More

abreslav · 2026-03-12T20:35:48 1773347748

> * model transforms text into a formal specification

formal specification is no different from code: it will have bugs :)

There's no free lunch here: the informal-to-formal transition (be it words-to-code or words-to-formal-spec) comes through the non-deterministic models, period.

If we want to use the immense power of LLMs, we need to figure out a way to make this transition good enough

abreslav · 2026-03-12T20:32:54 1773347574

When you translate spec to tests (if those are traditional unit tests or any automated tests that call the rest of the code), that fixes the API of the code, i.e. the code gets designed implicitly in the test generation step. Is this working well in your experience?

siscia · 2026-03-12T21:05:59 1773349559

Yes it is passable.

Good enough that I don't review it.

Granted, it is a personal project that I care only to the point that I want it to work. There are no money on the line. Nothing professional.

I believe that part of the secret is that I force CC to run the whole est suites after it change ANY file. Using hooks.

It makes iteration slower because it kinda forces it to go from green to green. Or better from red to less red (since we start in red).

But overall I am definitely happy with the results.

Again, personal projects. Not really professional code.

siscia · 2026-03-12T21:09:48 1773349788

Another trick that I use.

I force the code to be almost 100% dependency injection-able.

It simplifies a lot writing tests and getting the coverage. And I see the LLM being able to handle it very very well.

abreslav · 2026-03-12T20:32:40 1773347560

Very much agree on coverage. We're actually doing something in that area: https://codespeak.dev/blog/coverage-20260302

For now, it's only about test coverage of the code, but the spec coverage is coming too.

siscia · 2026-03-12T21:11:06 1773349866

I think you guys are doing pretty much everything right.

abreslav · 2026-03-12T20:29:43 1773347383

Very much agree. I like the imperative vs declarative angle you take here. Thank you!

abreslav · 2026-03-12T20:27:38 1773347258

We'd love to hear your feedback! Feel free to come to our discord to ask questions/share experience: https://l.codespeak.dev/discord

abreslav · 2026-03-12T16:04:04 1773331444

We are not trying to make things easier for LLMs. LLMs will be fine. CodeSpeak is built for humans, because we benefit from some structure, knowing how to express what we want, etc.

abreslav · 2026-03-12T16:01:02 1773331262

> Also it seems that the tool severely limits the configurability of the agentic generation process, although that's just a limitation of the specific tool.

Working on that as well. We need to be a lot more flexible and configurable

abreslav · 2026-03-12T16:00:10 1773331210

> The limitation seems to be that you can't modify the code yourself if you want the spec to reflect it

Eventually, we'll end up in a world where humans don't need to touch code, but we are not there yet. We are looking into ways to "catch up" the specs with whatever changes happen in the code not through CodeSpeak (agents or manual changes or whatever). It's an interesting exercise. In the case of agents, it's very helpful to look at the prompts users gave them (we are experimenting with inspecting the sessions from ~/.claude).

More generally, `codespeak takeover` [1] is a tool to convert code into specs, and we are teaching it to take prompts from agent sessions into account. Seems very helpful, actually.

I think it's a valid use case to start something in vibe coding mode and then switch to CodeSpeak if you want long-term maintainability. From "sprint mode" to "marathon mode", so to speak

[1] https://codespeak.dev/blog/codespeak-takeover-20260223

newsoftheday · 2026-03-12T16:20:48 1773332448

> Eventually, we'll end up in a world where humans don't need to touch code, but we are not there yet.

Will we though? Wouldn't AI need to reach a stage where it is a tool, like a compiler, which is 100% deterministic?

abreslav · 2026-03-12T20:20:43 1773346843

Two things to mention here:

1. You are right that we can redefine what is code. If code is the central artefact that humans are dealing with to tell machines and other humans how the system works, then CodeSpeak specs will become code, and CodeSpeak will be a compiler. This is why I often refer to CodeSpeak as a next-level programming language.

2. I don't think being deterministic per se is what matters. Being predictable certainly does. Human engineers are not deterministic yet people pay them a lot of money and use their work all the time.

discreteevent · 2026-03-12T22:06:43 1773353203

>Human engineers are not deterministic yet people pay them

Human carpenters are not deterministic yet they won't use a machine saw that goes off line even 1% of the time. The whole history of tools, including software, is one of trying to make the thing do more precisely what is intended, whether the intent is right or not.

Can you imagine some machine tool maker making something faulty and then saying, "Well hey, humans aren't deterministic."

pjmlp · 2026-03-13T08:39:57 1773391197

They do it all the time with their EULAs.

intrasight · 2026-03-12T16:33:00 1773333180

We will and soon because it does not have to be deterministic like a compiler. It only has to pass all tests.

my_throwaway23 · 2026-03-12T17:46:10 1773337570

Who is writing the tests?

abreslav · 2026-03-12T20:24:15 1773347055

There are different kinds of tests:

* regression tests – can be generated

* conformance tests – often can be generated

* acceptance tests – are another form of specification and should come from humans.

Human intent can be expressed as

* documents (specs, etc)

* review comments, etc

* tests with clear yes/no feedback (data for automated tests, or just manual testing)

And this is basically all that matters, see more here: https://www.linkedin.com/posts/abreslav_so-what-would-you-sa...

justonceokay · 2026-03-12T19:35:56 1773344156

In the future users will write the tests

vbezhenar · 2026-03-12T16:59:22 1773334762

Compiler is not 100% deterministic. Its output can change when you upgrade its version, its output can change when you change optimization options. Using profile-guided optimization can also change between runs.

CWIZO · 2026-03-12T18:35:53 1773340553

If you change inputs then obviously you will get a different output. Crucially using the same inputs, however, produces the same output. So compilers are actually deterministic.

jason_oster · 2026-03-13T19:47:59 1773431279

This is irrelevant over the long run because the environment changes even if nothing else does. A compiler from the 1980's still produces identical output given the original source code if you can run it. Some form of virtualization might be in order, but the environment is still changing while the deterministic subset shrinks.

Having faith that determinism will last forever is foolish. You have to upgrade at some point, and you will run into problems. New bugs, incompatibilities, workflow changes, whatever the case will make the determinism property moot.

mike_hearn · 2026-03-13T19:52:38 1773431558

Many compilers aren't deterministic. That's why the effort to make Linux distros have reproducible builds took so long and so much effort.

The reason is, it's often more work to be deterministic than not deterministic, so compilers don't do it. For example, they may compile functions in parallel and append them to the output in the order they complete.

ferguess_k · 2026-03-12T19:45:45 1773344745

Why are we eliminating our own job and maybe hobby so eagerly? Whatever. It is done.

abreslav · 2026-03-12T15:53:47 1773330827

I second that :)

abreslav · on Sept 28, 2015

Great slides! I hope we'll be able to adopt this style for our introductory materials