Someone still has orchestrate the shit show. Like a captain at the helm in the middle of a storm.
Or you can be full accelerationist and give an agent the role of standing up all the agents. But then you need someone with the job of being angry when they get a $7000 cloud bill.
Skeptical about replacing Redis with a table serialized to disk. The point of Redis is that it is in memory and you can smash it with hot path queries while taking a lot of load off the backing DB. Also that design requires a cron which means the table could fill disk between key purges.
I the article is wrong. UNLOGGED means it isn't written to WAL which means recovery and rollback guarantees won't work since the transaction can finish before the page can be synchronized on disk. The table loses integrity as a trade off for a faster write.
Open models have been about 6 to 9 months behind frontier models, and this has been the case since 2024. That is a very long time for this technology at it's current rate of development. If fast takeoff theory is right, this should widen (although with Kimi K2.5 it might have actually shortened).
If we consider what typically happens with other technologies, we would expect open models to match others on general intelligence benchmarks in time. Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
> That is a very long time for this technology at it's current rate of development.
Yes, as long as that gap stays consistent, there is no problem with building on ~9 months old tech from a business perspective. Heck, many companies are lagging behind tech advancements by decades and are doing fine.
> Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
They all get made in China, mostly all in the same facilities. Designs tend to converge under such conditions. Especially since design is not open loop - you talk to the supplier that will make your drill and the supplier might communicate how they already make drills for others.
You understate the capabilities of the latest gen LLMs. I can typically describe a user's bug in a few sentences or tell Claude to check fetch the 500 error in Cloud run logs and it will explain the root cause, propose a fix, and throw in new unit test in a two minutes.
I do this all the time in my Claude code workflow:
- Claude will stumble a few times before figuring out how to do part of a complex task
- I will ask it to explain what it was trying to do, how it eventually solved it, and what was missing from its environment.
- Trivial pointers go into the CLAUDE.md. Complex tasks go into a new project skill or a helper script
This is the best way to re-enforce a copilot because models are pretty smart most of the time and I can correct the cases where it stumbles with minimal cognitive effort. Over time I find more and more tasks are solved by agent intelligence or these happy path hints. As primitive as it is, CLAUDE.md is the best we have for long-term adaptive memory.
These complaints are about technical limitations that will go away for codebase-sized problems as inference cost continues its collapse and context windows grow.
There are literally hundreds of engineering improvements that we will see along the way like a intelligent replacement to compacting to deal with diff explosion, more raw memory availability and dedicated inference hardware, models that can actually handle >1M context windows without attention loss, and so on.
Switching from my 8-core ryzen minipc to an 8-core ryzen desktop makes my unit tests run way faster. TDP limits can tip you off to very different performance envelopes in otherwise similar spec CPUs.
A full-size desktop computer will always be much faster for any workload that fully utilizes the CPU.
However, a full-size desktop computer seldom makes sense as a personal computer, i.e. as the computer that interfaces to a human via display, keyboard and graphic pointer.
For most of the activities done directly by a human, i.e. reading & editing documents, browsing Internet, watching movies and so on, a mini-PC is powerful enough. The only exception is playing games designed for big GPUs, but there are many computer users who are not gamers.
In most cases the optimal setup is to use a mini-PC as your personal computer and a full-size desktop as a server on which you can launch any time-consuming tasks, e.g. compilation of big software projects, EDA/CAD simulations, testing suites etc.
The desktop used as server can use Wake-on-LAN to stay powered off when not needed and wake up whenever it must run some task remotely.
Even if you could cool the full TDP in a micro PC, in a full size desktop you might be able to use a massive AIO radiator with fans running at very slow, very quiet speeds instead of jet turbine howl in the micro case. The quiet and ease of working in a bigger space are mostly a good tradeoff for a slightly larger form factor under a desk.
It's good to be skeptical of new ideas as long as you don't box yourself in with dogmatism. If you're young you do this by looking at the world with fresh eyes. If you are experienced you do it by identifying assumptions and testing them.
Or you can be full accelerationist and give an agent the role of standing up all the agents. But then you need someone with the job of being angry when they get a $7000 cloud bill.
reply