More

BenoitP · 2026-04-14T07:31:09 1776151869

Yet another manycore proposal, but I feel each time we're getting closer. Bandwidth is one of the few dimensions still growing, and there's something to unlock by shaping computation by this firehose of bits.

However it seems the software part is always a blocker, and these architecture only address well a handful of program types.

Is this time different? I think it is. The paradigm here is about small threads that yield very often, and we have that way of programming: Erlang, Go, Java's virtual threads.

BenoitP · 2026-04-07T16:44:25 1775580265

LCOE is good for marginal cost (eg: one more solar panel), but fails dramatically at evaluating systemic costs.

A nuclear reactor moves the entire market down, including the costs to the consumer when he buys solar energy.

Here is a UN document explaining it: https://unece.org/sites/default/files/2025-09/GECES-21_2025_...

Ringz · 2026-04-08T13:37:36 1775655456

The SCBOE score is a good idea. However, in the case of Germany, it is often overlooked that the power grid dating from the 1970s, which was built as a one-way system from large power plants (nuclear power plants) to consumers, would have needed to be rebuilt regardless. A large share of the grid costs would therefore have been passed on to consumers even without the transition to renewable energy. Additionally, Germany is located in the center of Europe and is thus a major transit country for electricity. Here too, corresponding capacities would have had to be expanded. The expansion of a European power grid also means that the disadvantages of renewable energy variability can be offset. As the SCBOE system also shows, the individual power plant still accounts for the largest share of costs. Many of the additional factors can actually go down in prices as renewables scale up (nuclear has still to prove that this could work there too). In that regard, LCOE remains relevant.

leonidasrup · 2026-04-11T21:03:40 1775941420

Even in the peak of nuclear electricity production in the year 2001, coal was dominant source electricity in German grid. (Data for 2001, Nuclear 171 TWh, Coal 293 TWh).

https://ourworldindata.org/grapher/electricity-prod-source-s...

Power grid is not and newer was a one-way system, all the AC power lines, transformers don't care for the direction of the current. It's only the amount of current passing through each power lines, transformers that's important.

The side effect of many electric customers installing PV panels and reducing their demand from grid is that the fuel costs of on-demand power plants decrease, but the fixed costs of on-demand power plants (installation, maintenance) stay the same. These fixed costs have to be recouped in the smaller amount of electricity sold by on-demand power plants, therefor per MWh prices from on-demand power plants will increase for electric grid customer.

For most electric customers it's not possible to disconnect from electric grid and rely just on PV panels and batteries.

Germany is not major transit country for electricity. According to data from 2019 electricity interconnection level for Germany was only 10% .

https://en.wikipedia.org/wiki/Continental_Europe_Synchronous...

Germany is projected to have import capacity equal to less than 15% of their domestic electricity generation by 2030.

https://ember-energy.org/latest-insights/money-on-the-line-s...

Building of large capacity and long power lines is expensive, therefor many big industrial electric consumers were build near power plants or power plants were build near major industrial customers.

BenoitP · 2026-04-02T07:24:27 1775114667

JAX is designed from the start to fit well with systolic arrays (TPUs, Nvidia's tensor cores, etc), which are extremely energy-efficient. WebGL won't be the tool that connects it on the web, but the generation after WebGPU will.

BenoitP · 2026-04-02T07:19:28 1775114368

I believe we could get there eventually. For example for collision there is work to make it differentiable (or use a local surrogate at the collision point): https://arxiv.org/abs/2207.00669

The robotics will need to connect vision with motors with haptics with 3D modelling. And to propagate gradient seamlessly. For calibrating torque with the the elastic deformation of the material for example. After all matter is not discreet at small scales (staying above the atomic scale)

All this will require all modules to be compatible with differentiability. It'll be expensive at first, but I'm sure some optimizations can get us close to the discreet case.

Also even for meshes there is a lot to gain with trying to go the continuous way:

https://www.cs.cmu.edu/~kmcrane/Projects/DDG/

BenoitP · 2026-04-02T07:07:45 1775113665

Yeah :)

I had a lot of fun writing the article! And it is only half a joke

My intuition for so-called world models is that we'll have to plug modules, each responsible for a domain (text, video, sound, robot-haptics, physical modelling) It'll require to plug modules in a way that will allow the gradient to propagate. A differentiable architecture. And JAX seems well placed for this by making function manipulation a first citizen. Looking at your testimony comforts me in this view

BenoitP · 2026-04-02T06:58:02 1775113082

Damn, I should have spent more time QA-ing that post. I'll try to patch it.

You did not miss much though: it just rotates the scene.

BenoitP · 2026-03-25T08:02:07 1774425727

It is AI generated. Or was written by someone a bit far from the technical advances IMHO. The Johnson-Lindenstrauss Lemma is a very specific and powerful concept, when in the article the QLJ explanation is vacuous. A knowledgeable human would not have left the reader wanting for how that relates to the Lemma.

BenoitP · 2026-03-13T13:24:55 1773408295

> the whole process remains differentiable: we can even propagate gradients through the computation itself. That makes this fundamentally different from an external tool. It becomes a trainable computational substrate that can be integrated directly into a larger model.

IMHO the key point at which this technique has an unfair advantage vs a traditional interpreter is here.

How disruptive is it to have differentiability? To me it would mean that some tweaking-around can happen in an LLM-program at train-time; like changing a constant, or switching from a function call to another function. Can we gradient-descent effectively inside this huge space? How different is it from tool-calling from a pool of learned programs (think github but for LLM programs written in classic languages)?

BenoitP · 2026-03-12T17:55:18 1773338118

> He's still computing cross(z, d) and dot(z, d) separately. that looks like a code smell to me. with quaternions ...

Fair point, but I think you misspelled Projective Geometric Algebra

aap_ · 2026-03-12T18:38:13 1773340693

If you only care about rotations in 3d, quaternions do everything you need :) with all the added benefits of having a division algebra to play with (after all the cross product is a division-algebraic operation). PGA is absolutely great, but quite a bit more complex mathematically, and its spinors are not as obvious as quaternionic ones. in addition GA is commonly taught in a very vector-brained way, but i find spinors much easier to deal with.

BenoitP · 2026-03-01T06:55:02 1772348102

But the newer commenters most probably could be younger