babol's comments

babol · 2026-03-21T10:55:02 1774090502

What you wrote aligns with my experience so far. It's fast and easy to get something working, but in a number of cases it (Opus) just gets stuck 'spinning' and no number of prompts is going to fix that. Moreover - when creating things from scratch it tends to use average/insecure/ inefficient approaches that later take a lot of time to fix.

The whole thing reminds me a bit of the many RAD tools that were supposed to 'solve' programming. While it was easy to start and produce something with those tools, at some point you started spending way too much time working around the limitations and wished you started from scratch without it.

babol · 2026-02-26T11:49:33 1772106573

Would running an application with chosen GC, subtracting GC time reported by methods You introduced, and then comparing with Epsilong-based run be a good estimate of barrier overhead ?

Thank you for the well written article!

jonasn · 2026-02-26T12:03:15 1772107395

That is a creative idea, but unfortunately, Epsilon changes the execution profile too much to act as a clean baseline for barrier costs.

One huge issue is spatial locality. Epsilon never reclaims, whereas other GCs reclaim and reuse memory blocks. This means their L2/L3 cache hit rates will be fundamentally different.

If you compare them, the delta wouldn't just be the barrier overhead; it would be the barrier overhead mixed with completely different CPU cache behaviors, memory layout etc. The GC is a complex feedback loop, so results from Epsilon are rarely directly transferable to a "real" system.

babol · on Dec 21, 2023

> > QuestDB organizes data sorted by time, so relying on insertion order may help to avoid redundant sorting if there is an ORDER BY clause with the timestamp column.

> If data is already sorted and you have an 'order by' then just use the data directly – bingo, instant merge join, no hash table needed.

I reckon keeping data on heap in insertion order isn't that useful for joins because hash table is used for lookups while iterating the other table (so the main table determines output order). Where it could help is e.g. storing results of GROUP BY. For query such as:

SELECT timestamp, key, sum(value) from data GROUP BY timestamp, key order by timestamp

if data table stores data ordered by timestamp and hash table maintains insertion order then sorting is not required after aggregating all rows because iterating heap produces the right order.