More

efxhoy · 2026-03-16T18:39:07 1773686347

I’ve only had their platforms explained to me by them (palantir) at a conference but the mental model that stuck with me was more of an operating system than a single tool. Think AWS managed services + databricks + whatever library of ready made intelligence software they have already built for others.

They also have “forward deployed engineers” to help organizations actually use the platform. It looked complicated enough to probably be completely useless without these specialists, even in a “self hosted” setup.

The managed hosting also seems like a major selling point so many deployments that probably should be self hosted probably aren’t because muh managed services added value.

And the backdoors of course. There is no way it isn’t full of plausibly deniable “metrics endpoints” that helpfully spew out all the internal data if the right key comes knocking. There’s no way it’s auditable at the level of detail you would need compared to the value of the data and the sophistication of the potential attacker (NSA).

efxhoy · 2026-03-03T20:06:47 1772568407

How is the output quality of the smaller models?

elsombrero · 2026-03-04T00:00:23 1772582423

not good enough for coding anything more than simple scripts.

generally, the less parameters, the less knowledge they have.

efxhoy · 2025-12-23T17:02:11 1766509331

It’s the clear OLTP winner but for OLAP it’s still not amazing out of the box.

efxhoy · 2025-12-18T20:10:09 1766088609

Maybe my brain is oversaturated with culture war nonsense from too much doomscrolling but that’s where my train of thought went too, even if it wasn’t directly implied.

By claiming our ancient predecessors had terrible taste you can make them look like primitive fools, and make our own modernity appear superior in comparison.

When boiled down to culture war brainrot the poor coloring in the reconstructions becomes a woke statement that the brutish patriarchal empires of antiquity have nothing to teach our sophisticated modern selves and that new is good and old is bad. A progressive hit-piece on muh heritage.

Anything you don’t like is a purple haired marxist if you squint hard enough.

Idk why my brain went there. I’m guessing the years of daily exposure to engagement-farming ragebait had something to do with it.

efxhoy · 2025-11-21T21:23:18 1763760198

I like it! We have a service with a similar postgres task queue but we use an insert trigger on the tasks table that does NOTIFY and the worker runs LISTEN, it feels a bit tidier than polling IMO.

parthdesai · 2025-11-27T15:13:05 1764256385

Feels tidier till it becomes a bottleneck:

https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

surprisetalk · 2025-11-21T21:33:22 1763760802

LISTEN/NOTIFY works great but they don’t have any mechanism for ACKs or retries so it’s got some tradeoffs to consider. Works great when you’re willing to sacrifice some durability!

efxhoy · 2025-10-10T18:56:54 1760122614

I had a few periods of doing the same in sublime text, I did use syntax highlighting though. It’s a really great feeling and very liberating, especially in a greenfield project.

Can’t really justify it at work though, projects are too big to and gnarly keep in my head.

efxhoy · 2025-04-03T20:09:02 1743710942

The best practice way to swap fullname for firstname, lastname would be to:

  1. Migration that adds firstname and lastname columns will all nulls
  2. Deploy application code change to start populating firstname and lastname alongside fullname, still reading fullname in the code.
  3. backfill the firstname and lastname values with a script/command/migration
  4. change app code to read firstname and lastname and stop writing fullname
  5. drop the fullname column

I don't think there's a safe way to do all that in a single migration unless all your app code also lives in the database so it can be atomically deployed. If you have multiple app servers and do rolling deploys with no downtime I think it has to be done in these 5 steps.

MrMcCall · 2025-04-03T20:59:29 1743713969

  6. ensure there are no nulls in firstname and lastname
  7. alter the columns to be NOT NULL

Because no non-statistician uses nullable columns, right?

Of course, some dbs (SQLServer?) infer NULL from the empty string, or am I misremembering?

Always having the columns be NOT NULL is a fundamental cheat, after always having a PK, or is that too old school for 2025?

efxhoy · 2025-04-03T22:11:37 1743718297

There's nothing wrong with nullable fields when it's appropriate. When kids are born they don't have names. Not all users want to tell you their names. A null value is data too.

MrMcCall · 2025-04-03T22:33:47 1743719627

> when it's appropriate

Yes, it just requires extra care when querying and handling the rows.

It's always just easier, if you can, to make it NOT NULL after prepopulating all rows' columns to the empty string (or real data).

Sometimes NULL is truly different than the empty string, but that, like you said, is just a kind of data.

efxhoy · on March 7, 2025

Looks great! Can I run it on my own bare metal cluster? Will I need to buy a license?

efxhoy · on Feb 18, 2025

Timescale is definitely worth a look. Pg_partman gets you part of the way. We ended up going with bigquery for our workload because it solved a bigger bag of problems for our needs (data warehouse). It’s very hard to beat for big… queries.

physicles · on Feb 18, 2025

I never understood the rationale behind TimescaleDB — if you’re building a time series database using row-oriented storage, you’ve already got one hand tied behind your back.

What does your testing strategy look like with bigquery? We use snowflake, but the only way to iterate and develop is using snowflake itself, which is so painful as to impact the set of features that we have the stomach to build.

efxhoy · on Feb 19, 2025

Testing strategy? What’s that? I kid, but just a bit. Our use case is a data warehouse. We use DBT to build everything. Each commit is built in CI to a CI target project. Each commit gets its own hash prefixed in front of dataset names. Each developer also has their own prefix for local development. The dev and ci datasets expire and are deleted after like a week. We use data tests on the actual data for “foreign keys”, checking for duplicates and allowed values. But that’s pretty much it. It’s very difficult to do TDD for a data warehouse in sql.

My current headache is what to do with an actually big table, 25 billion rows of json, for development. It’s going to be some DBT hacks I think.

God help you if you want to unit test application code that relies on bigquery. I’m sure there are ways but I doubt they don’t hurt a lot.

physicles · on Feb 21, 2025

Interesting strategy with appending the commit hash to the dataset name. If one of those commits is known to be good and you want to “ship” it, do you then rename it?

What are you doing with that JSON? What’s the reason why you can’t get a representative sample of rows onto your dev machine and hack on that?

efxhoy · on Dec 30, 2024

Reuters really broke the site for me when they did what looked like a rewrite about a year ago. Glad to see this as I hardly read it anymore. bbc news is going downhill too, another prime candidate for a project like this.