More

karlmdavis · 2025-07-29T03:45:02 1753760702

I find it kind of baffling that this toolkit is so popular when it makes handling database joins so difficult. After bashing my head against it for a while, I moved to Diesel, and while that has its own set of problems, I am generally able to get through them without resorting to horrible hacks or losing compile time checks.

cryptonector · 2025-07-29T04:45:18 1753764318

What do you mean? It takes SQL queries. You use the `JOIN` keyword in the SQL to do joins.

rellfy · 2025-07-29T04:01:28 1753761688

What problems have you had with joins? I have this comment in one of my projects:

``` It is required to mark left-joined columns in the query as nullable, otherwise SQLx expects them to not be null even though it is a left join. For more information, see the link below: https://github.com/launchbadge/sqlx/issues/367#issuecomment-... ```

Did you have other problems beyond this, or are you referring to something different?

The issue above is a bit annoying but not enough that I'd switch to an ORM over it. I think SQLx overall is great.

karlmdavis · 2025-02-09T19:35:21 1739129721

I suspect they meant “scenario”.

karlmdavis · 2025-01-06T02:42:55 1736131375

I almost gave up on Leptos, because I was trying to use it with Actix, which it supports less-well than it does Axum (and I’m too stubborn for my own good and wouldn’t switch).

I came back to it recently after the Leptos 0.7 release, though, and it’s MUCH smoother.

Still early days for a framework like this, but I think it’s got a lot of magic.

karlmdavis · on Dec 8, 2024

What an absolutely delightful little project and write up.

karlmdavis · on Aug 15, 2024

A number of US federal agencies still have astonishing amounts of it. The world’s largest insurer, Medicare, uses 10M+ lines of COBOL to process the claims it receives — total dollar amounts that make up 3% of the yearly GDP.

Maintaining and modernizing these critical systems is important work.

karlmdavis · on March 25, 2024

From personal experience, it scales very well vertically. Have a system in production with tens of billions of rows and north of 12 TB of storage total. That system is read-heavy with large batched inserts, not many deletes or updates.

Biggest limiter is memory, where the need for it grows linearly with table index size. Postgres really really wants to keep the index pages hot in the OS cache. Gets very sad and weird if it can’t: will unpredictably resort to table scans sometimes.

We are running on AWS Aurora, on a db.r6i.12xlarge. Nowhere even close to maxed out on potential vertical scaling.

brightball · on March 25, 2024

Isn’t Aurora horizontal by default?

EDIT: Here's what I was thinking about. It's chunked in 10gb increments that are replicated across AZs.

> Fault-tolerant and self-healing storage

Aurora's database storage volume is segmented in 10 GiB chunks and replicated across three Availability Zones, with each Availability Zone persisting 2 copies of each write. Aurora storage is fault-tolerant, transparently handling the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Aurora storage is also self-healing; data blocks and disks are continuously scanned for errors and replaced automatically.

https://aws.amazon.com/rds/aurora/features/

dalyons · on March 25, 2024

No? it’s a standard single node primary with replicas setup. With a fancy log based storage layer.

SOLAR_FIELDS · on March 25, 2024

I recently did some db maintenance on a write heavy workload and I found that eventually it will bloat over time with a table with 500 million records. Switching it to use a proper partitioning scheme helped a lot. So people should not read this and assume you can just dump massive workloads into pg and they will be screamingly performant without some tuning and thoughtful design (I don’t think this is what you are implying, just a PSA)

riku_iki · on March 25, 2024

is there a chance you run some older version of PG? They reduced bloating significantly in last few releases.

SOLAR_FIELDS · on March 26, 2024

Yes! PG12 in this case. Thanks, this definitely motivates my inflight project to get these db’s up to PG16. Very happy to hear that

wlll · on March 26, 2024

If you need more motivation check out the feature matrix, there have been a fair number of pretty big advances since 12.

https://www.postgresql.org/about/featurematrix/

mrbonner · on March 25, 2024

My cluster is clocked in at 230TB in Aurora. It is hitting a hard limit of 250TB AWS can support.

mrbonner · on March 25, 2024

No, we do not store log or IoT. The data are all business related metrics. I didn't choose aurora but inherited from another team. We have 4 replication reads to scale out the read access. The internal team owns the ingestion (insert) to the write node. All other external accesses are read.

I think the reason behind aurora pick is to support arbitrary aggregation, filtering and low latency read (p90 < 3000ms). We could not pick distributed DB based on Presto, Athena or Redshift mainly for latency requirements.

The other contender I consider is Elastic search. But, I do think using it in this case is akin to fitting a square peg in round hole saying.

LunaSea · on March 25, 2024

Being curious I was wondering what type of applications could generate this quantity of data.

Is it IoT / remote sensing related?

manquer · on March 25, 2024

You are thinking of normalized ( bcnf if not 3nf) well architectures application storing structured data , unless the app is 100 million+ users or grew super fast 250TB size would be hard to get to .

Timeseries (like IoT you mentioned ) or binary blobs or logs or any other data in SQL storage that shouldn’t be really there can hit any size wouldn’t be all that interesting.

Can’t speak for OP, however managing data for few million user apps, what I have observed is most SQL stores hit single TB range and then start getting broken down into smaller dbs either coz now teams have grown want their own Micro-service or DB or infra wants easier to handle in variety of ways including Backup /recovery larger DBs are extremely difficult to get reasonable RTO/RPO numbers for.

SJC_Hacker · on March 25, 2024

If you want to store video data as BLOBs in a DB, you can get there easily.

Maybe not the best idea, I guess a file system would be better for that, and just use the DB for metadata.

But OTOH all the data is one place, so you just migrate the DB. Less to worry about.

I just looked up, all of English Wikipedia (including images) is barely even 100 GB ... crazy world we live in.

manquer · on March 25, 2024

You wouldn’t say less to worry about when you have to do full backup or show recovery from backup within a set recovery time .

This one data store is easier is a myth , it just offloads complexity from developer to infra teams who are now provisioning premium NVMe storage instead of cold object stores for binary data .

Binary data is not indexed or aggregated in a SQL store there is no value in doing this is one place expect dev experience at the cost of infra team experience.

wavemode · on March 25, 2024

Super easy to generate arbitrary amounts of data if you start using postgres as a log, of any sort.

I worked for a company that had only a few thousand active customers yet had dozens of terabytes of data, for this reason.

karlmdavis · on Dec 29, 2023

You and I must work in very different contexts, as these questions are so obvious that they first seemed like satire to me.

You enforce API contracts in a monolith (or any codebase, really) via an at-least-modest amount of typing and a compiler. You diagnose performance issues via any number of tools, prominently including metrics and profilers.

My context for this is a lot of years working with backend languages like Java, Rust, etc. though the same assurances and tooling are available for most every platform I’m aware of.

konschubert · on Dec 29, 2023

Sure… and then the type of the return object if of API is

‘CustomerORMModel’… and now the api consumer can build n+1 query problems across components.

You need a few more restrictions than that. But I agree it’s doable.

karlmdavis · on Dec 25, 2023

Looks perfect for me with one show stopper: no Home Kit support. I love the idea of Home Assistant but do not have the free time to pick up another service to support in our house.

ahaucnx · on Dec 25, 2023

One of our community members actually added Home Kit support.

See: https://forum.airgradient.com/t/airgradient-integrations/703...

karlmdavis · on June 28, 2023

Nu is using a data frame abstraction to drive this and it’s really quite powerful and fast.

I was rather skeptical on the value of having this stuff as a shell built in, but it won me over. Very convenient!

karlmdavis · on May 20, 2023

I’ve used Synthea for a whole assortment of small and large projects and it’s been boring in the best possible way: reliable and easy to use.

I’ve also had the pleasure of working directly with the team at MITRE that owns it on a consulting engagement (we needed some improvements to it) and they are a delight to work with.