Hacker Newsnew | past | comments | ask | show | jobs | submit | bavell's commentslogin

Perhaps you could generate a few tokens before the entire model is downloaded, but since every token takes a potentially different "path" through an MoE model, you'd still need to wait for the entire download before getting deeper than a handful of tokens... which is not really a UX improvement imo.

Even at its worst, it's a minor UX improvement compared to having to download everything prior to getting to the first token. Ultimately we will complete the download, but we can still pick the best priority so that the first handful of tokens goes through.

Going on 10 years now for me, tried Helm a bit and yep - all I've really needed was a package.json deploy script with sed to bump the image version.

Funny, I've been doing the same thing lately! CC + godot + some game ideas I've had banging around in my head for years but daunting to dive into.

The results so far are... okay, but getting something working to validate the gameplay loop and experiment with different systems is a lot of fun!


How well does it work with Godot? Engines like Unity and Godot are very focused on using the editor UI, so I've always wondered if there's any better workflow than generating code snippets. Unless you're going full .NET/GDExtension...

> I would also expect to see it taking exponentially longer to process a prompt. I don't believe LLMs work like that.

Try this out using a local LLM. You'll see that as the conversation grows, your prompts take longer to execute. It's not exponential but it's significant. This is in fact how all autoregressive LLMs work.


Yesterday I was playing around with Gemma4 26B A4B with a 3 bit quant and sizing it for my 16GB 9070XT:

  Total VRAM: 16GB
  Model: ~12GB
  128k context size: ~3.9GB
At least I'm pretty sure I landed on 128k... might have been 64k. Regardless, you can see the massive weight (ha) of the meager context size (at least compared to frontier models).

> As a user, I _expect_ the cost of resuming X hours/days later to be no different to resuming seconds or minutes later.

As an informed user who understands his tools, I of course expect large uncached conversations to massively eat into my token budget, since that's how all of the big LLM providers work. I also understand these providers are businesses trying to make money and they aren't going to hold every conversation in their caches indefinitely.


I'd hazard a guess that there's a large gulf between proportion of users who know as much as you, and the total number using these tools. The fact that a message can perform wildly differently (in either cost, or behaviour if using one of the mitigations) based on whether I send it at t vs t+1 seems like a major UX issue, especially given t is very likely not exposed in the UI.

I definitely agree that it should be shown and obvious in the UI. They do show a warning now when resuming old sessions but still could be better.

Haven't had a chance to test 4.7 much but one of my pet peeves with 4.6 is how eager it is to jump into implementation. Though maybe the 4.7 is smarter about this now.

The system prompt is always loaded in its entirety IIUC. It's technically possible to modify it during a conversation but that would invalidate the prefill cache for the big model providers.

Is it exponential or logistic?

Nope, the original tariffs were under IEEPA, then Supreme Court ruled they didn't have authority to use IEEPA, so they had to drop those tariffs and start working on refunds. It'd only have been illegal if they kept the tariffs after the ruling.

Lot of propaganda & emotions around this straightforward chain of events.


Under this reasoning, it's not illegal to just take things from stores (stores hate this one simple trick). If you're caught and your specific actions are then adjudicated to be illegal, at that point you can just start making a plan to bring the items back (even if some are used/damaged/etc) and everything is fine.

In reality of course, the actions were illegal the whole time. The big festering problem is that there is no actual punishment for government agents who break the law.


Definitely some problems in the current system, broad and creeping executive overreach extending back decades now.

Pretty sure stealing from stores is already illegal, not sure I understand your analogy... lots of case law / precedent there.


The existence of case law / precedent does not affect whether something is "already illegal", but rather only how strongly one can predict if something is illegal. The original tariffs were illegal from day 1.

The point of the analogy was exactly to point at something with a lot of case law where this dynamic is crystal clear (although if Trump starts petty shoplifting after he's done looting our government, it's even odds whether this corrupt "court" will find some way to excuse it. Anything for the cause, of course)


So the USA was under

Wikipedia on IEEPA: "An Act with respect to the powers of the President in time of war or national emergency. "?

I mean thats very wishi washi. So are we both aligned that it looks like missuse? Because if its only about a word definition of no its not illegal what he did but a clear missconduct than it feels like word play.


I do agree it was a weak case, I think SCOTUS ruled correctly.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: