Hacker Newsnew | past | comments | ask | show | jobs | submit | simonw's commentslogin

UX is really, really hard - and for some reason still not fully respected as a discipline.

Fast track to loss of respect:

I visit a site/launch the app I always use with the intent of getting something done quickly, and I find that since the last time I used it someone's rearranged the deck chairs and hidden or removed the functionality I need. Something that should take a minute or two suddenly becomes rage-inducing and eats an entire day.


Or the feature is still there but they've renamed it to something totally unrelated which you would never guess. Honestly, it's like they are actively trying to lose users.

The most depressing email to receive is "Good news! We've improved our website ..."


The Win8 and Metro design disaster is what happens when you give UX free rein, instead of focusing on users they try to start design trends to impress other UX / designers (essential for their career).

I wonder how much of Apples design was basically ‘if you confuse Steve Jobs you’re fired.’ And this acted as a necessary governing force to counteract the need to impress peers.


Metro was a wonderful design for the media player app it was made for. It's great for menu-heavy interactions, not so much for representing stateful things like options and checkboxes and such. Metro isn't the problem, it's trying to shoehorn UIs into it regardless of fit that is.

I don’t agree, but that’s design, people have different opinions. I actually like the Ribbon interface, would have liked it more if they added a search box to it as well but designers hate search boxes.

Part of UX is leveraging what users are already familiar with.

100% agree, but that is in contention with the desire to invent something new. As a separate discipline where the career trajectory is determined by peers the user becomes less important.

Respect has to be earned, and I don't think anyone (within margin of error) with UX in their job title has earned it. Most of their work consists of shuffling design elements around for its own sake. Sometimes they strike gold (or at least silver or copper), but it never feels like that's done because they target a better design, rather they stumble upon it while making designs whose goal is to be different.

You have to go back to when it was called HIC (Human–computer interaction) to find people who weren't completely brain-dead or ad-pilled when it came to design, did actual work and research trying to make better designs, and thus were at least somewhat respected.


Hermes agent dates back to at least September last year too, pre-dating Moltbot/OpenClow by a couple of months https://github.com/NousResearch/hermes-agent/commit/17608c11...

I've genuinely lost count of the number of little vibe coded things I've built but then failed to use, because it turns out I have limited bandwidth in terms of fully trying out the quirky ideas I'm popping out through coding agents.

If we're really, really lucky.

Destroying institutions is one heck of a lot easier than building new ones.


It's not even about rebuilding. Some things when destroyed can never be recreated, like trust, oceanliners, or the practice of Dísting. The initial event of destruction creates an expectation that it will happen again. Once it does happen the process accelerates itself until the full expectation is that whatever thing, concept, or practice can never exist again as anything more than a fleeting revival.

This is the kind of scientific research which companies don't generally pay for because it doesn't have direct commercial application, but that companies and the economy benefit from enormously because you can use the results of that science to build a great deal of useful commercial things.

> This is the kind of scientific research which companies don't generally pay for because it doesn't have direct commercial application

Tom over at the Explosions&Fire channel (and Extractions&Ire channel) just published a video[1] about his academic career. In it he noted that in Australia where he's located, the defense companies were an exception to that general rule, and did indeed sponsor a fair bit of basic research, including his PhD. I assume in areas they figured had potential, but still.

[1]: https://www.youtube.com/watch?v=4CbdVkcr-Nw


Even so, Australia still has the CSIRO (Commonwealth Scientific and Industrial Research Organization) so there's that funding and research too, which actually has, per capita, about a similar funding (equivalent of US$9B adjusted), though they generally do most of that research 'in house' versus funding it externally.

The more important research is the kind that the economy doesn't especially benefit from, but which needs to happen in order to improve the quality of human life.

I had a job paid by the National Science Foundation, doing genomics research on children with extremely rare (sometimes unique) genetic diseases. We did publish papers, and Big Pharma can glean a little bit about how we handled the biomedical informatics of managing data across different highly specialized labs, maybe a researcher will incrementally improve GWAS across the field. But that research was important because actual human children were suffering and needed help.



See sibling comment - NSF also funds science which doesn't have direct or indirect commercial applications (I shouldn't have implied that only commercial applications matter): https://news.ycombinator.com/item?id=47906005

What kind of an agenda does studying Gendered impact of COVID-19 in the Arctic carry?


Your comment here appears to be a perfect illustration of what Nilay calls "software brain" in the article.

(I have a strong case of software brain as he describes it myself.)


Hacker News isn't a great place to discuss papers generally.

Having a productive discussion around a paper requires at least reading and understanding the abstract, and the most successful content on HN (sadly) is content where people can jump in with an opinion purely from reading the headline.

Anyone know of any forums that are good for discussing papers?


This is true across all research subject areas (I'm not especially tuned into LLM research but am to cryptography, which also happens to be a field that gets a lot of play on HN). I think it's just a function of how many people conversant in the field are available to talk about it at any one time.

/R/MachineLearning is not bad

But the gold standard is a small signal or discord community of like-minded, fairly tight knit friends. You may have to organize this yourself


There are/were isolated communities on Discord around fast.ai, MLC, MLOps that talk papers more in depth but it’s hard to organize a community without commercial or academic incentive.

The difficulty is perhaps unsurprising given the time sink that is reading a given paper to any reasonably complete degree of understanding.

I just email the authors with questions. Surprisingly high response rate.

Unironically, very niche subreddits.

... and this thread over here seems to be proving me wrong already: https://news.ycombinator.com/item?id=47893779

I wonder if the fact that GPT-5.5 was already available in their Codex-specific API which they had explicitly told people they were allowed to use for other purposes - https://simonwillison.net/2026/Apr/23/gpt-5-5/#the-openclaw-... - accelerated this release!

I've been calling that the "streaming experts" trick, the key idea is to take advantage of Mixture of Expert models where only a subset of the weights are used for each round of calculations, then load those weights from SSD into RAM for each round.

As I understand it if DeepSeek v4 Pro is a 1.6T, 49B active that means you'd need just 49B in memory, so ~100GB at 16 bit or ~50GB at 8bit quantized.

v4 Flash is 284B, 13B active so might even fit in <32GB.


The "active" count is not very meaningful except as a broad measure of sparsity, since the experts in MoE models are chosen per layer. Once you're streaming experts from disk, there's nothing that inherently requires having 49B parameters in memory at once. Of course, the less caching memory does, the higher the performance overhead of fetching from disk.

Streaming weights from RAM to GPU for prefill makes sense due to batching and pcie5 x16 is fast enough to make it worthwhile.

Streaming weights from RAM to GPU for decode makes no sense at all because batching requires multiple parallel streams.

Streaming weights from SSD _never_ makes sense because the delta between SSD and RAM is too large. There is no situation where you would not be able to fit a model in RAM and also have useful speeds from SSD.


There have been some very interesting experiments with streaming from SSD recently: https://simonwillison.net/2026/Mar/18/llm-in-a-flash/

I don't mean to be a jerk, but 2-bit quant, reducing experts from 10 to 4, who knows if the test is running long enough for the SSD to thermal throttle, and still only getting 5.5 tokens/s does not sound useful to me.

It's a lot more useful than being entirely unable to try out the model.

But you aren't trying out the model. You quantized beyond what people generally say is acceptable, and reduced the number of experts, which these models are not designed for.

Even worse, the github repo advertises:

> Pure C/Metal inference engine that runs Qwen3.5-397B-A17B (a 397 billion parameter Mixture-of-Experts model) on a MacBook Pro with 48GB RAM at 4.4+ tokens/second with production-quality output including tool calling.

Hiding the fact that active params is _not_ 17B.


It doesn't have to be a 2-bit quant - see the update at the bottom of my post:

> Update: Dan's latest version upgrades to 4-bit quantization of the experts (209GB on disk, 4.36 tokens/second) after finding that the 2-bit version broke tool calling while 4-bit handles that well.

That was also just the first version of this pattern that I encountered, it's since seen a bunch of additional activity from other developers in other projects.

I linked to some of those in this follow-up: https://simonwillison.net/2026/Mar/24/streaming-experts/


On Apple Silicon Macs, the RAM is shared. So while maybe not up to raw GPU VRAM speeds, it still manages over 450GB/s real world on M4 Pro/Max series, to any place that it is needed.

They all do have a limitation from the SSD, but the Apple SSDs can do over 17GB/s (on high end models, the more normal ones are around 8GB/s)


Yeah, I am mostly only talking about the SSD bottleneck being too slow. No way Apple gets 17GB/s sustained. SSDs thermally throttle really fast, and you have some random access involved when it needs the next expert.

> ~100GB at 16 bit or ~50GB at 8bit quantized.

V4 is natively mixed FP4 and FP8, so significantly less than that. 50 GB max unquantized.


Ahh, that actually makes more sense now. (As you can tell, I just skimmed through the READMEs and starred "for later".)

My Mac can fit almost 70B (Q3_K_M) in memory at once, so I really need to try this out soon at maybe Q5-ish.


Unsloth often turn them around within a few hours, they might have gone to bed already though!

Keep an eye on https://huggingface.co/unsloth/models

Update ten minutes later: https://huggingface.co/unsloth/DeepSeek-V4-Pro just appeared but doesn't have files in yet, so they are clearly awake and pushing updates.



Those are quants, not distills.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: