More

rytill · 2026-03-18T21:15:46 1773868546

The sheer em dash density of this post really struck me, so I asked Claude to write a script which ranked text post Show HNs over the last week in order of em dash density. Script here: https://github.com/mturnshek/hn-em-dash-density/tree/master

This post comes in 12th place out of 668 with 0.6232% em dash density. I was also surprised by the large number of ShowHNs in the last week.

Here are the highest density 4 from the last week:

1. 1.2195% — Show HN: AI-native memory, recall and reminder on CLI – 100% local with Ollama (https://news.ycombinator.com/item?id=47417498)

2. 1.0417% — Show HN: CEL v0.2 Pro – cryptographic black box recorder for AI systems (Python) (https://news.ycombinator.com/item?id=47406917)

3. 1.0174% — Show HN: FalconAI – AI-Powered Smart TV Streaming, Search and Voice Control (https://news.ycombinator.com/item?id=47377118)

4. 0.9259% — Show HN: Ostov.js – Backbone.js Fork Without jQuery/Underscore, Classes, TS, ES (https://news.ycombinator.com/item?id=47401800)

Posts with at least one em or en dash: 457/668 (68.4%)

I also learned about the existence of the en dash. I only knew about the em dash and the hyphen, but there's:

— em dash (U+2014)

– en dash (U+2013)

- hyphen-minus (U+002D)

Anyway.

rytill · 2026-03-01T13:33:54 1772372034

Are you trying to imply that humans don’t need generalized knowledge, or that we’re not “rewarded” for having highly generalized knowledge?

If so, good luck walking to your kitchen this morning, knowing how to breathe, etc.

allovertheworld · 2026-03-02T00:24:57 1772411097

Do you need to learn Latin and marine biology to work the cashier in your local shop? Thats the point, humans go on with their jobs on very limited general knowledge just fine. LLMs have gotten this good because their dataset, pre training, and RL is larger than before

rytill · 2026-02-27T03:31:32 1772163092

> that corporate profits would rise while consumer spend dropped are literally incompatible realities

These are not incompatible realities.

I would be willing to accept the statement that corporate revenues increasing and consumer spending decreasing are incompatible realities.

But it’s feasible to think the following occurs:

- labor income falls

- consumer spending drops

- corporate revenues drop

- corporate profits moderately increase because profit margins get much higher

- government deficit continues (which, from an accounting perspective, means other accounts are in surplus, potentially US corporations)

I’m not saying I strongly predict the above, necessarily! I just don’t think it’s correct to say it’s not a conceivable reality.

rytill · 2026-02-26T03:10:42 1772075442

When I hear "coding agent", I think of both the harness and the LLM as a pair. Like, Claude Opus 4.6 and Claude Code is a coding agent, or Gemini 3 Pro and Pi is a coding agent.

"Harness" is a way to reference the coding agent minus the "LLM" part.

If an agent is an LLM in a loop with tool calls, there are two components: 1) the LLM. 2) The loop with tool calls. That second part could be called the harness.

rytill · 2026-01-14T01:05:57 1768352757

LLMs are not "average text generation machines" once they have context. LLMs learn a distribution.

The moment you start the prompt with "You are an interactive CLI tool that helps users with software engineering at the level of a veteran expert" you have biased the LLM such that the tokens it produces are from a very non-average part of the distribution it's modeling.

jason_oster · 2026-01-14T04:56:57 1768366617

True, but nuanced. The model does not react to "you are an experienced programmer" kinds of prompts. It reacts to being given relevant information that needs to be reflected in the output.

See examples in https://arxiv.org/abs/2305.14688; They certainly do say things like "You are a physicist specialized in atomic structure ...", but the important point is that the rest of the "expert persona" prompt _calls attention to key details_ that improves the response. The hint about electromagnetic forces in the expert persona prompt is what tipped off the model to mention it in the output.

Bringing attention to key details is what makes this work. A great tip for anyone who wants to micromanage code with an LLM is to include precise details about what they wish to micromanage: say "store it in a hash map keyed by unsigned integers" instead of letting the model decide which data structure to use.

xyzsparetimexyz · 2026-01-14T19:34:59 1768419299

Speak english

rytill · 2025-11-30T18:33:29 1764527609

It is not a "narrative", "philosophical paradigm", or him "getting high on his own supply". It is simply him sharing his thoughts about something.

BanditDefender · 2025-11-30T20:31:21 1764534681

He is in fact getting high on his own supply of narratives and philosophical paradigms. There are no facts in that entire blog post. It's a useless fart in the wind.

dang · 2025-12-01T02:26:02 1764555962

Could you please stop posting shallow and curmudgeonly dismissals? It's not what this site is for, and destroys what it is for.

If you want to make your substantive points without putdowns, that's fine, but please don't use this place to superciliously posture over others.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

aeve890 · 2025-11-30T18:37:43 1764527863

Alright that's a valid answer. Thank you.

rytill · 2025-11-26T19:30:02 1764185402

It really doesn’t, at all. Every sentence has a clear, non-equivocative meaning and it doesn’t use any LLM tropes. Your LLM sensor is seriously faulty.

rytill · 2025-11-17T19:28:30 1763407710

What is the goal of doing that vs using L2 loss?

counters · 2025-11-17T23:10:25 1763421025

To add to the existing answers - L2 losses induce a "blurring" effect when you autoregressively roll out these models. That means you not only lose import spatial features, you also truncate the extrema of the predictions - in other terms, you can't forecast high-impact extreme weather with these models at moderate lead times.

lysecret · 2025-11-18T13:29:15 1763472555

Yes very good point this to me is one of the most magical elements of this loss how it suddenly makes the model "collapse" on one output and the predictions become sharp.

counters · 2025-11-18T17:20:31 1763486431

Yeah, it's underplayed in the the writeup here but the context here is important. The "sharpness" issue was a major impediment to improving the skill and utility of these models. When GDM published GenCast two years ago, there was a lot of excitement because the generative approach seemed to completely eliminate this issue. But, there was a trade-off - GenCast was significantly more expensive to train and run inference with, and there wasn't an obvious way to make improvements there. Still faster than an NWP model, but the edge starts to dull.

FGN (and NVIDIA's FourCastNet-v3) show a new path forward that balances inference/training cost without sacrificing the sharpness of the outputs. And you get well-calibrated ensembles if you run them with random seeds to their noise vectors, too!

This is a much bigger deal than people realize.

lysecret · 2025-11-17T19:33:44 1763408024

To encourage diversity between the different members in an ensemble. I think people are doing very similar things for MOE networks but im not that deep into that topic.

sunshinesnacks · 2025-11-17T21:41:22 1763415682

The goal of using CRPS is to produce an ensemble that is a good probabilistic forecast without needing calibration/post processing.

[edit: "without", not "with"]

rytill · 2025-09-23T19:55:32 1758657332

So, I have heard a number of people say this, and I feel like I'm the person in your conversations saying it's a coarse description and downplays the details. What I don't understand is, what specifically do we gain from thinking of it as a Markov chain.

Like, what is one insight beyond that LLMs are Markov chains that you've derived from thinking of LLMs as Markov chains? I'm genuinely very curious.

jltsiren · 2025-09-23T20:58:42 1758661122

It depends on if you already had experience in using large Markov models for practical purposes.

Around 2009, I had developed an algorithm for building the Burrows–Wheeler transform on (what was back then) very large scale. If you have the BWT of a text corpus, you can use it to simulate a Markov model with any context length. It tried that with a Wikipedia dump, which was amusing for a while but not interesting enough to develop further.

Then, around 2019, I was working in genomics. We were using pangenomes based on thousands of (human) haplotypes as reference genomes. The problem was that adding more haplotypes also added rare variants and rare combinations of variants, which could be misleading and eventually started decreasing accuracy in the tasks we were interested in. The standard practice was dropping variants that were too rare (e.g. <1%) in the population. I got better results with synthetic haplotypes generated by downsampling the true haplotypes with a Markov model (using the BWT-based approach). The distribution of local haplotypes within each context window was similar to the full set of haplotypes, but the noise from rare combinations of variants was mostly gone.

Other people were doing haplotype inference with Markov models based on similarly large sets of haplotypes. If you knew, for a suitably large subset of variants, whether each variant was likely absent, heterozygous, or homozygous in the sequenced genome, you could use the model to get a good approximation of the genome.

When ChatGPT appeared, the application was surprising (even though I knew some people who had been experimenting with GPT-2 and GPT-3). But it was less surprising on a technical level, as it was close enough to what I had intuitively considered possible with large Markov models.

rytill · 2025-08-06T17:01:59 1754499719

> Boglehead

> 140% gain on your holdings this year

Choose one.

cosmicgadget · 2025-08-06T17:22:58 1754500978

Generally true but nvda and pltr are normie stocks and can account for these returns from this year.

lerchmo · 2025-08-06T19:29:58 1754508598

Boggle head is basically pick 2-3 vanguard etfs and check back in 25 years.

Breza · 2025-08-15T13:58:59 1755266339

That's my approach. I got my quarterly statement in the mail yesterday. Looks like the market must have gone up over the past three months. Not sure what to do with this information since it's not like I'm going to change anything.

andrepd · 2025-08-06T17:57:02 1754503022

But then it's not a Boglehead lol

fyrabanks · 2025-08-06T22:05:16 1754517916

https://www.bogleheads.org/wiki/Passively_managing_individua...

I understand where you’re coming from, but there isn’t a incongruity. Individual stock investments are a relatively small part of my overall portfolio.

andrepd · 2025-08-06T22:45:30 1754520330

> The discussion here assumes that you are not trying to beat the market, but instead passively managing individual stocks to create your own "DIY index fund."