_cs2017_'s favorites | Hacker News

1.		Agent design is still hard (pocoo.org)
		426 points by the_mitsuhiko 21 days ago \| 258 comments
2.		The Continual Learning Problem (jessylin.com)
		102 points by Bogdanp 49 days ago \| 8 comments
3.		Kafka is Fast – I'll use Postgres (topicpartition.io)
		561 points by enether 45 days ago \| 401 comments
4.		BERT is just a single text diffusion step (nathan.rs)
		455 points by nathan-barry 54 days ago \| 110 comments
5.		SWE-Grep and SWE-Grep-Mini: RL for Fast Multi-Turn Context Retrieval (cognition.ai)
		97 points by meetpateltech 58 days ago \| 31 comments
6.		Writing an LLM from scratch, part 22 – training our LLM (gilesthomas.com)
		254 points by gpjt 58 days ago \| 10 comments
7.		Show HN: I invented a new generative model and got accepted to ICLR (discrete-distribution-networks.github.io)
		656 points by diyer22 64 days ago \| 91 comments
8.		A small number of samples can poison LLMs of any size (anthropic.com)
		1202 points by meetpateltech 65 days ago \| 439 comments
9.		Reasoning LLMs are wandering solution explorers (arxiv.org)
		90 points by Surreal4434 64 days ago \| 98 comments
10.		Building the heap: racking 30 petabytes of hard drives for pretraining (si.inc)
		412 points by nee1r 73 days ago \| 274 comments
11.		We reverse-engineered Flash Attention 4 (modal.com)
		134 points by birdculture 77 days ago \| 48 comments
12.		Claude’s memory architecture is the opposite of ChatGPT’s (shloked.com)
		448 points by shloked 3 months ago \| 236 comments
13.		Le Chat: Custom MCP Connectors, Memories (mistral.ai)
		398 points by Anon84 3 months ago \| 165 comments
14.		A PM's Guide to AI Agent Architecture (productcurious.com)
		208 points by umangsehgal93 3 months ago \| 62 comments
15.		Physics of badminton's new killer spin serve (arstechnica.com)
		119 points by amichail 3 months ago \| 16 comments
16.		Dispelling misconceptions about RLHF (aerial-toothpaste-34a.notion.site)
		120 points by fpgaminer 3 months ago \| 32 comments
17.		Diffusion language models are super data learners (jinjieni.notion.site)
		218 points by babelfish 4 months ago \| 16 comments
18.		My Lethal Trifecta talk at the Bay Area AI Security Meetup (simonwillison.net)
		430 points by vismit2000 4 months ago \| 115 comments
19.		How attention sinks keep language models stable (hanlab.mit.edu)
		219 points by pr337h4m 4 months ago \| 36 comments
20.		Gemini 2.5 Deep Think (blog.google)
		461 points by meetpateltech 4 months ago \| 249 comments
21.		The Math Is Haunted (overreacted.io)
		409 points by danabramov 4 months ago \| 194 comments
22.		Hierarchical Reasoning Model – 1k training samples SoTA reasoning v/s CoT (github.com/sapientinc)
		26 points by dreamer7 4 months ago \| 6 comments
23.		Hierarchical Reasoning Model (arxiv.org)
		339 points by hansmayer 4 months ago \| 106 comments
24.		Study mode (openai.com)
		1130 points by meetpateltech 4 months ago \| 805 comments
25.		What went wrong for Yahoo (homeip.net)
		255 points by giuliomagnifico 4 months ago \| 265 comments
26.		Major rule about cooking meat turns out to be wrong (seriouseats.com)
		325 points by voxadam 4 months ago \| 250 comments
27.		A conceptual overview of asyncio (github.com/anordin95)
		149 points by anordin95 4 months ago \| 30 comments
28.		LLM architecture comparison (sebastianraschka.com)
		418 points by mdp2021 4 months ago \| 24 comments
29.		Apple Intelligence Foundation Language Models Tech Report 2025 (machinelearning.apple.com)
		242 points by 2bit 4 months ago \| 204 comments
30.		To be a better programmer, write little proofs in your head (the-nerve-blog.ghost.io)
		463 points by mprast 5 months ago \| 167 comments
		More