Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is just much more efficient to train on synthetic data. When you train on real data, all you know is the next token. With synthetic data you know the probability distribution of the next token; this results in a multiplier effect, and sometimes this effect is dramatic.

[1] https://arxiv.org/pdf/2504.14772v1



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: