Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Diffusion LLMs also follow scaling laws - https://proceedings.neurips.cc/paper_files/paper/2023/file/3...


Those aren't the modern type with discrete masking based diffusion though.

Of course, these too will have scaling laws.


Is it possible that combining multiple AIs will be able to somewhat bypass scaling laws, in a similar way that multicore CPUs can somewhat bypass the limitations of a single CPU core?


I’m sure there are ways of bypassing scaling laws, but I think we need more research to discover and validate them




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: