Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

for most (text-dense) documents without much layout differences, these small prompt eng tricks work pretty well! scaling this to complex layouts and 1000+ page docs, we found the models don’t stick to their instructions. perhaps there’s some work to be done with 1M+ context length models so they don’t lose layout memory.


Do any models use some sort of context pruning to keep the [most] relevant parts of the context?

What single documents are you processing that are 1000+ pages?


Is processing one page at a time not feasible? I'm always chunking things as small as possible for LLMs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: