for most (text-dense) documents without much layout differences, these small pro... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ritvikpandey21 on Feb 7, 2025 \| parent \| context \| favorite \| on: Why LLMs still have problems with OCR for most (text-dense) documents without much layout differences, these small prompt eng tricks work pretty well! scaling this to complex layouts and 1000+ page docs, we found the models don’t stick to their instructions. perhaps there’s some work to be done with 1M+ context length models so they don’t lose layout memory.

pbhjpbhj on Feb 8, 2025 | [–]

Do any models use some sort of context pruning to keep the [most] relevant parts of the context?

What single documents are you processing that are 1000+ pages?

mulmboy on Feb 8, 2025 | [–]

Is processing one page at a time not feasible? I'm always chunking things as small as possible for LLMs

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact