Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What happens when a prompt injection attack exploits the judge LLM and results in a higher level of attacker control than if it never existed?
 help



How can it result in a higher level of control? I don't see why the "judge" should have access to anything except one tool that allows it to send an "accept" or "deny" command.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: