Hype grows over “autonomous” AI agents that loop GPT-4 outputs

srslack · on April 15, 2023

LLMs predict text. They predict text by picking a random seed and then using the context to modify the weights of how it walks down a prediction tree. The definition of autonomous is "Not controlled by others or by outside forces." It's why the "jailbreak" prompts work so well against the moralizing RLHF that OpenAI does. You use strong words to modify the weights to go over there, instead of where they want you to go. They have other mechanisms, probably another model that checks output to ensure that it at least has their disclaimer and such. I know that Bing's chatbot has a model checking the output that revokes all output when it detects bad speak.

Not that this article is particularly bad since it's in quotes, but the woo I've seen everywhere is out of control these past few months. "AGI" in particular grinds my gears.

A few weeks ago I'd started a local project in the same vein as these, with "autonomous" as the name as a joke. It's not going to be autonomous. For what it's worth, I'm aiming to build a full-blown docker swarm to support ReAct problem solving and objectives in such, to support the commands in prompts just like Auto-GPT, giving it a file bucket, access to google, wikipedia, calculator and a browser, all prefixed with UUIDs so different configurations don't step on each other's toes. I've succeeded in getting it to interface with Playwright over a JSON API I wrote and prompted it to use, so it can use it for sure if you give it the commands. But you're always limited by context, so you have to use obvious tricks like constraining the viewport and walking the DOM, checking what's in the viewport, etc. I'm probably going to switch to the accessibility tree that's hopefully accessible over the CDP or a hybrid approach with that when I get time.

As a full blown assistant with access to your saved passwords in your browser, though? I don't think there's a chance with these LLMs, at least safely, because of prompt injections.

darepublic · on April 15, 2023

> I've succeeded in getting it to interface with Playwright over a JSON API I wrote and prompted it to use, so it can use it for sure if you give it the commands. But you're always limited by context, so you have to use obvious tricks like constraining the viewport and walking the DOM, checking what's in the viewport, etc.

I have created something similar in the last few weeks. With puppeteer instead of playwright. And yeah basically feeding gpt4 filtered views of webpages. Full blown assistant is the idea.

srslack · on April 15, 2023

>Full blown assistant

Well, by that I meant personal assistant, for actions with access to email and a password manager for anything personally associated without a chance to review. That'd be a bit inconvenient in my own case anyways, because hardware key. You're braver than me, for sure.

It's a pain when it runs for so long with how far it gets away from the original prompt, but light prompts with the system message do help, as well as a way for it to refresh its "commands" and goals/tasks back into the context window. It's like Memento, with the complex system of Polaroids, notes and tattoos.

I don't know if it's possible to develop this stuff in the open without a headache considering all of the hysteria, but was thinking if I fixed consistent output for my browser command and got a swarm working I'd try putting it on GitHub. Mainly I want to make sure things like Google Sheets are output correctly, try to leverage as much accessibility for SPAs that try, as I can.

lgreiv · on April 15, 2023

I've built a proof of concept myself in Ruby, an agent that extends itself with new abilities at runtime based on code suggestions from ChatGPT until it deems itself capable of fulfilling an arbitrary task given in natural language. It then persists that „specialized“ version of the task was completed successfully.

So far it was able to add an API to fetch me the local weather, return the capital of Venezuela, control brightness and volume of my MacBook and replicate itself at random locations (but tell me where, after).

That being said, I added multiple human-in-the-loop points in the assess/suggest/patch/execute cycle and (given the nature of LLMs) would never use it outside of a sandbox without these safety rails.

spacetime_cmplx · on April 15, 2023

>fetch weather

>get capital

>control volume

>oh btw, it's also a self-replicating AI

https://tvtropes.org/pmwiki/pmwiki.php/Main/ArsonMurderAndJa...

lgreiv · on April 15, 2023

I was just listing the goals I gave it in temporal order, but I’ll include a weak task for the giggles in the future when talking about the POC. Good suggestion!