My mind went to Q learning.

eli_gottlieb · on Nov 23, 2023

My mind went to some kind of Q-learning combined with something like a Monte Carlo Tree Search with some kind of A*-style heuristic to effectively combine Q-learning and with short-horizon planning.

theGnuMe · on Nov 23, 2023

This was alpha-go and alpha-zero right?

a-dub · on Nov 23, 2023

likewise. i can already imagine a* being useful for efficiently solving basic algebra and proofs.

it could form the basis of a generalized planning engine and that planning engine could potentially be dangerous given the inherent competitive reasoning behind any minmax style approach.

ilaksh · on Nov 22, 2023

Ok so maybe nothing to do with A*, but actually a way for GPT-powered models or agents to learn through automated reinforcement learning. Or something.

I wonder if DeepMind is working on something similar also.

If your hunch is right, this could lead to the type of self-improvement that scares people.

Davidzheng · on Nov 23, 2023

Could easily be both

flibble · on Nov 22, 2023

Mine to Q of Star Trek.

Izkata · on Nov 23, 2023

But were you thinking of Q, Q, or Q?