One thing I love about LangChain (and LlamaIndex) is that rather than having to find the latest relevant arxiv papers, and then try to understand them by reading 20-30 pages, I can just monitor these libraries' blogs and release changes to discover the latest relevant papers, then read their implementation code.
LLM based autonomous agents remind me a lot of Leonard from the movie Memento. In the movie, Leonard is trying desperately to find the murderer of his wife, but he has one big problem - he can no longer build new memories. So, he needs to develop a system for storing new knowledge and then retrieving the relevant bits every time he formulates a plan for what to do next. Throughput the movie, Leonard makes a number of missteps because his system for recording and retrieving knowledge from this external memory system is imperfect. The movie is very well done.
It does a really good job of conveying the fact that current LLMs have a horrendously inefficient method for achieving something resembling working+long-term memory: They need someone else to write code that
* stores in advance all possible pieces of text that might be be relevant to future prompts;
* finds and retrieves at runtime the pieces of text that are supposedly most relevant to a prompt; and
* tacks on those supposedly most relevant pieces of text as a contextual preamble to each new prompt (query).
Current LLMs truly are Leonard-Like Models.
Surely there must be a better way... but it hasn't been discovered yet.
--
[a] I saw "Memento" years ago and still remember it vividly. It was written and directed by a young Christopher Nolan.
Exactly! What could be the next step is removing the (user-facing) prompt at all, since (presumably) there should be enough data for this part to be proactive.
These agent approaches should really incorporate the recent neurology research around predictive processing appearing to be a three tiered process of "long term prediction," "intermediate prediction," and "short term prediction."
What I've seen is very heavily weighted towards chain of thought using the latter to satisfy the former, but explicitly stepping down from broad to intermediate to narrow (and re-evaluating as it goes) will probably work even better.
What's fascinating for me is how directly LangChain's Input > Thought > Action > Observation maps to the Observe > Orient > Decide > Act (OODA) Loop that is used in many other industries: https://en.wikipedia.org/wiki/OODA_loop. The fact that we can do OODA loops without a human in the loop--by just passing a bunch of tools and adding memory to a LangChain--is simply mind blowing.
Having said that, I believe LangChain can benefit from incorporating learnings/failures from successful OODA Loops where human feedback continually improves the automation. I am not an expert, but it seems like this type of human feedback can be incorporated into LangChain by plugging in an RLHF framework. Not sure though if this RLHF framework plugs into Thought or Action.
A lot of those human lessons learned and failures probably are incorporated in the training data. I wonder if literally telling it to perform an OODA loop in those words might allow it to perform better than otherwise, using reasoning examples from the literature.
Well, that's something that I want to understand better. I know ChatGPT--inspired by InstructGPT--used RLHF to improve their models. Since OpenAI's model is not available, how does LangChain make sure it gets similar performance as ChatGPT without using RLHF underneath?
The only other way I can think of is for them to take a core dependency on OpenAI's API which seems like a bad strategy for a promising open-source project like LangChain.
How do you see this evolving?
"For this discussion, we will use LangChain nomenclature, although it’s worth noting that this field is so new there’s no super standard terminology."
There is in fact a "super standard" terminology. This breathless sentence is denying that a computer science "field" existed prior to the LLM chatboxes ("this field").
Research on Agent architectures is evolving real fast. I read Park et al.'s paper on Generative Agents a couple of weeks back which seems to have the potential to put LangChain's Memory on steroids: https://arxiv.org/abs/2304.03442
"To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior."
Can't wait for LangChain to incorporate a long-term, reflection-based memory system.
Slightly related: what's the difference in use cases of llama-index vs LangChain? I know that LangChain can make retrieval using embeddings and I suspect that the nodes synth step is exclusive to llama-index, but I might be wrong.
Not sure on your interest/use case but something that is designed for "documents in" -> "documents out" is here https://github.com/marqo-ai/marqo. It does retrieval using embeddings and combines all the text splitting and inference operations and can be easily deployed to production (its designed for that, not pip install). Works across images and allows for multi-vector representations.
there's some overlap but our primary use case is going deep into indexing+retrieval for LLM's (and going deep in that area). Not really doing agents/chatbots/prompt management/etc.
I think Langchain is currently broader, containing those text-splitting and indexing tools as well as summarization and memory, 'Tool' integration, etc.
These agents, even in their embryonic form, are so powerful. The world is about to change. I'm convinced we're at the beginning of a massive exponential.