LLMs can already browse the web with the help of auxiliary systems, consuming current content. AFAIK GPT-4 has a browser plugin for that, as well as several other plugins for retrieving specific types of (current) information.
A well-trained language model only needs to be retrained when the language changes. New facts can be fed to the model by simply telling it about them, which is how those "plugins" work behind the scenes.
I'd like to know this too. Its concievable that most new web content would be output from an LLM so it would just be feeding on itself. That precise outcome will never quite happen as people will transform and validate this output. However, I see sense in a general worry that people "won't bother" if LLMs can do it all easier than a person. Conversely I also see the potential for it to bootstrap for even greater potential human expression, learning and applied thought.
That still doesn’t answer my question on human to human discovery. Say a site like SO disappears because it’s not viable economically to keep it running. Then having something that scrapes the internet doesn’t seem to help does it? Like a quirk in hardware that a human notices, documents, fixes. Something low level.
How is that situation different from how it was before AI? If a human documents something and then the document becomes inaccessible, the information is gone, regardless of whether the "researcher" is an LLM or a human.
The problem I think I’m not conveying correctly is human to human discovery is made easier with communication hubs like SO. With something like an LLM if that reduces the viability for economically running a site like SO. Is there a gap or period where we don’t see that easy human to human communication? Eventually new generated content from a human has to enter the loop until the AI is up to a level where it can replace humans, but if we see less and less human generated content being viable then we have a clear chicken or the egg problem for new training data
Is it specifically human-human contact thats required to generate new insights or document things? I see no major distinction between that and human-LLM contact (aside from awe) or simply an intelligence pusuing a goal by itself for that matter.
Again I could be a massive idiot, but as I see it? For certain things yes 100%. When it comes to low level things, currently an LLM has no way to confirm behavior of something that doesn't give feedback. The FairPlay example I gave requires you to actually play a video with DRM. Another example I can think of are debouncing buttons on a microcontroller based device. As a super basic example but if you are working on an IoT device and require to know when a pin needs to be debounced, that can only sometimes be found under an oscilloscope. However an open source device could be policed with human-to-human interaction. In fact, I have done so making a PR a long long time ago for a crypto wallet that was open sourced. There are all anecdotal examples, I still see a need for at some level a human to enter the loop, however in a disaster scenario if it becomes unviable to run sites like SO then you slowly lose hubs for that knowledge base.
I didn't consider that but its already begun with plugins and auto GPT. LLMs can test their hypothesis. Next, that might be possible via robotics... so scifi.
A well-trained language model only needs to be retrained when the language changes. New facts can be fed to the model by simply telling it about them, which is how those "plugins" work behind the scenes.