Going to reply to myself here to say that it does seem like there is definitely information from December/November 2021 available, like clear-cut facts.
I'm just curious why OpenAI didn't make an announcement or what's the deal here, wouldn't this warrant retraining the entire model?
They probably won't share how they did it, but there's been a lot of research over the past 6 months showing how you don't have to retrain the entire model to add in new sources. I know nothing about this stuff, but my limited understanding from blog posts is it's easier than anyone had thought to add in new data to a pre-existing model.
Do you by any chance have any of these blog posts available for my own reference? If not you maybe someone else does, I don't recall seeing it but it sounds interesting.
I think there was a paper from Google showing that if you included 5% of your original dataset together with the new data during the finetuning then catastrophical forgetting didn't occur. Perhaps it's that simple.
I'm just curious why OpenAI didn't make an announcement or what's the deal here, wouldn't this warrant retraining the entire model?