Given 2 datasets: 1. A special code dataset 2. A bunch of "unrelated" books My u...

Given 2 datasets:

1. A special code dataset 2. A bunch of "unrelated" books

My understanding is that the model trained on just the first will never beat the model trained on both. Bloomberg model is my favorite example of this.

If you can squirell away special data then that special data plus everything else will beat the any other models. But that's basically what openai, google, and anthropic are all currently doing.