It would have to have been trained on the papers without being aware of retractions for that test to work. Otherwise it will be limited to whatever papers it gets from a search engine query, which likely won't contain any un-retracted illegitimate papers.
I once worked at a crypto company that outsourced some of its marketing work to a content marketing firm. A piece that firm submitted to us contained a link to an "academic" article about global poverty with a totally garbled abstract and absolutely no content whatsoever. I don't know how they found it, because when I search google scholar for a subject, usually the things that come back aren't so blatantly FUBAR. I was hoping Claude could help me find something like that for a point I was making in a blogpost about BS in scientific literature (https://regressiontothemeat.substack.com/p/how-i-read-studie...).
The articles it provided where the AI prompts were left in the text were definitely in the right ballpark, although I do wonder if chatbots mean, going forward, we'll see fewer errors in the "WTF are you even talking about" category which, I must say, were typically funnier and more interesting than just the generic blather of "what a great point. It's not X -- it's Y."