There was a recent paper that showed you can spread model’s behavior through training on outputs, even if you don’t directly include obvious markers of the behavior. It’s totally plausible that training off Claude’s outputs subtly affected GLM into mentioning “Claude” even if they don’t include the direct tokens very often.
Given that other work shows that models often converge on similar internal representations, I'd not be surprised if there were close analogues of 'subliminal learning' that don't require shared-ancestor-base-model, just enough overlap in training material.
Further, "enough" training from another model's outputs – de facto 'distillation' – is likely to have similar effects as starting from a common base model, just "from thge other direction".
(Finally: some of the more nationalistic-paranoid observers seem to think Chinese labs have relied on exfiltrated weights from US entities. I don't personally think that'd be a likely or necessary contributor to Z.ai & others' successes, the mere appearance of this occasional "I am Claude" answer is sure to fuel further armchair belief in those theories.)
It is also possible that it learned off the internet that when someone says "Hello" to something that identifies as an AI assistant that the most appropriate response is "Hello! I'm Claude, an AI assistant created by Anthropic. How can I help you today?".
Didn't think of that, that would be an extremely interesting finding. However in that paper the transfer only happens for fine tunes of the same architecture, so it would be a whole new thing for it to happen in this case.
Chatbots identify themselves very often in casual/non-technical chats AFAIK -- for example, when people ask it for its opinion on something, or about its past.
Re:sed, I'm under the impression that most chatbots are pretty pure-ML these days. There are definitely some hardcoded guardrails, but the huge flood of negative press early in ChatGPT's life about random weird mistakes can be pretty scary. Like, what if someone asks the model to list all the available models? Even in this replacement context, wouldn't it describe itself as "GLM Opus"? Etc etc etc.
It's like security (where absolute success is impossible) but you're allowed to just skip it instead of trying to pile Swiss cheese over all the problems! You can just hookup a validation model or two instead and tell them to keep things safe and enforce XYZ, and it'll do roughly as well with way less dev time needed.
After all, what's the risk in this case? OpenAI pretty credibly accused DeepSeek of training R1 by distilling O1[1], but it's my understanding that was more for marketplace PR ("they're only good because they copied us!") than any actual legal reason. Short of direct diplomatic involvement of the US government, top AI firms in China are understandably kinda immune.