> Will the SECOND GROUP leak the source code? Is the SECOND GROUP telling the truth? Did the SECOND GROUP lie and have access to Ubisoft code this whole time? Was it MongoBleed? Will the FIRST GROUP get pinned for this? Who is this mysterious THIRD GROUP? Is this group related to any of the other groups?
This read to me like the end of a soap opera. Tune in tomorrow to find out!
Elon got singled out because the changes he was forcing on grok were both conspicuously stupid (grok ranting about boers), racist (boers again), and ultimately ineffective (repeat incidents of him fishing for an answer and getting a different one).
It does actually matter what the values are when trying to do "alignment". Although you are absolutely right that we've not solved for human alignment, putting a real limit on the whole thing.
I would also add that Elon got singled out because he was very public about the changes. Other players are not, so it's hard to assess the existence of "corrections" and the reasons behind them
No. If ChatGPT or Claude would suddenly start bringing up Boers randomly they would get "singled out" at least as hard. Probably even more for ChatGPT.
He was public and vocal about it while the other big boys just quietly made the fixes towards their desired political viewpoint. ChatGPT was famous for correcting the anti-transgender bias it had earlier.
Either way, outsourcing opinion to an LLM is dangerous no matter where you fall in the political spectrum.
> declining service quality, higher complaint volumes, and internal firefighting
LLMs are a great technology for making up plausible looking text. When correctness matters, and you don't have a second system that can reliably check it, the output turns out to be unreliable.
When you're dealing with customer support, everyone involved has already been failed by the regular system. So they're an exception, and they're unhappy. So you really don't want to inflict a second mistake on them.
The counter: the existing system of checks with (presumably) humans was not good enough. For the last 15 months or so, I have been dealing with E.ON claiming one thing and doing another, and had to escalate it to the Ombudsman. I don't think E.ON were using an AI to make these mistakes, I think they just couldn't get customer support people to cope with the idea "the address you have been posting letters to, that address isn't simply wrong, it does not exist". An LLM would have done better, except for what I'm going to say in the counter-counter.
The counter-counter, is that LLMs are only an extra layer of Swiss-cheese: the mistakes they make may be different to human mistakes or may overlap, but they're still definitely present. Specifically, I expect that an LLM would have made two mistakes in my case, one of which is the same mistake the actual humans made (saying they'd fixed everything repeatedly when they had not done so, see meme about LLMs playing the role of HAL in 2001 failing to open the pod bay door) and the other would have been a mistake in my favour (the Ombudsman decided less than I asked for, an LLM would likely have agreed with me more than it should have).
This is (a) wildly over expectations for open source and (b) a massive pain to maintain, and (c) not even the biggest timewaster of python, which is the packaging "system".
> not even the biggest timewaster of python, which is the packaging "system".
For frequent, short-running scripts: start-up time! Every import has to scan a billion different directories for where the module might live, even for standard modules included with the interpreter.
This can't come soon enough. Python is great for CLIs until you build something complex and a simple --help takes seconds. It's not something easily worked around without making your code very ugly.
Very good metaphor. I'm going to use that in the future. It even has rows and columns.
Except the spreadsheet is a really accessible technology that's been cloned, while the critical problem with FPGA is the proprietary tooling. This is the same reason that NVIDIA made a gazillion dollars by turning GPUs into general purpose compute: a proper API, CUDA.
The source leak is really interesting, though. We don't often get to see game source, and it often has surprises in.
reply