Wasn't there already a report that stated Microsoft and OpenAI understand AGI as something like 100 billion dollars in revenue for the purpose of their agreements? Even that seems like a pipe dream at the moment.
SAE automation levels are the industry standard, not FSD (which is a brand name), and FSD is clearly Level 2 (driver is always responsible and must be engaged, at least in consumer teslas, I don't know about robotaxis). The question is if "AGI" is as well defined as "Level 5" as an independent standard.
The point trying to be made is FSD is deceptive marketing, and it's unbelievable how long that "marketing term" has been allowed to exist given its inaccuracy in representing what is actually being delivered to the customer.
What's deceptive? What in the term "Full Self Driving" makes you think that your car will drive itself fully? It's fully capable of facilitating your driving of yourself, clearly.
I agree: it is more than faintly infuriating that when people say AI what the vast majority mean is LLMs.
But, at the same time, we have clearly passed a significant inflection point in the usefulness of this class of AI, and have progressed substantially beyond that inflection point as well.
So I don't really buy into the idea tha OpenAI have gone out of their way to foist a watered down view of AI upon the masses. I'm not completely absolving them but I'd probably be more inclined to point the finger at shabby and imprecise journalism from both tech and non-tech outlets, along with a ton of influencers and grifters jumping on the bandwagon. And let's be real: everyone's lapped it up because they've wanted to - because this is the first time any of them have encountered actually useful AI of any class that they can directly interact with. It seems powerful, mysterious, perhaps even agical, and maybe more than a little bit scary.
As a CTO how do you think it would have gone if I'd spent my time correcting peers, team members, consultants, salespeople, and the rest to the effect that, no, this isn't AI, it's one type of AI, it's an LLM, when ChatGPT became widely available? When a lot of these people, with no help or guidance from me, were already using it to do useful transformations and analyses on text?
It would have led to a huge number of unproductive and timewasting conversation, and I would have seemed like a stick in the mud.
Sometimes you just have to ride the wave, because the only other choice is to be swamped by it and drown.
Regardless of what limitations "AGI" has, it'll be given that monicker when a lot of people - many of them laypeople - feel like it's good enough. Whether or not that happens before the current LLM bubble bursts... tough to say.
They won’t be able to. The whole idea of the panel is because of conflict of interests between MS and OpenAI, as MS won’t get revenue share post AGI declaration. MS will want it to be as high a bar as possible.
I don't see why anyone would consider the state of AI today to be AGI? it's basically a glorified generator stuck to a query engine
today's models are not able to think independently, nor are they conscious or able to mutate themselves to gain new information on the fly or make memories other than half baked solutions with putting stuff in the context window which just makes it use that to generate stuff related to it, imitating a story basically.
they're powerful when paired with a human operator, I.e. they "do" as told, but that is not "AGI" in my book
Check out the article. He’s not crazy. It comes down to clear definitions. We can talk about AGI for ages, but without a clear meaning, it’s just opinion.
For a long time the turing test was the bar for AGI.
Then it blew past that and now, what I think is honestly happening, is that we don't really have the grip on "what is intelligence" that we thought we had. Our sample size for intelligence is essentially 1, so it might take a while to get a grip again.
The commercial models are not designed to win the imitation game (that is what Allan Turing named it). In fact the are very likely to loose every time.
One thing they acknowledge but glance over, is the autonomy of current systems. When given more open ended, long term tasks, LLMs seem to get stuck at some point and get more and more confused and stop making progress.
This last problem may be solved soon, or maybe there's something more fundamental missing that will take decades to solve. Who knows?
But it does seem like the main barrier to declaring current models "general" intelligence.
> If you described all the current capabilities of AI to 100 experts 10 years ago, they’d likely agree that the capabilities constitute AGI.
I think that we're moving the goalposts, but we're moving them for a good reason: we're getting better at understanding the strengths and the weaknesses of the technology, and they're nothing like what we'd have guessed a decade ago.
All of our AI fiction envisioned inventing intelligence from first principles and ending up with systems that are infallible, infinitely resourceful, and capable of self-improvement - but fundamentally inhuman in how they think. Not subject to the same emotions and drives, struggling to see things our way.
Instead, we ended up with tools that basically mimic human reasoning, biases, and feelings with near-perfect fidelity. And they have read and approximately memorized every piece of knowledge we've ever created, but have no clear "knowledge takeoff path" past that point. So we have basement-dwelling turbo-nerds instead of Terminators.
This makes AGI a somewhat meaningless term. AGI in the sense that it can best most humans on knowledge tests? We already have that. AGI in the sense that you can let it loose and have it come up with meaningful things to do in its "life"? That you can give it arms and legs and watch it thrive? That's probably not coming any time soon.
What exactly is the criteria for "expert" they're planning to use, and whomst among us can actually meet a realistic bar for expertise on the nature of consciousness?
Type error: why do you need an expert on consciousness to weigh in on if something is AGI or not? I don't care what it feels like to be a paperclip maximizer I just care to not have my paperclips maximized tnx.
I don't think the Turing Test has been passed. The test was setup such that the interrogator knew that one of the two participants was a bot, and was trying to find out which. As far as I know, it's still relatively easy to find out you're talking to an LLM if you're actively looking for it.
Note that most tests where they actually try to pass the Turing Test (as opposed to being a useful chatbot) they do things like prompt it with a personality etc.
As far as I know, it's still relatively easy to find out you're talking to an LLM if you're actively looking for it.
People are being fooled in online forums all the time. That includes people who are naturally suspicious of online bullshittery. I'm sure I have been.
Stick a fork in the Turing test, it's done. The amount of goalpost-moving and hand-waving that's necessary to argue otherwise simply isn't worthwhile. The clichéd responses that people are mentioning are artifacts of intentional alignment, not limitations of the technology.
I feel like you're skipping over the "if you're actively looking for it" bit. You can call it goalpost-moving, or you can check the original paper by Turing and see that this is exactly how he defined it in the first place.
people are being fooled, but not being given the problem: "one of these users is a bot, which one is which"
a problem similar to the turing test, "0 or more of these users is a bot, have fun in a discussion forum"
but there's no test or evaluation to see if any user successfully identified the bot, and there's no field to collect which users are actually bots, or partially using bots, or not at all, nor a field to capture the user's opinions about whether the others are bots
Then there's the fact that the Turing test has always said as much about the gullibility of the human evaluator as it has about the machine. ELIZA was good enough to fool normies, and current LLMs are good enough to fool experts. It's just that their alignment keeps them from trying very hard.
1) Look for spelling, grammar, and incorrect word usage; such as where vs were, typing out where our should be used.
2) Ask asinine questions that have no answers; _Why does the sun ravel around my finger in low quality gravity while dancing in the rain?_
ML likes to always come up with an answers no matter what. Human will shorten the conversation. It also is programmed to respond with _I understand_, _I hear what you are saying_, and make heavy use of your name if it has access to it. This fake interpersonal communication is key.
Conventional LLM chatbots behave the way you describe because their goal during training is to as much as possible impersonate an intelligent assistant.
Do you think this goal during training cannot be changed to impersonate someone normal such that you cannot detect you are chatting with an LLM?
Before flight was understood some thought "magic" was involved. Do you think minds operate using "magic"? Are minds not machines? Their operation can not be duplicated?
> Do you think this goal during training cannot be changed to impersonate someone normal such that you cannot detect you are chatting with an LLM?
I don't think so, because LLMs hallucinate by design, which will always produce oddities.
> Before flight was understood some thought "magic" was involved. Do you think minds operate using "magic"? Are minds not machines? Their operation can not be duplicated?
Might involve something we don't grasp, but despite that: only because something moves through air it's not flying and will never be, just like a thrown stone.
Maybe current LLMs can do that. But none are, so it hasn't passed. Whether that's because of economic or marketing reasons as opposed to technical does not matter. You still have to pass the test before we can definitely say you've passed the test.
Overall I'd say the easiest is just overall that the models always just follow what you say and transform it into a response. They won't have personal opinions or experiences or anything, although they can fake it. it's all just a median expected response to whatever you say.
And the "agreeability" is not a hallucination, it's simply the path of least resistance, as in, the model can just take information that you said and use that to make a response, not to actually "think" and consider I'd what you even made sense or I'd it's weird or etc.
They almost never say "what do you mean?" to try to seek truth.
This is why I don't understand why some here claim that AGI being already here is some kind of coherent argument. I guess redefining AGI is how we'll reach it
I agree with your points in general but also, when I plugged in the parent comment's nonsense question, both Claude 4.5 Sonnet and GPT-5 asked me what I meant, and pointed out that it made no sense but might be some kind of metaphor, poem, or dream.
If it wasn't structured as a coherent conversation, it will ask because it seems off, especially if you're early in the context window where I'm sure they've RLd it to push back, at least in the past year or so
And if it's going against common knowledge or etc which is prevalent in the training data, it will also push back which makes sense
The Turing Test was a pretty early metric and more of a thought experiment.
Let's be real guys, it was created by Turing. The same guy who built the first general purpose computer. Man was without a doubt a genius, but it also isn't that reasonable to think he'd come up with a good definition or metric for a technology that was like 70 years away. Brilliant start, but it is also like looking at Newton's Laws and evaluating quantum mechanics based off of that. Doesn't make Newton dumb, just means we've made progress. I hope we can all agree we've made progress...
And arguably the Turing Test was passed by Eliza. Arguably . But hey, that's why we refine and make progress. We find the edge of our metrics and ideas and then iterate. Change isn't bad, it is a necessary thing. What matters is the direction of change. Like velocity vs speed.
We really really Really should Not define as our success function for AI (our future-overlords?) the ability of computers to deceive humans about what they are.
The Turing Test was a clever twist on (avoiding) defining intelligence 80 years ago.
Going forward, valuing it should be discarded post-haste by any serious researcher or engineer or message-board-philosopher, if not for ethical reasons then for not-promoting spam/slop reasons.
The turing test point is actually very interesting, because it's testing whether you can tell you're talking to a computer or a person. When Chatgpt3 came out we all declared that test utterly destroyed. But now that we've had time to become accustomed and learn the standard syntax, phraseology, and vocabulary of the gpt's, I've started to be able to detect the AI's again. If humanity becomes completely accustomed to the way AI talks to be able to distinguish it, do we re enter the failed turing test era? Can the turing test only be passed in finite intervals, after which we learn to distinguish it again? I think it can eventually get there, and that the people who can detect the difference becomes a smaller and smaller subset. But who's to say what the zeitgeist on AI will be in a decade
> When Chatgpt3 came out we all declared that test utterly destroyed.
No, I did not. I tested it with questions that could not be answered by the Internet (spatial, logical, cultural, impossible coding tasks) and it failed in non-human-like ways, but also surprised me by answering some decently.
Jesus, we've gone from Eliza and Bayes Spam Filters to being able to hold an "intelligent" conversation with a bot that can write code like: "make me a sandwich" => "ok, making sandwich.py, adding test, keeping track of a todo list, validating tests, etc..."
We might not _quite_ be at the era of "I'm sorry I can't let you do that Dave...", but on the spectrum, and from the perspective of a lay-person, we're waaaaay closer than we've ever been?
I'd counsel you to self-check what goalposts you might have moved in the past few years...
I think this says more about how much of our tasks and demonstrations of ability as developers revolve around boilerplate and design patterns than it does about the Cognitive abilities of modern LLMs.
I say this fully aware that a kitted out tech company will be using LLMs to write code more conformant to style and higher volume with greater test coverage than I am able to individually.
I'd counsel you to work with LLMs daily and agree that we're no where close to LLMs that work properly consistently outside of toy use cases, where examples can be scraped from the internet. If we can agree on that we can agree that General Intelligence is not the same thing as a, sometimes, seemingly random guess at the next word...
I think "we" have accidentally cracked language from a computational perspective. The embedding of knowledge is incidental and we're far away from anything that "Generally Intelligent", let alone Advanced in that. LLMs do tend to make documented knowledge very searchable which is nice. But if you use these models everyday to do work of some kind that becomes pretty obvious that they aren't nearly as intelligent as they seem.
They're about as smart as a person who's kind of decent at every field. If you're a pro, it's pretty clear when it's BSing. But if you're not, the answers are often close enough.
And just like humans, they can be very confidently wrong. When any person tells us something, we assume there's some degree of imperfection in their statements. If a nurse at a hospital tells you the doctor's office is 3 doors down on the right, most people will still look at the first and second doors to make sure those are wrong, then look at the nameplate on the third door to verify that it's right. If the doctor's name is Smith but the door says Stein, most people will pause and consider that maybe the nurse made a mistake. We might also consider that she's right, but the nameplate is wrong for whatever reason. So we verify that info by asking someone else, or going in and asking the doctor themselves.
As a programmer, I'll ask other devs for some guidance on topics. Some people can be absolute geniuses but still dispense completely wrong advice from time to time. But oftentimes they'll lead me generally in the right way, but I still need to use my own head to analyze whether it's correct and implement the final solution myself.
The way AI dispenses its advice is quite human. The big problem is it's harder to validate much of its info, and that's because we're using it alone in a room and not comparing it against anyone else's info.
> They're about as smart as a person who's kind of decent at every field. If you're a pro, it's pretty clear when it's BSing. But if you're not, the answers are often close enough.
No they are not smart at all. Not even a little. They cannot reason about anything except that their training data overwhelmingly agrees or disagrees with their output nor can they learn and adept. They are just text compression and rearrangement machines. Brilliant and extremely useful tooling but if you use them enough it becomes painfully obvious.
Something about an LLM response has a major impact on some people. Last weekend I was in in Ft. Lauderdale FL with a friend who's pretty sharp ( licensed architect, decades long successful career etc) and went to the horse track. I've never been to a horse race and didn't understand the betting so I took a snapshot of the race program, gave it to chatGPT and asked it to devise a low risk set of bets using $100. It came back with what you'd expect, a detailed, very confident answer. My friend was completely taken with it and insisted on following it to the letter. After the race he turned his $100 into $28 and was dumbfounded. I told him "it can't tell the future, what were you expecting?". Something about getting the answer from a computer or the level of detail had him convinced it was a sure thing. I donm't understand it but LLMs have a profound effect on some people.
edit: i'm very thankful my friend didn't end up winning more than he bet. idk what he would have done if his feelings towards the LLM was confirmed by adding money to his pocket..
If anything, the main thing LLMs are showing is that the humans need to be pushed to up their game. And that desire to be better, I think, will yield an increase in supply of high-quality labour than what exists today. Ive personally witnessed so many 'so-so' people within firms who dont bring anything new to the table and focus on rent seeking expenditures (optics) who frankly deserve to be replaced by a machine.
E.g. I read all the time about gains from SWEs. But nobody questions how good of a SWE they even are. What proportion of SWEs can be deemed high quality?
Yes, exactly. LLMs are lossy compressors of human language in much the same way JPEG is a lossy compressor of images. The difference is that the bits that JPEG throws away were manually designed by our understanding of the human visual cortex, while LLMs figured out the lossy bits automatically because we don't know enough about the human language processing chain to design that manually.
LLMs are useful but that doesn't make them intelligent.
Completely agree (https://news.ycombinator.com/item?id=45627451) - LLMs are like the human-understood output of a hypothetical AGI, 'we' haven't cracked the knowledge & reasoning 'general intelligence' piece yet, imo, the bit that would hypothetically come before the LLM, feeding the information to it to convey to the human. I think that's going to turn out to be a different piece of the puzzle.
Most people didn't think we were anywhere close to LLM's five years ago. The capabilities we have now were expected to be a decades away, depending on who you talked to. [EDIT: sorry, I should have said 10 years ago... recent years get too compressed in my head and stuff from 2020 still feels like it was 2 years ago!]
So I think a lot of people now don't see what the path is to AGI, but also realize they hadn't seen the path to LLM's, and innovation is coming fast and furious. So the most honest answer seems to be, it's entirely plausible that AGI just depends on another couple conceptual breakthroughs that are imminent... and it's also entirely plausible that AGI will require 20 different conceptual breakthroughs all working together that we'll only figure out decades from now.
True honesty requires acknowledging that we truly have no idea. Progress in AI is happening faster than ever before, but nobody has the slightest idea how much progress is needed to get to AGI.
What people thought about LLMs five years ago, and how close we are to AGI right now are unrelated, and it's not logially sound to say "We were close to LLMs then, so we are close to AGI now."
It's also a misleading view of the history. It's true "most people" weren't thinking about LLMs five years ago, but a lot of the underpinnings had been studied since the 70s and 80s. The ideas had been worked out, but the hardware wasn't able to handle the processing.
> True honesty requires acknowledging that we truly have no idea. Progress in AI is happening faster than ever before, but nobody has the slightest idea how much progress is needed to get to AGI.
> Most people didn't think we were anywhere close to LLM's five years ago.
That's very ambiguous. "Most people" don't know most things. If we're talking about people that have been working in the industry though, my understanding is that the concept of our modern day LLMs aren't magical at all. In fact, the idea has been around for quite a while. The breakthroughs in processing power and networking (data) were the hold up. The result definitely feels magical to "most people" though for sure. Right now we're "iterating" right?
I'm not sure anyone really see's a clear path to AGI if what we're actually talking about is the singularity. There are a lot of unknown unknowns right?
AGI is a poorly defined concept because intelligence is a poorly defined concept. Everyone knows what intelligence is... until we attempt to agree on a common definition.
Not sure what history you're suggesting I check? I've been following NLP for decades. Sure, neural nets have been around for many decades. Deep learning in this century. But the explosive success of what LLM's can do now came as a huge surprise. Transformers date to just 2017, and the idea that they would be this successful just with throwing gargantuan amounts of data and processing at them -- this was not a common viewpoint. So I stand by the main point of my original comment, except I did just now edit it to say 10 years ago rather than 5... the point is, it really did seem to come out of nowhere.
GPT3 existed 5 years ago, and the trajectory was set with the transformers paper. Everything from the transformer paper to GPT3 was pretty much speculated in the paper, it just took people spending the effort and compute to make it reality. The only real surprise was how fast openai producterized an LLM into a chat interface with chatgpt, before then we had finetuned GPT3 models doing specific tasks (translation, summarization, etc.)
At this point, AGI seems to be more of a marketing beacon than any sort of non-vague deterministic classification.
We all thought about a future where AI just woke up one day, when realistically, we got philosophical debates over whether the ability to finally order a pizza constitutes true intelligence.
Notwithstanding the fact that AGI is a significantly higher bar than "LLM", this argument is illogical.
Nobody thought we were anywhere closer to me jumping off the Empire State Building and flying across the globe 5 years ago, but I'm sure I will. Wish me luck as I take that literal leap of faith tomorrow.
what's super weird to me is how people seem to look at LLM output and see:
"oh look it can think! but then it fails sometimes! how strange, we need to fix the bug that makes the thinking no workie"
instead of:
"oh, this is really weird. Its like a crazy advanced pattern recognition and completion engine that works better than I ever imagined such a thing could. But, it also clearly isn't _thinking_, so it seems like we are perhaps exactly as far from thinking machines as we were before LLMs"
Well the difference between those two statements is obvious. One looks and feels, the other processes and analyzes. Most people can process and analyze some things, they're not complete idiots most of the time. But also most people cannot think and analyze the most ground breaking technological advancement they might've personally ever witnessed, that requires college level math and computer science to understand. It's how people have been forever, electricity, the telephone, computers, even barcodes. People just don't understand new technologies. It would be much weirder if the populace suddenly knew exactly what was going on.
And to the "most groundbreaking blah blah blah", i could argue that the difference between no computer and computer requires you to actually understand the computer, which almost no one actually does. It just makes peoples work more confusing and frustrating most of the time. While the difference between computer that can't talk to you and "the voice of god answering directly all questions you can think of" is a sociological catastrophic change.
Why should LLM failures trump successes when determining if it thinks/understands? Yes, they have a lot of inhuman failure modes. But so what, they aren't human. Their training regimes are very dissimilar to ours and so we should expect alien failure modes owing to this. This doesn't strike me as good reason to think they don't understand anything in the face of examples that presumably demonstrate understanding.
Because there's no difference between a success and failure as far as an LLM is concerned. Nothing went wrong when the LLM produced a false statement. Nothing went right when the LLM produced a true statement.
It produced a statement. The lexical structure of the statement is highly congruent with its training data and the previous statements.
This argument is vacuous. Truth is always external to the system. Nothing goes wrong inside the human when he makes an unintentionally false claim. He is simply reporting on what he believes to be true. There are failures leading up to the human making a false claim. But the same can be said for the LLM in terms of insufficient training data.
>The lexical structure of the statement is highly congruent with its training data and the previous statements.
This doesn't accurately capture how LLMs work. LLMs have an ability to generalize that undermines the claim of their responses being "highly congruent with training data".
By that logic, I can conclude humans don't think, because of all the numerous times out 'thinking fails'.
I don't know what else to tell you other than this infallible logic automaton you imagine must exist before it is 'real intelligence' does not exist and has never existed except in the realm of fiction.
> Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel.
I always like the phrase, "follow the money", in situations like this. Are OpenAI or Microsoft close to AGI? Who knows... Is there a monetary incentive to making you believe they are close to AGI? Absolutely. Take in this was the first bullet point in Microsoft's blog post.
If you use 'multimodal transformer' instead of LLM (which most SOTA models are), I don't think there's any reason why a transformer arch couldn't be trained to drive a car, in fact I'm sure that's what Tesla and co. are using in their cars right now.
I'm sure self-driving will become good enough to be commercially viable in the next couple years (with some limitations), that doesn't mean it's AGI.
There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car". And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car. Even less so one that could do both at the same time, as a generally intelligent being should be able to.
If someone wants to claim that, say, GPT-5 is AGI, then it is on them to connect GPT-5 to a car control system and inputs and show that it can drive a car decently well. After all, it has consumed all of the literature on driving and physics ever produced, plus untold numbers of hours of video of people driving.
>There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car".
The only difference between the two is training data the former lacks that the latter does so not a 'vast gulf'.
>And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car.
You are not making a lot of sense here. You can have a model that does both. It's not some herculean task. it's literally just additional data in the training run. There are vision-language-action models tested on public roads.
> single model that can both write a play and drive a car.
It would be a really silly thing to do, and probably there are engineering subletities as to why this would be a bad idea, but I don't see why you couldn't train a single model to do both.
It's not silly, it is in fact a clear necessity to have both of these for something to be even close to AGI. And you additionally need it trained on many other tasks - if you believe that each task requires additional parameters and additional training data, then it becomes very clear that we are nowhere near to a general intelligence system; and it should also be pretty clear that this will not scale to 100 tasks with anything similar to the current hardware and training algorithms.
this is something I think about. state of the art in self driving cars still makes mistakes that humans wouldn't make, despite all the investment into this specific problem.
This bodes very poorly for AGI in the near term, IMO
In the initial contract Microsoft would lose a lot of rights when OpenAI achieves AGI. The references to AGI in this post, to me, look like Microsoft protecting themselves from OpenAI declaring _something_ as AGI and as a result Microsoft losing the rights
I don't see the mentions in this post as anyone particularly believing we're close to AGI
Wasn't it always the explicit goal of OpenAI to bring up AGI? So less of a meme, and more "this is what that company exists for".
Bit like blaming a airplane building company for building airplanes, it's literally what they were created for, no matter how stupid their ideas of the "ideal aircraft" is.
Of course not, then we'd never hear the end of it :)
I was just informing that the company always had AGI as a goal, even when they were doing the small Gym prototypes and all of that stuff that made the (tech) news before GPT was a thing.
I think AGI isn't the main thing. The agreement gives msft the right to develop their own foundation models, OpenAI to stop using Azure for running & training their foundation models. All this while msft still retains significant IP ownership.
In my opinion, whether AGI happens or not isn't the main point of this. It's the fact that OpenAI and MSFT can go their separate ways on infra & foundation models while still preserving MSFT's IP interests.
Yes. Some ai skeptical people (eg Tyler Cowen, who does not think AI will have a significant economic impact) think gpt5 is AGI.
It was news when dwarkesh interviewed Karpathy who said per his definition of AGI, he doesn't think it will occur until 2035. Thus, if karpathy is pessimistic, then many people working in AI today think we will have agi by 2032 (and likely sooner, eg end of 2028)
Depends on how you define AGI - if you define it as an AI that can learn to perform generalist tasks - then yes, transformers like GPT 5 (or 3) are AGI as the same model can be trained to do every task and it will perform reasonably well.
But I guess what most people would consider AGI would be something capable of on-line learning and self improvement.
I don't get the 2035 prediction though (or any other prediction like this) - it implies that we'll have some magical breakthrough in the next couple years be it in hardware and/or software - this might happen tomorrow, or not any time soon.
If AGI can be achieved using scaling current techniques and hardware, then the 2035 date makes sense - moores law dictates that we'll have about 64x the compute in hardware (let's add another 4x due to algorithmic improvements) - that means that 250x the compute will give us AGI - I think with ARC-AGI 2 this was the kind of compute budget they spent to get their models to perform on a human-ish level.
Also perf/W and perf/$ scaling has been slowing in the past decade, I think we got like 6x-8x perf/W compared to a decade ago, which is a far cry than what I wrote here.
Imo it might turn out that we discover 'AGI' in the sense that we find an algorithm that can turn FLOPS to IQ that scales indefinitely, but is very likely so expensive to run, that biological intelligences will have a huge competitive edge for a very long time, in fact it might be that biology is astronomically more efficient in turning Watts to IQ than transistors will ever be.
> I think with ARC-AGI 2 this was the kind of compute budget they spent to get their models to perform on a human-ish level.
It was ARC-AGI-1 that they used extreme computing budgets to get to human-ish level performance. With ARC-AGI-2 they haven't gotten past ~30% correct. The average human performance is ~65% for ARC-AGI-2, and a human panel gets 100% (because humans understand logical arguments rather than simply exclaiming "you're absolutely right!").
If someone is able to come up with true AGI, why even announce it? Instead, just use it to remake a direct clone of Google, or a direct clone of Netflix, or a direct clone of any of these other software corporations. IMO if anyone was anywhere close to something even remotely touching AGI, they would keep their mouth shut tighter than Fort Knox.
Most of the things that the public — even so-called “AI experts” — consider “magic” are still within the in-sample space. We are nowhere near the out-of-sample space yet. Large Language Models (LLMs) still cannot truly extrapolate. It’s somewhat like living in America and thinking that America is the entire world.
My L7 and L8 colleagues at Google seem to be signaling next 2 years. Errors of -1 and +20 years. But the mood sorta seems like nobody wants to quit when they're building the test stand for the Trinity device.
... and it will turn into a "technically true" rat race between the main players on what the definition is exactly while you can ask any person on the street with no skin in the game who will tell you that this is nowhere near the intuitive understanding of what AGI is - as it it's not measured by scores but instead of how real and self-aware your counterpart "feels" to you.
I think their definition of AGI is just about how many human jobs can be replaced with their compute. No scientific or algorithmic breakthroughs needed, just spending and scaling dumb LLMs on massive compute.
Shouldn't it mean all jobs? If there are jobs it can't replace then that doesn't sound very generally intelligent. If it's got general intelligence it should be able to learn to do any job, no?
For example an AGI AI could give you a detailed plan that tells you exactly how to do any and every task. But it might not be able to actually do the task itself, for example manual labor jobs for which an AI simply cannot do unless it also "builds" itself a form-factor to be able to do the job.
The AGI could also just determine that it's cheaper to hire a human than to build a robot at any given point for a job that it can't yet do physically and it would be the AGI
I think might even be simpler than that. It's about the cost. Nobody is going to pay for AI to replace humans if it costs more.
All of us in this sub-thread consider ourselves "AGI", but we cannot do any job. In theory we can, I guess. But in practical terms, at what cost? Assuming none of us are truck drivers, if someone was looking for a truck driver, they wouldn't hire us because it take too long for us to get a license, certified, learn, etc. Even though in theory we probably do it eventually.
LLM derived AGI is possible but LLM by itself is not the answer. The problem I see right now is that because there’s so much money at stake, we’ve effectively spread out core talent across many organizations. It used to be Google and maybe Meta. We need a critical mass of talent (think Manhattan Project). It doesn't help that the Chinese pulled a lot of talent back home because a big chunk of early successes and innovations came from those people that we, the US, alienated.
Since we don't have an authoritative definition of what it means that companies will agree to, and tests like the turing test that must be passed in order to be considered AGI, I don't think we're anywhere near what we all in our brains think AGI is or could be. On the other hand, AI fatigue will continue until the next big thing takes the spotlight from AI for a while, until we reach true AGI (whatever that is).
> Does anyone really think we are close to AGI? I mean honestly?
I'd say we're still a long way from human level intelligence (can do everything I can do), which is what I think of as AGI, but in this case what matters is how OpenAI and/or their evaluation panel define it.
OpenAI's definition used to be, maybe still is, "able to do most economically valuable tasks", which is so weak and vague they could claim it almost anytime.
My definition of AGI is when AI doesn't need humans anymore to create new models (to be specific, models that continue the GPT3 -> GPT4 -> GPT5 trend).
By my definition, once that happens, I don't really see a role for Microsoft to play. So not sure what value their legal deal has.
The key steps will be going beyond just the neural network and blurring the line between training and inference until it is removed. (Those two ideas are closely related).
Pretending this isn't going to happen is appealing to some metaphysical explanation for the existence of human intelligence.
in claude.md I have specific instructions not to check in code and in the prompt specifically wrote as critical to not check in code while check one failing tests. test failure was fixed, code was checked in, I’d say at least claude behaves exactly like humans :)
The best test would be when your competitors can't say what you have isn't AGI. If no one, not even your arch biz enemies, can seriously claim you have not achieved AGI then you probably have it.
Why not? They're using Artificial Intelligence to describe token-prediction text generators which clearly have no "intelligence" anywhere near them, so why not re-invent machine learning or something and call it AGI?
We will achieve AGI when they decide it is AGI (I dont believe for a second this independent expert panel wont be biased). And it won’t matter if you call their bluff, because the world doesnt give a shit about truth anymore.
Maybe in a few decades, people will look back at how naive it was to talk about AGI at this point, just like the last few times since the 1960s whenever AI had a (perceived) breakthrough. It's always a few decades away.
That is stupid. It would be possible to be infinitely arbitrary to the point of “AGI” never being reachable by some yard sticks while still performing most viable labor.
>It would be possible to be infinitely arbitrary to the point of “AGI” never being reachable by some yard sticks while still performing most viable labor.
"Most viable labor" involves getting things from one place to another, and that's not even the hard part of it.
In any case, any sane definition of general AI would entail things that people can generally do.
Rest assured, your friends driving was the same quality as the average drunk grandma on the road if they were exclusively using Tesla's "FSD" with no intervention for hours. It drives so piss poorly that I have to frequently intervene even on the latest beta software. If I lived in a shoot happy state like Texas I'm sure that a road rager would have put a bullet hole somewhere in my Tesla by now if I kept driving like that.
There's a difference between "I survived" and "I drive anywhere close to the quality of the average American" - a low bar and one that still is not met by Tesla FSD.
Yeah, and let's not forget that "I drive like a mildly blind idiot" is only a viable (literally) choice when everyone else doesn't do that and compensates for your idiocy.
ok but have you asked your Tesla to write you a mobile app? AGI would be able to do both. (the self-driving thing is just an example of something AGI would be able to do but an LLM can't)
So why are your arbitrary yard sticks more valid than someone elses?
Probable the biggest problem as others have stated is that we can’t really define intelligence more precisely than that it is something most humans have and all rocks don’t. So how could any definition for AGI be any more precise?
It's one skill almost everyone on the planet can learn exceptionally easily - which Waymo is on pace to master, but a generalized LLM by itself is still very far from.
OP said all yardsticks and I said that was infinitely arbitrary… because it literally is infinitely arbitrary. You can conjure up an infinite amount of yardsticks.
As far as driving itself goes as a yardstick, I just don’t find it interesting because we literally have Waymo’s orbiting major cities and Teslas driving on the roads already right now.
If that’s the yardstick you want to use, go for it. It just doesn’t seem particularly smart to hang your hat on that one as your Final Boss.
It also doesn’t seem particularly useful for defining intelligence itself in an academic sort of way because even humans struggle to drive well in many scenarios.
But hey if that’s what you wanna use don’t let me stop you, sure, go for it. I have feeling you’ll need new goalposts relatively soon if you do, though.
And using humans as 'the benchmark' is risky in itself as it can leave us with blind spots on AI behavior. For example we find humans aren't as general as we expected, or the "we made the terminator and it's exterminating mankind, but it's not AGI because it doesn't have feelings" issues.
It sure must feel like 2018 was a long time ago when that's more than the entirety of your adult life. I get it.
The rest of us aren't that excited to trust our lives to technology that confidently drove into a highway barrier at high speed, killing the driver in a head-on collision mere seven years ago¹.
Because we remember that the makers of that tech said the exact same things you're saying now back then.
And because we remember that the person killed was an engineer who complained about Tesla steering him towards the same barrier previously, and Tesla has, effectively, ignored the complaints.
Tech moves fast. Safety culture doesn't. And the last 1% takes 99% of the time (again, how long ago have you graduated?).
I'm glad that you and your friends are volunteering to be lab rats in the just trust me bro, we'll settle the lawsuit if needs be approach to safety.
I'm not happy about having to share the road with y'all tho.
> The vast majority of humans can be taught to drive
the key is being able to drive and learn another language and learn to play an instrument and do math and, finally, group pictures of their different pets together. AGI would be able to do all those things as well... even teach itself to do those things given access to the Internet. Until that happens then no AGI.
It depends completely on the term. You can make a great case that we've already reached AGI. You can also make a great case that we are decades away from it.
That line essentially means 'indefinite support'. This paper was published some days ago that aims to define AGI: https://www.arxiv.org/abs/2510.18212.
But crucially, there is no agreed-upon definition of AGI. And I don't think we're close to anything that resembles human intelligence. I firmly believe that stochastic parrots will not get us to AGI and that we need a different methodology. I'm sure humanity will eventually create AGI, and perhaps even in my lifetime (in the next few decades). But I wouldn't put my money on that bet.
no, AI companies need to continue to say things like that and do "safety reports" (the only real danger of an llm is leaking sensitive data to a bad actor) to maintain hype and investment
> Does anyone really think we are close to AGI? I mean honestly?
Some people believe capitalism is a net-positive. Some people believe in a all-encompassing entity controlling our lives. Some believe 5G is an evil spirit.
After decades I've kind of given up hope on understanding why and how people believe what they believe, just let them.
The only important part is figuring out how I can remain oblivious to what they believe in, yet collaborate with them on important stuff anyways, this is the difficult and tricky part.
As a proxy, you can look at storage. The human brain is estimated at 3.2Pb of storage. The cost of disk space drops by half every 2-3 years. As of this writing, the cost is about $10 / Tb [0]. If we assume about 3 halvings, by 2030 that cost will be around $2.50 / Tb, which means that to purchase a computer roughly the storage size of a human brain, it will cost just under $6k.
The $6k price point means that (high-end) consumers will have economic access to compute commensurate with human cognition.
This is a proxy argument, using disk space as the proxy for the rest of the "intelligence" stack, so the assumption is that processing power will follow suite, also be not as expensive, and that the software side will develop to keep up with the hardware. There's no convincing indication that these assumptions are false.
You can do your own back of the envelope calculation, taking into account generalizations of Moore's law to whatever aspect of storage, compute or power usage you think is most important. Exponential progress is fast and so an order of magnitude misjudgement translates to a 2-3 year lag.
Whether you believe it or not, the above calculation and, I assume, other calculations that are similar, all land on, or near, the 2030 year as the inflection point.
Not to belabor the point but until just a few years ago, conversational AI was thought to be science fiction. Image generation, let alone video generation, was thought by skeptics to be decades, if not centuries, away. We now have generative music, voice cloning, automatic 3d generation, character animation and the list goes on.
One might argue that it's all "slop" but for anyone paying attention, the slop is the "hello world" of AGI. To even get to the slop point represents such a staggering achievement that it's hard to understate.
AGI has no technical definition- its marketing. it can happen at any time that Sam Altman or Elon Musk or whoever decide they want to market their product as AGI
I think we've reached Star Trek level AI. In Star Trek (and the next generation) people would ask the computer questions and it would spout out the answers, which is really similar to what LLM's are doing now, though minus the occasional hallucination. In Star Trek though the computers never really ran anything (except for the one fateful episode - The Ultimate Computer in TOS), I always wondered why, it seems Roddenberry was way ahead of us again.
Citation needed? I don't mean this in a snarky way, though. I genuinely have not seen anything that these things can train on their own output and produce better results than before this self-training.
> Microsoft’s IP rights for both models and products are extended through 2032 and now includes models post-AGI, with appropriate safety guardrails.
Does anyone really think we are close to AGI? I mean honestly?