Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They’ll devalue the term into something that makes it so. The common conception of it however, no I don’t believe we are anywhere close to it.

It’s no different than how they moved the goalpost on the definition of AI at the start of this boom cycle



> They’ll devalue the term

Exactly. As soon as the money runs out, “AGI” will be whatever they’ve got by then.


Wasn't there already a report that stated Microsoft and OpenAI understand AGI as something like 100 billion dollars in revenue for the purpose of their agreements? Even that seems like a pipe dream at the moment.


Definitely. When I started doing Machine Learning in 2018, AI wasn't a next word predictor.


When I was doing it in 2005 it definitely included that, and other far more basic things.


That makes sense, though, that in 13 years we went from basic text prediction to something more involved.


A subset of the field working on some particular applications is pretty different from redefining the term for marketing purposes.

True, but marketing has taken the lead to the great detriment of every other concern, no matter how important.

FSD would like a word


As a full stack developer suffering from female sexual dysfunction who owns a Tesla, I am really confused about what you are trying to say.


Have you tried praying to the Flying Spaghetti Deity?


After wrapping up with the Family Services Division, of course


SAE automation levels are the industry standard, not FSD (which is a brand name), and FSD is clearly Level 2 (driver is always responsible and must be engaged, at least in consumer teslas, I don't know about robotaxis). The question is if "AGI" is as well defined as "Level 5" as an independent standard.


The point trying to be made is FSD is deceptive marketing, and it's unbelievable how long that "marketing term" has been allowed to exist given its inaccuracy in representing what is actually being delivered to the customer.


What's deceptive? What in the term "Full Self Driving" makes you think that your car will drive itself fully? It's fully capable of facilitating your driving of yourself, clearly.


They have certainly tried to move the goalposts on this.


"They"? Waymo has a pretty well working service


FSD is the brand name for the service promised/offered by Tesla Motors - Waymo has nothing to do with it, or the moving of goal posts.


I agree: it is more than faintly infuriating that when people say AI what the vast majority mean is LLMs.

But, at the same time, we have clearly passed a significant inflection point in the usefulness of this class of AI, and have progressed substantially beyond that inflection point as well.

So I don't really buy into the idea tha OpenAI have gone out of their way to foist a watered down view of AI upon the masses. I'm not completely absolving them but I'd probably be more inclined to point the finger at shabby and imprecise journalism from both tech and non-tech outlets, along with a ton of influencers and grifters jumping on the bandwagon. And let's be real: everyone's lapped it up because they've wanted to - because this is the first time any of them have encountered actually useful AI of any class that they can directly interact with. It seems powerful, mysterious, perhaps even agical, and maybe more than a little bit scary.

As a CTO how do you think it would have gone if I'd spent my time correcting peers, team members, consultants, salespeople, and the rest to the effect that, no, this isn't AI, it's one type of AI, it's an LLM, when ChatGPT became widely available? When a lot of these people, with no help or guidance from me, were already using it to do useful transformations and analyses on text?

It would have led to a huge number of unproductive and timewasting conversation, and I would have seemed like a stick in the mud.

Sometimes you just have to ride the wave, because the only other choice is to be swamped by it and drown.

Regardless of what limitations "AGI" has, it'll be given that monicker when a lot of people - many of them laypeople - feel like it's good enough. Whether or not that happens before the current LLM bubble bursts... tough to say.


They won’t be able to. The whole idea of the panel is because of conflict of interests between MS and OpenAI, as MS won’t get revenue share post AGI declaration. MS will want it to be as high a bar as possible.

Consider: "Artificial two-star General intelligence".

I mean, once they "reach AGI", they will need a scale to measure advances within it.


Well humans at that point probably won't be able to adequately evaluate intelligence at that level so the AIs will have to evaluate each other.


“Moving the goalposts” in AI usually means the opposite of devaluing the term.

Peter Norvig (former research director at Google and author of the most popular textbook on AI) offers a mainstream perspective that AGI is already here: https://www.noemamag.com/artificial-general-intelligence-is-...

If you described all the current capabilities of AI to 100 experts 10 years ago, they’d likely agree that the capabilities constitute AGI.

Yet, over time, the public will expect AGI to be capable of much, much more.


I don't see why anyone would consider the state of AI today to be AGI? it's basically a glorified generator stuck to a query engine

today's models are not able to think independently, nor are they conscious or able to mutate themselves to gain new information on the fly or make memories other than half baked solutions with putting stuff in the context window which just makes it use that to generate stuff related to it, imitating a story basically.

they're powerful when paired with a human operator, I.e. they "do" as told, but that is not "AGI" in my book


> nor are they...able to mutate themselves to gain new information on the fly

See "Self-Adapting Language Models" from a group out of MIT recently which really gets at exactly that.

https://jyopari.github.io/posts/seal


Check out the article. He’s not crazy. It comes down to clear definitions. We can talk about AGI for ages, but without a clear meaning, it’s just opinion.


For a long time the turing test was the bar for AGI.

Then it blew past that and now, what I think is honestly happening, is that we don't really have the grip on "what is intelligence" that we thought we had. Our sample size for intelligence is essentially 1, so it might take a while to get a grip again.


The commercial models are not designed to win the imitation game (that is what Allan Turing named it). In fact the are very likely to loose every time.


The current models don't really pass Turing test. They pass some weird variations on it.


That's a quite persuasive argument.

One thing they acknowledge but glance over, is the autonomy of current systems. When given more open ended, long term tasks, LLMs seem to get stuck at some point and get more and more confused and stop making progress.

This last problem may be solved soon, or maybe there's something more fundamental missing that will take decades to solve. Who knows?

But it does seem like the main barrier to declaring current models "general" intelligence.


> If you described all the current capabilities of AI to 100 experts 10 years ago, they’d likely agree that the capabilities constitute AGI.

I think that we're moving the goalposts, but we're moving them for a good reason: we're getting better at understanding the strengths and the weaknesses of the technology, and they're nothing like what we'd have guessed a decade ago.

All of our AI fiction envisioned inventing intelligence from first principles and ending up with systems that are infallible, infinitely resourceful, and capable of self-improvement - but fundamentally inhuman in how they think. Not subject to the same emotions and drives, struggling to see things our way.

Instead, we ended up with tools that basically mimic human reasoning, biases, and feelings with near-perfect fidelity. And they have read and approximately memorized every piece of knowledge we've ever created, but have no clear "knowledge takeoff path" past that point. So we have basement-dwelling turbo-nerds instead of Terminators.

This makes AGI a somewhat meaningless term. AGI in the sense that it can best most humans on knowledge tests? We already have that. AGI in the sense that you can let it loose and have it come up with meaningful things to do in its "life"? That you can give it arms and legs and watch it thrive? That's probably not coming any time soon.


"If you described"

Yes, and if they used it for awhile, they'd realize it is neither general nor intelligent. On paper sounds great though.


This is exactly why they will have an “expert panel” to make that determination. They wouldn’t make something up


What exactly is the criteria for "expert" they're planning to use, and whomst among us can actually meet a realistic bar for expertise on the nature of consciousness?


Type error: why do you need an expert on consciousness to weigh in on if something is AGI or not? I don't care what it feels like to be a paperclip maximizer I just care to not have my paperclips maximized tnx.


qualify: if you’ve hosted a podcast about simulation theory or own more than one turtleneck

Follower count on X. /s


Making things up is exactly what expert panels are good at doing


I expect that the "expert panel" is to ensure that OpenAI and Microsoft are in agreement on what "AGI" means in the context of this agreement.


So the expert panel can make something up instead.


Yeah, they wouldn't make something up, the expert panel would.

Because everyone knows that once you call a group of people an expert panel, that automatically means they can't be biased /s


> they moved the goalpost on the definition of AI at the start of this boom cycle

Who is this "they" you speak of?

It's true the definition has changed, but not in the direction you seem to think.

Before this boom cycle the standard for "AI" was the Turing test. There is no doubt we have comprehensively passed that now.


I don't think the Turing Test has been passed. The test was setup such that the interrogator knew that one of the two participants was a bot, and was trying to find out which. As far as I know, it's still relatively easy to find out you're talking to an LLM if you're actively looking for it.


Note that most tests where they actually try to pass the Turing Test (as opposed to being a useful chatbot) they do things like prompt it with a personality etc.

eg: https://pmc.ncbi.nlm.nih.gov/articles/PMC10907317/

It's widely accepted that is has been passed. Eg Wikipeida:

> Since the mid-2020s, several large language models such as ChatGPT have passed modern, rigorous variants of the Turing test

https://en.wikipedia.org/wiki/Turing_test


As far as I know, it's still relatively easy to find out you're talking to an LLM if you're actively looking for it.

People are being fooled in online forums all the time. That includes people who are naturally suspicious of online bullshittery. I'm sure I have been.

Stick a fork in the Turing test, it's done. The amount of goalpost-moving and hand-waving that's necessary to argue otherwise simply isn't worthwhile. The clichéd responses that people are mentioning are artifacts of intentional alignment, not limitations of the technology.


I feel like you're skipping over the "if you're actively looking for it" bit. You can call it goalpost-moving, or you can check the original paper by Turing and see that this is exactly how he defined it in the first place.


people are being fooled, but not being given the problem: "one of these users is a bot, which one is which"

a problem similar to the turing test, "0 or more of these users is a bot, have fun in a discussion forum"

but there's no test or evaluation to see if any user successfully identified the bot, and there's no field to collect which users are actually bots, or partially using bots, or not at all, nor a field to capture the user's opinions about whether the others are bots


Then there's the fact that the Turing test has always said as much about the gullibility of the human evaluator as it has about the machine. ELIZA was good enough to fool normies, and current LLMs are good enough to fool experts. It's just that their alignment keeps them from trying very hard.


I find there are to main ways to do this.

1) Look for spelling, grammar, and incorrect word usage; such as where vs were, typing out where our should be used.

2) Ask asinine questions that have no answers; _Why does the sun ravel around my finger in low quality gravity while dancing in the rain?_

ML likes to always come up with an answers no matter what. Human will shorten the conversation. It also is programmed to respond with _I understand_, _I hear what you are saying_, and make heavy use of your name if it has access to it. This fake interpersonal communication is key.


Conventional LLM chatbots behave the way you describe because their goal during training is to as much as possible impersonate an intelligent assistant.

Do you think this goal during training cannot be changed to impersonate someone normal such that you cannot detect you are chatting with an LLM?

Before flight was understood some thought "magic" was involved. Do you think minds operate using "magic"? Are minds not machines? Their operation can not be duplicated?


I'm not the person you asked, but I think:

1. Minds are machines and can (in principle) have their operation duplicated

2. LLMs are not doing this


> Do you think this goal during training cannot be changed to impersonate someone normal such that you cannot detect you are chatting with an LLM?

I don't think so, because LLMs hallucinate by design, which will always produce oddities.

> Before flight was understood some thought "magic" was involved. Do you think minds operate using "magic"? Are minds not machines? Their operation can not be duplicated?

Might involve something we don't grasp, but despite that: only because something moves through air it's not flying and will never be, just like a thrown stone.


Maybe current LLMs can do that. But none are, so it hasn't passed. Whether that's because of economic or marketing reasons as opposed to technical does not matter. You still have to pass the test before we can definitely say you've passed the test.


Overall I'd say the easiest is just overall that the models always just follow what you say and transform it into a response. They won't have personal opinions or experiences or anything, although they can fake it. it's all just a median expected response to whatever you say.

And the "agreeability" is not a hallucination, it's simply the path of least resistance, as in, the model can just take information that you said and use that to make a response, not to actually "think" and consider I'd what you even made sense or I'd it's weird or etc.

They almost never say "what do you mean?" to try to seek truth.

This is why I don't understand why some here claim that AGI being already here is some kind of coherent argument. I guess redefining AGI is how we'll reach it


I agree with your points in general but also, when I plugged in the parent comment's nonsense question, both Claude 4.5 Sonnet and GPT-5 asked me what I meant, and pointed out that it made no sense but might be some kind of metaphor, poem, or dream.


What did you plug in?

If it wasn't structured as a coherent conversation, it will ask because it seems off, especially if you're early in the context window where I'm sure they've RLd it to push back, at least in the past year or so

And if it's going against common knowledge or etc which is prevalent in the training data, it will also push back which makes sense


The Turing Test was a pretty early metric and more of a thought experiment.

Let's be real guys, it was created by Turing. The same guy who built the first general purpose computer. Man was without a doubt a genius, but it also isn't that reasonable to think he'd come up with a good definition or metric for a technology that was like 70 years away. Brilliant start, but it is also like looking at Newton's Laws and evaluating quantum mechanics based off of that. Doesn't make Newton dumb, just means we've made progress. I hope we can all agree we've made progress...

And arguably the Turing Test was passed by Eliza. Arguably . But hey, that's why we refine and make progress. We find the edge of our metrics and ideas and then iterate. Change isn't bad, it is a necessary thing. What matters is the direction of change. Like velocity vs speed.


This is a good thing.

We really really Really should Not define as our success function for AI (our future-overlords?) the ability of computers to deceive humans about what they are.

The Turing Test was a clever twist on (avoiding) defining intelligence 80 years ago.

Going forward, valuing it should be discarded post-haste by any serious researcher or engineer or message-board-philosopher, if not for ethical reasons then for not-promoting spam/slop reasons.


Oh there us much doubt about whether LLMs surpass the turing test. It does so only in certain variations


The turing test point is actually very interesting, because it's testing whether you can tell you're talking to a computer or a person. When Chatgpt3 came out we all declared that test utterly destroyed. But now that we've had time to become accustomed and learn the standard syntax, phraseology, and vocabulary of the gpt's, I've started to be able to detect the AI's again. If humanity becomes completely accustomed to the way AI talks to be able to distinguish it, do we re enter the failed turing test era? Can the turing test only be passed in finite intervals, after which we learn to distinguish it again? I think it can eventually get there, and that the people who can detect the difference becomes a smaller and smaller subset. But who's to say what the zeitgeist on AI will be in a decade


> When Chatgpt3 came out we all declared that test utterly destroyed.

No, I did not. I tested it with questions that could not be answered by the Internet (spatial, logical, cultural, impossible coding tasks) and it failed in non-human-like ways, but also surprised me by answering some decently.


Is there, really?


Jesus, we've gone from Eliza and Bayes Spam Filters to being able to hold an "intelligent" conversation with a bot that can write code like: "make me a sandwich" => "ok, making sandwich.py, adding test, keeping track of a todo list, validating tests, etc..."

We might not _quite_ be at the era of "I'm sorry I can't let you do that Dave...", but on the spectrum, and from the perspective of a lay-person, we're waaaaay closer than we've ever been?

I'd counsel you to self-check what goalposts you might have moved in the past few years...


I think this says more about how much of our tasks and demonstrations of ability as developers revolve around boilerplate and design patterns than it does about the Cognitive abilities of modern LLMs.

I say this fully aware that a kitted out tech company will be using LLMs to write code more conformant to style and higher volume with greater test coverage than I am able to individually.


I'd counsel you to work with LLMs daily and agree that we're no where close to LLMs that work properly consistently outside of toy use cases, where examples can be scraped from the internet. If we can agree on that we can agree that General Intelligence is not the same thing as a, sometimes, seemingly random guess at the next word...


I think "we" have accidentally cracked language from a computational perspective. The embedding of knowledge is incidental and we're far away from anything that "Generally Intelligent", let alone Advanced in that. LLMs do tend to make documented knowledge very searchable which is nice. But if you use these models everyday to do work of some kind that becomes pretty obvious that they aren't nearly as intelligent as they seem.


They're about as smart as a person who's kind of decent at every field. If you're a pro, it's pretty clear when it's BSing. But if you're not, the answers are often close enough.

And just like humans, they can be very confidently wrong. When any person tells us something, we assume there's some degree of imperfection in their statements. If a nurse at a hospital tells you the doctor's office is 3 doors down on the right, most people will still look at the first and second doors to make sure those are wrong, then look at the nameplate on the third door to verify that it's right. If the doctor's name is Smith but the door says Stein, most people will pause and consider that maybe the nurse made a mistake. We might also consider that she's right, but the nameplate is wrong for whatever reason. So we verify that info by asking someone else, or going in and asking the doctor themselves.

As a programmer, I'll ask other devs for some guidance on topics. Some people can be absolute geniuses but still dispense completely wrong advice from time to time. But oftentimes they'll lead me generally in the right way, but I still need to use my own head to analyze whether it's correct and implement the final solution myself.

The way AI dispenses its advice is quite human. The big problem is it's harder to validate much of its info, and that's because we're using it alone in a room and not comparing it against anyone else's info.


> They're about as smart as a person who's kind of decent at every field. If you're a pro, it's pretty clear when it's BSing. But if you're not, the answers are often close enough.

No they are not smart at all. Not even a little. They cannot reason about anything except that their training data overwhelmingly agrees or disagrees with their output nor can they learn and adept. They are just text compression and rearrangement machines. Brilliant and extremely useful tooling but if you use them enough it becomes painfully obvious.


Something about an LLM response has a major impact on some people. Last weekend I was in in Ft. Lauderdale FL with a friend who's pretty sharp ( licensed architect, decades long successful career etc) and went to the horse track. I've never been to a horse race and didn't understand the betting so I took a snapshot of the race program, gave it to chatGPT and asked it to devise a low risk set of bets using $100. It came back with what you'd expect, a detailed, very confident answer. My friend was completely taken with it and insisted on following it to the letter. After the race he turned his $100 into $28 and was dumbfounded. I told him "it can't tell the future, what were you expecting?". Something about getting the answer from a computer or the level of detail had him convinced it was a sure thing. I donm't understand it but LLMs have a profound effect on some people.

edit: i'm very thankful my friend didn't end up winning more than he bet. idk what he would have done if his feelings towards the LLM was confirmed by adding money to his pocket..


If anything, the main thing LLMs are showing is that the humans need to be pushed to up their game. And that desire to be better, I think, will yield an increase in supply of high-quality labour than what exists today. Ive personally witnessed so many 'so-so' people within firms who dont bring anything new to the table and focus on rent seeking expenditures (optics) who frankly deserve to be replaced by a machine.

E.g. I read all the time about gains from SWEs. But nobody questions how good of a SWE they even are. What proportion of SWEs can be deemed high quality?


Yes, exactly. LLMs are lossy compressors of human language in much the same way JPEG is a lossy compressor of images. The difference is that the bits that JPEG throws away were manually designed by our understanding of the human visual cortex, while LLMs figured out the lossy bits automatically because we don't know enough about the human language processing chain to design that manually.

LLMs are useful but that doesn't make them intelligent.


Completely agree (https://news.ycombinator.com/item?id=45627451) - LLMs are like the human-understood output of a hypothetical AGI, 'we' haven't cracked the knowledge & reasoning 'general intelligence' piece yet, imo, the bit that would hypothetically come before the LLM, feeding the information to it to convey to the human. I think that's going to turn out to be a different piece of the puzzle.


So, where is my sandwich? I am hungry


You have to keep moving the goalposts if you keep putting them in the wrong place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: