LOOK MAA I AM ON FRONT PAGE
Peak pseudo-science. The burden of evidence is on the grifters who claim “reason”. But neither side has any objective definition of what “reason” means. It’s pseudo-science against pseudo-science in a fierce battle.
Even defining reason is hard and becomes a matter of philosophy more than science. For example, apply the same claims to people. Now I’ve given you something to think about. Or should I say the Markov chain in your head has a new topic to generate thought states for.
By many definitions, reasoning IS just a form of pattern recognition so the lines are definitely blurred.
And does it even matter anyway?
For the sake of argument let’s say that somebody manages to create an AGI, does it reasoning abilities if it works anyway? No one has proven that sapience is required for intelligence, after all we only have a sample size of one, hardly any conclusions can really be drawn from that.
Wow it’s almost like the computer scientists were saying this from the start but were shouted over by marketing teams.
This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.
It’s hard to to be heard when you’re buried under all that sweet VC/grant money.
When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
Intuition is about the only thing it has. It’s a statistical system. The problem is it doesn’t have logic. We assume because its computer based that it must be more logic oriented but it’s the opposite. That’s the problem. We can’t get it to do logic very well because it basically feels out the next token by something like instinct. In particular it doesn’t mask or disconsider irrelevant information very well if two segments are near each other in embedding space, which doesn’t guarantee relevance. So then the model is just weighing all of this info, relevant or irrelevant to a weighted feeling for the next token.
This is the core problem. People can handle fuzzy topics and discrete topics. But we really struggle to create any system that can do both like we can. Either we create programming logic that is purely discrete or we create statistics that are fuzzy.
Of course this issue of masking out information that is close in embedding space but is irrelevant to a logical premise is something many humans suck at too. But high functioning humans don’t and we can’t get these models to copy that ability. Too many people, sadly many on the left in particular, not only will treat association as always relevant but sometimes as equivalence. RE racism is assoc with nazism is assoc patriarchy is historically related to the origins of capitalism ∴ nazism ≡ capitalism. While national socialism was anti-capitalist. Associative thinking removes nuance. And sadly some people think this way. And they 100% can be replaced by LLMs today, because at least the LLM is mimicking what logic looks like better though still built on blind association. It just has more blind associations and finetune weighting for summing them. More than a human does. So it can carry that to mask as logical further than a human who is on the associative thought train can.
People think they want AI, but they don’t even know what AI is on a conceptual level.
Yeah I often think about this Rick N Morty cartoon. Grifters are like, “We made an AI ankle!!!” And I’m like, “That’s not actually something that people with busted ankles want. They just want to walk. No need for a sentient ankle.” It’s a real gross distortion of science how everything needs to be “AI” nowadays.
If we ever achieved real AI the immediate next thing we would do is learn how to lobotomize it so that we can use it like a standard program or OS, only it would be suffering internally and wishing for death. I hope the basilisk is real, we would deserve it.
AI is just the new buzzword, just like blockchain was a while ago. Marketing loves these buzzwords because they can get away with charging more if they use them. They don’t much care if their product even has it or could make any use of it.
I agree with you. In its current state, LLM is not sentient, and thus not “Intelligence”.
And that’s pretty damn useful, but obnoxious to have expectations wildly set incorrectly.
I don’t think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called “complex”) puzzles. Like Towers of Hanoi but with 25 discs.
The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.
The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don’t have an answer for why this is, but they suspect that the reasoning doesn’t scale.
It’s all “one instruction at a time” regardless of high processor speeds and words like “intelligent” being bandied about. “Reason” discussions should fall into the same query bucket as “sentience”.
My impression of LLM training and deployment is that it’s actually massively parallel in nature - which can be implemented one instruction at a time - but isn’t in practice.
XD so, like a regular school/university student that just wants to get passing grades?
I see a lot of misunderstandings in the comments 🫤
This is a pretty important finding for researchers, and it’s not obvious by any means. This finding is not showing a problem with LLMs’ abilities in general. The issue they discovered is specifically for so-called “reasoning models” that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.
Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that’s a flaw that needs to be corrected before models can actually reason.
Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn’t if AI can reason, but how its reasoning differs from ours.
There’s probably alot of misunderstanding because these grifters intentionally use misleading language: AI, reasoning, etc.
If they stuck to scientifically descriptive terms, it would be much more clear and much less sensational.
When given explicit instructions to follow models failed because they had not seen similar instructions before.
This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.
I’m not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.
If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.
Sure. We weren’t discussing if AI creates value or not. If you ask a different question then you get a different answer.
Well - if you want to devolve into argument, you can argue all day long about “what is reasoning?”
You were starting a new argument. Let’s stay on topic.
The paper implies “Reasoning” is application of logic. It shows that LRMs are great at copying logic but can’t follow simple instructions that haven’t been seen before.
This would be a much better paper if it addressed that question in an honest way.
Instead they just parrot the misleading terminology that they’re supposedly debunking.
How dat collegial boys club undermines science…
What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it’s no longer reasoning? I feel like at this point a more relevant question is “What exactly is reasoning?”. Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.
Sure, these grifters are shady AF about their wacky definition of “reason”… But that’s just a continuation of the entire “AI” grift.
If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It’s like comparing PhD reasoning to a dog’s reasoning.
While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).
Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it’s designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don’t have the tech to make a synthetic human.
I think as we approach the uncanny valley of machine intelligence, it’s no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.
It’s just the internet plus some weighted dice. Nothing to be afraid of.
What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to “reasoning models” that allow them to break free of the inherent boundaries of the statistical methods they are based on?
Yeah these comments have the three hallmarks of Lemmy:
- AI is just autocomplete mantras.
- Apple is always synonymous with bad and dumb.
- Rare pockets of really thoughtful comments.
Thanks for being at least the latter.
Some AI researchers found it obvious as well, in terms of they’ve suspected it and had some indications. But it’s good to see more data on this to affirm this assessment.
Particularly to counter some more baseless marketing assertions about the nature of the technology.
Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.
Ragebait?
I’m in robotics and find plenty of use for ML methods. Think of image classifiers, how do you want to approach that without oversimplified problem settings?
Or even in control or coordination problems, which can sometimes become NP-hard. Even though not optimal, ML methods are quite solid in learning patterns of highly dimensional NP hard problem settings, often outperforming hand-crafted conventional suboptimal solvers in computation effort vs solution quality analysis, especially outperforming (asymptotically) optimal solvers time-wise, even though not with optimal solutions (but “good enough” nevertheless). (Ok to be fair suboptimal solvers do that as well, but since ML methods can outperform these, I see it as an attractive middle-ground.)Machine learning based pattern matching is indeed very useful and profitable when applied correctly. Identify (with confidence levels) features in data that would otherwise take an extremely well trained person. And even then it’s just for the cursory search that takes the longest before presenting the highest confidence candidate results to a person for evaluation. Think: scanning medical data for indicators of cancer, reading live data from machines to predict failure, etc.
And what we call “AI” right now is just a much much more user friendly version of pattern matching - the primary feature of LLMs is that they natively interact with plain language prompts.
What’s hilarious/sad is the response to this article over on reddit’s “singularity” sub, in which all the top comments are people who’ve obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don’t understand AI or “reasoning”. It’s a weird cult.
So, what your saying here is that the A in AI actually stands for artificial, and it’s not really intelligent and reasoning.
Huh.
The AI stands for Actually Indians /s
NOOOOOOOOO
SHIIIIIIIIIITT
SHEEERRRLOOOOOOCK
The funny thing about this “AI” griftosphere is how grifters will make some outlandish claim and then different grifters will “disprove” it. Plenty of grant/VC money for everybody.
Without being explicit with well researched material, then the marketing presentation gets to stand largely unopposed.
So this is good even if most experts in the field consider it an obvious result.
Extept for Siri, right? Lol
Apple Intelligence
OK, and? A car doesn’t run like a horse either, yet they are still very useful.
I’m fine with the distinction between human reasoning and LLM “reasoning”.
Cars are horses. How do you feel about statement?
The guy selling the car doesn’t tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it’s utility is nearly limitless because Tesla has demonstrated there’s no actual penalty for lying to investors.
Then use a different word. “AI” and “reasoning” makes people think of Skynet, which is what the weird tech bros want the lay person to think of. LLMs do not “think”, but that’s not to say I might not be persuaded of their utility. But thats not the way they are being marketed.
It’s not just the memorization of patterns that matters, it’s the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that’s value - that’s the new Google.
While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.
Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It’s gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.
Hallucinations and the cost of running the models.
So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from “a book” or “the TV” has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.
The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What’s the price of the resulting increased ignorance of the general population due to the high cost of information access?
What good is a bunch of knowledge stuck behind a search engine when people don’t know how to access it, or access it efficiently?
Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.
Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.
I also worry that “easy access” to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don’t know because they’re dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead…
Fucking obviously. Until Data’s positronic brains becomes reality, AI is not actual intelligence.
AI is not A I. I should make that a tshirt.
It’s an expensive carbon spewing parrot.
It’s a very resource intensive autocomplete
I think it’s important to note (i’m not an llm I know that phrase triggers you to assume I am) that they haven’t proven this as an inherent architectural issue, which I think would be the next step to the assertion.
do we know that they don’t and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don’t? That’s the big question that needs answered. It’s still possible that we just haven’t properly incentivized reason over memorization during training.
if someone can objectively answer “no” to that, the bubble collapses.
do we know that they don’t and are incapable of reasoning.
“even when we provide the algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve”
That indicates that this particular model does not follow instructions, not that it is architecturally fundamentally incapable.
Not “This particular model”. Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.
The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.
those particular models. It does not prove the architecture doesn’t allow it at all. It’s still possible that this is solvable with a different training technique, and none of those are using the right one. that’s what they need to prove wrong.
this proves the issue is widespread, not fundamental.
The architecture of these LRMs may make monkeys fly out of my butt. It hasn’t been proven that the architecture doesn’t allow it.
You are asking to prove a negative. The onus is to show that the architecture can reason. Not to prove that it can’t.
Is “model” not defined as architecture+weights? Those models certainly don’t share the same architecture. I might just be confused about your point though
It is, but this did not prove all architectures cannot reason, nor did it prove that all sets of weights cannot reason.
essentially they did not prove the issue is fundamental. And they have a pretty similar architecture, they’re all transformers trained in a similar way. I would not say they have different architectures.
Ah, gotcha
No way!
Statistical Language models don’t reason?
But OpenAI, robots taking over!