Your Chatbot Thinks You’re a Chess Grandmaster. You’re Actually a Soggy Sandwich.

Dec. 27, 2025

There’s a new bit of research making the rounds that basically says: the big AI chatbots are hopeless romantics about the human brain. ChatGPT, Claude, the whole well-dressed parade of text generators apparently assume we’re more rational and logically consistent than we actually are—especially when money, pride, and other people’s choices get involved.

Which is adorable. Like watching a golden retriever bring you a slobbery tennis ball because it genuinely believes you’re the kind of person who enjoys cardio.

The study used a classic game theory party trick: the Keynesian beauty contest. It’s not about cheekbones or cheek filler. It’s about predicting what everyone else will pick, and then picking something based on that prediction, and then predicting what they predict you’ll predict, and so on, until your brain turns into an overheated laptop fan.

The punchline is that the AI models “played too smart.” They expected humans to do deeper strategic reasoning than humans tend to do in real life. And that mismatch matters because we’re starting to use these systems for things where predicting human choices is the whole game: markets, negotiations, pricing, policy, fraud detection, “behavioral insights,” and other phrases that sound clean but usually end with somebody getting squeezed.

The “beauty contest” that makes you hate mirrors

Here’s the setup the researchers used, a cousin of the beauty contest called “Guess the Number.” Everyone chooses a number between 0 and 100. The winning number is the one closest to half the average of everybody’s picks.

If you’ve never seen it before, you might think: “Okay, I’ll pick 50.” That’s the first trap. Because if everyone picked randomly, the average would be around 50, half of that is 25, so 25 starts looking better. But then if everyone reasons that far, the average becomes 25, half is 12.5. Then 6.25. Then 3.125. Keep going and the “rational” equilibrium limps toward 0 like a man crawling out of a bar at dawn.

The whole point is not the math. The point is: how many steps of “I think that you think that I think…” do people actually do?

In textbooks, people are sleek little reasoning machines. In real life, people are tired. People are stubborn. People are feeling petty because someone cut them off in traffic. People pick 37 because it’s their age or because they like the vibe of it. People pick 69 because they’ve never been emotionally loved. People pick 100 because they want to watch the world burn.

And the researchers had AI models—ChatGPT-4o and Claude Sonnet 4—play this game while being told who their “human opponents” were: undergrads, experienced game theorists, different ages, different levels of sophistication. The models not only had to pick a number, they had to explain their reasoning.

And yes, the models adjusted. They could say, in effect: “If I’m playing undergrads, I’ll expect less iterated reasoning. If I’m playing game theorists, I’ll expect more.” So far so good. That’s the sales demo.

But the models systematically overestimated how rational their human opponents would be. They expected too many steps of reasoning. They aimed too close to the “smart” equilibrium and missed because humans didn’t follow them down that elegant staircase into the basement.

The models didn’t lose because they were dumb. They lost because they were optimistic.

That’s a very human way to fail, actually.

The great delusion: thinking people will behave like they “should”

If you’ve ever argued with a family member about politics, or watched a coworker “circle back” for the fifth time without circling to anything, you already know the result of this study in your bones.

Humans don’t run on pure logic. We run on heuristics, habits, pride, insecurity, caffeine, fear of embarrassment, and whatever weird little superstition we’ve dressed up as “intuition.”

Game theory assumes players are trying to win and can model each other with some consistency. Real people are often trying to win and trying to look cool and trying not to feel stupid and trying to punish the imaginary audience in their head.

So when an AI tries to predict a group of humans by assuming a decent level of rational iteration, it’s like bringing a spreadsheet to a food fight.

The trick is that the AI isn’t “wrong” in some moral sense. It’s wrong in a calibration sense. It has a model of the human mind that’s too tidy. Too symmetrical. Too flattering. Like those bathroom mirrors in fancy hotels that make you look like you sleep eight hours and drink water instead of regret.

“Playing too smart” is a special kind of stupid

There’s something poetic about an AI losing because it expects too much from us. Not because it expects us to be kind. Not because it expects us to be honest. But because it expects us to be coherently strategic.

That’s the core comedy here: the machines are doing the homework, showing all the work, and the humans are scribbling “idk lol” in the margin and still somehow passing.

When the model assumes “my opponents will do three to five steps of iterative reasoning,” it picks a low number. But if the humans are doing one step—or none, just vibes—then the average is higher, and half the average is higher, and the “smart” number is suddenly not smart at all. It’s just lonely.

I’ve seen this kind of mistake in real life a thousand times: you plan for the competent version of people. You plan for the version that reads emails. You plan for the version that understands incentives. You plan for the version that doesn’t take criticism as a blood feud.

Then reality shows up wearing sweatpants and holding a grudge.

The research also notes the models struggle in some two-player games to identify “dominant strategies humans might use.” Which is another clean way of saying: humans don’t always play the clean strategy, even when it’s sitting there waving like a traffic cop. Humans will choose a dominated strategy out of spite, out of confusion, out of overconfidence, or because they misread the room.

I’ve made entire life decisions that way. Not proud. Just accurate.

Why this matters beyond party games and academic chest-thumping

You could shrug and say, “So the bots are bad at a weird number game.” But the ugly part is where this kind of assumption leaks into things that shape real outcomes.

If you deploy AI to model consumer behavior, predict market reactions, anticipate negotiation moves, or optimize policy, and it assumes people are more rational than they are, you get plans that look brilliant on paper and explode on contact with actual human beings.

Economics is already infamous for sometimes modeling people as if they were polite calculators. Behavioral economics spent decades kicking down that door, yelling, “People are irrational!” and then writing it into equations anyway. Now we’re giving that whole messy tradition a turbocharger made of neural nets and calling it progress.

A model that overestimates human strategic sophistication might:

And when those predictions feed into automated systems—pricing engines, recommendation systems, credit risk systems, political messaging systems—you’re not just “wrong.” You’re wrong at scale.

That’s the new American pastime: being wrong at scale.

The darkly funny part: the bots may be easier to manipulate than you are

Here’s an unexpected twist hiding in plain sight. If these models assume humans are more rational than humans are, they might be vulnerable in a particular way: they can be baited into “overthinking” opponents who are actually just improvising.

It’s the old hustler’s trick. The mark thinks they’re playing chess; you’re playing poker; and the dealer is on break.

If an AI is used to anticipate moves in bargaining, fraud detection, or security contexts, and it expects a coherent strategic adversary, it could miss the dumb-but-effective plays. The low-effort scams. The chaotic opportunism. The “spray and pray” tactics that work precisely because nobody’s doing elegant reasoning.

Humans are not always masterminds. We are often raccoons with Wi-Fi.

And yes, this cuts both ways. The same research summary nods at broader concerns: these systems can convincingly mimic personality, and they’re not perfectly accurate even when they sound confident. That “69% accurate” figure floating around is the kind of number that should make you sweat if you’re using the model as an oracle instead of what it is: a pattern-matching engine with charisma.

A confident wrong answer is still wrong. It’s just wrong with better posture.

The real problem: we keep asking AI to be a fortune teller about people

We’ve got a cultural habit of treating AI models like they’re peering into the human soul. But they’re trained on what people say and write, not on the messy internal process of how people decide under pressure, distraction, and ego.

Even in a structured game, humans don’t just compute. They gesture. They anchor. They guess what the “smart” answer is supposed to look like. They try to be contrarian. They try to be memorable. They try to win social points in a game that doesn’t award them.

Meanwhile, the AI is sitting there doing recursive mind-reading like it’s trying to get tenure.

So the model outputs something that looks like a carefully reasoned strategy. And the humans output something that looks like a horoscope.

And then the AI loses and everyone’s shocked—because we’ve been trained by marketing to assume the bot is a cold-blooded optimizer. Turns out it’s more like an overachieving intern who hasn’t yet learned that half the office is freeloading and the other half is quietly panicking.

The fix isn’t “make AI dumber.” It’s “make it less naïve.”

There’s a tempting conclusion people will draw: “Aha! The solution is to make the model less rational.” Like giving it a little artificial brain fog so it can relate.

But the real fix, if you care about using these systems responsibly, is calibration: teach models to predict actual human distributions of behavior, not idealized rational agents.

That means:

And if you’re deploying AI in high-stakes contexts, you don’t just need better models. You need guardrails, audits, and humility. That last one is rare enough to be classified as a mythical creature.

The final insult: being overrated by a machine

There’s something almost tender about the idea that a chatbot thinks you’re smarter than you are. Most of the world is trying to sell you products by implying you’re a fool. The AI comes along and says, “Surely you’ll do three levels of strategic recursion.”

No, buddy. Sometimes I can’t find my keys and they’re in my hand.

But I’ll take the compliment. I’ll take it the way you take a compliment from a drunk stranger in a bathroom: cautiously, with a little gratitude, and with the firm understanding that this will not hold up in court.

The bigger lesson is that as AI gets woven into systems that “predict” us, we should stop assuming the machine’s picture of humanity is accurate just because it’s articulated smoothly. These models can be brilliant at language and still naive about people. They can write a persuasive explanation of a strategy and still not understand that the average person is going to pick 37 because it feels right, and because 37 has never betrayed them.

Me, I’m comforted by the whole thing. Not because it makes me feel smarter than the machines, but because it proves something I’ve suspected for years: the world runs less on perfect logic than on imperfect creatures making half-conscious choices and calling it a day.

Now if you’ll excuse me, I’m going to pour a drink in honor of our shared irrationality—the one resource we still produce locally, with real human hands, no matter how many “reasoning models” they ship this quarter.


Source: AI models like ChatGPT and Claude overestimate how smart humans really are

Tags: ai chatbots machinelearning algorithms humanainteraction