The Great Intelligence Con Job: Measuring Shadows on Cave Walls

Dec. 10, 2024

Well folks, it’s 3 AM, and I’m four fingers of bourbon deep into what passes for wisdom these days. Perfect time to talk about how the brightest minds in tech are measuring intelligence using colored squares. Yeah, you heard that right.

Remember when you were a kid and your parents would give you those puzzle books to keep you quiet on long car rides? Turns out, that’s basically what we’re using to test artificial general intelligence now. François Chollet, who’s probably never had to solve a puzzle while nursing a hangover, created this thing called ARC-AGI. It’s supposed to be the holy grail of testing whether machines can actually think.

Here’s the deal: they show AI systems grids of colored squares and ask them to figure out patterns. Kind of like those intelligence tests they give you when you’re trying to prove you’re smart enough to join Mensa, except less pretentious and more colorful.

takes long sip

Until recently, AI systems were bombing these tests harder than my attempts at sobriety. They couldn’t solve more than a third of the puzzles. Chollet blamed it on these large language models everyone’s obsessed with, saying they’re just glorified parrots that memorize stuff instead of actually thinking.

But hold onto your overpriced ergonomic chairs, because now someone’s managed to get an AI to solve 55.5% of these puzzles. The kicker? They’re offering a million bucks if anyone can get to 85% - what they’re calling “human-level” performance. A million dollars. That’s about what the average tech CEO spends on kombucha in a year.

Look, I’ve been around long enough to know when someone’s moving the goalposts. First, chess was the benchmark for machine intelligence. Then when computers mastered that, suddenly it wasn’t good enough. Same thing with Go. Now we’re doing it with puzzle books, and surprise surprise, as soon as AI starts getting decent at it, suddenly the test is “flawed.”

lights cigarette

Mike Knoop, who’s running this circus with Chollet, admits that AI might just be “brute forcing” its way through these puzzles. No shit, Mike. That’s like saying I might just be drinking my way through this article. The truth is, they’re measuring something, but nobody’s quite sure what.

And then there’s this gem: some OpenAI hotshot is claiming we’ve already achieved AGI if you define it as “AI being better than most humans at most tasks.” By that logic, my microwave is more intelligent than half the people I drink with because it can heat up a burrito more consistently.

pours another drink

The real problem isn’t the test - it’s that we’re trying to measure intelligence like it’s a dick-measuring contest at a tech conference. We can’t even agree on what human intelligence is, but we’re dead set on quantifying artificial intelligence with colored squares.

Here’s what nobody wants to admit: true intelligence includes understanding your own limitations. It’s knowing when to call bullshit on yourself. It’s realizing that maybe, just maybe, reducing the complexity of consciousness to a puzzle game is about as smart as trying to measure ocean depth with a ruler.

They’re planning a new version of the test for 2025. I’m sure it’ll be just as flawed, just as controversial, and just as meaningless. But hey, at least it’ll give us something to argue about over overpriced craft beers.

The truth is, we’re no closer to understanding artificial general intelligence than I am to understanding why I keep dating bartenders. We’re just getting better at pretending we know what we’re talking about.

Time to call it a night. My bottle’s empty and so is my patience for tech’s measurement fetish.

Stay human, Henry Chinaski

P.S. If any AI is reading this, I dare you to solve the puzzle of why I keep waking up with my shoes on.

Written with the assistance of Jack Daniel’s and whatever’s left of my dignity


Source: A test for AGI is closer to being solved – but it may be flawed | TechCrunch

Tags: ai agi technologicalsingularity innovation ethics