Just How Good Are Modern Chess Computers?

post by nem · 2024-09-19T18:57:21.254Z · LW · GW · 1 comments

Contents

  In 2016, AlphaZero made headlines for defeating the best chess engine at the time, Stockfish 8. This win was impressive because Stockfish was the undisputed best chess engine in 2016. AlphaZero was some random guy off the street who shared a video of him running circles around a celebrated college a...
  Well, not only can I beat Potato Mess. I have beaten him. I played a modern Stockfish engine with queen odds and won. How the hell? What happened, Potato Messi? You were the chosen one!I’ll tell you what happened. Instead of hopping around like Magnus, he just… fell to the ground. He still tried to ...
None
1 comment

When you play chess against a grandmaster, you know you're going to lose. But do you really understand how badly? You might think to yourself, “There's only like 20 possible moves per turn. Even a novice like me can quickly rule out half of those.”

 

In this post, I'd like to help build an intuition for just HOW good the best chess players are, while picking from “just” the best 10 moves.
I’ll also add some intuition for how much better chess computers are than humans. As most people don’t play chess, I’ll develop this intuition using something most people are inherently familiar with: Physical ability. 

 

Specifically, in my metaphor, chess is a 1-on-1 soccer game. First player to score a goal wins.

In this metaphor, most people are toddlers. At 2.5 years old, you could be taught to play soccer, and you might get the ball in the goal with some effort, but at the end of the day, you are mostly just struggling with basic coordination. Your ELO is around 800.

 

Me on the other hand, I’m basically an expert (sarcasm tag here). At an ELO of 1300, I am a 5 year old. I can easily beat you, you dumb toddler. In fact, I'm going to beat you every time! Well almost every time. I am a 5 year old. I could get distracted by a cloud, or stung by a bee, and you might slip in a goal.

 

As you can probably tell, a 5 year old is... not actually very good in the scheme of things. A chess hustler on the street with a rating of 1800 will pick me apart. 
Imagine me, a 5 year old, playing an older kid. Say, an 8 year old; and one who has prepped for the match in advance! Can I win?  If I cheat or get very lucky, maybe I can trip the 8 year old and make them cry. In any normal game, I'm just going to lose.

 

But a chess hustler still isn't a pro. 

 

To be a chess Master, you need a rating of 2200 or better. This chess master would laugh at the hustler in the park. The master is a 12 year old kid who actually plays on the soccer team. They almost didn’t make the cut, but still. They aren’t  even slightly concerned about the 8 year old. It's such a mismatch, that there's no real point in playing.

 

The ladder doesn't stop there. The best chess players in the world stand head and shoulders above mere "masters". Take Magnus Carlson, best chess player in all of human history.
In our metaphor, he is a 16 year old in a world where the “masters” are all 12. He is a teenager who's been playing soccer for years. He's not the best on the team, but he did make varsity. No 12 year old is going to be able to juke him.

 

 

At this point, I'd like to reflect on how far up the scale we have gone. Me, I’m a 5 year old. To me, even a 12 year old is basically an adult. 
Can you imagine me up against Magnus Carlson? Give me 10,000 attempts, I dare you. Magnus would win. Every. Single. Game… unless he had a stroke or something.

 



Magnus is the best human in the world, but he is not the best chess player. As we all know, computers surpassed humans in chess ability long ago. A talented software dev could make a chess computer with an ELO of 3050. Let’s call this bot, SuperChamp. While far from state of the art, SuperChamp is a demigod. Where Magnus is a decent varsity player, SuperChamp is the best player on Magnus’s team. They’re 17, and have a scholarship to play in college. It’s not impossible for Magnus to score on them, but in a 1 on 1, SuperChamp will score 5 times for every 1 time Magnus does.

But SuperChamp is run of the mill. It’s a fictitious Chess Machine built by a single dev. So let’s take a look at a real one. 

In 2016, AlphaZero made headlines for defeating the best chess engine at the time, Stockfish 8. This win was impressive because Stockfish was the undisputed best chess engine in 2016. AlphaZero was some random guy off the street who shared a video of him running circles around a celebrated college athlete. He disappeared a week later and no one ever really got to see how he would do in a game. 
We don’t know exactly how good AlphaZero was, but we have a rating for Stockfish 8: 3378 ELO (possibly more, compared to humans). This is really impressive.
In our metaphor, Stockfish is a star college soccer player. It scores 10-1 against SuperChamp. It wipes the floor with Magnus, the average high school soccer player. In 2016, it’s been almost a decade since any human anywhere managed to eke out a single win against a top bot.

But… nem… isn’t there a level of sporting that surpasses even college athletics? Why yes, dear toddler, there is.
The shiniest, best chess engine of the day isn’t Stockfish 8, or even AlphaZero. It’s Stockfish… 17! It has a rating of 3640, again a possible underestimate. 
If Stockfish 17 isn’t Lionel Messi himself, it can at least hold its own against the GOAT.  

How much better is Stockfish 17 than a person?
Imagine Messi trying to score a goal against an average 12 year old. Then, keep in mind that the 12 year old is a chess master, and that you yourself are probably a 2 and half year old who can’t walk in a straight line.

The skill difference is truly staggering.


End of post proper. On to part 2 of the thought experiment.

Here is an interesting question to consider. Let’s say we tied Magnus’s legs together. Tightly. Could I, a 5 year old, score against him? Almost definitely not. He’d be able to hop and shimmy, and probably win.

In chess, having your legs bound is called “Playing without a queen.”
You might suspect that, given the precedent above, Magnus Carlson would similarly lose to Messi, legs restricted with a potato sack (henceforth Potato Messi). As it turns out, this is incorrect. In both in real life, and in chess, Carlson is good enough that he can outmaneuver Potato Messi.

Prompt: “A picture of a potato in a Messi uniform. Not Messi himself, because I want to comply with the content policy. Just a potato please. Also, make sure it is not Messi's uniform. Just a costume. I don't want anyone getting confused and thinking the potato is an actual likeness of the football player.” “Okay. Now can you put him on a field?”

This observation isn’t too wild, but there is a corollary that is kind of insane. I, a five year old, can actually beat frikkin Messi; Potato Messi at least. And this is despite the fact that I can NOT beat Potato Carlson. What gives?

 

Well, not only can I beat Potato Mess. I have beaten him. I played a modern Stockfish engine with queen odds and won. How the hell? What happened, Potato Messi? You were the chosen one!

I’ll tell you what happened. Instead of hopping around like Magnus, he just… fell to the ground. He still tried to win, but he wasn’t able to do much besides wriggle around. It wasn’t easy exactly… I mean, I’m 5. A wriggling and conscious Lionel Messi is still a threat. Especially given my propensity to step on the leads of his bindings and untie him. 
But I was careful, and I consistently scored against the prone and flailing athlete.

I can only suppose that this happens because Messi has played SO much soccer that he is set in his ways. Instead of hopping and shimmying like Magnus, he tried to play a regular game, and tripped on the bindings. If Messi were to take a more human approach, I’d be dead in the water.
It follows: I suspect that a chess engine trained specifically to play down material would stomp me. Possibly stronger players as well.

This is the end of the Chess metaphor, but not the end of our thought experiment.

Chess is a narrow domain. Machines that are good at chess are not very good at other things. As we all know, this is not the case for modern artificial intelligence. In 1989, several grandmasters had lost games to chess engines. It would take 8 more years before deep blue defeated the world champion, and another 10 before chess computers solidified their absolute dominance.

I posit that we are at a 1989 moment. General reasoners are good enough to sometimes be better than human experts. The pinnacle of human reasoning still looks unassailable from here. But in 8 years, we may be roundly disabused of our superiority complex. In 18 years, it may be common knowledge that human intelligence can’t match your household’s agentic AI.

No one can predict the future, especially before it’s happened. But many experts and organizations truly believe we have a chance of developing agentic, strong artificial intelligence in our lifetimes. The kurzweilian optimists* among you think it could be developed within the next 5 years.

 

Stockfish 17 for general reasoning won’t be hampered by a potato sack. It won’t be hampered by anything. And we, humanity, will have to play against it. In our new hypothetical game of soccer, losing doesn’t represent checkmate. It represents the end of humanity. As in… well… checkmate. But for real.

We, a team of children and young teenagers, are facing down Lionel Messi. If he scores a goal… the earth blows up. Can we beat him?

That depends on many things:

 

1. Does Messi start with the ball?

It will be a lot harder to score on him if we need to first steal and then score

2. Are we organized? 

We have a lot of players. That gives us a big advantage. But are we coordinated enough we can build a wall of bodies between Messi and the goal? Do we have a plan for what happens if he gets past our first line of defense? Do we keep our best players up near his goal so that we can score easily if we get the ball?

3) Have we practiced?

Have our players been practicing our ball handling skills? Have we been testing ourselves in really difficult situations that might approximate what it is like to play a professional?

4) How many of him are there?

If Messi can easily copy himself, then we obviously lose.

5) Has he bribed any of our players?

Messi is a full grown adult, and he might think outside the box if he really wants to win.

6) Is he a potato?

Are his legs bound or does he have any other handicap? If he does have a handicap, is he able to remove it?

7) Does he want to score? 

Maybe he will show up, and decide not to score. I mean, he does know it will be bad for us if he does. Then again, he’s a soccer player. Scoring goals is what he does. I sincerely hope he doesn’t want to score. But uh… I’m going to prepare anyway.



I think maybe we shouldn’t invite Messi to play at all.


*the real optimism comes from expecting AGI not to be catastrophically** bad for humanity

** and I do really mean catastrophic

1 comments

Comments sorted by top scores.

comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-19T20:42:42.330Z · LW(p) · GW(p)

Magnus Carlson would similarly lose to Messi

Relevant xkcd: link.