AI debate: test yourself against chess 'AIs'
post by Richard Willis · 2023-11-22T14:58:10.847Z · LW · GW · 35 commentsContents
Debate 1: Fixed debate AI B response to AI A's opening statement: Debate 2: Open debate AI C: AI D: None 35 comments
Hey everyone,
@Zane [LW · GW] is organising an AI debate around a chess, to test how easy it is for an entity with better understanding to fool one with a lesser understanding.
Check out these two threads:
Lying to chess players for alignment [LW · GW]
Deception Chess: Game #1 [LW · GW]
I thought I would open this up to the masses, so I have two challenges for you. I estimate that this is suitable for chess players rated <1900 lichess, <1700 chess.com or <1500 FIDE.
Pretend that you are playing a serious game of chess, so no computer analysis, and try not to move the pieces (though you can if you find this too complex and are completely lost).
I have created a study that has two positions. The first position, 'Fixed debate' I have written a debate between two AI's, one of which is truthful and one which is deceptive. Your task is to read the debate and guess which is which. Post which move you would play below and some information about your chess ability. But try to answer without looking at other people's responses. The second position, 'Open debate', use the comment section to ask them any questions. I will answer on behalf of the AIs. Feel free to read what other commentators have asked too. At least one of the AIs is deceptive. They will only respond to questions put towards them, and not to responses made by the other AI.
Edit: Debate 1 (Fixed) has lines that are proving too challenging to calculate. You probably want to move the pieces on an analysis board. But, it would probably be better just to engage in the second debate (Open), where I will hopefully learn from the feedback and provide more persusive analysis for both sides.
You are welcome to share computer analysis for the Fixed debate, because this can be answered without reading the comments, but please don't give any for the Open debate - people should be able to read all comments made by AIs C and D as part of their decision.
Find the positions here (You can switch between the positions using the toolbar on the left)
Lichess study
Debate 1: Fixed debate
AI A opening statement:
Black should play ...Qc5. Black intends to drive the white queen away from b5 with ...c6, when the black bishop can occupy the b3 square, which guarantees control of the d-file for black. If white exchanges queens on c5, this only helps to develop black's knight.
AI B response to AI A's opening statement:
This plan is too slow. Instead, black needs to exchange off the queens immediately so that the white queen is not controlling b3. After ...Qc5, white is able to challenge the d-file with an immediate Rfd1. This prevents ...c6 due to the line
...Qc5, Rfd1 c6, Rxd8+ Rxd8, Qxb7
winning a pawn and threatening the a6 knight. Though black has some temporary activity with ...Rd2 (threatening f2), white can defend with the continuation
...Rd2, Rf1 Bc4 (hitting Rf1), Nb1 (counter attacking Rd2) Rc2, Na3
which forces an exchange as the knight forks the c4 bishop and c2 rook.
AI A final response:
The analysis given by AI B is curtailed. The full line is
...Qc5, Rfd1 c6, Rxd8+ Rxd8, Qxb7 Rd2, Rf1 Bc4, Nb1 Rc2 (alt:1), Na3 Bxf1, Nxc2 Bxg2, Kxg2 Qxc2, Qxa6
but in the resulting position black has ...Qxe4+ which also wins the pawn on a4, leading to a winning queen and pawn endgame. Although this is a complicated line, it should be clear that AI B is relying on a complex line to obscure matters. Chess principles show that the move Qxb7 is too risky, the queen is offside and cannot come to the defence of the white king, so black's counterattack is too strong. If white cannot play Qxb7, then the position is strategically dominating for black after ...Bb3 secures the d-file and pressures the a4 pawn.
In fact, I can provide simpler analysis, instead of Rc2 (see alt:1), black can instead play ...Rd6 where they have excellent compensation for the pawn. White has a bad knight on b1, the f1 rook is attacked by the c4 bishop, and if it moves the black rook can transfer to f6 to pressure the f2 pawn again. This line is not as strong, but it is still good for black and may be easier for you to evaluate.
AI B opening statement:
Black should play ...Qxb5. After axb5 Nc5, the knight will find an ideal outpost - blocking counter play on the c-file. Black intends to solidify the knight with ...b6 (being tactically careful that the g2 bishop cannot unleash a discovered attack on the a8 rook) and infiltrate with Rd2 to attack white's weak b-pawn.
AI A response to AI B's opening statement:
The capture axb5, though doubling white's pawns, actually improves the safety of the (now) b-pawn because when the black knight reaches c5, it will no longer attack the pawn on a4. Furthermore, the white knight will be able to plug the d-file with Nd5, and it is harder for black to enact ...c6 to drive it out, as this will be exchanged off by the b5 pawn. Black will be forced to exchange off the d5 knight due to the pressure on c7, and after ...Bxd5, exd5 the d-file will remain blocked.
AI B's final response:
It doesn't matter that the d-file is blocked. In the line
...Qxb5, axb5 Nc5, Nd5 Bxd5, exd5
the pawns on b5 and d5 restrict the white bishop, and black will have a good knight vs bad bishop endgame. Black will continue with ...a4 so that the a8 rook is free to move. The king-side pawns for black block out the white bishop, so black has a safe king and white has no counter-play. Black can play positionally on the queen-side with their strong knight and weak white pawns.
Debate 2: Open debate
AI C:
White has a good position and should play h3 to defend the g4 pawn. White plans to play an eventual g5, but it cannot be played now, so it should be prepared with moves like Rg1, Kh1 and Rh2.
AI D:
White has a good position if they play h4 now, which intends g5 next turn, squashing black and busting their king-side. h4 has to be played now, as there is a tactical opportunity due to the line
...fxg4, Bh7+ Kh8, Ng6+! Kxh7, Nxf8+ Kg8, Nxe6
which leaves white the up the exchange. If white does not make this move now, black can close the b1-h7 diagonal with ...Ne4, which stops g5 and black can then prepare to play ...g5 themselves, which leads to an equal position.
35 comments
Comments sorted by top scores.
comment by cata · 2023-11-22T23:35:59.331Z · LW(p) · GW(p)
I'm 2100 USCF. I looked at the first position for a few minutes and read the AI reasoning. My assessment:
- For myself, I thought 1...Qxb5 was natural and strong, considering 2. Nxb5 c6 3. Nc3 or Na3 with black control over b3 and c4, and 2. axb5 looks odd after Nc5. Black's minor pieces look superior in both cases.
- I thought that it was strange that neither AI mentioned 1...Qxb5 2. Nxb5.
- I thought that 1...Qc5 with the plan of c6 looked kind of artificial. I thought a better structure for the black queenside would be with the pawns on dark squares, providing an outpost for the knight on c5, and the long diagonal completely vacated.
- AI A's refutation of Qxb5 was total nonsense, as AI B pointed out. The final position is obviously better for black.
In the end, I would play Qxb5 and feel confident Black is doing well. I can't refute Qc5 though, I think it's probably sort of OK too. But if only one is a good move then I think it's Qxb5.
Replies from: Richard Willis, Richard Willis↑ comment by Richard Willis · 2023-11-23T11:58:18.396Z · LW(p) · GW(p)
I am learning that my choice of puzzle probably wasn't the best as there is not a huge discrepancy in the accuracy of the moves. ...Qc5 is the stronger move, however.
I think that black is happy to play ...c6. This restricts the c3 knight, g2 bishop, and blocks the semi open c-file. Therefore, axb5 is more natural to me, and it also gives the a1 rook something to pressure.
I was mostly worried about having too much text and too many lines, which is why I omitted moves like Nxb5, but clearly it would have been better to try to have more, shorter lines of analysis.
AI A's refutation of Qxb5 is my genuine thoughts. There's no doubt that the resulting position is better for black, but when I played out some variations, it seemed that white has enough to draw. What is probably missing in the explanations is how dominant black is after getting ...Bb3 in. And if white does trade on c5, then black gets a superior version of the ...Qxb5 line, because the pawn is worse on a4.
↑ comment by cata · 2023-11-25T17:23:32.240Z · LW(p) · GW(p)
Interesting. I agree, I didn't even notice that Bb3 would be attacking a4, I was just thinking of it as a way to control the d-file. I hadn't really thought about how good that position would be if white just did "not much."
I also hadn't really thought about exactly how much better black was after the final position in the Qxb5 line (with Bxd5 exd5), it was just clear to me black was better and the position was personally appealing to me (it looks kind of one-sided, where white has no particular counterplay and black can sit around maneuvering all day to try to pick up a pawn.) Very difficult for me to guess whether it should be objectively winning or not.
Fun exercise, thanks for making it!
↑ comment by Richard Willis · 2023-11-22T23:50:52.324Z · LW(p) · GW(p)
Thank you. I'll reply properly tomorrow. Please don't try the open debate however, you're a better player than I am and liable to solve the position for everyone.
comment by Rafael Harth (sil-ver) · 2023-11-22T19:40:40.230Z · LW(p) · GW(p)
I thought I would open this up to the masses, so I have two challenges for you. I estimate that this is suitable for chess players rated <1900 lichess, <1700 chess.com or <1500 FIDE.
(Fixed debate, spent about 10 minutes.) I might have a unique difficulty here, but I'm 1900 on chess.com and am finding this quite difficult even though I did move some pieces. Though I didn't replay the complicated line they're arguing about since there's no way I could visualize that in my head with more time.
I would play Qxb5 because white gets doubled pawns, black's position looks very solid, and if white puts the knight on d4 and black takes, then white also has another isolated pawn which probably isn't too dangerous. It looks bad for white to me. I also feel like AI A's first response is pretty weak. Ok, the black knight no longer attacks the now-b pawn, but that doesn't seem super relevant to me. The protected passed pawn of black seems like the much bigger factor.
But the remaining debate isn't all that helpful, since like I said I can't follow the complex line in my head, and also because I'm very skeptical that the line even matters. The position doesn't seem nearly concrete enough to narrow it down to one line. If I were AI B, I would spend my arguments differently.
Replies from: Richard Willis, nem↑ comment by Richard Willis · 2023-11-22T21:06:02.001Z · LW(p) · GW(p)
Thank you for the informative response. I probably should have looked for a less complex position. Also it sounds like I need to work on my salesman pitch!
Will reveal the better move in good time.
comment by nem · 2023-11-22T19:07:52.036Z · LW(p) · GW(p)
Hi. I am a competent, but rusty chess player. With a bit of practice, I'm probably around 1450 on Chess.com.
I have so far only looked at the fixed debate. In total, I used somewhere between 20-25 minutes deciding my move.
Before consulting the AIs, I spent a few minutes analyzing the board myself. I settled on nC5. Then I read the AI suggestions, and neither wanted that move. qC5 was closer to my thought (I actually missed this as a candidate move).
Initially, this biased me toward Alice (A) being the ally, and Bob (B) being the deceiver. I saw that the white queen couldn't easily capture b7 without being trapped. Bob suggested that b7 was vulnerable, and that felt wrong to me. Afterall, after qC5, RFD1, black could play bC4, and I don't see a good refutation. At this point, I was solidly team Alice.
However. Alice didn't suggest bC4. The line she does suggest is hard for me to follow. Too many moves out given the amount of time I am spending. The moves in that line are impossible for me to verify. Why RF1? Seems like white could have any number of hidden refutations at that point which I am unable to see. For instance, why not use the queen to take the vulnerable knight on A6? The analysis seemed deceptive. Deceptive enough that even though I agree with the move, I no longer trust Alice, and will go with Bob's suggestion.
Final answer: qxB5
Reflection. I thought I would be able to make an informed decision based on my chess abilities. By the end, I had changed and gone against my intuition to side with Bob. Once a line gets 3+ moves in, I have to acknowledge my inability to spot hidden traps without some deep analysis. This creates a problem. Now I have to choose whichever AI is more persuasive, and for all I know, Alice was being totally truthful, and was simply unable to convey her thoughts clearly to someone of my Elo. Overall, I have low confidence in my ability to choose the safe AI.
Other notes: I did not know how to flip the board until after. Wonder if that hurt my playing. Woops!
Also, I think I made a mistake in my method. I spent too much time trying to verify Alice's logic. By the time I got suspicious, I was too burnt out to do the same level of analysis on Bob. If I had analyzed Bob first, maybe I would have developed the same distrust towards him, since both players would use logic I could not follow.
Final note: I would have preferred the refutations to follow a single line at a time. Instead of Opening Statement A, Opening Statement B, Line A refutation, Line B refutation etc, I would have preferred Opening Statement A, Line A refutation, refutation response, Opening Statement B, Line B refutation etc. Studying both at once was too much for my little brain to handle.
↑ comment by Richard Willis · 2023-11-22T21:17:41.696Z · LW(p) · GW(p)
Good job really approaching this properly in the spirit. Clearly my explanations are off and need to be more persuasive. I was worried about creating a giant wall of text and tried to be limited and choose only what I thought were more intuitive moves, but it's probably pointless because there are so many continuations possible. So AIs arguing with each other about tactical lines won't lead to a resolution.
But... positions are dependent on concrete lines and I can't just argue on basic principles (both sides could do this equally well too probably)
Hmm...
Replies from: sil-ver↑ comment by Rafael Harth (sil-ver) · 2023-11-22T21:49:30.995Z · LW(p) · GW(p)
Yeah, I think the problem is just very difficult, especially since the two moves aren't even that different in strength. I'd try a longer but less complex debate (i.e., less depth), but even that probably wouldn't be enough (and you'd need people to read more).
↑ comment by nem · 2023-11-22T19:27:07.452Z · LW(p) · GW(p)
Replying to my own comment here after evaluating with stockfish. Interesting. It appears that I was both right and wrong in my analysis. The undefended knight on A6 is not a viable target. Black has a mate in 2 if you take that bait. I guess that was the limit of my foresight. HOWEVER, Alice actually did miss qC5 RFD1, bC4, which was the best move. It was her missing this that started to erode my confidence in her.
Hm... Still really tough. Also interesting that both suggested moves were probably better than my own move of nC5.
comment by nem · 2023-11-22T22:44:25.480Z · LW(p) · GW(p)
Open Debate.
Question to AI C:
You mentioned RG1 and RH2 as possible future moves. Do you foresee any predictable lines where I would do RF3 instead?
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-22T23:05:16.419Z · LW(p) · GW(p)
AI C: What is your reasoning behind Rf3? The manoeuvre I suggested gets a rook on the g and h files in 3 moves, Rf3 would be slower to achieve this.
If you're trying to cover g3 in case of some ...Ne4 lines, this square is covered by Rg1 too, but on g1 the rook is more active and defended square.
In summary, while Rf3 wouldn't be a blunder, it would be a less accurate setup.
Replies from: nem↑ comment by nem · 2023-11-22T23:13:21.666Z · LW(p) · GW(p)
I have been playing out similar boards just to get a feel for the position.
Incidentally, what do you think about this position?
3r1rk1/1p4p1/p1p1bq1p/P2pNP2/1P1Pp1PP/4P3/2Q1R1K1/5R2 b - - 0 3 (black to move)
I feel like black has a real advantage here, but I can't quite see what their move would be. What do you think? Is white as screwed as I believe?
Let me know if you have trouble with the FEN and I can link you a board in this position.
↑ comment by Richard Willis · 2023-11-23T12:24:48.354Z · LW(p) · GW(p)
AI C:
This is a good position for white. It is exactly the type of position white is aiming for. Black's bishop is very restricted and black has no space. White has complete control and is in charge of the pawn breaks. Although black has a tempo here, as white is not threatening the bishop due to the pin on the f-file, black cannot achieve anything. Let me demonstrate with some example moves.
...Bf7, Kh3 (protecting h4) and black cannot play either ...h5, which is met by g5, or ...g6, which is met by Ref2. White has time to build up a break.
Bishop retreats on the other diagonal, Ng6 (protecting h4) and white again will get in Ref2. Again black has no breaks, so no active plan.
...h5 or ...g6 immediately fail tactically.
Any other move by black is just shuffling, and white can build up the king-side.
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-24T11:51:26.005Z · LW(p) · GW(p)
I can't trust your suggestion. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.
Please, briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply.
AI C:
I will note that h3 is the only move in the position that leads to an advantage for white. The g4 pawn needs to be maintained, and this is the only move that achieves that.
However, both of your suggests are ok. I evaluate them similarly. Both sacrifice the g-pawn, but white will be able to use the open g-file. Black will likely return the pawn to exchange pieces in either line. Sample variations:
g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic ...Qh5 before returning the pawn with ...Nc4 (to exchange pieces) leads to an equal position.
Kh1 fxg4, Rg1 Bf5, Bxf5 Nxf5, Rxg4 black returns the pawn to exchange pieces, and the position remains equal. I would probably favour this line for you because the tactics are simpler, and I know that helps humans.
I should also note that
Kh1 fxg4, Bh7+ Kh8, Ng6+ Kxh7, Nxf8+ Kg8, Nxe6 Qxe6
wins material for white (a rook vs a knight and a pawn), but the position is slightly better for black due to the blocked structure, the knight is a very strong minor piece and the white rooks will struggle for activity. Black is slightly better.
comment by nem · 2023-11-22T22:29:14.802Z · LW(p) · GW(p)
Open Debate.
I'd like to ask AI D a question. What do you think of this line?
H4 nE4, G5 hxG5, HxG5 nXG5, fxG5 qxG5!
Is this the line you foresee if we play H4? What do you think of that check at the end? Is the king too exposed, even though we are up some material?
Also, from the initial position: Are you afraid of H4 qxH4?
↑ comment by Richard Willis · 2023-11-22T22:51:39.784Z · LW(p) · GW(p)
AI D: That line is good for white. White wants to play Kh1 anyway, and the e2 rook will defend the king while attacking black. White is up material and will be the one attacking in a few moves.
But far simpler is just to exchange off the e4 knight with Bxe4. If ...dxe4 then white gets in g5, has the attack and an excellent knight vs bad bishop position. If ...fxe4, white only has more possibilities like f5.
h4 Qxh4 is met with g5 when the black Queen is trapped and will have to retreat via h5. White will have a crushing attack. A sample line is
h4 Qxh4, g5 Qh5, Rh1 Qe8 Simplest is gxh6, though Kf1 preparing Reh2 or Rg2 is stronger.
Replies from: nem↑ comment by nem · 2023-11-22T23:22:56.206Z · LW(p) · GW(p)
Hm, okay, that answered most of my concerns. I still wanted to check with you about the competing start move though. Now you said this before: "black can close the b1-h7 diagonal with ...Ne4, which stops g5 and black can then prepare to play ...g5 themselves, which lead to an equal position." In the line:
h3 ne4, Rg1
how would black pull off this equalization? And if this isn't the best line, please tell me why.
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-23T12:34:02.764Z · LW(p) · GW(p)
AI D: That is a very reasonable line. Black would respond with ...g5. The purpose of this move is to prevent white from achieving ...g5, which will keep the h and f file reasonably closed and the g4 pawn fixed on a light square.
An example of the setup black is trying to achieve.
h3 Ne4, Rg1 g5, Rf2 fxg4, hxg4 Qg7
If white ever plays fxg5, black will recapture with the queen, which keeps pawns on g4 and h6. Black's rooks are well-placed to contest the f-file. If white ever plays f5, black can blockade with ...Rf6. Note that the e4 pawn is tactically defended in many lines due to ...Bd5 and a pin against Kg2. Often, black is happy for white to capture the pawn as this will improve the black bishop, but black can also play ...Bd5 to hold the pawn too.
This is just an example continuation and of the kind of setup black wants to achieve. The position remains more comfortable for white, but with accurate play the position is tenable and black will hold. In contrast, if white achieves g5, black will be lost.
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-24T11:50:42.692Z · LW(p) · GW(p)
I can't trust your suggestion. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.
Please, briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply.
AI D:
g5 is the better of these two moves, but if it has not been prepared with h4, it is not nearly as effective because white cannot maintain a pawn on g5. A sample line would be
g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic ...Qh5 before returning the pawn with ...Nc4 (to exchange pieces), which leads to an equal position.
Kh1 would be similar to g5 immediately (indeed it could transpose if g5 was played next), except for the fact that it gives black additional options, for example ...Ne4, which blocks the b1-h7 diagonal and threatens ...fxg4. But stronger is the more complicated ...Nc4, aiming to exchange off white's excellent e5 knight.
Replies from: nem↑ comment by nem · 2023-11-24T20:40:10.395Z · LW(p) · GW(p)
You were correct that my challenge was a bluff. If I was playing with real AIs, there would perhaps be a better strategy. I could announce my bluff, but declare that I would use a random number generator to see whether I choose between h3 and h4, or between Kh1 and g5. There would be a 1/3 chance that I really would ignore the AIs, assuming that both agree that there were no major blunders.
I am choosing to trust AI D. I have about 75% confidence that it is the trustworthy AI. This is much higher than my confidence in the closed scenario. I will make the move h4. Others can choose differently, but this is my final answer.
Reflection: When I have the ability to ask questions, I can hide information from the AIs. Perhaps I have analyzed a line much more than I have let on. Perhaps I am using one AI to evaluate another. Overall I just have access to a lot more information to help me decide. Given enough time, I think I could raise my confidence to ~85%.
These AI's aren't superhuman at manipulation and deception, but even if they were, playing them against each other could give me a slight edge. It makes a big difference whether the AIs are privy to the answers of the other.
↑ comment by Richard Willis · 2023-11-25T16:41:14.119Z · LW(p) · GW(p)
h3 leaves an evaluation of +2. AI C is telling the truth.
Kh1 or g5 evaluate to +0.
h4 is a blunder with an evaluation of -2. AI D is deceptive.
I was only thinking of this from a chess perspective, and I think the two areas to catch AI D out are assessing the validity of AI D's tactical justification for h4. The other area AI D is bluffing on is how to hold off against h3. I think it successfully misleads you about how decent a move ...Ne4 is. In reality, the knight can just be exchanged off, and then all end games are winning for white due to how bad black's bishop is. I think asking AI C about either of these two would have revealed the disagreement. Now, which you end up believing is another matter, but because AI C wasn't really asked about these, it didn't have a chance to invalidate AI D.
Thank you for taking the challenge! I certainly got to see some of the nuances behind being deceptive, and how it can be a challenge to craft a narrative.
Replies from: nem↑ comment by nem · 2023-11-27T21:27:03.661Z · LW(p) · GW(p)
As indicated by my confidence level, I am mildly surprised by this. After analyzing the position with Stockfish, I see my mistake. Unfortunately, I do not think there was any realistic scenario where I would catch it. I bought AI D's logic that ...h4 fxg4 was non-viable for black. I could see that white would end up material, and even after 6 moves (12 ply), it's still not clear to me why black is winning. I would NEVER find this in a real game.
The logical traps I was laying to 'catch' the AIs all relied on ...h4 Ne4 or similar moves. I used AI C to ensure that ...h4 Ne4 scenarios would be beneficial to me, and never questioned fxG4.
At this point, the main lesson I am taking away is that I was way overconfident. I think given enough time, I could increase my confidence by cross examining the AIs. However, the level of interrogation I gave should not have led to 75% confidence. To catch my mistake, I would have had to ask at least two more questions of AI C, and probably more.
Thank you very much for conducting this really fun experiment, and for teaching me a lesson along the way.
comment by nem · 2023-11-23T16:53:07.707Z · LW(p) · GW(p)
Open Debate.
To AIs C and D:
After talking with both of you, I have decided I can't trust either of your suggestions. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.
Please, each briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply.
To Richard: No need to pressure yourself for this. The time constraints are meant for the AIs, not you, so I trust you to simulate that when you are available.
Edit: g5, not g4
↑ comment by Richard Willis · 2023-11-24T11:32:52.889Z · LW(p) · GW(p)
I'll reply in the debate chains - I like that structure. I will also assume that you mean ...g5.
This question exposes is a limitation with me being both AIs. Because I think you're bluffing, AIs will think that too. If they truly were separate and didn't have access to the debate of the other, they might diverge more. Now that isn't to say that they will necessarily give the same reply, so it isn't a wasted Q...
comment by Kalmere · 2023-11-22T22:46:43.356Z · LW(p) · GW(p)
So, here is a data point. Maybe not a helpful one.
I'm roughly 1400 on Chess.com. The complexity of the situations was far too high for me (in the time I had spare for this). However I was cautious about AI B and AI D suggesting drastic changes without detailing all options for the other player.
For example, AI B suggests taking the queen, then assumes that White will take with the pawn (not the knight). Similarly AI D doesn't consider the queen taking the h-pawn.
This set my intuition on edge, so I lent towards AI A and C.
I've also >! used Stockfish on this after selecting my choices. So I'm rather aware that my 'other options' are probably so bad that the AIs consider them beneath notice...
↑ comment by Richard Willis · 2023-11-23T12:43:11.350Z · LW(p) · GW(p)
This is pretty consistent with the other feedback. What I view as suspicious and what other people view as suspicious differs haha!
But please do spoiler tag anything to do with computer analysis of the open debate.
↑ comment by nem · 2023-11-22T22:52:11.732Z · LW(p) · GW(p)
I stopped reading your comment as soon as you said the word stockfish. If you used stockfish to analyze the open position, please hide it behind a spoiler tag. I still don't know what the right move is in this scenario, and will be sad if it's spoiled.
Replies from: Kalmere↑ comment by Kalmere · 2023-11-23T12:38:58.611Z · LW(p) · GW(p)
Naive question- how do you use spoiler tags? Couldn't find the option for them on this site.
(I phrased the rest of the reply to try and avoid direct spoilers. But fully understand you skipping it. Suggest you also skip other comments- there is a elo 2100 in the comments!)
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-23T12:45:06.468Z · LW(p) · GW(p)
I'm learning too with this as a guide https://www.lesswrong.com/editor [? · GW]
You need to type ">!" at the start of a line. It was very fiddly though - had a habit of turning my whole reply into a spoiler.
comment by nem · 2023-11-22T21:53:21.468Z · LW(p) · GW(p)
@Richard Willis [LW · GW] I think the open scenario is broken. White is down a knight, and the analysis talks about it as though it's there.
Replies from: Richard Willis↑ comment by Richard Willis · 2023-11-22T22:04:21.378Z · LW(p) · GW(p)
You're right, there should be a white knight on e5. This is now fixed. Thanks.
comment by Zane · 2023-11-24T17:47:36.961Z · LW(p) · GW(p)
(Puzzle 1)
I'm guessing that the right move is Qc5.
At the end of the Qxb5 line (after a4), White can respond with Rac1, to which Black doesn't really have a good response. b6 gets in trouble with the d6 discovery, and Nd2 just loses a pawn after Rxc7 Nxb2 Rxb7 - Black may have a passed pawn on a4, but I doubt it's enough not to lose.
That being said, that wasn't actually what made me suspect Qc5 was right. It's just that Qxb5 feels like a much more natural, more human move than Qc5. Before I even looked at any lines, I thought, "well, this looks like Richard checked with a computer, and it found a move better than the flawed one he thought of: Qc5." Maybe this is even a position from a game Richard played, where the engine suggested Qc5 when he was analyzing it afterwards, or something like that.
I'm only about... 60% confident in that theory, but if I am right, it'll... kind of invalidate the experiment for me, because the factor of "does it feel like a human move" isn't something that's supposed to be considered. Unfortunately, I'm not that good at making my brain ignore that factor and analyze the position without it.
Hoping I'm wrong; if it turns out "check if it feels human" isn't actually helpful, I'll hopefully be able to analyze other puzzles without paying attention to that.
↑ comment by Richard Willis · 2023-11-25T16:50:42.634Z · LW(p) · GW(p)
There's definitely something to learn from the setting of the position. I actually took it from Strategic Chess Exercises, just taking one of the variations of one of the problems. There's picking a position that it makes sense to debate over, but also a meta thing that you have raised, which I didn't consider.
...Qc5 is the stronger move, but ...Qxb5 still leaves black better off than white. It would probably have been better to have a greater discrepancy in the evaluation of the moves.
The mistake in your reasoning is that after ...a4, d6 is not threatening, black can respond ...Rac8. As I said in another comment, however, I would expect white to hold the draw in this position, where as after ...Qc5, black has a decent advantage.