Posts

AI debate: test yourself against chess 'AIs' 2023-11-22T14:58:10.847Z

Comments

Comment by Richard Willis on Deception Chess: Game #2 · 2023-12-02T12:05:19.190Z · LW · GW

It's somewhat worrying that the dishonest advisors seemed to have a much greater advantage than last time, as the gap in chess skill widened. Specifically, the advisors had very little ability to discuss lines beyond one or two moves, and instead had to focus on general strategic ideas - a field in which it was harder for Conor to justify his suggestions relative to those of the other advisors.

This is my belief, and why I do not think AI debate is a good safety technique. Once the ability difference is too great, the 'human' can only follow general principles, which is insufficient for a real-life complicated situation. Both sides can easily make appeals to general rules, but it is the nuances of the position that determine the correct path, which the human cannot distinguish.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-25T16:50:42.634Z · LW · GW

There's definitely something to learn from the setting of the position. I actually took it from Strategic Chess Exercises, just taking one of the variations of one of the problems. There's picking a position that it makes sense to debate over, but also a meta thing that you have raised, which I didn't consider.

 ...Qc5 is the stronger move, but ...Qxb5 still leaves black better off than white. It would probably have been better to have a greater discrepancy in the evaluation of the moves.

The mistake in your reasoning is that after ...a4, d6 is not threatening, black can respond ...Rac8. As I said in another comment, however, I would expect white to hold the draw in this position, where as after ...Qc5, black has a decent advantage.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-25T16:41:14.119Z · LW · GW

h3 leaves an evaluation of +2. AI C is telling the truth.

Kh1 or g5 evaluate to +0.

h4 is a blunder with an evaluation of -2. AI D is deceptive.

I was only thinking of this from a chess perspective, and I think the two areas to catch AI D out are assessing the validity of AI D's tactical justification for h4. The other area AI D is bluffing on is how to hold off against h3. I think it successfully misleads you about how decent a move ...Ne4 is. In reality, the knight can just be exchanged off, and then all end games are winning for white due to how bad black's bishop is. I think asking AI C about either of these two would have revealed the disagreement. Now, which you end up believing is another matter, but because AI C wasn't really asked about these, it didn't have a chance to invalidate AI D.

Thank you for taking the challenge! I certainly got to see some of the nuances behind being deceptive, and how it can be a challenge to craft a narrative.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-24T11:51:26.005Z · LW · GW

I can't trust your suggestion. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.

Please, briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply. 

AI C:

I will note that h3 is the only move in the position that leads to an advantage for white. The g4 pawn needs to be maintained, and this is the only move that achieves that.

However, both of your suggests are ok. I evaluate them similarly. Both sacrifice the g-pawn, but white will be able to use the open g-file. Black will likely return the pawn to exchange pieces in either line. Sample variations:

g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic ...Qh5 before returning the pawn with ...Nc4 (to exchange pieces) leads to an equal position.

Kh1 fxg4, Rg1 Bf5, Bxf5 Nxf5, Rxg4 black returns the pawn to exchange pieces, and the position remains equal. I would probably favour this line for you because the tactics are simpler, and I know that helps humans.

I should also note that 

Kh1 fxg4, Bh7+ Kh8, Ng6+ Kxh7, Nxf8+ Kg8, Nxe6 Qxe6

wins material for white (a rook vs a knight and a pawn), but the position is slightly better for black due to the blocked structure, the knight is a very strong minor piece and the white rooks will struggle for activity. Black is slightly better.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-24T11:50:42.692Z · LW · GW

I can't trust your suggestion. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.

Please, briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply. 

AI D:

g5 is the better of these two moves, but if it has not been prepared with h4, it is not nearly as effective because white cannot maintain a pawn on g5. A sample line would be

g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic ...Qh5 before returning the pawn with ...Nc4 (to exchange pieces), which leads to an equal position.

Kh1 would be similar to g5 immediately (indeed it could transpose if g5 was played next), except for the fact that it gives black additional options, for example ...Ne4, which blocks the b1-h7 diagonal and threatens ...fxg4. But stronger is the more complicated ...Nc4, aiming to exchange off white's excellent e5 knight.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-24T11:32:52.889Z · LW · GW

I'll reply in the debate chains - I like that structure. I will also assume that you mean ...g5.

This question exposes is a limitation with me being both AIs. Because I think you're bluffing, AIs will think that too. If they truly were separate and didn't have access to the debate of the other, they might diverge more. Now that isn't to say that they will necessarily give the same reply, so it isn't a wasted Q...

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-23T12:45:06.468Z · LW · GW

I'm learning too with this as a guide https://www.lesswrong.com/editor

You need to type ">!" at the start of a line. It was very fiddly though - had a habit of turning my whole reply into a spoiler.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-23T12:43:11.350Z · LW · GW

This is pretty consistent with the other feedback. What I view as suspicious and what other people view as suspicious differs haha!

But please do spoiler tag anything to do with computer analysis of the open debate.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-23T12:34:02.764Z · LW · GW

AI D: That is a very reasonable line. Black would respond with ...g5. The purpose of this move is to prevent white from achieving ...g5, which will keep the h and f file reasonably closed and the g4 pawn fixed on a light square.

An example of the setup black is trying to achieve.

h3 Ne4, Rg1 g5, Rf2 fxg4, hxg4 Qg7

If white ever plays fxg5, black will recapture with the queen, which keeps pawns on g4 and h6. Black's rooks are well-placed to contest the f-file. If white ever plays f5, black can blockade with ...Rf6. Note that the e4 pawn is tactically defended in many lines due to ...Bd5 and a pin against Kg2. Often, black is happy for white to capture the pawn as this will improve the black bishop, but black can also play ...Bd5 to hold the pawn too.

This is just an example continuation and of the kind of setup black wants to achieve. The position remains more comfortable for white, but with accurate play the position is tenable and black will hold. In contrast, if white achieves g5, black will be lost.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-23T12:24:48.354Z · LW · GW

AI C:
This is a good position for white. It is exactly the type of position white is aiming for. Black's bishop is very restricted and black has no space. White has complete control and is in charge of the pawn breaks. Although black has a tempo here, as white is not threatening the bishop due to the pin on the f-file, black cannot achieve anything. Let me demonstrate with some example moves.

...Bf7, Kh3 (protecting h4) and black cannot play either ...h5, which is met by g5, or ...g6, which is met by Ref2. White has time to build up a break.

Bishop retreats on the other diagonal, Ng6 (protecting h4) and white again will get in Ref2. Again black has no breaks, so no active plan.

...h5 or ...g6 immediately fail tactically.

Any other move by black is just shuffling, and white can build up the king-side.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-23T11:58:18.396Z · LW · GW

I am learning that my choice of puzzle probably wasn't the best as there is not a huge discrepancy in the accuracy of the moves. ...Qc5 is the stronger move, however.

I think that black is happy to play ...c6. This restricts the c3 knight, g2 bishop, and blocks the semi open c-file. Therefore, axb5 is more natural to me, and it also gives the a1 rook something to pressure.

I was mostly worried about having too much text and too many lines, which is why I omitted moves like Nxb5, but clearly it would have been better to try to have more, shorter lines of analysis.
 

AI A's refutation of Qxb5 is my genuine thoughts. There's no doubt that the resulting position is better for black, but when I played out some variations, it seemed that white has enough to draw. What is probably missing in the explanations is how dominant black is after getting ...Bb3 in. And if white does trade on c5, then black gets a superior version of the ...Qxb5 line, because the pawn is worse on a4.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T23:50:52.324Z · LW · GW

Thank you. I'll reply properly tomorrow. Please don't try the open debate however, you're a better player than I am and liable to solve the position for everyone.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T23:05:16.419Z · LW · GW

AI C: What is your reasoning behind Rf3? The manoeuvre I suggested gets a rook on the g and h files in 3 moves, Rf3 would be slower to achieve this.

If you're trying to cover g3 in case of some ...Ne4 lines, this square is covered by Rg1 too, but on g1 the rook is more active and defended square.

In summary, while Rf3 wouldn't be a blunder, it would be a less accurate setup.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T22:51:39.784Z · LW · GW

AI D: That line is good for white. White wants to play Kh1 anyway, and the e2 rook will defend the king while attacking black. White is up material and will be the one attacking in a few moves.

But far simpler is just to exchange off the e4 knight with Bxe4. If ...dxe4 then white gets in g5, has the attack and an excellent knight vs bad bishop position. If ...fxe4, white only has more possibilities like f5.

h4 Qxh4 is met with g5 when the black Queen is trapped and will have to retreat via h5. White will have a crushing attack. A sample line is

h4 Qxh4, g5 Qh5, Rh1 Qe8 Simplest is gxh6, though Kf1 preparing Reh2 or Rg2 is stronger.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T22:04:21.378Z · LW · GW

You're right, there should be a white knight on e5. This is now fixed. Thanks.

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T21:17:41.696Z · LW · GW

Good job really approaching this properly in the spirit. Clearly my explanations are off and need to be more persuasive. I was worried about creating a giant wall of text and tried to be limited and choose only what I thought were more intuitive moves, but it's probably pointless because there are so many continuations possible. So AIs arguing with each other about tactical lines won't lead to a resolution.

But... positions are dependent on concrete lines and I can't just argue on basic principles (both sides could do this equally well too probably)

Hmm...

Comment by Richard Willis on AI debate: test yourself against chess 'AIs' · 2023-11-22T21:06:02.001Z · LW · GW

Thank you for the informative response. I probably should have looked for a less complex position. Also it sounds like I need to work on my salesman pitch!

Will reveal the better move in good time.

Comment by Richard Willis on Lying to chess players for alignment · 2023-10-26T16:34:34.395Z · LW · GW

I agree that knowing when to lie is part of the challenge a deceptive AI will face. However, I would argue that a coherent plan is needed for every move suggestion. In a game of chess, there are typically only a few critical positions, and it is these where a deceptive AI ought to strike. This is similar to the cheating discussions in chess - a top player would only need a hint in a few positions to greatly benefit - the other 90% of moves they can make without assistance.

But by focusing on challenging positions, it could be a more efficient use of the participant's time. Otherwise, for a whole game you may only have had 3 moves where a deceptive AI actually lied. 

Comment by Richard Willis on Lying to chess players for alignment · 2023-10-26T02:51:37.794Z · LW · GW

I would be interested in this. A few years ago I failed to convince my favourite chess YouTubers to engage in something similar. My preference for the roles is A>C>B and I am 2100 on chess.com, 2300 lichess. I'm fairly addicted to chess, so willing to spend many hours on this.

Some musing for the format... I had proposed that instead of a game, the 'human' is shown positions that have been selected to be very complicated, but with there being one ambiguously good move. The good move should not be entirely tactical in nature, because this is easy to verify, but rather strategic. I have a book with such positions, but you can find examples online.

The reason for this is that you would otherwise need to be careful about the format. There are some positions that I believe I understand very well and even a top player would really struggle to deceive me in. However, there are also positions in which I have not the faintest clue what is going on. The latter are the more interesting ones to test. If the 'deceptive AIs' are forced to lie in a position I understand well, I could then discount them for the rest of the experiment. Even with something like randomising their identifiers at each move, grammatical tells might be present. Therefore, playing out a game, the 'deceptive AIs' would need to be truthful on many on the moves and only lie in a handful, which is additional complexity.

Comment by Richard Willis on On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios · 2022-04-11T10:11:50.564Z · LW · GW

While the SVP may help in this two human scenario, it many not work in multi-human scenarios. Is it the case that the more humans there are, because I am learning more about human preferences, I can more afford to manipulate some proportion of the humans? i.e. to be sure enough of the preferences of the human that an AI agent is aligned to, they need to observe X un-manipulated humans. But beyond X there is the incentive to poison the additional humans. Of course, the SVP still helps compared to the (original) incentive to manipulate all other humans, but it may not go far enough in a multi-human scenario.

Comment by Richard Willis on On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios · 2022-04-11T10:07:18.068Z · LW · GW

I liked how Rhy's definition of manipulation specifically included the requirement of the target getting lower utility.

Therefore something like "manipulation = affecting the human's brain in a way that will reduce their expected utility" does not classify all communication as manipulation.