Posts

Comments

Comment by Mohsen Arjmandi (mohsen-arjmandi) on Models Don't "Get Reward" · 2023-01-24T20:33:33.025Z · LW · GW

Reminded me of this recent work: TrojanPuzzle: Covertly Poisoning Code-Suggestion Models.
Some subtle ways to poison the datasets used to train code models. The idea is that by selectively altering certain pieces of code, they can increase the likelihood of generative models trained on that code outputting buggy software.