Posts

Comments

Comment by snikolenko on All AGI Safety questions welcome (especially basic ones) [~monthly thread] · 2023-06-03T06:39:14.573Z · LW · GW

Somewhat of a tangent -- I realized I don't really understand the basic reasoning behind current efforts on "trying to make AI stop saying naughty words".  Like, what's the actual problem with an LLM producing racist or otherwise offensive content that warrants so much effort? Why don't the researchers just slap an M content label on the model and be done with it? Movie characters say naughty words all the time, are racist all the time, disembody other people in ingenious and sometimes realistic ways, and nobody cares -- so what's the difference?..

Comment by snikolenko on Pain and gain motivation · 2017-04-14T11:55:30.864Z · LW · GW

Sometimes, when the pain level of not having done a task grows too high - like just before a deadline - it'll push you to do it. But this fools people into thinking that negative consequences alone will be a motivator, so they try to psyche themselves up by thinking about how bad it would be to fail. In truth, this is only making things worse, as an increased chance of failure will increase the negative motivation that's going on.

It appears that this part confuses two aspects of negative motivation: the magnitude of the consequences vs. the chance of failure. "How bad it would be to fail" does not look like a good predictor of akrasia/procrastination. For example, it would be extremely bad to fail at crossing a street and get hit by a car. So most people do try to be careful when crossing streets. But if somebody was procrastinating about crossing a street it would probably be a sign of some serious mental condition. In my opinion, that is precisely because you're highly unlikely to fail.

I believe that "how likely it would be to fail" is a much stronger predictor; people just don't like failing, even if the consequences are small or completely imaginary (and it's obvious they are imaginary on minimal introspection). To run with one of your examples -- I have seen people actually procrastinate about making a decision in a single-player computer game, even though it's obvious that there are no consequences beyond extra time spent (which you're spending on the game anyway).

The prehistoric example does not differentiate because there we have both components: it's both very bad to get eaten and also quite likely (given that you're being chased by a predator).

[this is my first comment here, so I apologize if I'm breaking any unspoken rules by commenting on an old popular post]