The Power of Reinforcement

post by lukeprog · 2012-06-21T13:42:29.475Z · score: 101 (107 votes) · LW · GW · Legacy · 474 comments

Contents

  The power of reinforcement
  A quick reminder of what you learned in high school
  What works
  Example applications
None
474 comments

Part of the sequence: The Science of Winning at Life

Also see: Basics of Animal Reinforcement, Basics of Human Reinforcement, Physical and Mental Behavior, Wanting vs. Liking Revisited, Approving reinforces low-effort behaviors, Applying Behavioral Psychology on Myself.

 

Story 1:

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

 

Story 2:

I once witnessed a worker who hated keeping a work log because it was only used "against" him. His supervisor would call to say "Why did you spend so much time on that?" or "Why isn't this done yet?" but never "I saw you handled X, great job!" Not surprisingly, he often "forgot" to fill out his worklog.

Ever since I got everyone at the Singularity Institute to keep work logs, I've tried to avoid connections between "concerned" feedback and staff work logs, and instead take time to comment positively on things I see in those work logs.

 

Story 3:

Chatting with Eliezer, I said, "Eliezer, I get the sense that I've inadvertently caused you to be slightly averse to talking to me. Maybe because we disagree on so many things, or something?"

Eliezer's reply was: "No, it's much simpler. Our conversations usually run longer than our previously set deadline, so whenever I finish talking with you I feel drained and slightly cranky."

Now I finish our conversations on time.

 

Story 4:

A major Singularity Institute donor recently said to me: "By the way, I decided that every time I donate to the Singularity Institute, I'll set aside an additional 5% for myself to do fun things with, as a motivation to donate."


The power of reinforcement

It's amazing to me how consistently we fail to take advantage of the power of reinforcement.

Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.

You are not an agenty homunculus "corrupted" by heuristics and biases. You just are heuristics and biases. And you respond to reinforcement, because most of your motivation systems still work like the motivation systems of other animals.

 

A quick reminder of what you learned in high school

 

What works

  1. Small reinforcers are fine, as long as there is a strong correlation between the behavior and the reinforcer (Schneider 1973; Todorov et al. 1984). All else equal, a large reinforcer is more effective than a small one (Christopher 1988; Ludvig et al. 2007; Wolfe 1936), but the more you increase the reinforcer magnitude, the less benefit you get from the increase (Frisch & Dickinson 1990).
  2. The reinforcer should immediately follow the target behavior (Escobar & Bruner 2007; Schlinger & Blakely 1994; Schneider 1990). Pryor (2007) notes that when the reward is food, small bits (like M&Ms) are best because they can be consumed instantly instead of being consumed over an extended period of time.
  3. Any feature of a behavior can be strengthened (e.g., its intensity, frequency, rate, duration, persistence, its shape or form), so long as a reinforcer can be made contingent on that particular feature (Neuringer 2002).

 

Example applications

For additional examples and studies, see The Power of Reinforcement (2004), Don't Shoot the Dog (2006), and Learning and Behavior (2008).

 

I close with Story 5, from Amy Sutherland:

For a book I was writing about a school for exotic animal trainers, I started commuting from Maine to California, where I spent my days watching students do the seemingly impossible: teaching hyenas to pirouette on command, cougars to offer their paws for a nail clipping, and baboons to skateboard.

I listened, rapt, as professional trainers explained how they taught dolphins to flip and elephants to paint. Eventually it hit me that the same techniques might work on that stubborn but lovable species, the American husband.

The central lesson I learned from exotic animal trainers is that I should reward behavior I like and ignore behavior I don't. After all, you don't get a sea lion to balance a ball on the end of its nose by nagging. The same goes for the American husband.

Back in Maine, I began thanking Scott if he threw one dirty shirt into the hamper. If he threw in two, I'd kiss him. Meanwhile, I would step over any soiled clothes on the floor without one sharp word, though I did sometimes kick them under the bed. But as he basked in my appreciation, the piles became smaller.

I was using what trainers call "approximations," rewarding the small steps toward learning a whole new behavior...

Once I started thinking this way, I couldn't stop. At the school in California, I'd be scribbling notes on how to walk an emu or have a wolf accept you as a pack member, but I'd be thinking, "I can't wait to try this on Scott."

...After two years of exotic animal training, my marriage is far smoother, my husband much easier to love.

 

Next post: Rational Romantic Relationships Part 1

Previous post: The Good News of Situationist Psychology

 

 

My thanks to Erica Edelman for doing much of the research for this post.

474 comments

Comments sorted by top scores.

comment by MBlume · 2012-06-21T03:00:14.190Z · score: 37 (55 votes) · LW · GW

Good post! Thank you for writing it Luke =)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-06-21T03:57:36.117Z · score: 30 (40 votes) · LW · GW

Thanks for reinforcing Luke! And it's great that you applied the theory so quickly!

comment by JGWeissman · 2012-06-21T04:06:19.403Z · score: 21 (31 votes) · LW · GW

Yay recursive reinforcement!

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-06-21T04:19:53.604Z · score: 9 (23 votes) · LW · GW

Why, thanks! It's helpful to hear you say that!

comment by CharlieSheen · 2012-06-21T14:35:41.595Z · score: 23 (33 votes) · LW · GW

I think I'm going to be ill if this continues.

comment by Dorikka · 2012-06-21T04:51:16.483Z · score: 0 (12 votes) · LW · GW

Moar recursion! Keep it up! :D

comment by Will_Newsome · 2012-06-21T04:56:58.942Z · score: 28 (36 votes) · LW · GW

No. Unreflective happy death spirals get people killed. Shame on all of you for being bad people.

comment by RomeoStevens · 2012-06-21T05:49:25.379Z · score: 11 (23 votes) · LW · GW

I'm glad you mentioned this.

comment by Will_Newsome · 2012-06-21T05:59:39.136Z · score: 8 (18 votes) · LW · GW

Don't be glad. If you need reinforcement, be relieved. Gladness tends to cause unreflective happy death spirals. Shame on you for being glad.

Presumably the emotion you actually felt was relief, and "glad" was merely used as an inaccurate/misleading synonym? In which case, shame on you for using inaccurate/misleading synonyms.

(I'm totally at least a quarter serious, maybe half.)

comment by JGWeissman · 2012-06-21T06:06:54.467Z · score: 12 (18 votes) · LW · GW

Thank you for wanting us to not have unreflective happy death spirals. I will have to repeat the behavior that caused you to express such caring.

comment by Viliam_Bur · 2012-06-21T09:31:47.172Z · score: 21 (25 votes) · LW · GW

I guess now it's the right time to say big thanks to everyone who didn't contribute to this thread!

comment by NancyLebovitz · 2012-06-22T03:15:09.817Z · score: 5 (7 votes) · LW · GW

You're welcome.

comment by Will_Newsome · 2012-06-21T06:17:24.763Z · score: 2 (12 votes) · LW · GW

I don't want you to not have unreflective happy death spirals, I'm just horrified at the potential consequences of not going out of my way to prevent you from having unreflective happy death spirals. Shame on you for imprecision and/or implicitly accusing me of hypocrisy.

comment by JGWeissman · 2012-06-21T06:24:35.884Z · score: 2 (4 votes) · LW · GW

I don't mean to accuse you of hypocrisy, just that your actions may have consequences you didn't intend. But it is nice of you to try to deny your caring once you learned it is reinforcing behavior you consider destructive.

comment by Will_Newsome · 2012-06-21T06:38:15.697Z · score: 0 (0 votes) · LW · GW

I don't deny my caring, it's just that caring isn't necessarily a positive motivation. I do care. The way I care is by coercing myself into paralyzing fear, so as to bootstrap that paralyzing fear beyond myself and into others, others who might be able to use that fear to help me avert all possible circumstances of desperate, trapped, thrashing justified self-recrimination and socially-mediated abandonment to terrifying consequences causally isomorphic to retaliation against abject, utter historical indefensibility. You should feel guilty for not having already considered that there are many ways for a person to care.

comment by JGWeissman · 2012-06-21T06:53:02.254Z · score: 0 (0 votes) · LW · GW

Ok, I don't want you to have paralyzing fear over this. I am not in any real danger of happy death spirals around positive reinforcement, I was just having fun with recursion and exaggerating the topic under discussion, and I expect this is true of the other participants in this thread. If you notice us going overboard in a similar manner elsewhere, I will take your concerns seriously.

comment by arundelo · 2012-06-21T03:11:04.245Z · score: 7 (15 votes) · LW · GW

I see what you did there!

comment by [deleted] · 2012-06-21T08:33:42.900Z · score: 0 (4 votes) · LW · GW

(I didn't until EY pointed that out.)

comment by FiftyTwo · 2012-06-26T01:17:11.689Z · score: 0 (2 votes) · LW · GW

Good on you for admitting error.

comment by CommanderShepard · 2012-06-21T15:25:13.488Z · score: 4 (12 votes) · LW · GW

"god this is even more phygish than just that quote about eliezer getting fed mnms"

comment by John_Maxwell (John_Maxwell_IV) · 2012-06-22T01:34:58.721Z · score: 3 (5 votes) · LW · GW

That strikes me as goofy, not phygish.

comment by Dorikka · 2012-06-22T03:21:57.780Z · score: 1 (1 votes) · LW · GW

I agree, so much that I think I might be missing something.

comment by Will_Sawin · 2012-06-21T17:33:51.556Z · score: 3 (3 votes) · LW · GW

what's "phygish"?

comment by shminux · 2012-06-21T17:37:50.328Z · score: 3 (3 votes) · LW · GW

rot13 it

comment by radical_negative_one · 2012-06-21T19:14:21.359Z · score: -1 (1 votes) · LW · GW

Source.

comment by [deleted] · 2012-06-21T15:11:37.036Z · score: 36 (52 votes) · LW · GW

"Eventually it hit me that the same techniques might work on that stubborn but lovable species, the American wife." "Back in Maine, I began thanking Amy if she threw one dirty shirt into the hamper. If she threw in two, I'd kiss her." "...After two years of exotic animal training, my marriage is far smoother, my wife much easier to love."

comment by Kaj_Sotala · 2012-06-26T11:53:24.813Z · score: 17 (21 votes) · LW · GW

It's probably worth noting that the original article, which lukeprog quoted, ended with this:

PROFESSIONALS talk of animals that understand training so well they eventually use it back on the trainer. My animal did the same. When the training techniques worked so beautifully, I couldn't resist telling my husband what I was up to. He wasn't offended, just amused. As I explained the techniques and terminology, he soaked it up. Far more than I realized.

Last fall, firmly in middle age, I learned that I needed braces. They were not only humiliating, but also excruciating. For weeks my gums, teeth, jaw and sinuses throbbed. I complained frequently and loudly. Scott assured me that I would become used to all the metal in my mouth. I did not.

One morning, as I launched into yet another tirade about how uncomfortable I was, Scott just looked at me blankly. He didn't say a word or acknowledge my rant in any way, not even with a nod.

I quickly ran out of steam and started to walk away. Then I realized what was happening, and I turned and asked, "Are you giving me an L. R. S.?" Silence. "You are, aren't you?"

He finally smiled, but his L. R. S. has already done the trick. He'd begun to train me, the American wife.

comment by handoflixue · 2012-06-22T18:57:41.166Z · score: 11 (19 votes) · LW · GW

This actually bothers me less than the original, simply because the stereotype of "properly raised wife having to train her lower-status husband to act appropriately" is a VERY common social meme, whereas "husband training wife" is something I generally only see in the context of physical abuse (which, given the lack of violence, this obviously isn't).

Is there a cultural meme I'm missing here that makes THIS version the more offensive one? o.o

comment by Raemon · 2012-06-22T19:14:16.627Z · score: 21 (21 votes) · LW · GW

"Woman Training Man" is generally presented as funny with no negative ramifications. "Husband training wife" is presented in the context of either physical abuse, emotional abuse, or as part of a widespread societal trend of women being "domesticated" which is now generally considered distasteful. If this had been phrased "husband training wife", it wouldn't pattern match to "funny, harmless joke", it'd pattern-match to either abuse or societal oppression. (The abuse angle wouldn't necessarily be accurate, but for many people it would come to mind before the "mirror-image-of-the-woman-training-man" concept did).

So whether it actually makes sense, the example would produce negative affect in many people.

comment by TheOtherDave · 2012-06-22T19:13:50.218Z · score: 2 (2 votes) · LW · GW

No, it sounds like you're aware of the relevant cultural meme.

comment by Viliam_Bur · 2012-06-22T20:11:03.069Z · score: 5 (9 votes) · LW · GW

"wife training lower-status husband" is a cultural meme

"man abusing woman" is a very strong meme, and "man woman" pattern-matches it

comment by TheOtherDave · 2012-06-22T20:57:30.656Z · score: 1 (1 votes) · LW · GW

I agree with all of those statements, and am left with the sense that you were trying to convey an additional message that I didn't quite get.

comment by Viliam_Bur · 2012-06-23T10:08:47.477Z · score: 3 (23 votes) · LW · GW

Just an observation of sexism in our society. We are hypersensitive about anything negative that happens to women (it is a great opportunity for signalling moral superiority above people who are not outraged), while misfortunes of low-status males are just funny (signalling care about them is low-status).

How exactly does this happen? How exactly appears the paradox that this unequal reaction is percieved as fair, while complaining about it can be so easily labeled as sexist?

There is an obvious evolutionary explanation (low-status males are expendable, there is no advantage for high-status males or any-status females to care about them), but how does the algorithm feel from inside? First, there is a rationalization that problems of low-status males are either not real, or could (and should) be easily avoided by them, so if they don't avoid the situations, they obviously deserve the consequences. (Unless they are members of some minority, in which case it is OK to express moral outrage about the opression of given minority.) Second, we are hyper-sensitivised by feminism about everything related to women, because even the smallest joke means that you are a supporter of patriarchy and rape culture, which makes you a complice in every abuse and murder and whatever. There are no innocent jokes about women. Saying your wife "thank you" for doing something nice for you is just a first step on a slippery slope of evil male behavior. (And no, there is no female privilege, and if you have a misunderstood word, go read feminism 101 until you accept it.)

There. Sorry for the mindkilling, I don't know how to write it better without spending too big part of a weekend online.

EDIT: related video

comment by [deleted] · 2012-06-25T18:53:03.214Z · score: 12 (12 votes) · LW · GW

And no, there is no female privilege, and if you have a misunderstood word, go read feminism 101 until you accept it.

I seem to recall having seen at least one introduction to feminism which did acknowledge that there are forms of female privilege (e.g. children usually end up with the mother after divorces), even though far fewer than forms of male privilege (their list was about an order of magnitude shorter). (This made me find that introduction much more credible, as otherwise it would have failed Policy Debates Should Not Appear One-Sided.)

comment by Viliam_Bur · 2012-06-25T19:29:57.407Z · score: 1 (3 votes) · LW · GW

I would have more respect for such introduction, too, for pretty much the same reasons.

comment by TheOtherDave · 2012-06-25T20:32:33.321Z · score: 0 (0 votes) · LW · GW

There are several such, but they don't tend to inspire quite as strong a reaction as the ones the OC is reacting to.

comment by TheOtherDave · 2012-06-23T14:46:18.916Z · score: 1 (1 votes) · LW · GW

OK. Thanks for being explicit.

comment by private_messaging · 2012-06-23T12:30:38.341Z · score: 0 (22 votes) · LW · GW

man abusing woman is not only a very strong "meme", but also a common occurrence due to biological detail of males in mammals generally a: being larger b: being more aggressive and c: likely being naturally more selfish (due to different reproductive role). edit: all I am saying is that there is a biologically justified prior here, that most people use, a body of utterly indisputable evidence across many species of mammals. Except subpar evidence-evaluators, of course, whom do not process the prior and are also subject to Dunning-Kruger effect about it.

comment by [deleted] · 2012-06-25T18:58:49.124Z · score: 2 (8 votes) · LW · GW

Why the hell was that downvoted? I guess it was supposed to be a descriptive statement but people misunderstood it as a normative one.

comment by private_messaging · 2012-06-26T08:49:54.106Z · score: 0 (4 votes) · LW · GW

At least 2 people seem to think you guess wrong.

edit: as of how i interpret reactions to such statements, i have already an explanation for e.g. gaming forums where we have very similar white privileged male nerd demographics. We don't do downvoting there because enabling downvotes lets the white privileged male nerd majority enforce their worldviews and discourage any dissent, which we can not afford because we make games for everyone not just the white privileged male nerd majority. Tho its up to -1 here.

comment by zslastman · 2013-07-18T13:29:54.063Z · score: -1 (1 votes) · LW · GW

The edit is worthy of a downvote, the original part an upvote.

comment by waveman · 2016-08-07T01:19:57.235Z · score: 0 (0 votes) · LW · GW

lower-status husband

Interesting I have seen research that suggests a major difference in perceptions between men and women. Men tend to assess the average woman as, well, average in overall attractiveness. Women tend to assess about 80% of men as below average. So in a monogamous society women tend to think they have settled too low.

Such a gap in perceptions would make sense in a polygamous society where a few men at the top have most of the women - so the women marry up, and end up perceiving this as normal. From my reading, most hunter gatherer societies were polygamous.

comment by lukeprog · 2012-06-22T01:31:22.640Z · score: 8 (22 votes) · LW · GW

Have some tact, man. My post was fine, but you... you are a god damned sexist.

comment by Andreas Källberg (anka-213) · 2018-10-23T09:33:49.202Z · score: 1 (1 votes) · LW · GW

I feel a bit icky when reading both versions. A slightly less icky version would be: "... that stubborn but lovable species, the American partner." I really dislike the idea of treating the two sexes as separate species, especially given how much people already do this.

I also kind of dislike the idea of training your partner like that because it feels manipulative, but that is mostly a question of how you use it. As long as you only do it on behaviours that your partner agrees are good/bad and want to do more/less, it's fine with me. In other words, only when you know they would consent or when they have consented.

comment by Paul Crowley (ciphergoth) · 2012-06-23T07:29:11.055Z · score: -2 (6 votes) · LW · GW

Given the many asymmetries between men and women, it seems at least plausible to me that the above would be much more problematic than the original.

comment by wedrifid · 2012-06-23T15:32:48.578Z · score: 9 (11 votes) · LW · GW

Given the many asymmetries between men and women, it seems at least plausible to me that the above would be much more problematic than the original.

It also seems plausible that the reverse is true. Or neither.

comment by TheOtherDave · 2012-06-23T16:19:47.330Z · score: 1 (3 votes) · LW · GW

Or, most likely of all, that it depends on the relative salience at any given moment of the large set of factors that "problematic" aggregates.

comment by RichardKennaway · 2012-06-23T07:46:30.411Z · score: -2 (10 votes) · LW · GW

Sounds like standard PUA to me.

comment by wedrifid · 2012-06-23T15:31:33.590Z · score: 5 (7 votes) · LW · GW

Sounds like standard PUA to me.

Really? Exactly which PUA recommends thanking women more as a way to pick up women? That seems out of character.

There is a relation, I suppose, in as much as both are about a male influencing a female subject and both rely on principles of human or mammalian psychology. They differ in goal and (so) differ in the specific kinds of tactics.

comment by pjeby · 2012-06-24T00:43:19.739Z · score: 10 (14 votes) · LW · GW

Really? Exactly which PUA recommends thanking women more as a way to pick up women? That seems out of character.

Quite a few PUA schools advise ignoring behavior you don't like, and rewarding behavior you do like, as well as ensuring that you aren't inadvertently sending out a lot of positive reinforcement just because someone is attractive.

True, "thank you" is not generally a recommended form of reinforcement; non-verbal reinforcements like smiles, nods, touch, laughter, looking interested, turning towards the person, etc. are more generally recommended. Occasionally, a certain old story is cited: the one about the professor whose class conditioned him to stop pacing back and forth by looking interested only when he was in the middle of the room.

comment by NancyLebovitz · 2012-06-25T15:25:22.439Z · score: 4 (10 votes) · LW · GW

Your reaction to the idea of kisses to encourage a man to pick up his clothes reminds me of the way a number of women (including me) react to the idea of PUA. It's going ballistic about a hypothetical boundary violation and it's more fun in LW, where one is apparently outnumbered by people who don't see the boundary violation at all. (The boundary violation is hypothetical because the person may not have experienced it..)

comment by wedrifid · 2012-06-26T05:13:24.543Z · score: 0 (2 votes) · LW · GW

It's going ballistic about

Applying that label is both grossly inaccurate and unwelcome.

I noted that certain instances of 'influence by reward' I wouldn't accept and would respond by asking her politely to stop and then escalating as necessary to ensure that the undesired rewarding was not itself rewarded. A couple of users seemed to find the notion that someone else doesn't unconditionally accept all reinforcement offensive.

comment by NancyLebovitz · 2012-06-26T05:53:52.487Z · score: 3 (7 votes) · LW · GW

I'd say that describing small amounts of M&Ms as a significant health threat is a sign of using arguments as soldiers.

On the other hand, you've got better access to your internal experience than I do.

comment by wedrifid · 2012-06-26T08:36:34.421Z · score: 2 (4 votes) · LW · GW

I'd say that describing small amounts of M&Ms as a significant health threat is a sign of using arguments as soldiers.

This is utterly bizarre. Even allowing that you completely missed the obvious meaning of "the most significant risks are the health and dental considerations and they are so insignificant that I'm making a joke about them" my words still can't be taken to mean "there is a significant health threat to small amounts of M&Ms". Not only that but the tangent being answered, something about the relative "risk" of kisses vs M&Ms isn't something I have a position on so I have no idea which side to send 'soldiers' to. Neither of those things are at all 'risky'. It pretty much comes down to "rotten teeth and diabetes vs spreading infectious mononucleosis and herpes simplex" - both at insignificant probabilities and I don't care either way.

On the other hand, you've got better access to your internal experience than I do.

Access to internal experience isn't required to dismiss your accusations. Non-motivated reading of my actual words is.

If I was going to "go ballistic" about anything it would be the active misrepresentation of my words and actions by yourself and pjeby. Not only have you been allowed to get away with slander without sanction you have been actually rewarded for it. I am disgusted.

comment by NancyLebovitz · 2012-06-26T13:46:54.479Z · score: 2 (4 votes) · LW · GW

Sorry for not getting that you intended to make a joke-- I've found that, even in real life and more so online, hyperbolic humor and reduction to absurdity are risky strategies. People are apt to not get the context, or to not agree on what's absurd.

I hadn't gotten around to asking why I was getting upvotes on my previous comments in this thread. It's possible that people agreed with my take what you said, but it's also possible that they mostly found the prospect of a quarrel entertaining. (They presumably agreed with me to some extent, or we'd both be getting upvotes.)

Part of my reason for saying "ballistic" is that I don't think most people would consider a policy of kisses for putting clothes in the hamper to be such a serious infringement that if it isn't stopped after one request, it's a good reason for divorce.

My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

I admit I missed this sentence on previous readings, and it's probably at the center of your objections. I do think "hostile" is extreme, but maybe I'm missing something.

I think there's a middle range between benign efforts at improvement and hostility-- the range where the person is fairly indifferent to the attempted behavior change. I'm guessing that it's the lack of respect for conscious choice by the person being reinforced which causes you to frame it as hostile.

comment by TheOtherDave · 2012-06-26T14:21:58.269Z · score: 2 (2 votes) · LW · GW

even in real life and more so online, hyperbolic humor and reduction to absurdity are risky strategies. People are apt to not get the context, or to not agree on what's absurd.

This is true.

I've also found, especially online, that characterizing the emotional states of my interlocutors for them is a risky strategy. On those rare occasions where the other person's emotional state really is important, I find I do better to explicitly ask for confirmation of my perception about it, rather than implying or referring to it as an observed fact.

comment by NancyLebovitz · 2012-06-26T14:27:53.023Z · score: -1 (1 votes) · LW · GW

You're right about describing other people's emotional states.

comment by wedrifid · 2012-06-27T14:30:03.673Z · score: 1 (1 votes) · LW · GW

Part of my reason for saying "ballistic" is that I don't think most people would consider a policy of kisses for putting clothes in the hamper to be such a serious infringement that if it isn't stopped after one request, it's a good reason for divorce.

That position sounds bizarre, I don't think it exists outside of pjeby's straw man. I believe my stated response was to shun the kisses.

As it happens I've never even had to escalate to the "ask politely" level. A smirk, a knowing look and a "Really?" avoided the conflict while keeping the interaction at the level of play, while still communicating the presence of a boundary.

I think there's a middle range between benign efforts at improvement and hostility-- the range where the person is fairly indifferent to the attempted behavior change. I'm guessing that it's the lack of respect for conscious choice by the person being reinforced which causes you to frame it as hostile.

Yes.

comment by TheOtherDave · 2012-06-23T16:22:02.767Z · score: 2 (2 votes) · LW · GW

both rely on principles of human or mammalian psychology

Operant conditioning works pretty much the same way on some non-mammals as well.

comment by wedrifid · 2012-06-23T16:37:21.408Z · score: 1 (1 votes) · LW · GW

Operant conditioning works pretty much the same way on some non-mammals as well.

Yes, it's the PUA tactics that are in general more mammal specific (at least).

comment by RichardKennaway · 2012-06-24T09:19:32.680Z · score: -2 (4 votes) · LW · GW

I was thinking at a higher level of abstraction. Moulding the woman's behaviour by psychological manipulation, indeed a form of "exotic animal training". This is standard doctrine in the PUA blogosphere -- see also pjeby's reply. PUA, btw, is not about picking up women.

"Psychological endocytosis" might be a better metaphor than "animal training" at the more extreme end of things.

comment by wedrifid · 2012-06-24T09:56:44.744Z · score: 1 (1 votes) · LW · GW

I was thinking at a higher level of abstraction.

Or rather, a lower standard of epistemic accuracy.

PUA skills pertain to influence by males over female behavior using methods that include operant conditioning (including reinforcement). It does not follow that all instances of influence by a male over a female using operant conditioning is standard PUA methodology. In fact this example is significantly different to the kind of application we see in standard PUA. This is unsurprising - after all, we got the example in question when Konkvistador took a wife-influencing-her-husband example and substituted roles.

see also pjeby's reply

I prefer the grandparent:

There is a relation, I suppose, in as much as both are about a male influencing a female subject and both rely on principles of human or mammalian psychology. They differ in goal and (so) differ in the specific kinds of tactics.

comment by NancyLebovitz · 2012-06-25T13:34:07.476Z · score: 0 (2 votes) · LW · GW

"Psychological endocytosis"-- I don't understand the metaphor.

comment by RichardKennaway · 2012-06-25T14:23:08.539Z · score: -2 (4 votes) · LW · GW

Endocytosis is the process by which a cell engulfs a food particle, by extending itself around it and pulling it into its interior. Metaphorically, I am suggesting a process whereby one person similarly extends their own reality around another, undermining the other's perceptions and replacing them with their own. For example, that is what "negging" is about. It is intended to convey the message, at least in the imagination of those advocating it (fictionally imagined here), that the man's beliefs are reality and the woman's are merely pretty lies that deserve to die.

comment by NancyLebovitz · 2012-06-25T15:21:10.090Z · score: 5 (9 votes) · LW · GW

I recommend Clarisse Thorne's Confessions of a Pickup Artist Chaser, a substantial overview of the PUA communities.

PUA covers a wide range from decent behavior to just plain vile. Depending on who's talking, negging can be light-hearted teasing between people who know it's a game or a deliberate effort to keep the target off-balance and dependent on the targeter's good opinion.

It can also be an effort at light-hearted teasing which goes wrong because some PUAs just assume that beautiful women aren't nervous about how they're perceived.

Endocytosis is an interesting metaphor, and it would cover everything from total environment abusiveness (prisons, cults, some dysfunctional familes) to efforts to keep one's voice whispering in the back of a subject's mind. (Anyone have the quote about Saruman handy?)

comment by RichardKennaway · 2012-06-25T18:47:30.657Z · score: -1 (1 votes) · LW · GW

(Anyone have the quote about Saruman handy?)

"Suddenly another voice spoke, low and melodious, its very sound an enchantment. Those who listened unwarily to that voice could seldom report the words that they heard; and if they did, they wondered, for little power remained in them. Mostly they remembered only that it was a delight to hear the voice speaking, all that it said seemed wise and reasonable, and desire awoke in them by swift agreement to seem wise themselves. When others spoke they seemed harsh and uncouth by contrast; and if they gainsaid the voice, anger was kindled in the hearts of those under the spell. For some the spell lasted only while the voice spoke to them, and when it spoke to another they smiled, as men do who see through a juggler's trick while others gape at it. For many the sound of the voice alone was enough to hold them enthralled; but for those whom it conquered the spell endured when they were far away, and ever they heard that soft voice whispering and urging them. But none were unmoved; none rejected its pleas and its commands without an effort of mind and will..."

From The Two Towers, the chapter "The Voice of Saruman". The passage, btw, seems to have become a favorite of the American Right to use of Obama.

comment by RichardKennaway · 2012-06-25T16:22:35.780Z · score: -2 (6 votes) · LW · GW

I recommend Clarisse Thorne's Confessions of a Pickup Artist Chaser, a substantial overview of the PUA communities.

In an Amazon box on my desk right now :-).

PUA does cover a wide range, but so does, for example, science fiction fandom. Is that one thing, or many things? Fannish fans may look down on Trekkies, and literary types scoff at fannish fans, and all of them scoff at commercial conventions, but really, they do all join up, even if some of them are barely aware of the others' existence. PUA is also many things, but they also join up, and if you try to take some and leave the rest, you'll have contact with the rest anyway through the community, and one way or another will have to take up an attitude about it. And one of the many things that is PUA is this particular thing that I've been talking about. To name it more explicitly, MDFS BDSM, not as bedroom games, but as ideology. There are smoking guns here.

And beyond endocytosis is phagocytosis, the digestion or destruction of the ingested particle.

comment by NancyLebovitz · 2012-06-25T17:36:00.167Z · score: 1 (7 votes) · LW · GW

I agree about MDFS (presumably Male Dominant Female Submissive) as ideology is worse than problematic. It's putting a penny in the fusebox so far as abuse is concerned.

Is there a LessWrongian term for a self-sustaining blind spot?

Interesting point about to what extent fandom is a thing, or more generally, any diverse bunch of human social systems which are sort of under one name are a thing.

comment by [deleted] · 2012-06-25T18:46:30.243Z · score: -2 (2 votes) · LW · GW

Is there a LessWrongian term for a self-sustaining blind spot?

There ought to be one, given that there have been lots of posts about them like this one which are often mentioned.

(I'm being deliberately vague about whether by ought to I mean ‘is likely’ or ‘had better’. :-))

comment by steven0461 · 2012-06-25T19:47:10.755Z · score: 2 (4 votes) · LW · GW

It is intended to convey the message, at least in the imagination of those advocating it,

Your link appears to point to the imagination of a critic, not the imagination of an advocate.

comment by RichardKennaway · 2012-06-26T08:01:41.127Z · score: -1 (1 votes) · LW · GW

It's the imagination of a critic imagining an advocate. I'll try and reword the link to make that clearer.

comment by shminux · 2012-06-21T15:28:20.578Z · score: -3 (3 votes) · LW · GW

Downvoted for excerpting what's already on this page without offering any comment.

Reread... Oops, should have paid attention.

comment by [deleted] · 2012-06-21T15:29:42.475Z · score: 8 (12 votes) · LW · GW

You should reread it. Compare to the original. Consider what emotions you or the average person reading these two two experiences.

Then consider why I would point that difference out.

comment by Kindly · 2012-06-22T12:55:19.279Z · score: 2 (2 votes) · LW · GW

The "American husband" version is the only one I had a strong emotional reaction to, because I think in popular culture "training" a husband to behave is much more common than the mirror version. Was that your intent?

(In fact, I had to tell myself that if I were in the husband's place I would probably approve of getting into the habit of picking up dirty clothes, enough to get past the condescension involved.)

comment by [deleted] · 2012-06-22T13:36:01.310Z · score: 6 (6 votes) · LW · GW

My intent was to point out that the husband version is usually taken as acceptable while the wife version is mostly not. Like I said I did expect many people on LessWrong to have different intuitions, this is why I invoked the mythical average person.

comment by CharlieSheen · 2012-06-21T15:36:25.456Z · score: 2 (8 votes) · LW · GW

Ah the good old though experiment, what would we do without it?

I can't wait until someone casually dismisses PUA/game as unethical or manipulative.

comment by [deleted] · 2012-06-21T15:38:21.908Z · score: -1 (5 votes) · LW · GW

Don't worry they have a back up plan. Someone just has to bring up the "negging" straw man. It dosen't matter in their minds that most PUAs don't teach that any more because newbies tend to mess it up.

comment by wedrifid · 2012-06-21T16:17:41.062Z · score: 5 (11 votes) · LW · GW

Don't worry they have a back up plan. Someone just has to bring up the "negging" straw man. It dosen't matter in their minds that most PUAs don't teach that any more because newbies tend to mess it up.

"Deliberately behaving attractively is Rape!"

It's comparatively easy to start a flame war on that subject. I've argued against both sides at times, when their respective extremists started saying ridiculous things.

comment by [deleted] · 2012-06-21T16:58:15.532Z · score: 6 (6 votes) · LW · GW

I'm rather frustrated because people refuse to believe that negging is a rather minor part of some approaches to PUA. When done right its playful teasing. This is why I brought it up.

comment by Manfred · 2012-06-22T11:57:57.654Z · score: 3 (3 votes) · LW · GW

I'm rather frustrated because people refuse to believe that negging is a rather minor part of some approaches to PUA. When done right its playful teasing. This is why I brought it up.

I suspect that, speaking literally, they do not refuse to believe that, and that you're disagreeing about something else.

comment by [deleted] · 2012-06-22T10:32:57.690Z · score: 0 (4 votes) · LW · GW

Actually David DeAngelo (the only PUA from which I've read a non-negligible amount of stuff) makes ‘cocky and funny’ the central point of his technique, IIRC.

comment by wedrifid · 2012-06-22T11:28:48.766Z · score: 1 (1 votes) · LW · GW

Actually David DeAngelo (the only PUA from which I've read a non-negligible amount of stuff) makes ‘cocky and funny’ the central point of his technique, IIRC.

Which, given that you have read a non-negligible amount of his stuff, you would know does not mean the same thing as negging (although you can do the latter while being the former if it happens to be appropriate). The "actually" is non-sequitur.

comment by [deleted] · 2012-06-22T12:41:48.194Z · score: 0 (0 votes) · LW · GW

(I don't remember him using the word “neg”, actually -- but it's been years since I read him.)

comment by Karmakaiser · 2012-06-21T17:40:59.245Z · score: 2 (2 votes) · LW · GW

I think I remember that. Weren't you replying to yourself over and over again taking different sides?

comment by wedrifid · 2012-06-21T17:58:30.254Z · score: 4 (4 votes) · LW · GW

Not that I recall, but if I did it sounds kind of amusing. :)

comment by [deleted] · 2012-06-22T10:30:49.326Z · score: 2 (6 votes) · LW · GW

“Negging” means jokingly insulting someone? And that's bad? I do that all the time, what an awful person I am. And that's supposed to lower people's self-esteem? (Gawd, are they that brittle?) And that's in turn supposed to make them more likely to sleep with me? (Huh, that doesn't seem to be working with me.)

comment by [deleted] · 2012-06-22T10:19:14.945Z · score: 1 (3 votes) · LW · GW

Huh? The only difference is that the genders are switched AFAICT... Maybe I would sense a difference if a stereotypically masculine task (i.e. washing the car) or a stereotypically feminine one (i.e. doing the laundry) was substituted for throwing dirty clothes into the hamper, but this is something that each spouse is supposed to do with their own clothes, so the situation is symmetrical... or what am I missing?

ETA: this is about the emotions I experience reading the two stories. I won't guess about ‘the average person’ because I'm well aware that such guesses are very unreliable.

comment by TheOtherDave · 2012-06-22T13:08:50.353Z · score: 8 (8 votes) · LW · GW

What you're missing is that many people will respond to the gender-swapped version differently, and Konk is calling attention to that fact.

comment by AdeleneDawner · 2012-06-21T05:03:47.008Z · score: 36 (36 votes) · LW · GW

Bit of a tangent, but if you ever run across someone for whom this doesn't seem to work, check the hypothesis that they don't parse praise as a positive reinforcer. I don't know how common this is, but I actually have to make a conscious effort to keep it from acting as a mild punishment in most cases when it's applied to me. (Ditto M&Ms in the given context, I expect. Attention Bad.)

comment by [deleted] · 2012-06-21T06:37:00.678Z · score: 10 (10 votes) · LW · GW

You are correct that there are many kinds of reinforcers, and it's important to make sure that the one you choose to use is something the receiver will desire.

"In other studies, animals and people given a choice between performing a task for either of two reinforcers often show strong preferences (Parsons & Reid, 1990; Simmons, 1924). Identifying preferred reinforcers can improve the effectiveness of a reinforcement procedure in applied settings (Mace et al., 1997).”

-Learning and Behavior, p149

comment by Will_Newsome · 2012-06-21T22:17:15.539Z · score: 8 (16 votes) · LW · GW

Furthermore at least one person I know (er, myself) picks up on any sort of test-like or game-like or we're-judging-you-so-you-better-not-screw-up-like context and starts acting in extremely confusing/uninformative/atypical/misleading ways so as not to be seen as the kind of person who is easily manipulable (there are probably other motivations involved too). Any incentive structure I'm put under thus has to somehow take this into account, even e.g. the LessWrong karma system. Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don't get the impression this sort of defense mechanism is very common.

comment by Oligopsony · 2012-06-27T20:37:23.418Z · score: 27 (27 votes) · LW · GW

Excellent insight. Downvoted.

comment by Lethalmud · 2013-06-26T14:45:51.413Z · score: 4 (4 votes) · LW · GW

So you are saying that, to change your mode of behavior, all one has to do is create a judging context? That would actually make you very easy to manipulate..

comment by TheOtherDave · 2012-06-21T22:21:21.501Z · score: 2 (2 votes) · LW · GW

Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don't get the impression this sort of defense mechanism is very common.

I experience this as common, but I suspect it's because of a small number of exceptionally vocal "manipulation is evil!" types in my life, rather than a larger number of typically vocal ones.

comment by Viliam_Bur · 2012-06-21T09:56:42.753Z · score: 3 (3 votes) · LW · GW

Yes, the situation is usually not so easy that behavior is just a result of inputs, like this:

output := f(input)

People have minds, and a mind is an environment, different for different people. The real equation would be more like this:

[mind1, output] := f([mind0, input])

For example many people like attention of others, but some people may be trained (for example by a previous abuse) that attention of others is usually followed by pain. For them, a positive reinforcement by giving them attention wouldn't work, because the important things is not the attention per se, but what it means for them.

On a meta level, for someone even the idea of "learning" or "improving" or "changing" may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.

comment by DaFranker · 2012-07-07T04:16:39.137Z · score: 1 (1 votes) · LW · GW

[...] On a meta level, for someone even the idea of "learning" or "improving" or "changing" may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.

This becomes painfully common (and obvious to any observant third-party that knows these concepts) for subjects that students "are just not made for", such as large amounts of students that "just don't get" maths. They've been trained in so many ways to associate actual learning (especially the actions taken when attempting to learn a concept) with negativity that it becomes obviously so much more rewarding to just guess the teacher's password, and so they are positively reinforced into doing everything they can to avoid mental modeling and seek password-guessing through aggregation and correlation of symbol-data. In most cases I've observed, they become experts at the skill of subconsciously forming "truth-tables" of teachers' passwords through brute-force trial-and-error tactics. What's more, this tactic, which they've been trained to do and learned so well and associate so much with positive feedback, often feeds itself into a vicious circle through several possible methods, which makes getting out (or, for that matter, even realizing that it's there and you need to get out of it) so much more difficult than if that behavior had been blocked immediately when it first appeared.

When I realized that, I've started to feel sad for every student I see showing signs of spending hours upon hours studying and memorizing and headsmashing against the same math problems "until they finally understand them", when in truth they haven't really gained anything worthwhile (IMO) from the experience.

comment by [deleted] · 2012-06-21T05:20:21.920Z · score: 2 (4 votes) · LW · GW

I'd have to say that it shouldn't be that common. Most people want to be praised.

comment by erratio · 2012-06-21T13:21:35.467Z · score: 18 (18 votes) · LW · GW

Most people want to be sincerely praised. Someone who reads this post and applies it poorly is going to be saying praise while their body language says something else entirely. Or acting out of character for themselves, leading the reinforcee to suspect that the praise is insincere. Or they may go around praising seemingly everything, causing the reinforcee to interpret the praise as meaningless noise.

There are lots of ways for using praise as reinforcement to go wrong, and if someone is in one of those environments for long enough they will end up being conditioned to interpret praise as neutral or negative.

comment by JGWeissman · 2012-06-21T05:35:09.721Z · score: 3 (3 votes) · LW · GW

I suspect it is common enough that when you observe that praising someone doesn't reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.

comment by pjeby · 2012-06-21T16:41:34.175Z · score: 8 (8 votes) · LW · GW

I suspect it is common enough that when you observe that praising someone doesn't reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.

And also, that you might just be really bad at it. ;-)

This was my problem for quite a while: believing that I ought to praise people, while alieving that there wasn't anything to praise and that they didn't deserve it, due to all their obvious imperfections.

This, as you can imagine, produced sub-optimal results. ;-)

comment by AdeleneDawner · 2012-06-21T06:27:00.497Z · score: 1 (1 votes) · LW · GW

Yep. It's not a situation you're likely to come across often, but when you do, it's worth having the alternate theory available to check.

comment by mwengler · 2012-07-03T21:45:12.912Z · score: 0 (0 votes) · LW · GW

I would expect the rate at which people run counter to "usual" reinforcers to be far less than the rate at which people claim to run counter to "usual" reinforcers. Humans are not naturally very good at reflection of certain sorts.

Having said that I imagine reinforcing techniques would work brilliantly on me whether I knew I was being trained or not. I'd rather know I was being trained (and therefore know what I was being trained to do) and I would therefore wish to reinforce the trainer's openness about the process by having it work better when s/he is open. But even when I didn't realize it, I have every reason to believe that I am not some sort of exception to human neurobiology.

comment by AdeleneDawner · 2012-07-12T03:20:30.773Z · score: 1 (1 votes) · LW · GW

I would expect the rate at which people run counter to "usual" reinforcers to be far less than the rate at which people claim to run counter to "usual" reinforcers.

Yes, which is why I said 'someone for whom this doesn't seem to work', not 'someone who claims that this doesn't work on them' - though of course in the latter case it's at least polite to humor them.

I also didn't say that reinforcing techniques don't work on me - I've never run into anyone for whom that was even remotely plausible, in fact. Just, you have to use things that don't squick me out as positive reinforcers, and overt praise and rewards aren't in that category.

comment by lukeprog · 2012-06-21T03:35:23.645Z · score: 22 (24 votes) · LW · GW

Reason #228 I'm crazy and irrational: Without conscious attention to the reinforcement process, my behaviors are selected for reinforcement almost at random. The process selecting behaviors for reinforcement has tons of steps in it like "Did I happen to glance in the direction of the bag of M&Ms right now?" instead of "Is the thing I'm doing now something I want to reinforce?"

comment by TheOtherDave · 2012-06-21T03:39:47.722Z · score: 9 (9 votes) · LW · GW

(nods) For my own part, it's frequently worse than random... when I don't attend to what I'm doing, I frequently berate or otherwise punish myself for attempts to achieve a target that fall short of that target, and I'm more likely to do that the more I value achieving the target. Which is a great way to extinguish the behaviors I value.

comment by Viliam_Bur · 2012-06-21T09:45:25.347Z · score: 4 (4 votes) · LW · GW

I suspect it's very difficult to design the right reinforcement strategy. It's easy to reward something that seems related to the goal, but can gradually become a replacement for the goal.

For example rewarding success and punishing failure reinforces choosing only trivial tasks, which prevents learning new things. Rewarding starting new things reinforces starting new tasks without finishing them, also choosing tasks for being new, not being useful. Etc.

Rational thinking about consequences, and changing the strategy when necessary, cannot be avoided. So perhaps this should be reinforced. But how do we distinguish between genuine rationality and signalling? Yeah, rationalists should win, but by rewarding success and punishing failure... see the previous paragraph.

Anyway, many people do worse than random, so some reinforcement can be used to improve the situation.

EDIT: Another problem: I suspect that any reinforcement inevitably goes meta. When I get a reward for doing X, I will do X more, but I will also like the reinforcement mechanism more. When I get punished for doing Y, I will do Y less, but I will also hate the reinforcement mechanism and rationalize why I must get rid of it.

I suspect that people prefer wireheading, except in cases when it becomes too obvious that it is wireheading. If I am allowed to choose my reinforcement mechanisms, I will probably unknowingly slowly optimize them towards wireheading. If someone else chooses my reinforcement mechanisms, I suspect they will choose it to optimize their utility function instead of mine.

comment by mwengler · 2012-07-03T21:54:07.489Z · score: 2 (2 votes) · LW · GW

Yoiks! This may well be why my procrastination at work has increased and increased over the decades. I almost always (habitually?) feel like my efforts are not good enough, will be criticized negatively.

comment by TheOtherDave · 2012-07-03T22:02:44.442Z · score: 1 (1 votes) · LW · GW

(nods) That's a pretty common result of relying on punishment to shape behavior.

comment by tgb · 2012-06-21T02:42:27.586Z · score: 22 (28 votes) · LW · GW

I like this article because it is reasonably short, but very clear and highly actionable.

comment by sketerpot · 2012-06-21T23:21:58.058Z · score: 14 (14 votes) · LW · GW

This compliment is particularly effective because it's specific, verifiable, and true. I've never been very good at accepting vague compliments -- I tend to get embarrassed and self-conscious -- but more specific compliments are really nice.

comment by XFrequentist · 2012-06-25T17:42:34.348Z · score: 2 (2 votes) · LW · GW

This explanation of why the complementary comment on the article was effective is itself effective, because it gives specific reasons why the complement is unlikely to evoke the embarrassment sometimes associated with more vague complements.

comment by johnlawrenceaspden · 2012-06-21T15:45:25.420Z · score: 21 (21 votes) · LW · GW

Thank you Luke for this beautifully written post.

A while ago I saw a kindly waitress give my friend's two year old daughter a small cookie in a restaurant. Various emotions flickered across her tiny face, and then she made a decision, accompanied by a small smile.

She broke the cookie into three pieces and gave them to her brothers. Completely unprompted.

I couldn't believe my eyes. I asked my friend, who is a lecturer in experimental psychology, whether altruism was normal amongst very young siblings.

He looked a bit smug and said "Well we put a lot of reinforcement into that."

I hadn't really thought about what that meant until now. Your clear writing has made it obvious.

As a result of your post, I think I'm going to try deliberately modifying some of my own behaviours this way, and maybe try the techniques on some friends. (The first time, by the way, that I've changed my behaviour as a result of reading less wrong, rather than just treating it as philosophical crack.)

For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.

comment by Viliam_Bur · 2012-06-22T10:16:30.563Z · score: 8 (8 votes) · LW · GW

For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.

M&Ms, one piece at a time -- they are small enough. (It would probably be good if you stop eating them in all other circumstances, but that is not big sacrifice.)

Or try a symbolic reward. For example put on your table two glass boxes, put 100 stones in first one, and every time you want to reward yourself, move one stone from the first box to the second one, and congratulate yourself on progress. When all stones are in the second box, give yourself a big reward (pizza or whatever), change the boxes, and start again. (This way the reward is still linked to pizza, but it is less pizza. And you see your progress all the time.)

comment by TheOtherDave · 2012-06-22T12:33:49.427Z · score: 7 (7 votes) · LW · GW

Don't underestimate the power of praise as a self-reward. It feels really goofy to explicitly praise myself -- especially to do it out loud -- but that doesn't mean it doesn't work.

IME, the biggest problem with self-reward, whatever the mechanism, is that it requires quite a lot of discipline to differentially reward the thing I want to reinforce at all consistently.

The only time I ever really maintained that discipline for any length of time was when I was recovering from brain damage, when continued focus on self-improvement was the single most important thing in my life for about 18 months. In my real life, I just don't care that much. YMMV.

Recruiting allies to reward me works better for me.

comment by mstevens · 2012-06-21T15:47:01.318Z · score: 18 (18 votes) · LW · GW

To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're > doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."

I've noticed in pilates classes with one specific teacher you get positive feedback in one specific situation - when you're having trouble, and have just barely managed something basic. This leads to the association that whenever you get positive comments you know you're doing badly.

comment by [deleted] · 2012-06-21T18:20:33.310Z · score: 6 (6 votes) · LW · GW

Yeah, there's kind of a perceptual/patternmatching arms race going on there -- if you're too blatant about it, or the intended recipient of the reinforcement is just that perceptive, then they're reading the script too and it won't have the intended result. It could backfire (as in your example; semantically-positive reinforcement becomes pragmatically-negative), or send undesirable information ("you wouldn't have put it that way unless something were up, and that gives me a clue"), or open you to counter social-engineering scripts if the part knows what they're doing.

comment by pnrjulius · 2012-07-05T01:27:58.737Z · score: 2 (2 votes) · LW · GW

If that's the case (and it seems like it is), then reinforcing yourself is going to be almost impossible, because you will by definition know the reinforcement script.

comment by Caspian · 2013-07-18T13:18:12.982Z · score: 0 (0 votes) · LW · GW

Reinforcing effort only in combination with poor performance wasn't the intent. Pick a better criterion that you can reinforce with honest self-praise. You do need to start off with low enough standards so you can reward improvement from your initial level though.

comment by mstevens · 2012-06-21T20:18:31.285Z · score: 2 (2 votes) · LW · GW

In my case I'm not terribly perceptive, but there's a lot of repetition of the same situation to give you a clue.

comment by Will_Newsome · 2012-06-21T04:51:33.425Z · score: 15 (21 votes) · LW · GW

Eagerly awaiting "The Power of Punishment".

comment by JulianMorrison · 2012-06-21T09:25:43.171Z · score: 13 (15 votes) · LW · GW

Anecdotally, punishment seems to be a good guilt-releaser, while guilt is dysthymic. Punishment may be effective at snapping someone out of a blue funk and getting them to be responsive to rewards. Guilty people reject rewards. (The above may work better if you are kinked that way.)

comment by RichardKennaway · 2012-06-26T09:19:56.688Z · score: 0 (0 votes) · LW · GW

I'm curious about the anecdotes. I feel like I'm reading travellers' tales of the weird customs of a distant tribe.

comment by JulianMorrison · 2012-06-26T20:27:24.620Z · score: 0 (0 votes) · LW · GW

How about I direct you to this blog for a gentle introduction?

comment by arundelo · 2012-06-26T21:51:01.510Z · score: 0 (0 votes) · LW · GW

It's guessable from context, but an NSFW tag would probably be good here.

comment by RichardKennaway · 2012-06-26T21:30:17.808Z · score: 0 (0 votes) · LW · GW

Ah, that sort of thing. Ok, not so Martian after all.

comment by JulianMorrison · 2012-06-26T21:36:06.603Z · score: 1 (1 votes) · LW · GW

Now you've made me curious what you thought it was.

Although to clarify, I meant that as generally as I said it. It applies in kink, and it applies out of kink. Kink just has the most readily accessible anecdotes.

comment by RichardKennaway · 2012-06-26T22:22:44.821Z · score: 1 (1 votes) · LW · GW

Now you've made me curious what you thought it was.

I just found it opaque. The context you linked had not occurred to me. Still is, to some extent, despite having cough some slight familiarity with such things, but people's experiences and conceptualisations vary enormously.

comment by FiftyTwo · 2012-06-26T01:18:51.081Z · score: 0 (0 votes) · LW · GW

What about self punishment?

comment by Viliam_Bur · 2012-06-26T09:08:47.669Z · score: 1 (1 votes) · LW · GW

What about self punishment?

Every time you do it, you condition yourself against doing it.

comment by JulianMorrison · 2012-06-26T21:28:33.753Z · score: 1 (1 votes) · LW · GW

If and only if it's negatively self reinforcing. Which it might not be, if it's serving some purpose.

Self-harm can help you to feel in control, and reduce uncomfortable feelings of tension and distress. If you feel guilty, it can be a way of punishing yourself and relieving your guilt. Either way, it can become a 'quick fix' for feeling bad.

-- http://www.rcpsych.ac.uk/mentalhealthinfo/problems/depression/self-harm.aspx

comment by wedrifid · 2012-06-21T05:10:45.401Z · score: 9 (11 votes) · LW · GW

Eagerly awaiting "The Power of Punishment".

Particularly good for demonstrating to observers that you have more status and power than the person you are punishing.

comment by Viliam_Bur · 2012-06-21T10:04:17.042Z · score: 7 (7 votes) · LW · GW

meh. downvoted.

(just joking)

comment by Will_Newsome · 2012-06-21T05:14:54.091Z · score: 1 (5 votes) · LW · GW

(demonstrating to observers / demonstrating to self / demonstrating to punished; status / power / resources / justification / need / etc; person / cognitive subsystem / institution / problem representation / etc)

comment by Dorikka · 2012-06-21T04:56:38.993Z · score: 0 (0 votes) · LW · GW

I'm curious -- where did your other post (a few paragraphs) go? I didn't think that people could permanently delete posts, only retract them, and I thought that a star appeared if you edited your post.

comment by Will_Newsome · 2012-06-21T06:09:36.609Z · score: 2 (2 votes) · LW · GW

Here's a fixed, less passive-agressive version of the deleted comment:

This article implicitly reinforces reinforcement and punishes punishment. But there are situations in which punishment should be reinforced, e.g. if this article is in fact correct to punish punishment. I hope someday someone writes out a list of ways to efficiently torture oneself into having at least some hope of ultimately not being seen as obviously stupid in retrospect, to complement this article and perhaps adjust for any optimistic-biased selection that might have generated it.

comment by Will_Newsome · 2012-06-21T04:58:17.768Z · score: 2 (2 votes) · LW · GW

You get the option to delete if you retract and no one's commented. Which is perhaps not good, because I made a rather embarrassing terminological error in that comment that I probably deserve to be punished for.

comment by Eugine_Nier · 2012-06-22T04:01:06.122Z · score: 1 (3 votes) · LW · GW

You are aware that you can edit your posts? Such as fixing terminological errors.

comment by Will_Newsome · 2012-06-22T04:20:45.466Z · score: 4 (4 votes) · LW · GW

Upon a few minutes of reflection, I've decided that it wouldn't technically be logically impossible for me to not be aware that I can edit my posts. At first I thought that any person who can contribute to LessWrong for two years without realizing that they can edit their posts simply couldn't be me in any possible world. But it's true that weirdly specific brain damage or supernatural influence could in fact make it happen while leaving my identity intact. I have a stricter sense of logical possibility than most, but I guess I'll cordon off that debate for some other time.

Um anyway yeah I'm aware.

comment by Eugine_Nier · 2012-06-22T04:39:18.030Z · score: -1 (1 votes) · LW · GW

Then why didn't you fix the error rather than delete the post?

comment by Will_Newsome · 2012-06-22T05:01:47.021Z · score: 0 (0 votes) · LW · GW

It was a bad post overall. Also the terminological error was rather embarrassing so I wanted to erase it as quickly as possible.

comment by Dorikka · 2012-06-21T04:59:16.968Z · score: 0 (0 votes) · LW · GW

Gotcha -- thanks.

comment by Will_Newsome · 2012-06-21T05:00:50.997Z · score: 0 (0 votes) · LW · GW

But for some reason I can still see where the comment used to be, now with a "comment deleted" indicator, and it says it has one child, but it doesn't. Perhaps a synchronization error.

comment by JGWeissman · 2012-06-21T05:09:33.119Z · score: 1 (1 votes) · LW · GW

I had replied to point out the terminological error. You must have deleted after I started the comment but before I submitted. I then notice your comment was deleted, so I deleted my response. (It might be a good idea to not allow a response to be submitted after the parent is deleted.)

comment by coffeespoons · 2012-06-22T10:32:33.113Z · score: 12 (12 votes) · LW · GW

I read this post last night. I was in the office late, not because I had a great deal to do, but because I was procrastinating. After reading it, I asked my friend to give me a quick call to say congratulations in a half an hour if I'd finished all the work. It took me 10 minutes to finish! :)

comment by Jonathan_Graehl · 2012-06-22T23:18:15.809Z · score: 7 (7 votes) · LW · GW

But that's probably more of a public commitment effect.

comment by Normal_Anomaly · 2012-06-23T00:36:57.272Z · score: 7 (7 votes) · LW · GW

True. But I bet if coffeespoons makes this a routine thing, they'll eventually find themselves enjoying work more.

comment by Vladimir_Golovin · 2012-06-21T09:44:46.796Z · score: 12 (12 votes) · LW · GW
  1. Nice post SIAI! Have an $5 donation!

  2. I tried a similar reinforcement technique on myself but it didn't stick because I couldn't find a reliable trigger condition for dispensing the reward.

  3. Does this mean that we should stop punishing ourselves for procrastination?

comment by Kaj_Sotala · 2012-06-21T11:14:47.124Z · score: 15 (17 votes) · LW · GW

Does this mean that we should stop punishing ourselves for procrastination?

My personal experience strongly suggests that "stop punishing yourself for X" helps avoid X, for most if not all X. For instance, becoming a vegetarian was much easier when I didn't try to go cold turkey, but rather was fine with the fact that I would succumb to the lure of eating meat every now and then. When I did, I felt a little guilty, but then shrugged and thought that I'd try better the next time. I still fall victim to that temptation occasionally, but it's much more rare now than it used to be.

This might have something to do with the fact that if you punish yourself for trying and failing, you stop wanting to try in the first place, as it becomes associated with the negative emotions. Also, accepting and being okay with the occasional failure makes you treat it as a genuine choice where you have agency, not something that you're forced to do against your will.

See also It's okay to be (at least a little) irrational.

comment by Vladimir_Golovin · 2012-06-21T12:05:24.399Z · score: 5 (5 votes) · LW · GW

Perhaps this is why I like Autofocus better than GTD. "It is fine to have incomplete tasks in your task list".

Also, non-punishment for failures may be one of the distinctions between play-like work and work-like work.

comment by Caspian · 2013-07-18T13:35:29.676Z · score: 1 (1 votes) · LW · GW

I think I even have work-like play where a game stops being fun. And yes, play-like work is what I want to achieve.

comment by Vladimir_Golovin · 2013-07-19T06:29:36.220Z · score: 1 (1 votes) · LW · GW

In case of work-like play, I have a resolution: stop playing immediately. It doesn't mean quitting the game for good, but rather "end the session now, if a game permits that". Also, this is why I generally don't play games that punish me for leaving early (e.g. WoW raids, DOTA2).

comment by CAE_Jones · 2013-07-18T14:06:39.502Z · score: 1 (1 votes) · LW · GW

Anecdote: I've made suggestions to someone on how he might optimize the time he spends writing for his various projects, and more than once he's responded that a given strategy would make it feel too much like work (I don't remember off hand if he said explicitly that this would be an instrumental problem, or if that was only implied). I'm not really sure how I feel on how I might go about applying this concept, mostly because of my extremely vague definitions of work / play, but I do find that having certain restrictions--something as simple as paper size, for example--tends to make it much easier to work on something. (I wrote a shortstory by specifying what it would need to fit in, and measuring books I'd made in the same format years earlier; I made a large number of maps for a game by using a format restricted to 32 tiles across, etc. I haven't found good ways to apply this strategy to most of what I try to do, though.).

comment by Rain · 2012-06-21T03:13:15.670Z · score: 12 (14 votes) · LW · GW

That's why I tried to stay positive when talking about the new SI website. Especially with technical changes like that, the (vocal) negative response can be overwhelming.

comment by lukeprog · 2012-06-21T03:37:15.405Z · score: 22 (22 votes) · LW · GW

Yup. When reading through the comments about the new website, I could feel my effort being punished.

comment by pjeby · 2012-06-21T16:52:37.735Z · score: 14 (14 votes) · LW · GW

Yup. When reading through the comments about the new website, I could feel my effort being punished.

Perhaps you could have somebody read them for you and summarize them in a non-critical way, thus creating a reinforcement shield.

Alternately, you could adapt what internet marketing "personalities" do, and promote doing: practice celebrating criticism. One marketer (I forget which one) described making a practice of throwing his hands in the air and shouting "Woo!" when he received a criticism via email.

(Background: "personality" marketers promote by writing emotionally charged material that's intended to divide their audience into people who either love or hate them. Thus, the presence of hate mail is evidence that their strategy is working. They will then often publicize the hate mail, in order to stir up the emotions of the people on the opposite side of the debate. Talk radio hosts, bloggers, political commentators, etc. also use these strategies, even if they're not always considered "marketers" in a traditional sense. Whether you consider this "dark arts" is largely a political question, since the LW sequences use these tactics also. Whether he knows it or not, Eliezer is a personality marketer in this sense, it's just that he's not as efficiently monetizing the results. ;-) )

comment by John_Maxwell (John_Maxwell_IV) · 2012-06-21T05:45:32.170Z · score: 4 (4 votes) · LW · GW

Sorry about that.

It seems to me that if humans were emotionless utility maximizers, we would prefer hearing criticism over praise, the same way programmers purchase more utility by fixing bugs in their programs than polishing features that already work. I suspect criticism is generally more valuable from a pure decision theoretic perspective.

I wonder if there is an effective way to buy encouragement and criticism separately. Also, it's hard to know exactly how best to encourage folks. In theory it's possible that making a new website is not the best use of SI's resources, which suggests reinforcement would not be optimal. But we still may want to reinforce you towards the more general behavior of taking steps to achieve your organizational goals. So what's the best response?

Maybe someone can develop some general guidelines for reinforcing/criticizing people, similar to what the nonviolent communication people came up with. (When {observable event} happened, I felt {feeling} because I need/value {underlying need that felt unmet or value that felt jeopardized}. Would you be willing to {specific request that person could do} in the future?) E.g. check to see if the person was acting with good intentions and reinforce them for those if they existed, check for super goals you endorse and reinforce them for working to accomplish those, check to see if the person could just have easily have sat around doing nothing and reinforce them for expending effort if this was the case, etc.

I think optimally criticism would have lots more reinforcers associated with it: people should be reinforced for requesting, giving, and receiving criticism because these are all activities that are naturally aversive but actually have high expected value.

So, I wholeheartedly endorse the following actions of yours: attempting to maximize humanity's collective utility function, working on the super goal of AGI safety, actually doing stuff, and deliberately gathering critical feedback. Go Luke!

comment by handoflixue · 2012-06-22T19:27:54.828Z · score: 0 (0 votes) · LW · GW

Criticism > Praise > Nothing. The problem is, people default to "Criticize or stay quiet", and so I tend to value praise highly, as it's culturally much rarer.

Also, if it's a matter of opinion (rather than an actual code bug), praise can actively offset criticism (1 person dislikes it, but the other 99 users all love the new UI... probably not wise to revert!)

comment by TimS · 2012-06-22T19:32:33.483Z · score: 0 (0 votes) · LW · GW

Edit: This statement is basically wrong. I was confusing negative instruction with punishment. Comment preserved for continuity.


Interestingly, adding a stimuli that decreases the frequency of a behavior (aka positive punishment) is less effective at changing behavior frequency than positive reinforcement.

That is, reinforcing an alternate behavior is more effective at decreasing a problem behavior than simply punishing the problem behavior. (I think this is even true when there are only two possible behaviors).

comment by TheOtherDave · 2012-06-22T19:41:27.192Z · score: 0 (0 votes) · LW · GW

Really? I'd love a reference for this. My understanding was always that positive punishment has a stronger effect on behavior frequency than (for example) training an incompatible behavior, but also has lots of other effects that I don't want to instill, which are often more important than maximizing effect on behavior frequency (e.g., reducing the rate at which novel behaviors are offered).

comment by TimS · 2012-06-22T19:54:15.681Z · score: 0 (0 votes) · LW · GW

Let me clarify slightly, because I wasn't trying to say something earth-shaking. If I did say something earth-shaking, I'm probably wrong.

My statement was made assuming that Bob already has a problem behavior that we would like to decrease the frequency of, and eventually extinguish. To be more concrete, let's say Bob has bathroom accidents (he voids away from the toilet). All I meant to say was the statement "Good job going pee-pee on the potty" is more effective at reducing the frequency of accidents than "Bob, you shouldn't go pee-pee in your underwear."

Yes, I'm toilet training my son - why do you ask? :)

comment by TheOtherDave · 2012-06-22T21:03:57.861Z · score: 1 (1 votes) · LW · GW

Well... hrm. That might very well be true about toilet-training, as increased anxiety is one of the side-effects of positive punishment, and anxiety interacts exceptionally poorly with bladder control. So, I dunno.

But in general, I'm pretty sure what you're saying isn't quite right. If I want to extinguish, say, jumping on the couch, consistently punishing incidents of jumping on the couch will extinguish the behavior much faster than pretty much anything else I can do.

Please don't misunderstand me; I absolutely don't endorse this as a training technique. But the reason I reject it isn't because it doesn't extinguish the behavior quickly... it does. The reason I reject it is because it creates a host of related side-effects that make subsequent training much more difficult, not to mention make the subsequent relationship with the trainer (and often with everyone else) much more unpleasant for the trainee.

Punishment is a blunt axe, but it's a powerful blunt axe.

comment by TimS · 2012-06-22T23:47:36.340Z · score: 2 (2 votes) · LW · GW

I talked with my wife, the future BCBA, and it appears that my intellectual reach has exceeded my grasp. First, I seem to have confused positive reinforcement v. punishment and positive and negative instruction. It is the case that negative instruction ("Don't throw your toy car") is less effective than positive instruction ("We only throw balls").

Second, there are some interventions, reinforcing and punishing, that could teach in one trial (consider heroin injections as reinforcement and flamethrowers as punishment). Edit: my wife says this point is about salience.

Third, best practices among behavior analysts are to use reinforcement prior to using punishment. My wife says that this is for ethical reasons - her reference book didn't talk about the relative effectiveness of reinforcement and punishment.

comment by TheOtherDave · 2012-06-23T01:26:58.414Z · score: 1 (1 votes) · LW · GW

I seem to have confused positive reinforcement v. punishment and positive and negative instruction.

Ah! Yes, that makes sense. Negative instruction doesn't work very well, it's true.

there are some interventions, reinforcing and punishing, that could teach in one trial

Mm... yeah, that's a good point. I was eliding the distinction between salience and reward/punishment, and ought not have.

comment by wedrifid · 2012-06-21T05:16:20.768Z · score: 2 (4 votes) · LW · GW

Yup. When reading through the comments about the new website, I could feel my effort being punished.

I am slightly surprised to hear this. I perhaps expected slightly less emotional involvement with the effort and more of a ", Go! Fix!" feeling.

comment by lukeprog · 2012-06-21T05:20:05.196Z · score: 2 (2 votes) · LW · GW

What happened is that (1) I felt my effort being punished, and then (2) I sent an email to Nickolai or Kamil asking them to fix X.

comment by wedrifid · 2012-06-21T05:26:36.762Z · score: 10 (12 votes) · LW · GW

I sent an email to Nickolai or Kamil asking them to fix X.

Great work Nickolai or Kamil, if either of you read lesswrong at all. The website is a much needed improvement! ;)

comment by wedrifid · 2012-06-21T05:48:36.557Z · score: 4 (4 votes) · LW · GW

(2) I sent an email to Nickolai or Kamil asking them to fix X.

I've noticed (while being such a minion) that when making such change requests yourself manage to do so with a frame that minimises a criticism vibe or 'effort punishment' feelings. I would pay many, many M&Ms for that effort in careful phrasing.

comment by NancyLebovitz · 2012-06-22T03:16:09.990Z · score: 0 (0 votes) · LW · GW

Thank you for toughing it out.

I'm sorry if my comments were too harsh.

comment by Vladimir_Nesov · 2012-06-21T18:56:10.292Z · score: 0 (0 votes) · LW · GW

I don't have that response, which probably accounts for me not being sufficiently mindful of expressing criticism to others... Do you think there may be a way to train positive or neutral response to criticism? Are there effective methods for making criticism less painful to a typical person?

comment by dbaupp · 2012-06-21T05:49:48.309Z · score: 2 (2 votes) · LW · GW

I tried to do the same. Although, I was probably significantly less successful than I'd liked to have been (sorry Luke, Nickolai, Kamil and anyone else who'd made an effort!).

Also, given lukeprog's comment, this unfortunately appears to be a case of history repeating itself: matt had a similarly negative experience when LW was redesigned a little while ago.

comment by wedrifid · 2012-06-21T06:28:39.070Z · score: 6 (8 votes) · LW · GW

matt had a similarly negative experience when LW was redesigned a little while ago.

That circumstance is somewhat different in nature. While as far as I know nobody wanted matt to experience negative affect the discouragement of 'effort' was actually a perceived instrumental good, given an expectation that more effort would produce undesired outcomes.

I note that this relies on beliefs at the time. In that context users had to make the prediction "If a website administrator implements detrimental changes when previous discussion had already explained why such a thing was not desired and a prediction had been made that a change to the website would probably be bad, what is the probability that future 'effort' will be beneficial?" The answer is very, very low. The emotional distress matt experience was his social instincts warning him that interfering with the tribe when they do not want you to is a dangerous act - especially when that interference is to (in effect) institute a prohibition against something they could previously do.

It turns out, however, that matt is a superior human being to the typical person in his role. While his ego did cause him to act more defensive than optimal and seemed to cause him to experience emotional distress it did not cripple his ability to respond to user feedback or cause him to lash out with actions against the users as many would. The undesired change was eventually fixed, as were the few bugs that were introduced.

I expect users to drastically update how they would respond to matt if he made future website upgrades due to having more information about matt. He definitely deserves a lot of rewarding for going ahead and doing the bugfixes and implementing 'retraction then deletion' despite having received discouragement. He lives (here) in Melbourne. Perhaps I should give him a packet of M&Ms if I run in to him at one of our meetups!

comment by [deleted] · 2012-06-26T19:10:24.732Z · score: 11 (11 votes) · LW · GW

The lead article conflates two process: habits and incentives. The very term "reinforcement" dates back to before the distinction was well-understood. Only in the last decade has it been known that habit operates from a neurology distinct from incentives. (The habit mechanism is in a much older part of the brain.) Only the first story, Yudkowsky and the jellybeans, deals clearly with reinforcement of habit. The others are probably primarily adjustment of incentives.

In using habit and incentive, different rules apply. Incentives require that the subject discern the contingency. The processes Skinner studied as "reinforcement" are mostly about incentives. You adjust schedules of reinforcement to alter the organism's expectancies. For incentive effects, consistent reinforcement is not usually best, as the results are subject to extinction soon after the organism stops getting the reward.

Habits, on the other hand, are blind. The organism doesn't need to see any contingency. Yudkowsky continued to be nice even after he no longer received the jellybeans. To form habits, as opposed to incentive structures, consistency is key.

In short, as a general rule, you want consistency to reward habits and considerable randomness to create lasting incentives.

But the difference extends also to the ethical questions raised. Altering others' incentives for our own benefit is part of ordinary human interaction. If his colleagues surreptitiously timed the offer of jellybeans to Yudkowsky when he acted nice, this is something else; the ethical reason is that Yudkowsky need not recognize what he's being rewarded for to be affected by the jellybeans.

Both habit and incentive are "powerful." But they're powerful for different reasons, in different ways; and to apply them effectively and ethically requires different procedures.

comment by mwengler · 2012-07-03T21:52:14.460Z · score: 4 (4 votes) · LW · GW

How do you tell which things you want to reinforce are habits (and should therefore be reinforced consistently) and which things are incentives?

comment by bbleeker · 2013-02-19T11:22:12.468Z · score: 1 (1 votes) · LW · GW

I'd think a habit is something that just goes on as long as nothing happens to disrupt it. You no longer need to reinforce it.

comment by Houshalter · 2013-05-13T17:04:12.788Z · score: 0 (0 votes) · LW · GW

But the only difference in the process of creating to two is the amount of consistency? That doesn't seem quite right.

comment by bbleeker · 2013-05-14T09:08:46.828Z · score: 1 (1 votes) · LW · GW

I must confess I don't really understand the way 'incentive' is used in [deleted]'s post. Isn't an incentive usually a reward you use to get someone to do something? When I give my cat a treat I to get her cat to come when I call her, the treat is the incentive. I didn't create the incentive - the cat already liked the treats. All I did is get her to associate me calling her with getting a treat.

comment by Pablo_Stafforini · 2012-10-19T00:21:03.294Z · score: 1 (1 votes) · LW · GW

Can anyone here point me to the relevant scholarly literature discussing the differences between habits and incentives? I tried Google and Google Scholar but failed to find any paper or survey article that explicitly contrasts these two processes.

comment by Swimmer963 · 2012-06-21T01:19:27.859Z · score: 11 (15 votes) · LW · GW

To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."

I got a demonstration of how true this is yesterday when, during my taekwondo class, I was paired up with one of the senior black belt students, who has some but not a lot of experience teaching. He was supposed to be fixing up my poomsae (same thing as a kata in karate) and each time he watched me do it, I would finish and he would immediately launch into a description of what I was doing wrong. His feedback was pretty useful–specific, with demonstrations of exactly what to change in order to do it right–but without any prelude of "yay, good job!" or even "okay, the punches were way better that time...now let's work on the stances", I found myself getting really discouraged. Reminding myself that I wasn't actually doing worse than usual, that he just had a different teaching style, helped a little... But my subconscious brain still decided to feel resentful and unenthusiastic, no matter how counterproductive that might be towards my actual goal of improving my poomsae.

As a swimming instructor, I do make sure to dole out a LOT of praise, but I'm wondering if I should push it even further...

comment by [deleted] · 2012-06-21T01:32:15.858Z · score: 7 (7 votes) · LW · GW

I'm not sure a lot of praise is a good idea since that would lower its effectiveness as a reinforcer.

comment by Swimmer963 · 2012-06-21T02:01:00.478Z · score: 12 (12 votes) · LW · GW

Well, a lot of non-specific praise would water down the value of non-specific praise as a reinforcer, but taking the time to pick out more specific elements that are good/improving would probably reduce discouragement.

I think one of the things I forget most as an instructor is how easy it is to get discouraged, especially when you're being taught by someone who seems to be able to do all of it effortlessly. There's also the element of "I already know I'm doing it wrong! I just can't get my body to listen to my brain!" Instructors who don't acknowledge this and give praise for trying or noticing that I'm doing it wrong are a major source of discouragement for any new physical skill I try to learn.

comment by NancyLebovitz · 2012-06-22T03:12:39.709Z · score: 2 (2 votes) · LW · GW

"I already know I'm doing it wrong! I just can't get my body to listen to my brain!

Any advice on getting one's body or one's student's body to be more cooperative?

comment by Swimmer963 · 2012-06-22T12:31:34.142Z · score: 3 (3 votes) · LW · GW

Break complex movements down into lots of simple movements ("drills") and practice them individually, a lot...then string together the first two simple movements and practice that sequence a lot...then the first three in sequence...etc. Also, don't start by teaching/trying to learn the full complex movement in the first place–always start with the simplest possible subset, master that, and then worry about the next step.

comment by handoflixue · 2012-06-22T19:13:00.172Z · score: 1 (1 votes) · LW · GW

Very much second this. The most useful thing I've learned about learning, is how to break down a complex action in to multiple simpler ones that I can drill independent of each other.

comment by [deleted] · 2012-06-21T02:14:01.150Z · score: 1 (1 votes) · LW · GW

Excellent point. I stand corrected.

comment by Swimmer963 · 2012-06-21T02:23:58.974Z · score: 5 (5 votes) · LW · GW

I think you do have a valid point... However, in my experience, most instructors err way on the side of "too little praise" and don't have to worry about using it too much and lowering its effectiveness. And most humans I know have a brain setup where after hearing "good job on X" ten times, hearing it an eleventh time is still really reinforcing. So you'd have to really go to extremes to praise them too much...

comment by handoflixue · 2012-06-22T19:23:10.247Z · score: 5 (5 votes) · LW · GW

I was originally going to say I don't like excessive praise, but thinking about it, what I actually dislike are two things:

1) False praise. I really hate it when it's obvious someone is formatting EVERYTHING they say to match a script (the one that annoys me most is the "sandwich" model of praise-critique-praise. It's great for blows that need to be softened, but if you soften everything, then every mistake becomes equally trivialized).

2) Wasting my time. If the feedback boils down to "You did exactly as well as you did last time" then (a) I probably know this and (b) you can say approximately that without spending 2 minutes extolling my virtues. I'm usually impatient to get back to actually doing the activity. If I'm not impatient, it means I'm either seriously discouraged or don't value the activity at all - either way, unless it's my actual job, I'm unlikely to care about feedback at that point.

These apply to critiques even more-so than praise, but the style of "make everything 90% praise and act like mistakes are just this mild little thing" is a pattern I recognize very quickly, and find extremely discouraging, since it means I'm no longer receiving feedback that actually honestly represents how well I'm doing.

I also hate being given "points for effort" unless the person is correctly reinforcing "You'll probably need to repeat this drill 500 times before you have it down correctly, so be patient with yourself" (saying this when others are figuring it out in 5-10 drills is clearly lying, and will again seriously impact all feedback from that source >.>)

comment by Swimmer963 · 2012-06-22T19:59:13.979Z · score: 3 (3 votes) · LW · GW

Well, it sounds like you're someone who, if you already know how you did on something, don't need people to shower on praise unless it conveys new information. Am I right that you would find specific praise, or praise on something that you genuinely didn't know whether you'd done well on, less annoying?

I also hate being given "points for effort" unless the person is correctly reinforcing "You'll probably need to repeat this drill 500 times before you have it down correctly, so be patient with yourself" (saying this when others are figuring it out in 5-10 drills is clearly lying, and will again seriously impact all feedback from that source.)

As someone who regularly takes more repetitions of a drill to learn a certain (physical) skill than the average person, I damn well like getting points for effort–they're likely to be the only points I get for a while, and I tend to get seriously discouraged watching other people learn stuff easily when I'm struggling with it. I agree that someone who says "be patient, everyone needs to do this 500 times to get it right", when that's not the case, is not being helpful...but a simple "good effort, you will improve on this, don't worry" is a) not lying, and b) helps with the discouragement factor.

comment by handoflixue · 2012-06-25T19:25:27.466Z · score: 2 (2 votes) · LW · GW

conveys new information

This is indeed key, thank you for putting it more concisely than I

Am I right that you would find specific praise, or praise on something that you genuinely didn't know whether you'd done well on, less annoying?

It varies. "Be specific" is usually better, but "be brief" is also often important to me. A slow break-down of specifics is important if I don't know how to improve. A brief summary is fine if I'm improving on my own and really just need to get more repetitions. These days I'm usually aware of which one I need, and can ask for it. Previously, I'd just get frustrated if I needed more specific advice, because communication is exhausting, and learning is exhausting, and the combination of the two sucked.

I damn well like getting points for effort–they're likely to be the only points I get for a while

I've found that it varies - if I have things down and just need to drill, I'll often be entirely content off in a corner repeating something mindlessly with minimal feedback (an occasionally "good effort, you will improve on this, don't worry" is very rewarding, but we're talking every 15-30 minutes)

Basically, I can enjoy drilling. I actually find it a ton of fun with most skills I've gotten good at - the skills I fail to improve are usually the ones where I don't enjoy drilling, and thus... don't drill. The assumption that just because I'm stuck repeating something a lot, I must need encouragement... tends to de-motivate me, because it says "hey, you're slow and abnormal and so I'm going to focus a lot on fixing you", which has a lot of bad connotations for me.

I strongly suspect your teaching style would not annoy me (or that you'd quickly adapt it around me), but a lot of people get stuck in the meme of "ALWAYS follow this script" and start completely ignoring body language cues that indicate a particular student DOESN'T do well with a certain script.

As an example: I hate, HATE the "positive - negative - positive" sandwich. It means I associate positive feedback with a lead-in to "something bad", and so any compliment is now a threat to me, a sign I did something else wrong. I also pattern-match fairly well, and once I figure out when it's a sandwich vs a genuine compliment, I'll get impatient to know why I was REALLY brought in to talk (i.e. what I did wrong).

comment by Swimmer963 · 2012-06-25T22:14:01.785Z · score: 0 (0 votes) · LW · GW

Previously, I'd just get frustrated if I needed more specific advice, because communication is exhausting, and learning is exhausting, and the combination of the two sucked.

This sounds like a challenging situation. How were you able to move past this in order to be able to ask for more specific feedback when you needed it?

I've found that it varies - if I have things down and just need to drill, I'll often be entirely content off in a corner repeating something mindlessly with minimal feedback (an occasionally "good effort, you will improve on this, don't worry" is very rewarding, but we're talking every 15-30 minutes)

You are very lucky to be content in this kind of situation. I wish I could be more content.

The assumption that just because I'm stuck repeating something a lot, I must need encouragement... tends to de-motivate me, because it says "hey, you're slow and abnormal and so I'm going to focus a lot on fixing you", which has a lot of bad connotations for me.

I think I almost have a good connotation around this kind of situation. There are at least two areas (singing as a strong example, and competitive swimming as a weaker example) where I started out pretty awful. I could have compared myself to the people starting out at the same skill level as me...but that would have been pretty pointless. Other people who were as tone deaf as I was at age 11 just didn't try learning to sing. So I made my reference group the people who were doing solos in my choir. After a few years, I think most people actually forgot that they had originally considered me "slow and abnormal." I started to get the comment "well, obviously someone with your natural musical talent..." Ha. Right. But I did succeed in proving, to myself if not anyone else, that if I put myself into situations where I am "slow and abnormal" compared to everyone else, I will make much bigger improvements than if I stick with the activities where I'm already stronger than average.

comment by handoflixue · 2012-06-25T23:01:13.855Z · score: 3 (3 votes) · LW · GW

This sounds like a challenging situation. How were you able to move past this in order to be able to ask for more specific feedback when you needed it?

It's not really exciting to say it, but: 1) I learned to identify, internally, what my emotions correspond to (most critically, if I'm frustrated, it's probably because I'm practicing the wrong thing)

2) I've memorized a few phrases that tend to garner the feedback I need ("Can you be more specific?", "Can you break that down in to smaller pieces?", "I feel like there's some little piece I'm missing that would make this all click together", and "can you demonstrate slowly and narrate what you're doing?")

3) Most important, I have a strong CONCEPT of "this technique is actually a series of smaller techniques that I can drill separately". It's very hard to ask someone to break something down in to simpler steps when you're stuck thinking about it as a single step. And I've broken things down often enough that I can communicate the idea to an instructor who doesn't have it as a concept.

3rd one also helps me evaluate things in advance: "this skill is beyond me - I will need to do something smaller and simpler first, otherwise I'll feel totally overwhelmed and have trouble learning." The tricky bit is usually just finding smaller pieces, but that's where an instructor is useful :)

comment by shokwave · 2012-06-26T01:49:00.593Z · score: 1 (1 votes) · LW · GW

Markdown doesn't play nice with the # character; you may need to edit it out to return to normal size.

comment by handoflixue · 2012-06-26T21:07:17.022Z · score: 0 (0 votes) · LW · GW

Thanks :)

comment by Armok_GoB · 2012-06-21T14:25:59.275Z · score: 1 (1 votes) · LW · GW

Hmm, I wonder if providing a lot of negative reinforcement on some attribute of them you don't care about would make the positive reinforcements more effective on the things you do care about.

Example: trying to teach someone math, and praising them at everything they do right with the math, including trying, but complain abut their physique, fashion choices, hygiene, etc. Especially timing those unrelated complaints to when they seem less focused on the math but subtly enough they don't consciously notice the correlation.

Not that this isn't a bad idea for other unrelated reasons...

comment by TheOtherDave · 2012-06-21T15:24:48.808Z · score: 2 (2 votes) · LW · GW

There's a couple of factors here worth keeping in mind.

One is that classical conditioning continues to work, even when I'm concentrating on operant conditioning. So one result of this strategy is that my target will come to associate me with aversive stimuli, which will in turn reduce the effectiveness of my attempts at reinforcement. They will similarly associate the teaching sessions and math with those stimuli, which may be counterproductive.

Another is that a target consciously noticing my attempts at conditioning changes the whole ball game, in ways I don't entirely understand and I'm not sure are entirely understood. Sometimes it's a huge win. Sometimes it's a huge lose. Staying subtle is more predictable, if I can do it, but of course it's not always possible to avoid detection, and sometimes it's better to admit to my attempts at conditioning than to be caught out at them. The safest move is to first establish a social context where my attempts at conditioning can be labelled "manners," such that any attempt to call me out on them is inherently low-status, but that's not always possible either.

When using praise signals as reinforcers for systems, like some humans, who are capable of skepticism about my motives, it helps to be seen to use expensive signals. (Attention often works well, which is one reason Internet trolls are so persistent.) Of course, that typically means I have to invest resources into my conditioning efforts.

In general, the approach I endorse is to maintain (and adjust as needed) a consistent threshold of evaluation, ignore behavior that falls below that threshold, reward behavior that clears it, and resist the temptation to go meta about the process.

comment by [deleted] · 2012-06-21T17:47:50.581Z · score: 1 (1 votes) · LW · GW

Sounds like an interesting idea for an experiment, although it would probably violate ethical guidelines. :P

comment by wedrifid · 2012-06-21T14:38:04.948Z · score: 1 (1 votes) · LW · GW

Hmm, I wonder if providing a lot of negative reinforcement on some attribute of them you don't care about would make the positive reinforcements more effective on the things you do care about.

The example you give is either punishment of the other attributes or negative reinforcement of the desired behavior (if you look at it from the perspective of taking away the aversive stimulus only when the math is done.)

comment by phonypapercut · 2012-06-21T02:06:30.377Z · score: 0 (0 votes) · LW · GW

Would it? There would be greater contrast between the reinforcement and the ignoring of poor performance.

comment by [deleted] · 2012-06-21T02:15:39.749Z · score: 2 (2 votes) · LW · GW

Well the idea I was going for was that it would be better to praise improvements in skill rather than just good performance.

comment by mapnoterritory · 2012-06-21T19:30:51.039Z · score: 10 (10 votes) · LW · GW

Daniel Kahneman in Thinking, Fast and Slow:

I had stumbled onto a significant fact of the human condition: the feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.

There reason for that lies in regression to the mean when training (example of flight instructors in the israel airforce):

I pointed out to the instructors that what they saw on the board coincided with what we had heard about the performance of aerobatic maneuvers on successive attempts: poor performance was typically followed by improvement and good performance by deterioration, without any help from either praise or punishment.

Since positive reinforcement is so counterintuitive: don't forget to reward yourself for rewarding somebody for good behaviour! :)

comment by faul_sname · 2012-06-22T01:47:15.509Z · score: 9 (9 votes) · LW · GW

Speaking of regression to the mean, that seems to be one topic that wasn't really covered in the sequences that really should have been.

comment by Eugine_Nier · 2012-06-22T04:44:33.864Z · score: -1 (1 votes) · LW · GW

I had stumbled onto a significant fact of the human condition: the feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.

So you (or at least Kahneman) implicitly admit that punishment is effective at changing behavior.

comment by mapnoterritory · 2012-06-22T06:48:19.346Z · score: 2 (2 votes) · LW · GW

Yes, I think so and apparently so does Kahneman. I don't think this is particularly controversial. Kahneman does say that positive reinforcement is more efficient (both in animals and humans).

comment by Vaniver · 2012-06-22T05:33:55.975Z · score: 2 (2 votes) · LW · GW

Everyone who's looked at the data thinks that punishment can change behavior. The question is whether punishment makes the changes you want- and people dramatically overestimate the usefulness of punishment and dramatically underestimate the usefulness of positive reinforcement.

comment by Viliam_Bur · 2012-06-22T10:23:10.375Z · score: 5 (5 votes) · LW · GW

The question is whether punishment makes the changes you want

Also it depends on the definition of what you "want" -- for example if you punish someone for bad behavior, what exactly is your goal?

  • to help them improve their behavior?
  • to signal to other people that you care?
  • to have higher status that the punished person?

All three goals are pleasant, though only the first one is officially desirable. The punishment works in all directions. Perhaps this is the reason why behavior change by punishment is popular more than it deserves; and why people rationalize its usefulness even when the first goal visibly fails.

comment by Vaniver · 2012-06-22T16:09:11.316Z · score: 1 (1 votes) · LW · GW

Agreed. Hopefully, instructors care most about the first- but in general human interaction, the others can easily rise to prominence.

comment by Eugine_Nier · 2012-06-23T06:18:53.957Z · score: 0 (2 votes) · LW · GW

Depends, the current "everyone is special, everyone deserves an A for trying" culture almost certainly overvalues positive reinforcement.

comment by pnrjulius · 2012-07-05T01:26:03.914Z · score: 1 (1 votes) · LW · GW

Everyone getting an A isn't reinforcement. Reinforcement has to be conditional on something. If you give everyone who writes a long paper an A, that's reinforcing writing long papers. If you give everyone who writes a well-written paper an A, that's reinforcing well-written papers (and probably more what you want to do).

But if you just give everyone an A, that may be positive, but it simply isn't reinforcement.

comment by Vaniver · 2012-06-23T15:26:39.101Z · score: 1 (1 votes) · LW · GW

I see a difference between 'niceness' and 'positive reinforcement'. The "everyone deserves an A for trying" approach is 'nice' but it generally isn't skillful positive reinforcement; I think a major problem with it is underestimating how much it rewards behaviors that look like trying but aren't trying.

There's also a basic value question- if you're trying to build self-esteem, it's not clear that an "A for trying" approach overvalues positive reinforcement, though if you're trying to build understanding, it clearly would be a misapplication of positive reinforcement.

comment by shminux · 2012-06-21T15:21:11.690Z · score: 10 (10 votes) · LW · GW

But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.

Thanks for pointing out this particular low-hanging fruit.

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

I wonder if they had just (re-)watched this Big Bang Theory episode.

you don't get a sea lion to balance a ball on the end of its nose by nagging

Hmm, I better keep this in mind at all times when dealing with my family.

comment by JGWeissman · 2012-06-21T01:58:56.814Z · score: 10 (12 votes) · LW · GW

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

If I recall my high school psychology class correctly, you can get a stronger and more persistent effect by secretly rolling a dice and note the number, and when Eliezer says that many nice things, give him an M&M, roll the dice again for a new target number of nice things.

comment by TheOtherDave · 2012-06-21T02:12:48.595Z · score: 18 (18 votes) · LW · GW

That's true and false. Intermittent reinforcement gets a more robust effect than continual reinforcement, yes, but randomly intermittent reinforcement isn't as effective as setting the reward threshold higher as the behavior becomes more common... e.g., rewarding only the 10% nicest things.

comment by matt · 2012-06-21T19:09:46.052Z · score: 5 (5 votes) · LW · GW

I want to design a reinforcement schedule in one of our apps. Can anyone link me to some specific guidelines on how to optimise this?

(Reinforce exactly what % of successes (30%? 26%? 8%?)? Reinforce performances in the top 10% of past performances (or the top 12%, or the top 8%?)? How does time factor (if the user hasn't used the app for a week, should I push a reinforcer forward?)?)

comment by TheOtherDave · 2012-06-21T19:17:44.674Z · score: 0 (0 votes) · LW · GW

I can't, but if you find anything concise and useful, I'd love to hear about it myself.

My rule of thumb is to set the threshold so as to reinforce the top 20% or so of performances, and arrange performance frequencies so I'm reinforcing 2-3 times/minute during active training periods. But that's not based on anything.

I'll also note that reinforcing higher-tier performances more strongly works really well (though is hard to do consistently by hand), as do very intermittent "jackpots" (disproportional and unpredictable mega-rewards).

comment by Paul Crowley (ciphergoth) · 2012-06-21T06:18:23.482Z · score: 5 (7 votes) · LW · GW

When the threshold is "something nice", there's going to be randomness in the reinforcement anyway.

comment by dbaupp · 2012-06-21T05:41:52.466Z · score: 4 (4 votes) · LW · GW

Some previous discussion about this form of conditioning.

comment by wedrifid · 2012-06-21T16:50:22.117Z · score: 7 (17 votes) · LW · GW

The central lesson I learned from exotic animal trainers is that I should reward behavior I like and ignore behavior I don't. After all, you don't get a sea lion to balance a ball on the end of its nose by nagging. The same goes for the American husband.

Back in Maine, I began thanking Scott if he threw one dirty shirt into the hamper. If he threw in two, I'd kiss him. Meanwhile, I would step over any soiled clothes on the floor without one sharp word, though I did sometimes kick them under the bed. But as he basked in my appreciation, the piles became smaller.

My wife, if pulling that kind of stunt, would quickly find that her affections were shunned and her thanks were met with clear contempt (after she was asked politely not to do that the first time). It is almost certainly not in her interests to produce a pavlovian association between her affections and attempts to control me against my wishes. My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

This would be entirely different if I had made a prior agreement regarding shirts and hampers. Making it motivationally easier and more enjoyable to do things I am willing to do is to be encouraged.

comment by Viliam_Bur · 2012-06-22T11:06:49.476Z · score: 5 (5 votes) · LW · GW

I do accept this kind of reinforcement from my significant other, assuming that:

  • it is for a goal I agree with (extrapolated volition)
  • I am free to say "stop doing this" if I don't feel like to be reinforced; and my wish is respected
  • I do get the same signs of affection in other situations too.

Actually I consider it very useful, and for me it would be a waste not to use this kind of cheap "external willpower". YMMV.

comment by wedrifid · 2012-06-22T11:24:58.532Z · score: 1 (1 votes) · LW · GW

I do accept this kind of reinforcement from my significant other, assuming that:

Note that I consider the reinforcement you are describing to be entirely different in kind (not "this kind"). The boundaries around the kind I accept are approximately the same as yours:

  • it is for a goal I agree with (extrapolated volition)
  • I am free to say "stop doing this" if I don't feel like to be reinforced; and my wish is respected
  • I do get the same signs of affection in other situations too.

I go by what my intuition tells me but when formalizing those intuitions something similar is generated.

Actually I consider it very useful, and for me it would be a waste not to use this kind of cheap "external willpower".

I make a point of rewarding desired reinforcement (while attempting 'extinction' on less desirable influence tactics like nagging or punishment.)

comment by [deleted] · 2012-06-24T18:31:43.958Z · score: 0 (2 votes) · LW · GW

The boundaries around the kind I accept are approximately the same as yours:

it is for a goal I agree with (extrapolated volition)

I supposed the reason why the husband in the story didn't put his clothes in the hamper was that he was too lazy to do that, not that he (terminally) valued that the clothes stayed outside the hamper.

comment by wedrifid · 2012-06-25T06:49:54.539Z · score: 0 (2 votes) · LW · GW

I supposed the reason why the husband in the story didn't put his clothes in the hamper was that he was too lazy to do that, not that he (terminally) valued that the clothes stayed outside the hamper.

Having a terminal value for clothes outside the hamper isn't the point. It is whether given the negotiated relationship boundaries and typical behaviors as they currently are the person being modified would prefer "status quo except I do more" over "status quo".

"Too lazy" can be left out of such considerations. That doesn't distinguish between akrasia and considered intent not to do the thing (for whatever reason). For most part judgements like "too lazy" are just another method of attempting influence - usually a method that is inferior to reinforcement.

comment by TheOtherDave · 2012-06-25T13:58:40.692Z · score: 1 (1 votes) · LW · GW

For most part judgements like "too lazy" are just another method of attempting influence - usually a method that is inferior to reinforcement.

Well, making judgments like "too lazy" can also provide valuable social cover for other kinds of reinforcement (or punishment), within communities where deliberately altering the behavior of others is seen as unacceptable unless I can frame it as being for their benefit.

More generally, motivated speculation about other people's best interests (including but not limited to positing that they possess unexpressed "terminal values" that happen to align better with what I seem to want than with what they seem to want) can be a very useful way to ignore people's stated preferences without feeling (or being seen by third parties as) indebted to them.

comment by pjeby · 2012-06-21T17:27:30.225Z · score: 5 (17 votes) · LW · GW

My wife, if pulling that kind of stunt, would quickly find that her affections were shunned and her thanks were met with clear contempt

Seriously? You'd shun your wife because she said thank you? i.e.

I began thanking Scott if he threw one dirty shirt into the hamper

comment by [deleted] · 2012-06-21T18:15:21.025Z · score: 13 (13 votes) · LW · GW

Some people react quite viscerally to the awareness that another party is trying intentionally to steer their behavior in any way. It seems to just be a massive squick button for some (indeed, I notice that most randomly-selected people who are made aware of explicit attempts to condition behavior react with discomfort at minimum); for others, there seems to be a correlation with triggers gained from abusive interactions earlier in life; a few I knew who reacted strongly showed strong indications of sociopathy and seemed to instinctively feel violated if someone else successfully, or even just obviously, tried to affect their behavior in a deliberate manner toward some end (a normal part of cognition and social interaction for them directed at others).

comment by wedrifid · 2012-06-21T18:21:11.292Z · score: 6 (10 votes) · LW · GW

Seriously? You'd shun your wife because she said thank you?

(No, I said I would shun kisses delivered under those circumstances. No cutting and pasting of my keywords for the sake of hyperbole thanks.)

If people use their affection in a way that is obviously intended to systematically manipulate me to do things that I do not, in fact, wish to do then yes, of course those instances of affection I will shun. While I know some people are more tolerant to that kind of blatant disrespect I would expect you to at least be able to comprehend the subset of people that will not.

I'm afraid that all women who want kisses to serve the role of doggy treats within our relationship are out of luck. I have yet to experience a problem with having that policy. My model of myself predicts that rewarding hostile-to-my-interests-reward-training with increased compliance or acceptance would leave me with relationships that were far less satisfying and in particular far less enjoyment of displays of affection.

comment by Vaniver · 2012-06-21T18:26:39.241Z · score: 10 (18 votes) · LW · GW

So, I have to ask: do you in fact have a wife?

comment by handoflixue · 2012-06-22T19:41:55.584Z · score: 6 (10 votes) · LW · GW

The phrases "of course" and "blatant disrespect" imply a shared frame of reference that doesn't seem to be in evidence. While it might be considered rude to you, it's pretty much human nature. The phrase "thank you" is, as near as I can tell, pretty much entirely meant as a positive reinforcer.

So, having established that we have different frames of reference, can you go in to WHAT behaviors bother you? Is it the use of specific actions as reinforcers ("thank you" is okay but kissing is not?) or is it just the deliberate (as opposed to socialized and subconscious) application of these techniques? Or something else that I'm missing?

comment by pjeby · 2012-06-21T21:04:15.816Z · score: 5 (9 votes) · LW · GW

If people use their affection in a way that is obviously intended to systematically manipulate me to do things that I do not, in fact, wish to do then yes, of course those instances of affection I will shun.

Since positive reinforcement can only be applied after you already do a thing, then presumably, you at least wished to do it once. So, how is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

comment by wedrifid · 2012-06-22T03:04:27.258Z · score: 2 (4 votes) · LW · GW

Caveat: I don't know why the husband in question doesn't just put his damn clothes in the hamper. Doesn't the idea of having soiled clothes lying around repulse him anyway? Especially when sharing the space with another. I mean... ewww. But now back to assuming the target behavioral territory is not already granted by the obvious shelling point or prior arrangement.

So, how is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

It seems you wish to unilaterally accept rewarding behavior as positive. I don't. I have no trouble detecting when rewards are being used as "approximations" towards a behavioral landscape that I clearly don't want or, especially, have previously declared that I would not accept. I am also able to predict - by reference to past experience and knowledge of my own preferences - that encouraging that reward pattern gives undesired outcomes. As Vaniver mentioned, an important skill to develop is the ability to detect the difference between desired and undesired manipulations.

As a somewhat separate issue, excessive use of physical affection (kisses, hugs, sex) as a "reward" for good behavior changes the experience of those activities - and not in a good way.

comment by handoflixue · 2012-06-22T19:45:02.824Z · score: 4 (4 votes) · LW · GW

excessive use of physical affection (kisses, hugs, sex) as a "reward" for good behavior changes the experience of those activities - and not in a good way.

Could you elaborate on that? I'm entirely okay with physical affection being used as a "reward", as long as it's also clear that the person genuinely wants affection with me, and initiates it "just because" too (actually I'd probably be entirely okay with a strictly reward-based system of affection, as long as it was explicit...)

I have no trouble detecting when rewards are being used as "approximations" towards a behavioral landscape that I clearly don't want

You seem to be assuming, in the example, that the husband doesn't WANT to be modified to put away his laundry. Is that correct?

If so, is it correct that your objection is "you're manipulating me in to a state I don't desire" rather than simply "you're manipulating me"? Given that you PERSONALLY find soiled clothes disgusting, would you PERSONALLY appreciate reinforcement that helped you overcome such a habit?

comment by wedrifid · 2012-06-23T03:39:59.978Z · score: 2 (4 votes) · LW · GW

You seem to be assuming, in the example, that the husband doesn't WANT to be modified to put away his laundry. Is that correct?

Yes.

If so, is it correct that your objection is "you're manipulating me in to a state I don't desire" rather than simply "you're manipulating me"? Given that you PERSONALLY find soiled clothes disgusting, would you PERSONALLY appreciate reinforcement that helped you overcome such a habit?

Yes.

comment by pjeby · 2012-06-22T20:34:10.810Z · score: 3 (5 votes) · LW · GW

Hm. You quoted a question I asked, and then proceeded to not answer it in any way. The question was:

How is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

Instead of answering that question, you supplied various generalizations whose referents in physical reality I can't ascertain. Please give an example of a situation where somebody being, say, happy that you did something, means that they are manipulating you to do something you don't "wish to do" (your previous words).

comment by TheOtherDave · 2012-06-22T21:13:15.663Z · score: 4 (4 votes) · LW · GW

Well, I'm not wedrifid, but OK.

Suppose there's a crisis at work, and in response to that crisis I step in and solve a problem.
Suppose, as part of solving that problem, I take some steps (X) that I don't enjoy doing and don't wish to do again.
Suppose my boss notices that I did X and was effective at it and decides that she wants me to do X more regularly, and being familiar with the uses of positive reinforcement decides to hand me a large bonus at our next status meeting. Further, she praises me to the skies in public for having done X, and does so in a way that communicates the (entirely accurate) message that my continuing to receive such praise is contingent on my continuing to do X.

I assert that, in this scenario, my boss is applying positive reinforcement techniques with the goal of increasing my likelihood of doing X, by providing me with a bonus to something I've already done, where X is something I don't wish to do.

Do you agree?

As to whether, in so doing, she's manipulating me... (shrug) I've already had that discussion once too often this week. If our only remaining point of disagreement about that scenario is whether the word "manipulating" properly applies to it, I'm happy to leave that point unresolved.

comment by pjeby · 2012-06-23T01:25:41.414Z · score: 0 (4 votes) · LW · GW

I assert that, in this scenario, my boss is applying positive reinforcement techniques with the goal of increasing my likelihood of doing X, by providing me with a bonus to something I've already done, where X is something I don't wish to do.

So? Are you saying this is a bad thing? That's what I'm asking wedrifid. Are you offended by said boss doing this?

Ironically, in your scenario, your boss is actually elevating your status: trying to please you in order to obtain a consent that in principle could be had by simply ordering you to do more X. So I don't think it's analagous to the situation that upsets wedrifid here.

comment by TheOtherDave · 2012-06-23T01:37:05.078Z · score: 4 (6 votes) · LW · GW

So?

So, you asked for "an example of a situation where somebody being, say, happy that you did something, means that they are manipulating you to do something you don't "wish to do"," and I gave you one.

Apparently, you also wanted an example where the person isn't also elevating my status in the process, isn't trying to please me, and isn't trying to get me to agree to something that they could order me to do. I didn't realize that, sorry.

No, I can't think of any coherent examples where someone tries to use positive reinforcement to alter my behavior by doing something that doesn't please me.

Tapping out now.

comment by wedrifid · 2012-06-23T02:32:50.378Z · score: 1 (3 votes) · LW · GW

Tapping out now.

As am I. I refer any interested observers to the previous comments by myself, TheOtherDave, Vaniver and others, as well as the details of the originally quoted example, including the emphasis on successive approximation. I expect that everyone who wishes to understand will from existing comments and that further engagement would be both futile and constitute a reward of an interaction style which is undesirable.

comment by pjeby · 2012-06-23T01:40:35.075Z · score: 0 (0 votes) · LW · GW

Apparently, you also wanted an example where the person isn't also elevating my status in the process

Nope, that was a side comment. The main point is that wedrifid said this was a bad thing, and I was asking him. So, it's actually an answer from someone other than wedrifid that didn't meet my criteria. ;-)

comment by NancyLebovitz · 2012-06-24T03:54:53.796Z · score: 1 (1 votes) · LW · GW

It depends on why TheOtherDave doesn't like doing whatever. If it's something that he could get to like or at least tolerate by being more familiar with it, no biggie.

If it's just aggravating and he doesn't get used to it, but it doesn't come up often enough to make him miserable, then it's one of those things which is apt to happen in jobs.

If it's something that takes so many additional hours that he's running himself ragged, then reinforcing him for doing it would be bad for him in the long run.

comment by TimS · 2012-06-21T18:34:20.786Z · score: 3 (3 votes) · LW · GW

The question is not whether positive reinforcement is effective in changing your behavior. The question is whether kisses are positive reinforcement in particular contexts.

Suppose your spouse says, "Please pick up my prescription from the store" and you don't want to, but you do it anyway. When you get back, spouse says "Thanks for dealing with that." Do you really think continued experiences like that won't increase the frequency of the behavior "Run an errand even when I don't want to"?

comment by [deleted] · 2012-06-21T18:40:03.239Z · score: 1 (3 votes) · LW · GW

Do you really think continued experiences like that won't increase the frequency of the behavior "Run an errand even when I don't want to"?

I think it depends a lot on her intention. If she says 'thank you' for the purposes of positive reinforcement, I mean if she thinks about her 'thank you's' that way, then I think she's being manipulative.

If she says 'thank you' to say what those words mean, namely, that she's grateful, then even if this does have the effective positive reinforcement there's nothing wrong about her behavior.

comment by TheOtherDave · 2012-06-21T18:57:05.245Z · score: 14 (16 votes) · LW · GW

I find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

It strikes me as similar to saying that hurting people is OK as long as I don't know I'm hurting them. No, it isn't. If hurting people is not OK, then it follows that I ought not hurt people, and learning to recognize when I'm hurting people is part of that, and I ought to learn to recognize it. The behavior doesn't suddenly become "not OK" the moment I learn to recognize it... it never was OK, and now I know it and can improve.

Conversely, if hurting people is OK, then it's OK whether I know I'm doing it or not.

The same goes for manipulating people. Whether I know I'm doing it or not isn't the determiner of whether I'm doing good or ill.

To my mind, the determiner of whether I'm doing good or ill is whether, when I'm done doing it, we're all better off or worse off.

comment by TimS · 2012-06-21T19:12:06.818Z · score: 3 (5 votes) · LW · GW

I agree with your point, but I think that "manipulate" needs to be tabooed. If we define manipulate as "acts that tend to change the behavior of others" then I agree with your implicit point that it is impossible to interact with others without changing their behaviors, and there is nothing wrong with thinking about how I would like someone else to behave when considering how I interact with them.

That said, there are connotations of manipulate as the word is ordinarily used that are not captured by the way you (and I) are using the word.

comment by TheOtherDave · 2012-06-21T19:19:32.291Z · score: 2 (2 votes) · LW · GW

Sure. I'm perfectly happy to drop the word altogether and instead talk about changing the behavior of others.

comment by [deleted] · 2012-06-22T15:41:16.547Z · score: 2 (2 votes) · LW · GW

I find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

Awareness of side effects isn't equivalent to intentionality. You can thank someone to express genuine feelings of gratitude. If you wouldn't do that in a counterfactual world in which the gratitude was absent, then I wouldn't call that behavior intentionally manipulative regardless of whether you know about positive reinforcement.

comment by TheOtherDave · 2012-06-22T16:05:08.364Z · score: 6 (6 votes) · LW · GW

If you wouldn't do that in a counterfactual world in which the gratitude was absent, then I wouldn't call that behavior intentionally manipulative regardless of whether you know about positive reinforcement.

Suppose I am not in the habit of expressing gratitude when people do nice things for me. Never mind why... maybe I was raised wrong. For whatever reason, I'm not in that habit. I feel gratitude, certainly, I just don't express it.

Then one Monday, I learn that expressing gratitude to people for doing nice things for me will increase the odds that they will do it again. Suppose I want people to do nice things for me, and I therefore conclude that I ought to expressing gratitude when people do nice things for me, in order to get them to do it more, and I therefore start expressing gratitude when people do nice things for me, whether I feel gratitude or not.

Then on Wednesday, I learn that this only works when I genuinely do feel gratitude... when I express gratitude I don't actually feel, I get bad results. (Again, it doesn't matter why. Maybe I'm a lousy liar.) So I stop expressing gratitude when people do nice things for me when I don't feel gratitude, but I continue doing so when I do, since that still gets me stuff I want.

If I've understood you correctly, you would call me intentionally manipulative on Tuesday, but not on Thursday. I'm happy to restrict the term "intentionally manipulative" to Tuesday behavior and not Thursday behavior, if that makes communication easier, though I don't use those words that way myself.

Regardless of what words we use, presumably we agree that on both Tuesday and Thursday, I am doing something with the intention of causing changes in other people's behavior, and am doing so without their awareness or consent. Yes?

Do you endorse this on Tuesday?
Do you endorse this on Thursday?

For my own part, I find the idea of endorsing that behavior on Thursday but not on Tuesday deeply troubling, for many of the reasons I listed before.

comment by MixedNuts · 2012-06-26T06:52:57.592Z · score: 0 (0 votes) · LW · GW

Obvious remark is obvious: you might disapprove of the behavior on Tuesday because it involves lying.

comment by [deleted] · 2012-06-21T18:58:51.658Z · score: 2 (4 votes) · LW · GW

find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

If you don't know you're manipulating someone, you're not manipulating someone. Manipulation is an intentional behavior, like lying, or congratulating, or taking a vow. Knowing what you're doing is part of doing it.

comment by TheOtherDave · 2012-06-21T19:06:19.473Z · score: 9 (9 votes) · LW · GW

Yeah, I pretty much disagree with this statement completely.

comment by [deleted] · 2012-06-21T19:32:23.324Z · score: 1 (1 votes) · LW · GW

That's... incredible to me. Do you disagree that there is such a category (i.e. actions you have to know you're doing in order to be doing them at all), or that manipulation falls under it?

comment by TheOtherDave · 2012-06-21T19:45:05.414Z · score: 2 (2 votes) · LW · GW

I disagree that manipulation falls under it.

comment by [deleted] · 2012-06-21T20:05:54.088Z · score: 0 (0 votes) · LW · GW

Do you agree that manipulation can be intentional (lets call this Imanipulation) And that what Luke is advising is the intentional kind?

comment by TheOtherDave · 2012-06-21T20:27:40.454Z · score: 2 (2 votes) · LW · GW

I agree that manipulation can be intentional, certainly.

I agree that the examples Luke is talking about are intentional ones, but I suspect that's rather incidental. To talk about it as "the intentional kind of manipulation" strikes me as misleading in the same sense that, while I agree that his example of Anna and Alicorn manipulating Eliezer was manipulation of a man by women, I would consider it misleading to refer to it as "the heterosexual kind of manipulation."

For example, if I practiced positive-reinforcement conditioning so assiduously that I started doing it without having to form explicit intention to do it (in the same way that I don't always form the explicit intention to catch a ball flying at my face before catching it), I expect that Luke would endorse doing it just the same; the fact that it's intentional in one case and not the other just wouldn't matter.

Actually, now that I think about it, what's your take on that? That is, if I practice modifying others' behavior until I reach the point where I can do it instinctively, without an overt intention-forming stage, does it suddenly become ethically acceptable for me to do so? (Since, after all, it's no longer manipulation, on your account.)

comment by [deleted] · 2012-06-21T20:41:55.081Z · score: 1 (1 votes) · LW · GW

I would consider it misleading to refer to it as "the heterosexual kind of manipulation."

I didn't follow this at all.

For example, if I practiced positive-reinforcement conditioning...

Say I practiced my swing so assiduously that I could hit a 90mph fastball without thinking (indeed, there is no time to think). Would you say that every time I knock such a pitch into the outfield, I've done so unintentionally? The fact that I don't go through an explicit thought process (if such a category is intelligible) every time doesn't make a difference. A practiced liar and a pathological liar could both lie without thinking, but the former is doing something typically unethical (unless they're like a spy or something) while the latter is just, well, pathological.

The way I'd put your point here is that one can practice a behavior to the point where it becomes a basic action, something which requires no deliberation as to how it is done, like walking or taking a drink or saying a sentence in your native language. I don't think the basic vs. non-basic action distinction (if you think this is a fair way to put it) tracks the intentional vs. unintentional action distinction.

And the unintentional manipulations I'd exclude from this discussion are cases where you, say, ask how someone's kids are because you care, and this happens to make them feel good (largely because they think you care), as opposed to cases where you ask about their kids in order to make them feel good. Those unintentional cases fall outside the conventional use of 'manipulation', but I won't stand on semantics.

comment by TheOtherDave · 2012-06-21T20:50:20.518Z · score: 0 (0 votes) · LW · GW

Would you say that every time I knock such a pitch into the outfield, I've done so unintentionally?

Nope. If you intentionally put yourself into a situation where you're going to have 90mph fastballs thrown near you, with the intention of hitting them with a baseball bat, I would not say that your subsequent hitting of a 90mph fastball with a bat was unintentional.

I'm going to drop this thread here, because I feel like you're sidestepping my questions rather than addressing them, and it's beginning to get on my nerves. (You are, of course, under no obligation to answer my questions. I'm also perfectly prepared to believe that you aren't intentionally sidestepping them.)

comment by [deleted] · 2012-06-21T21:13:25.130Z · score: 0 (0 votes) · LW · GW

Well, I am sidestepping, because I think the point about practice is tangential to our discussion. Habitual insincerity is not therefore unintentional in the relevant sense.

comment by TimS · 2012-06-21T19:40:31.971Z · score: 1 (1 votes) · LW · GW

This exchange may be helpful to understand TheOtherDave's point.

comment by [deleted] · 2012-06-21T19:53:12.969Z · score: 1 (1 votes) · LW · GW

Thanks, that is helpful.

comment by adamtpack · 2012-06-23T02:00:28.551Z · score: 0 (0 votes) · LW · GW

.... And what about helping other people without knowing you helped them? /sly look/

comment by TheOtherDave · 2012-06-23T02:22:08.074Z · score: 2 (2 votes) · LW · GW

Similarly, if helping people is OK, it's OK whether I know I'm doing it or not, and if it's not OK, it's not OK whether I know I'm doing it or not.

comment by Viliam_Bur · 2012-06-22T11:00:08.249Z · score: 0 (0 votes) · LW · GW

The same goes for manipulating people. Whether I know I'm doing it or not isn't the determiner of whether I'm doing good or ill.

Yes. But maybe there is a correlation that people who know what they are doing, are doing it more.

If that's true, then it would make sense to criticize intentional manipulation more.

comment by TheOtherDave · 2012-06-22T12:20:55.413Z · score: 1 (1 votes) · LW · GW

Well, only if doing it is worth criticizing in the first place.

comment by wedrifid · 2012-06-21T18:49:27.685Z · score: 0 (0 votes) · LW · GW

The question is not whether positive reinforcement is effective in changing your behavior. The question is whether kisses are positive reinforcement in particular contexts.

Neither of those seem to be the question - at least neither of those are the question I'm asking when I evaluate whether a given trend of behaviors constitutes a Defection::Manipulation.

Suppose your spouse says, "Please pick up my prescription from the store" and you don't want to, but you do it anyway. When you get back, spouse says "Thanks for dealing with that."

That is kind of me and it would all else being equal be somewhat rude if she didn't thank me for doing a favour like that. (This assumes a weak instantiation of 'want' such that I reflectively endorse doing the errand but experience emotional reluctance. If I reflectively endorse not doing the errand but still do then that is not kind but weak.)

Do you really think continued experiences like that won't increase the frequency of the behavior "Run an errand even when I don't want to"?

Being influenced isn't something to be universally avoided. Having negotiated boundaries subverted by the strategic use of kisses as doggy treats is. That way leads to madness - often for both parties.

comment by michaelsullivan · 2012-06-27T13:08:42.323Z · score: 1 (1 votes) · LW · GW

For my part, I didn't experience the positive reinforcement description in the article as being about subverting negotiated boundaries, but about changing what seem likely to be unthinking habitual behaviors that the person is barely aware of.

I don't know of anyone that I wish to be associated with who specifically desires to leave dirty clothes on the floor instead of in the hamper, it's just something that is easy to do without thinking unless and until you are in the habit of doing something differently.

If the husband in question had actually negotiated a boundary about being able to leave his clothes on the floor, or even expressed reflective hesitancy about using the hamper as a theoretically desired or acceptable action, then I would agree that the author's behavior was highly unethical, and as the husband, if I became aware of it, I would have a problem.

A more typical scenario is one in which the husband would reflectively endorse putting dirty clothes in the hamper on principle, but has a previously developed habit of leaving clothes on the floor and does not judge it important enough to do the hard mental work of changing the habit. Positive reinforcement in this scenario basically represents the wife attempting to do a big portion of the work required to change the habit in the hopes it will get him over this threshold.

In this case, I am having trouble imagining a situation in which one would have reflective desire not to use an existing hamper for dirty clothes.

comment by wedrifid · 2012-06-27T13:41:28.689Z · score: 1 (1 votes) · LW · GW

In this case, I am having trouble imagining a situation in which one would have reflective desire not to use an existing hamper for dirty clothes.

Everyone here who has comment on the subject of dirty clothes, myself included, has mentioned that they much prefer to put them in a designated repository. However, the precise nature of the example is not important and precisely where the boundaries of responsibility have been set in someone else's relationship are not my business to determine.

comment by michaelsullivan · 2012-06-27T17:11:12.289Z · score: 0 (0 votes) · LW · GW

Of course it is not our business to determine those boundaries in someone else's relationship.

Yet my reaction to the behavior described is very largely determined by what I imagine as the relationship context. The reason I did not have your reaction to this story is because I implicitly assumed that there was no boundary the husband had set about the fact of having clothes end up in the hamper by his hands.

I was somewhat troubled by the story, and the conversation in this subthread has clarified why -- the relationship context is crucial to determining the ethics of the behavior, and the ethical line or the necessary context was not discussed seriously in the article. While I find it unlikely that this particular example was crossing a line in their relationship, similar strategies could easily be used in an attempt to cross explicit or implicit boundaries in a way I would find abhorrent.

There is one point on which I am not clear whether we are drawing the line in the same place.

In the absence of any prior negotiation one way or another, do you consider the wife's behavior unethical? That seemed to be what you suggested with your initial comment, that it would only be acceptable in the context of a prior explicit agreement.

I think I fall on the side of thinking it is sometimes acceptable in some possible middle cases, but I'm not completely comfortable with my decision yet and would be interested in hearing arguments on either side.

I am clear (and think you will agree) that it is ok to use this strategy to reinforce a previous agreement, and NOT ok to use it to break/bend/adjust a previous agreement. It is the situation with no prior agreement that I am interested in.

To describe it semi-formally.

Party A wants to use positive reinforcement on party B in order to get them to do X

Middle cases I consider to be important (aside from there being some explicit agreement/boundary)

Party B has given some indication (but not an explicit statement/agreement) that doing X would be acceptable or desirable in principle --- PR OK

Party B has given some indication (not explicit statement/agreement) that doing X would be a undesirable in principle --- PR NOT OK

Party B has given no indication one way or another -- ??

In this last case, are social expectations relevant? In the particular case of clothes in hamper, there are clear social expectations that most people normatively desire clothes in hamper. Perhaps our difference lies in whether we consider social expectations a relevant part of the context.

My tentative line is that where no indication has been given, reinforcing social expectations is acceptable, and violating social expectations is at least dubious and probably not OK without discussion.

If social expectations matter, then questions about which social circle is relevant come into play. If party A and party B would agree about which social expectation is relevant, then that is the correct one.

The interesting subcase would be where the relevant social expectations are different for party A and for Party B. My current position is that party A's best information about what party B would choose as a relevant set of social expectations should determine the ethics.

comment by NancyLebovitz · 2012-06-23T02:44:02.252Z · score: 1 (3 votes) · LW · GW

I seem to have more sympathy for your point of view than most here, but I'm not sure I have the thing articulated.

I think a piece of it is that a kiss given in order to get a spouse to do a routine chore seems very different from a kiss given out of affection or lust.

Intuitively, a kiss given out of enthusiasm for help received seems like a different sort of thing than a kiss given as part of a program to get behavioral change.

comment by NancyLebovitz · 2012-06-23T13:21:18.334Z · score: 2 (4 votes) · LW · GW

From a different context

And I think that another way to put it is that whereas someone compassionate might think “how can I get this person from A to B safely?”, an abuser tends to think “how can I get this person from A to B?

Would it be different and less risky if the reward were M&Ms rather than kisses? If both partners were using reinforcement schemes on each other? The latter seems to have some comic potential, but in a way that isn't quite coming into focus.

comment by wedrifid · 2012-06-23T15:36:48.322Z · score: 0 (4 votes) · LW · GW

Would it be different and less risky if the reward were M&Ms rather than kisses?

Do diabetes, arteriosclerosis and dental costs count as 'risks'?

comment by [deleted] · 2012-06-24T18:33:37.521Z · score: 1 (1 votes) · LW · GW

EY must be saying lots of nice things if that's a non-negligible risk.

comment by NancyLebovitz · 2012-06-23T16:19:11.973Z · score: 1 (3 votes) · LW · GW

I assume we're talking about something like a dozen M&Ms/day, which wouldn't be a large risk for most people (I agree they'd be a bad idea for diabetics). Unless the person otherwise would eat no sweets at all, I can't see the M&Ms making a difference.

comment by TheOtherDave · 2012-06-23T05:48:24.940Z · score: 2 (2 votes) · LW · GW

Intuitively, a kiss given out of enthusiasm for help received seems like a different sort of thing than a kiss given as part of a program to get behavioral change.

I agree. That said, this is similar to saying that me going to work because they pay me is a different sort of thing than me going to work because I enjoy my job. In practice, the lines between expressions of enthusiasm and attempts to manage behavior are rarely that clearcut.

comment by Swimmer963 · 2012-06-22T20:16:02.540Z · score: 1 (3 votes) · LW · GW

What would you see as the difference between a) the story described, and b) a wife who kisses her husband because it makes her happy when he does helpful, nice things, of which putting laundry in the hamper is one, and her automatic response to this surge happiness is "thank you, you're an amazing man!" [kiss]? The latter includes most of the same actions on the part of the wife, and probably occurs in a lot of healthy relationships.

My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

Are there some internal motivations that you are less protective of than others? For example, if someone tried to condition me to be less averse to harming people, I would have a pretty big reaction, because that particular internal motivation is sacrosanct to me. But preferences for levels of tidiness...meh. I barely consider that an internal motivation, and definitely not a facet of who I am...it's just a habit, and I don't really care about changing it in either direction.

Is the difference with you that you consider all of your motivations to be a sacrosanct part of who you are? Or just that you place a higher value on your autonomy, and being the one 100% entirely responsible for all of your decisions?

comment by TheOtherDave · 2012-06-22T21:19:22.299Z · score: 7 (7 votes) · LW · GW

It may be worth sharing, anecdotally, that years ago my husband expressed annoyance with me over the fact that I only ever rubbed his back while he was doing dishes, and it made him feel much like how wedrifid describes.

This utterly bewildered me, so I agreed to pay attention to the behavior and see what was going on. Pretty quickly it became clear to me that this was absolutely true, for reasons I wasn't entirely clear on myself, although my working theory was it was the only time that I'd regularly walk past him while he was hunched over in that particular posture, which apparently served as a "give me a backrub" signal for me, for whatever reason.

My response to this was to start giving him random backrubs at other times, which solved the problem.

My point being that (a) being annoyed by this sort of behavior is not at all unique to wedrifid, and (b) whether the behavior pattern is intentional doesn't necessarily matter very much. (I don't mean to suggest that it doesn't matter to wedrifid; actually, they have made it somewhat clear that it's part of what they're objecting to.)

comment by Swimmer963 · 2012-06-22T21:47:16.126Z · score: 3 (3 votes) · LW · GW

The main lesson I'm taking from your anecdote is "people are complicated, everyone is complicated in a different way, and for almost any action or behaviour X, there will be a person somewhere who finds it awful." It's hard to guess at the relative numbers without doing a poll, but I'm guessing there's a range of people who wouldn't care if their significant other used physical affection as a reward (or who would even like it, because "yay, more total physical affection!"), and there's a range of people who would find it mildly to extremely unpleasant.

comment by TheOtherDave · 2012-06-22T21:51:28.663Z · score: 2 (2 votes) · LW · GW

I'm guessing there's a range of people who wouldn't care if their significant other used physical affection as a reward (or who would even like it, because "yay, more total physical affection!"), and there's a range of people who would find it mildly to extremely unpleasant.

Yup, that's consistent with my experience.

comment by wedrifid · 2012-06-23T02:45:41.196Z · score: 1 (1 votes) · LW · GW

Pretty quickly it became clear to me that this was absolutely true, for reasons I wasn't entirely clear on myself,

Well, the whole thing where he is standing up against the sink with his back to you but his hands were busy and he couldn't turn around (to engage in other forms of affection) seems like the obvious guess.

comment by TheOtherDave · 2012-06-21T01:36:18.965Z · score: 7 (7 votes) · LW · GW

"Don't Shoot the Dog" remains my favorite book for these sorts of anecdotes, as well as some of the theory and a lot of the practice. I recommend it.

comment by Arkanj3l · 2012-07-05T17:47:22.463Z · score: 4 (4 votes) · LW · GW

So, reinforcement with M&Ms doesn't translate into an addiction for extrinsic rewards and the reduction of intrinsic motivation?

I'm missing something here, I know.

comment by Insert_Idionym_Here · 2012-08-12T17:45:23.254Z · score: 0 (0 votes) · LW · GW

One could attempt to fight that by reducing the number or frequency of M&Ms eaten over a long period of time, essentially weaning one's self off of extrinsic rewards.

comment by Arkanj3l · 2012-08-15T23:05:30.652Z · score: 0 (0 votes) · LW · GW

I think it's still hard to privilege if that kind of effect exists in the first place.

comment by EphemeralNight · 2012-06-21T21:43:29.171Z · score: 3 (3 votes) · LW · GW

The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more.

I am probably unusual in this regard, but I think I would find both approaches equally aggravating. If someone points out that I've made a mistake, anything other than a concise detailing of exactly how what I did differs from what I was supposed to do, is just going to irritate me. Also, my brain tends to interpret being ignored as a signal that I'm doing correctly.

comment by Swimmer963 · 2012-06-21T21:56:04.393Z · score: 2 (2 votes) · LW · GW

If someone points out that I've made a mistake, anything other than a concise detailing of exactly how what I did differs from what I was supposed to do, is just going to irritate me.

Is this because of the "damn it, I know I made a mistake, you telling me I did doesn't help!" effect? I get that too... A good thought experiment is that if I was making a type of mistake that I couldn't automatically tell I was making on my own, I would prefer it to be pointed out, even if not in a concise detailed fashion–the idea of not knowing that I'm making a mistake is kind of scary. What would your reaction be in that situation?

comment by EphemeralNight · 2012-06-21T22:23:58.686Z · score: 2 (2 votes) · LW · GW

Is this because of the "damn it, I know I made a mistake, you telling me I did doesn't help!" effect?

No, I react the same way whether I was previously aware of my mistake or not. I only experience that effect when I'm told to do something I am already doing.

A good thought experiment is that if I was making a type of mistake that I couldn't automatically tell I was making on my own, I would prefer it to be pointed out, even if not in a concise detailed fashion–the idea of not knowing that I'm making a mistake is kind of scary. What would your reaction be in that situation?

Pragmatically, we as humans, just barely over the threshold into sapient intelligence, make mistakes we're not aware of constantly. If we didn't, we wouldn't need a superintelligence to fix the world; we'd have already done it ourselves. So finding the concept scary seems kind of pointless.(Sort of like being hydrophobic about the water in one's own body.) However, I would, of course, rather be aware of my mistakes than not.

But none of this is really on the topic, which was that the listed reinforcements don't seem even remotely applicable to humans in a universal way.

comment by Swimmer963 · 2012-06-22T02:16:26.732Z · score: 2 (2 votes) · LW · GW

So finding the concept scary seems kind of pointless. However, I would, of course, rather be aware of my mistakes than not.

My actions have impacts on others. In general, I prefer to help other people or at least not harm them–however, I may harm someone by mistake, and I really don't want this to happen. If I make a mistake once and I realize it–fine, hopefully no harm done, I won't do it again. If I make a mistake and I don't know about it, well, maybe no harm done that time in particular, but I'm likely to keep making this mistake over and over, and possibly the first time I'll find out is when there is harm done. I think that justifies finding it scary.

comment by pnrjulius · 2012-07-05T01:28:59.200Z · score: 0 (0 votes) · LW · GW

I've always found that recommendations of what to do are much more useful than any kind of praise, reward, punishment, or criticism.

On the other hand, if everyone told you how to do everything, you might never learn the very important skill of teaching yourself to do things.

comment by hrishimittal · 2012-06-22T07:58:22.040Z · score: 2 (2 votes) · LW · GW

What expert timing, Luke! Just two days ago, I came across the fascinating practice of clicker training for horses - http://www.theclickercenter.com, while reading Kathy Sierra's old blog - http://headrush.typepad.com/creating_passionate_users/2006/03/clicker_trained.html.

My only problem is that I need to train my own behaviour rather than someone else's. I'm going to try to use these techniques on myself, although I'm not sure if that's supposed to work.

comment by shminux · 2012-06-21T22:06:59.767Z · score: 2 (6 votes) · LW · GW

Lessons learned:

  • continue to mentally /ignore people and posts I don't care for on IRC and online forums

  • never comment on bad posts or explain my downvote on LW

  • be more generous with upvoting good contributions and give a short praise when warranted.

comment by Vaniver · 2012-06-22T05:32:19.933Z · score: 8 (8 votes) · LW · GW

never comment on bad posts or explain my downvote on LW

This is not quite justified; this is a post on how to use positive reinforcement, not how to use punishment.

comment by shminux · 2012-06-22T06:02:39.815Z · score: 0 (2 votes) · LW · GW

When a dolphin does something wrong, the trainer doesn't respond in any way.

(from the link)

comment by Vaniver · 2012-06-22T06:30:12.687Z · score: 4 (4 votes) · LW · GW

Dolphins are more difficult to punish usefully than humans; for one, they're less likely to understand English.

Moving to object-level advice: I agree that not responding to bad comments or posts is generally a good idea. I think that responding to downvote explanation requests is a good idea about half of the time. Unsolicited downvote explanation is typically done to sway bystander opinion as well as inform the poster, and so deserves its own treatment.

comment by tgb · 2012-06-23T13:44:32.149Z · score: 5 (5 votes) · LW · GW

The difference between explaining bad posts and punishing misbehaving dolphins is that the explaining is done for the purpose of the other readers, not just as a punishment.

comment by dbaupp · 2012-06-22T06:26:41.211Z · score: 3 (3 votes) · LW · GW

never comment on bad posts or explain my downvote on LW

I think this should be "never downvote".

comment by Nornagest · 2012-06-22T22:58:03.879Z · score: 2 (2 votes) · LW · GW

I think this should be "never downvote".

Seems to me that a downvote would associate negative valance with both the act of posting on LW and with whatever their specific mistake is, with the latter being stronger. So no vote and a comment with a mixture of praise and criticism is probably the stronger play if you're looking to improve someone's writing or fix some technical mistake while keeping them as a contributor, but a downvote is still effective if all you care about is seeing fewer posts of that kind.

comment by wedrifid · 2012-06-22T07:54:39.880Z · score: 0 (0 votes) · LW · GW

never comment on bad posts or explain my downvote on LW

I think this should be "never downvote".

That would be true if the point was actually about implementing the reinforcement ideal rather than using it to validate a premeditated ideal.

comment by hvass · 2012-06-21T04:22:17.480Z · score: 2 (4 votes) · LW · GW

Thanks, Luke! I've always enjoyed this sequence. (It's funny that I was tempted to include a note that I would've been happier if you contributed to the sequence more often, but let's stick with the praise for now. :-)

comment by [deleted] · 2012-06-21T01:13:19.327Z · score: 2 (6 votes) · LW · GW

Excellent article. I wonder if reinforcement could be used to speed up rationality training? I would love to see a study done on that.

comment by lukeprog · 2012-06-21T01:42:13.245Z · score: 4 (4 votes) · LW · GW

I wonder if reinforcement could be used to speed up rationality training?

Almost certainly. CFAR is doing this to some extent at their minicamps, but much more can be done. There is tons of room for rationality training via videogames, for example. Raytheon is developing some debiasing video games for military officers.

comment by [deleted] · 2012-06-21T01:46:24.780Z · score: 2 (2 votes) · LW · GW

Now that is exciting as hell. I've always felt that rationality was a potential competitive advantage for organizations that wasn't being utilized. Way to go.

comment by zslastman · 2013-07-19T23:57:27.714Z · score: 1 (3 votes) · LW · GW

Attacking your opponent's intelligence is just that, regardless of the terminology you dress it up in. That your opinion is the rational one, and that those who disagree with it are less rational than you, is obviously the position of anyone who makes an argument. Steering the conversation in that direction adds nothing.

comment by private_messaging · 2013-07-20T06:22:20.858Z · score: 0 (0 votes) · LW · GW

I'm speaking of a specific way of being irrational which is highly popular here: to explain normal behaviour in terms of "cultural memes" and "pattern matching" (misusing meanings of both), while neglecting the fact that e.g. in this specific instance, the world being what it is, if you hear that a man is training his wife, that probably implies physical abuse, whereas if you hear that a wife is training the man, that much less implies abuse, and thus one normally infers abuse from the former but not the latter (which may be unfair, but definitely isn't wrong in any other sense than fairness).

comment by Caspian · 2013-07-18T13:04:50.146Z · score: 1 (1 votes) · LW · GW

I just read Don't Shoot The Dog, and one of the interesting bits was that it seemed like getting trained the way it described was fun for the animals, like a good game. Also as the skill was learnt the task difficulty level was raised so it wasn't too easy. And the rewards seemed somewhat symbolic - a clicker, and being fed with food that wasn't officially restricted outside the training sessions.

Thinking about applying it to myself, having the reward not be too important outside the game/practise means I'm not likely to want to bypass the game to get the reward directly. Having the system be fun means it's improving my quality of life in that way in addition to any behaviour change.

I haven't done much about ramping up the challenge. How does one make doing the dishes more challenging?

But I did make sure to make the rewards quicker/more frequent by rewarding subtasks.

comment by tsakinis · 2012-09-28T00:55:14.384Z · score: 1 (1 votes) · LW · GW

Wow, thanks for this great article that was the final piece of information that tipped me over towards getting my shit together. Within 10 minutes after reading it and browsing the comments, I was on my bicycle going to buy small treats I like, that I now give myself for every achieved small goal (~2-10 min of work).

I now wonder though if maybe I should give myself another reinforcer when starting to work with a new goal, otherwise maybe I will only strive for finishing as fast as possible, but starting with a new small goal won't be that much reinforced? Maybe this is my mind trying to get more candy though, so I would be thankful for outside perspective.

comment by lukeprog · 2013-01-27T05:22:26.939Z · score: 1 (1 votes) · LW · GW

Have you been trying this? Any luck?

comment by tsakinis · 2013-03-06T09:57:22.211Z · score: 1 (1 votes) · LW · GW

It worked with similar effectiveness as other techniques I implement - that means only until I have done enough to feel good about myself (2-5 productive days)...

comment by potato · 2012-06-21T17:55:31.303Z · score: 1 (1 votes) · LW · GW

Does this still work if I reinforce myself? Every time I read 5 lesswrong articles in a day, I give myself a reward. Or every time i have a cigarette, I kick a brick wall with no shoes on. If i was consistent with this for a long time, would it work?

comment by wedrifid · 2012-06-21T18:25:28.345Z · score: 7 (7 votes) · LW · GW

Or every time i have a cigarette, I kick a brick wall with no shoes on. If i was consistent with this for a long time, would it work?

Totally. The wall will fall over in 20 years, tops!

The actual answer is maybe - it works for some but not others. A common point of failure is that people just train themselves to cheat and take the reward anyway. I'm not sure what the response rate is when full compliance to the reward schedule is assumed.

comment by TheOtherDave · 2012-06-21T19:04:33.014Z · score: 1 (1 votes) · LW · GW

It can. Basically the failure modes are the same as when reinforcing others. In particular, it's common to fail to maintain consistent thresholds of self-reward.

comment by cicatriz · 2012-06-21T17:19:54.756Z · score: 1 (1 votes) · LW · GW

This seems to contradict the very powerful effect of learning from failure and corrective feedback. See http://www.wired.com/wiredscience/2011/10/why-do-some-people-learn-faster-2/ for an accessible overview.

I'd conjecture this works better when someone can already perform the desired behavior and wants to form a habit, whereas learning from failure comes in when new information needs to be stored and reorganized.

comment by Will_Sawin · 2012-06-21T17:51:44.224Z · score: 2 (2 votes) · LW · GW

That article especially seems to demonstrate the critical importance of choosing what you reinforce, and how your a teacher's model of what they are reinforcing may differ from the students.

comment by Swimmer963 · 2012-06-21T20:16:36.402Z · score: 1 (1 votes) · LW · GW

I was about to reply "hmm, I wonder how you could reward someone for making an effort rather than just for succeeding, or reward them for noticing when they make a mistake." Then I read the article, and realized that that's basically what it talks about.

Yeah, failures are important. But the natural tendency, whether teaching others or trying to change our own behaviour, is to correct and criticize failures–which is basically negative reinforcement and trains people to stop trying because failing is so painful. The interesting new point in the article is that positively reinforcing for success, if done in a certain way (the "wow you're smart!" group of kids) can actually have the same effect as negatively reinforcing for failure.

comment by Viliam_Bur · 2012-06-22T10:48:47.866Z · score: 2 (2 votes) · LW · GW

More generally, a positive motivation often contains an implicit negative motivation -- a threat of not receiving the same reward next time. ("What pushes you forward, holds you back.")

Telling someone they are smart implies that the teacher has ability to judge smart and stupid students based on their work. So if tomorrow the work is not good enough, the same student could be judged as stupid. This could also happen if the student tries something new, where they obviously cannot have as good results as when they stick with what they already know well.

Telling someone they work hard avoids this danger somehow. Maybe because it contains an actionable advice what to next time to achieve the reward -- so it feels more under control, less threatening.

Maybe the secret is in finding a motivation that feels under control, but not too much to allow cheating. Maybe it's a moving target; I suspect that given enough time, some children in the experiment would find ways to appear working hard without doing the hard work.

comment by Swimmer963 · 2012-06-22T12:35:48.132Z · score: 2 (2 votes) · LW · GW

Telling someone they work hard avoids this danger somehow. Maybe because it contains an actionable advice what to next time to achieve the reward -- so it feels more under control, less threatening.

Yeah, working hard is something that isn't associated with a fixed mindset in the same way that intelligence is. A lot of people see intelligence as something that you either have or you don't.

comment by CharlieSheen · 2012-06-21T14:36:32.799Z · score: 1 (13 votes) · LW · GW

We have enough happy death spirals here.

comment by wedrifid · 2012-06-21T15:22:19.404Z · score: 3 (7 votes) · LW · GW

We have enough happy death spirals here.

Who is happy about what?

comment by CharlieSheen · 2012-06-21T15:28:22.786Z · score: -3 (11 votes) · LW · GW

Leave sleeping mind killers lie.

comment by wedrifid · 2012-06-21T15:35:46.366Z · score: 8 (8 votes) · LW · GW

Your unsubstantiated assertion is rejected. There is nothing that fits that label here. There are things that people like to say that everyone else is in a happy death spiral about but they are too powerfully skeptical to be one of the gullible crowd. This is useless cheap signalling that is a net detriment.

-3 M&Ms for all instances of vague self-reinforcing negativity.

comment by CharlieSheen · 2012-06-21T15:40:38.442Z · score: 3 (9 votes) · LW · GW

Very well I'll be explicit, I simply wanted to avoid a flame war. Most obvious example:

  • Relationship advice.

Now give me my M&Ms back.

comment by wedrifid · 2012-06-21T15:54:04.181Z · score: 3 (3 votes) · LW · GW

Very well I'll be explicit, I simply wanted to avoid a flame war. Most obvious example:

Relationship advice.

That isn't a Happy Death Spiral. It is a disgraceful mindkiller, sure. But it isn't remotely happy, isn't encouraged by universal reward and absence of criticism. It certainly isn't treated with or caused by the kind of positive feedback Luke's post advocates.

Now give me my M&Ms back.

You can have one back - but being fundamentally confused about what it is you are trying to criticize is only a weak mitigating factor.

comment by CharlieSheen · 2012-06-21T15:59:08.461Z · score: 0 (10 votes) · LW · GW

That isn't a Happy Death Spiral. It is a disgraceful mindkiller, sure. But it isn't remotely happy, isn't encouraged by universal reward and absence of criticism. It certainly isn't treated with or caused by the kind of positive feedback Luke's post advocates.

Do you remember the online dating profile optimization thread? LessWrong went in Vladimir_M's words "healing crystal equivalent". That thread was a happy death spiral.

Also if you recall the critics in the relationship threads are getting tired and frustrated and just aren't showing up any more, someone even wrote out a full comment to that effect! Evaporative cooling dude. Sure we haven't had a relationship thread since Luke's part I., but its only a matter of time before someone brings it up and the critics won't be there any more.

I only bother because I'm a Charlie Sheen.

comment by JoshuaZ · 2012-07-11T04:26:23.247Z · score: 0 (0 votes) · LW · GW

Do you remember the online dating profile optimization thread? LessWrong went in Vladimir_M's words "healing crystal equivalent". That thread was a happy death spiral.

Can you expand on why you thought it amounted to a happy death spiral? I didn't get that impression. (I am very likely a biased source in this regard because I started using OkCupid essentially because one of the LW threads on it suggested that it was worth trying, and it has worked quite well for me, so I may be missing something obvious.)

comment by [deleted] · 2012-06-21T15:53:32.935Z · score: 0 (2 votes) · LW · GW

Does he have to vomit the M&M's back up?

I really hope that's not the procedure.

comment by Vaniver · 2012-06-21T16:02:41.531Z · score: 5 (5 votes) · LW · GW

Incidentally, chewing M&Ms and then spitting them out is a moderately effective way to wean yourself off of chocolate cravings.

comment by [deleted] · 2012-06-21T18:23:43.362Z · score: 3 (5 votes) · LW · GW

Any suggestions for sugar specifically? I like chocolate and can get it in low-sweet, high-theobromine form, but shaking off sugar cravings would do me a world of good.

comment by Vaniver · 2012-06-21T18:29:48.332Z · score: 4 (4 votes) · LW · GW

From my incomplete understanding of taste psychology, sugar is one of the instinctual taste preferences, whereas things like chocolate are learned taste preferences that are possible to unlearn. I've found that sugar/salt/fat cravings have been useful signals about the quality of my diet, and so would recommend taking a hard look at your diet before trying to alter those signals. (They could be mistuned, but I don't have any advice on how to correctly tune them.)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-06-21T15:02:03.322Z · score: 0 (28 votes) · LW · GW

Whatever it is that rationalists are supposed to use instead of death spirals, we don't have enough of it until everything is funded. GO TEAM HAPPINESS!

comment by gwern · 2012-06-21T16:11:36.622Z · score: 16 (16 votes) · LW · GW

'My Little SIAI: Positive Reinforcement is Magic'?

comment by CharlieSheen · 2012-06-21T15:14:02.237Z · score: 4 (12 votes) · LW · GW

No.

comment by Strange7 · 2012-06-21T16:06:16.536Z · score: 1 (3 votes) · LW · GW

How long has it been since you had a post that stabilized at net negative votes?

comment by wedrifid · 2012-06-21T16:11:22.203Z · score: 7 (7 votes) · LW · GW

29 March 2012

comment by [deleted] · 2012-06-21T15:51:24.234Z · score: 0 (2 votes) · LW · GW

I hope you realize that LWers would give you M&Ms for this particular post regardless if you where right or not.

comment by Gastogh · 2012-06-21T10:26:57.298Z · score: 1 (5 votes) · LW · GW

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

Made me smile. Thanks for sharing.

comment by Viliam_Bur · 2012-06-21T10:59:22.713Z · score: 7 (9 votes) · LW · GW

Hopefully now that the experiment is over, they will return to the original schedule of giving M&Ms for new HPMoR chapters. Seriously, people are suffering here. :D

comment by Benquo · 2012-06-21T13:54:18.609Z · score: 12 (12 votes) · LW · GW

Too infrequent. They need to start by giving him an M&M every time he thinks about writing more HPMoR.

comment by gwern · 2012-06-21T16:10:48.414Z · score: 9 (9 votes) · LW · GW

But then when he starts actually writing, Eliezer will become diabetic!

comment by Benquo · 2012-06-21T17:59:41.797Z · score: 6 (6 votes) · LW · GW

Shush, don't give away the plan!

But seriously, one can always increase the reward threshold once the first behavior has been firmly established.

comment by wedrifid · 2012-06-21T16:31:54.728Z · score: 5 (5 votes) · LW · GW

But then when he starts actually writing, Eliezer will become diabetic!

If he gets into flow quickly he could be safe. That would mean he is writing more HPMoR but not thinking about writing HPMoR.

comment by philh · 2012-06-21T10:12:05.412Z · score: 1 (3 votes) · LW · GW

I think next time I go shopping, I'll buy a pack of M&Ms, and take one whenever I make a git commit.

comment by thomblake · 2012-06-21T14:17:20.185Z · score: 2 (2 votes) · LW · GW

Careful not to over-reinforce! Think of the commit logs!

comment by lukeprog · 2013-08-17T02:10:59.334Z · score: 0 (0 votes) · LW · GW

Lots of cool videos of operant conditioning and animals are here (also click on the other chapters).

comment by Curiouskid · 2012-07-12T20:16:25.278Z · score: 0 (0 votes) · LW · GW

related: http://gettingstronger.org/2012/01/hormesis-and-the-limbic-brain/

"Reprogramming the amygdala. This is the indirect way to re-program the hypothalamus, by altering the amygdaloid reward circuitry that feeds it. There are a number approaches to achieving this, some of which I’ve outlined in previous articles, but all of them fall generally under the umbrella of classical or Pavlovian conditioning. There are a few basic strategies:

Extinction.  An addictive response becomes weaker and eventually dies out when you stop responding to a triggering cue.   This approach works, but can take a long time and requires patience and discipline.
Cue exposure or deconditioning.  This involves deliberate, repeated and provocative exposure to the triggering cue, withholding the response.  After some initial discomfort, this approach proceeds rapidly and can be quite effective.  Success is improved the more realistic and varied the presentation of the cue.
Putting on cue.  A new cue is developed and the behavior is only allowed in the presence of this cue.  It could be a special sound, or a location.  Then the special cue is withheld and the behavior disappears.
Counter conditioning.  This involves the substitution of an alternative behavior to actively displace the old reward circuitry.  It can be very effective.

"

comment by Arkanj3l · 2012-07-07T01:39:46.568Z · score: 0 (0 votes) · LW · GW

Any non-sugary food related reinforcer?

comment by Swimmer963 · 2012-07-07T03:23:47.284Z · score: 0 (0 votes) · LW · GW

Coffee/tea. Twice in the past week, during extremely busy shifts at work, one of the doctors has decided to buy Tim Hortons for all the nurses. I can't think of any other single event that has made me as happy (albeit for a short 5-minute time period.)

comment by gwern · 2012-07-07T15:58:11.104Z · score: 2 (2 votes) · LW · GW

Perhaps it's just the unexpected generosity.

I remember a few years ago I was reading about positive psychology's various findings, such as that spending money on experiences with friends or family made one happier than spending money on an object for oneself; so I tried out buying donuts or bagels for a few university clubs. Everyone seemed much happier, for what was a relatively trivial amount of money invested (<$10 in every case).

(I don't remember noticing that the donut people were happier than the bagel people.)

comment by JGWeissman · 2012-07-07T02:34:31.914Z · score: 0 (0 votes) · LW · GW

Cheese would work well as a reinforcement for me. Also the proverbial carrot. And probably lots of other things.

comment by Dorikka · 2012-07-07T02:28:38.519Z · score: 0 (0 votes) · LW · GW

Dark chocolate?

comment by tgb · 2012-06-23T13:56:28.823Z · score: 0 (0 votes) · LW · GW

This post may have the highest upvotes per comment I've ever seen. Anyone got access to the database want to confirm that?

comment by dbaupp · 2012-06-23T15:06:45.605Z · score: 0 (0 votes) · LW · GW

Nope, this is higher. (In fact, many of the top posts have much higher upvotes/comment ratio.)

comment by tgb · 2012-06-24T13:30:25.999Z · score: 0 (0 votes) · LW · GW

Hm, I think I wasn't clear. I'm interested in comment upvotes per comment, not top-level post upvotes per comment. My assumption is that everyone here is primed to upvote due to the content of the post that most of the comments here are highly upvoted.

Or maybe you really did measure that that? You are probably right that some long-lasting classic posts have gotten high numbers of upvotes on their comments over the years just since so many people have read them.

comment by Oscar_Cunningham · 2012-06-21T18:17:40.345Z · score: 0 (0 votes) · LW · GW

Note that there are many circumstances when it is right to criticise. For instance group brainstorming exercises are more productive if the participants criticise each others ideas.

comment by [deleted] · 2012-06-21T16:12:59.964Z · score: -1 (5 votes) · LW · GW

Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.

But treating human beings, especially adults, like animals is characteristically unethical. Applying some system of reinforcement where someone has asked you to effectively treat their behavior is innocuous enough, as is of course treating yourself.

But generally manipulating the behavior of other people by means other than convincing them that they should behave in a certain way seems to me to be almost definitional of a dark art. If that's not controversial, then I think this article should be qualified appropriately: never do this to other people without their explicit consent.

comment by Vaniver · 2012-06-21T18:20:18.951Z · score: 10 (10 votes) · LW · GW

But treating human beings, especially adults, like animals is characteristically unethical.

It seems to me like the flow is in the reverse direction: many unethical manipulations involve treating adults like animals. But people who skillfully use positive reinforcement are both more pleasant to be around and more effective- which seems like something ethical systems should point you towards, not away from.

comment by adamtpack · 2012-06-23T02:13:10.329Z · score: 1 (1 votes) · LW · GW

.... And here begins the debate.

What do we do? What do we think about this piece of freaking powerful magic-science?

I vote we keep it a secret. Some secrets are too dangerous and powerful to be shared.

comment by beoShaffer · 2012-06-23T02:30:49.485Z · score: 4 (4 votes) · LW · GW

I think the cat is out of the bag on this one.

comment by [deleted] · 2012-06-21T18:28:39.274Z · score: 1 (1 votes) · LW · GW

That's a fair point: I may have been treating a conditional like a bi-conditional. I think my sense of the matter is this: if a friend told me that he spent a lot of our time together thinking through ways to positively reinforce some of my behaviors, even to my benefit, I would become very suspicious of him. I would feel that I'd been treated as a child or a dog. His behavior would seem to me to be manipulative and dishonest, and I think I would feel this way even if I agreed that the results of his actions were on the whole good and good for me.

Do you think this sort of reaction on my part would be misguided? Or am I on to something?

comment by Vaniver · 2012-06-21T18:51:20.871Z · score: 10 (10 votes) · LW · GW

I agree with you that your autonomy is threatened by the manipulations of others. But threats only sometimes turn into harm- distinguishing between manipulations you agree with and disagree with is a valuable skill.

Indeed, there's a general point that needs to be made about human interaction, and another about status, but first a recommendation: try to view as many of your actions as manipulations as possible. This will help separate out the things that, on reflection, you want to do and the things that, on reflection, you don't want to do. For example:

if a friend told me that he spent a lot of our time together thinking through ways to positively reinforce some of my behaviors, even to my benefit, I would become very suspicious of him. I would feel that I'd been treated as a child or a dog. His behavior would seem to me to be manipulative and dishonest,

Emphasis mine. The reaction- of calling his behavior manipulative and dishonest- feels like it punishes manipulation, which you might want to do to protect your autonomy. But it actually punishes honesty, because the trigger was your friend telling you! Now, if your friend wants to change you, they'll need to try to do it subtly. Your reaction has manipulated your friend without his explicit consent- and probably not in the direction you wanted it to.

So, the general point: human social interaction is an incredibly thorny field, in part because there are rarely ways to learn or teach it without externalities. Parents, for example, tell their children to share- not because sharing is an objective moral principle, but because it minimizes conflict. As well, some aspects of human social interaction are zero sum games- in which people who are skilled at interaction will lose if others get better at interaction, and thus discourage discussions that raise general social interaction skills.

The status interpretation: generally, manipulation increases the status of the manipulator and decreases the status of the manipulated. Resistance to manipulation could then be a status-preserving move, and interest in manipulation could be a status-increases move. What articles like this try to do is lower the status effects of manipulation (in both directions)- Luke proudly recounts the time Eliezer manipulated him so that he could better manipulate Eliezer. If being molded like this is seen more positively, then resistance to being molded (by others in the community) will decrease, and the community will work better and be happier. As well, I suspect that people are much more comfortable with manipulations if they know how to do them themselves- if positive reinforcement is a tool used by creepy Others, it's much easier to dislike than if it's the way you got your roommate to finally stop annoying you.

comment by wedrifid · 2012-06-21T18:59:35.774Z · score: 7 (7 votes) · LW · GW

distinguishing between manipulations you agree with and disagree with is a valuable skill.

This, with extra emphasis!

comment by adamtpack · 2012-06-23T02:15:23.780Z · score: -1 (1 votes) · LW · GW

I'm confused, not only by the beginning of this comment, but by several others as well.

I thought being a LessWronger meant you no longer thought in terms of free will. That it's a naive theory of human behavior, somewhat like naive physics.

I thought so, anyway. I guess I was wrong? (This comment still up voted for amazing analysis.)

comment by Vaniver · 2012-06-23T15:21:46.862Z · score: 3 (3 votes) · LW · GW

I thought being a LessWronger meant you no longer thought in terms of free will. That it's a naive theory of human behavior, somewhat like naive physics.

Autonomy and philosophical free will are different things. Philosophical free will is the question "well, if physical laws govern how my body acts, and my brain is a component of my body, then don't physical laws govern what choices I make?", to which the answer is mu. One does not need volition on the level of atoms to have volition on the level of people- and volition on the level of people is autonomy.

(You will note that LW is very interested in techniques to increase one's will, take more control over one's goals, and so on. Those would be senseless goals for a fatalist.)

comment by adamtpack · 2012-06-23T21:40:32.520Z · score: 2 (2 votes) · LW · GW

Thanks for clarifying that. I should note that I am very interested in techniques for self-improvement, too. I am currently learning how to read. (Apparently, I never knew :( ) And also get everything organized, GTD-style. (It seems a far less daunting prospect now than when I first heard of the idea, because I'm pseudo-minimalist.)

I still am surprised at the average LWers reaction here. Probably because it's not clear to me the nature of 'volition on the level of people'. Not something to expect you to answer, clarifying the distinction was helpful enough.

comment by [deleted] · 2012-06-21T19:36:11.769Z · score: 3 (3 votes) · LW · GW

I think it's misguided personally. You're already being manipulated this way by your environment whether or not you realize it.

comment by [deleted] · 2012-06-21T19:51:48.475Z · score: 1 (1 votes) · LW · GW

You're already being manipulated this way by your environment whether or not you realize it.

Well, I'm claiming that this kind of manipulation is often, even characteristically, unethical. Since my environment is not capable of being ethical or unethical (that would be a category mistake, I think) then that's not relevant to my claim.

comment by [deleted] · 2012-06-21T19:59:07.115Z · score: 1 (1 votes) · LW · GW

I was referring though to the case of your friend using reinforcement to alter your behavior in a way that would benefit you. I just have a hard time seeing someone trying to help you as an unethical behavior.

comment by AdeleneDawner · 2012-06-21T23:42:58.478Z · score: 1 (1 votes) · LW · GW

I just have a hard time seeing someone trying to help you as an unethical behavior.

It does depend on whose definition of 'help' they're using.

comment by [deleted] · 2012-06-22T00:08:15.923Z · score: 0 (0 votes) · LW · GW

Good point. Do you think it would be ethical if they were helping to fulfill your preferences?

comment by AdeleneDawner · 2012-06-22T01:07:58.624Z · score: 0 (0 votes) · LW · GW

Usually, yes, though there are several qualifications and corner cases.

comment by [deleted] · 2012-06-22T06:09:15.149Z · score: 0 (0 votes) · LW · GW

Agreed, there probably are.

comment by [deleted] · 2012-06-22T01:37:57.836Z · score: 0 (0 votes) · LW · GW

That's fair. I should tone down my point and say that doing this sort of thing is disrespectful, not evil or anything. Its the sort of thing parents and teachers do with kids. With your peers, unsolicited reinforcement training is seen as disrespectful because it stands in leau of just explaing to the person what you think they should be doing.

comment by TheOtherDave · 2012-06-22T01:55:32.326Z · score: 0 (0 votes) · LW · GW

In my experience, telling other people how I think they should behave is also often seen as disrespectful.

comment by [deleted] · 2012-06-22T02:03:44.452Z · score: 0 (2 votes) · LW · GW

Often it is, we agree. But it's the 'telling' there that's the problem. A respectful way to modify someone's behavior is to convince them to do something different (which may mean convincing them to subject themselves to positive reinforcement training). The difference is often whether we appeal to someone's rationality, or take a run at their emotions.

comment by TheOtherDave · 2012-06-22T02:28:36.148Z · score: 2 (2 votes) · LW · GW

A respectful way to modify someone's behavior is to convince them to do something different

I agree that there are respectful ways to convince me to do something different, thereby respectfully modifying my behavior.
Many of those ways involve appealing to my rationality.
Many of those ways involve appealing to my emotions.

There are also disrespectful ways to convince me to do something different.
Many of those ways involve appealing to my rationality.
Many of those ways involve appealing to my emotions.

comment by [deleted] · 2012-06-22T02:39:06.606Z · score: 0 (0 votes) · LW · GW

There are also disrespectful ways to convince me to do something different. Many of those ways involve appealing to my rationality.

So, by 'appealing to someone's rationality' I mean, at least, arguing honestly. Perhaps I should have specified that. Do you still think there are such examples?

comment by TheOtherDave · 2012-06-22T03:37:13.045Z · score: 1 (1 votes) · LW · GW

Do I think there are disrespectful ways to convince me to do something different that involve arguing honestly? Sure. Do you not?

comment by [deleted] · 2012-06-22T15:52:08.338Z · score: 0 (0 votes) · LW · GW

Not that I can think of, no. Can you think of an example?

comment by TheOtherDave · 2012-06-22T16:09:57.909Z · score: 1 (1 votes) · LW · GW

Sure. Suppose I believe my husband is a foolish, clumsy, unattractive oaf, and I want him to take dance lessons. Suppose I say to him, "Hey, husband! You are a foolish, clumsy, unattractive oaf. If you take dance lessons, you will be less clumsy. That's a good thing. Go take dance lessons!" I would say, in that situation, I have presented an honest, disrespectful argument to my husband with the intention of convincing him to do something different.

comment by [deleted] · 2012-06-22T19:11:42.655Z · score: 0 (0 votes) · LW · GW

That's not really a very good example. That in virtue of which its disrespectful is unconnected to that in virtue of which it appeals to reason.

comment by TheOtherDave · 2012-06-22T19:19:31.833Z · score: 1 (1 votes) · LW · GW

I agree completely that my example is disrespectful in virtue of (in vice of?) something other than its appeal to reason.

If that makes it a poor example of what you're asking for, I misunderstood what you were asking for. Which, given that you're repeatedly asking me for "an example" without actually saying precisely what you want an example of, is not too surprising.

So, perhaps it's best to back all the way out. If there's something specific you'd like me to provide an example of, and you can tell me what it is, I'll try to provide an example of it if I can. If there isn't, or you can't, that's OK too and we can drop this here.

comment by [deleted] · 2012-06-22T06:08:40.431Z · score: 1 (1 votes) · LW · GW

Well this runs into the problem of giving unsolicited advice. Most people don't respond well to that. I think it's probably difficult for most rationalists to remember this since we are probably more open to that.

comment by pjeby · 2012-06-22T20:53:09.815Z · score: -1 (1 votes) · LW · GW

Well this runs into the problem of giving unsolicited advice. Most people don't respond well to that. I think it's probably difficult for most rationalists to remember this since we are probably more open to that.

Not really. Rationalists are just open to different advice. There's lots of advice rationalists will reject out of hand. (Some of which is actually bad advice, and some of which is not.)

Everyone believes themselves to be open-minded; the catch is that we're all open to what we're open to, and not open to what we're not.

comment by [deleted] · 2012-06-23T01:44:13.972Z · score: 0 (2 votes) · LW · GW

Well I agree that none of us is completely rational when it comes to accepting advice. But don't you think rationalists are at least better at that than most people?

comment by pjeby · 2012-06-23T15:20:01.788Z · score: -1 (1 votes) · LW · GW

But don't you think rationalists are at least better at that than most people?

Based on what evidence?

comment by [deleted] · 2012-06-24T01:02:04.436Z · score: 0 (0 votes) · LW · GW

It's a guess but I think it's a fairly logical one. Think about all the stories of rationalists who've overcome a belief in God, or ESP or whatever. Seems to me that demonstrates an ability to suppress emotion and follow logic that should carry over into other areas.

comment by pjeby · 2012-06-24T01:53:46.821Z · score: 1 (3 votes) · LW · GW

As I mentioned in another comment, you can just read LW threads on contentious topics to observe as a matter of practice that LW rationalists at least are no different than other people in this respect: open only to what they're not already opposed to.

This is relevant evidence: evidence directly connected to the topic (openness to unsolicited advice). Your evidence is not, because it describes situations where rationalists changed their minds on their own. This is really different -- changing your own mind is in no way similar to being open to someone else changing your mind, since somebody else trying to change your mind creates internal resistance in a way that changing your own mind does not.

It's like using people's ability to walk on dry land as evidence of their ability to swim underwater, when an actual swimming test shows the people all drowning. ;-)

comment by Desrtopa · 2012-06-24T03:08:26.061Z · score: 1 (1 votes) · LW · GW

Since I remember your username being associated with various PUA discussions, I assume you at least partly have those in mind, and I can't say much about those not having ever really been part of the discussion, but I'll note that it's a particularly contentious issue (my position has changed somewhat but given that I only had a vague awareness of the PUA community before and not through anyone who participated in or approved of them I don't consider that especially remarkable,) and Less Wrongers seem to be more pliable than the norm on less contentious matters which still provoke significant resistance in much of the population, such as the safety of fireplaces.

comment by pjeby · 2012-06-25T01:23:43.269Z · score: -1 (1 votes) · LW · GW

Since I remember your username being associated with various PUA discussions, I assume you at least partly have those in mind, and I can't say much about those not having ever really been part of the discussion, but I'll note that it's a particularly contentious issue

It's not the only one. See any thread on cryonics, how well SIAI is doing on various dimensions, discussions of nutrition, exercise, and nootropics... it's not hard to run across examples of similar instances of closed-mindedness on BOTH sides of a discussion.

Less Wrongers seem to be more pliable than the norm

My point is: not nearly enough.

on less contentious matters which still provoke significant resistance in much of the population, such as the safety of fireplaces.

As I mentioned in that thread, LWers skew young and for not already having fireplaces: that they'd be less attached to them is kind of a given.

comment by wedrifid · 2012-06-24T05:35:50.847Z · score: 0 (0 votes) · LW · GW

This is really different -- changing your own mind is in no way similar to being open to someone else changing your mind, since somebody else trying to change your mind creates internal resistance in a way that changing your own mind does not.

Not only is it similar, the abilities in those areas are significantly correlated.

comment by Swimmer963 · 2012-06-24T05:39:59.568Z · score: 1 (3 votes) · LW · GW

Agreed. Wanting to be "the kind of person who changes their mind" means that when you get into a situation of someone else trying to change your mind, and you notice that you're getting defensive and making excuses not to change your mind, the cognitive dissonance of not being the kind of person you want to be makes it more likely, at least some of the time, that you'll make yourself be open to changing your mind.

comment by pjeby · 2012-06-25T01:32:55.865Z · score: -1 (1 votes) · LW · GW

This is a nice idea, but it doesn't hold up that well under mindkilling conditions: i.e. any condition where you have a stronger, more concrete loyalty to some other chunk of your identity than being the kind of person who changes your mind, and you perceive that other identity to be threatened.

It also doesn't apply when you're blocked from even perceiving someone's arguments, because your brain has already cached a conclusion as being so obvious that only an evil or lunatic person could think something so stupid. Under such a condition, the idea that there is even something to change your mind about will not occur to you: the other person will just seem to be irredeemably wrong, and instead of feeling cognitive dissonance at trying to rationalize, you will feel like you are just patiently trying to explain common sense to a lunatic or a troll.

IOW, everyone in this thread who's using their own experience (inside view) as a guide to how rational rationalists are, is erring in not using the available outside-view evidence of how rational rationalists aren't: your own experience doesn't include the times where you didn't notice you were being closed-minded, and thus your estimates will be way off.

comment by pjeby · 2012-06-25T01:35:57.961Z · score: 0 (0 votes) · LW · GW

Not only is it similar, the abilities in those areas are significantly correlated.

In order to use that ability, you have to realize it needs to be used. If someone is setting out to change their own mind, then they have already realized the need. If someone is being offered advice by others, they may or may not realize there is anything to change their mind about. It is this latter skill (noticing that there's something to change your mind about) that I'm distinguishing from the skill of changing your mind. They are not at all similar, nor is there any particular reason for them to be correlated.

comment by Swimmer963 · 2012-06-25T02:13:06.876Z · score: 1 (1 votes) · LW · GW

Really? You don't think the sort of person why tries harder than average to actually change their mind more often will also try harder than average to examine various issues that they should change their mind about?

comment by pjeby · 2012-06-25T15:41:51.487Z · score: 1 (1 votes) · LW · GW

Really? You don't think the sort of person why tries harder than average to actually change their mind more often will also try harder than average to examine various issues that they should change their mind about?

But that isn't the issue: it's noticing that there is something you need to examine in the first place, vs. just "knowing" that the other person is wrong.

Honestly, I don't think that the skill of being able to change your mind is all that difficult. The real test of skill is noticing that there's something to even consider changing your mind about in the first place. It's much easier to notice when other people need to do it. ;-)

comment by [deleted] · 2012-06-25T03:11:29.654Z · score: 0 (0 votes) · LW · GW

Inasmuch as internal reflective coherence, and a desire to self-modify (towards any goal) or even just the urge to signal that desire are not the same thing...yeah, it doesn't seem to follow that these two traits would necessarily correlate.

comment by [deleted] · 2012-06-24T02:52:49.527Z · score: 0 (0 votes) · LW · GW

I hadn't considered that. Ego does get in the way more when other people are involved.

comment by Dolores1984 · 2012-06-22T21:38:08.624Z · score: 0 (0 votes) · LW · GW

This feels like an equivocating-shades-of-grey argument, of the form 'nobody is perfectly receptive to good arguments, and perfectly unswayed by bad ones, therefore, everyone is equally bad at it.' Which is, of course, unjustified. In truth, if rationalists are not at least somewhat more swayed by good arguments than bad ones (as compared to the general population), we're doing something wrong.

comment by pjeby · 2012-06-23T01:35:31.580Z · score: 0 (2 votes) · LW · GW

Which is, of course, unjustified. In truth, if rationalists are not at least somewhat more swayed by good arguments than bad ones (as compared to the general population), we're doing something wrong.

Not really, we're just equally susceptible to irrational biases.

Trivial proof for LW rationalists: read any LW thread regarding a controversial self-improvement topic, including nutrition, exercise, dating advice, etc., where people are diametrically opposed in their positions, using every iota of their argumentative reasoning power in order not to open themselves to even understanding their opponents' position, let alone reasoning about it. It is extremely improbable that all divisive advice (including diametrically-opposed divisive advice) is incorrect, and therefore the bulk of LW rationalists are correctly rejecting it.

(Side note: I didn't say anything about receptiveness to good arguments, I said receptiveness to unsolicited advice, as did the comment I was replying to. I actually assumed that we were talking about bad arguments, since most arguments, on average, are bad. My point was more that there are many topics which rationalists will reject out of hand without even bothering to listen to the arguments, good or bad, and that in this, they are just like any other human being. The point isn't to invoke a fallacy of the grey, the point is for rationalists not to pat ourselves on the back in thinking we're demonstrably better at this than other human beings: demonstrably, we're not.)

comment by TheOtherDave · 2012-06-22T21:05:42.334Z · score: 0 (0 votes) · LW · GW

It amuses me how readily my brain offered "I am not neither open-minded!" as a response to that.

comment by adamtpack · 2012-06-23T02:17:57.247Z · score: 0 (0 votes) · LW · GW

But your environment includes people, dude.

This shouldn't be a puzzle. Reinforcement happens, consciously or subconsciously. Why in the name of FSM would you choose to relinquish the power to actually control what would otherwise happen just subconsciously?

How is that not on the face of it a paragon, a prototype of optimization? Isn't that optimizing is, more or less-consciously changing what is otherwise unconscious?

comment by Swimmer963 · 2012-06-21T19:50:17.923Z · score: 0 (0 votes) · LW · GW

I don't think I would be suspicious of him, as long as I agreed with the behaviours he was trying to reinforce. (I don't know for sure–my reactions are based only on a thought experiment.) I think I would be grateful, both that he cared enough about me to put that much time and effort in, and that he considered me emotionally mature enough to tell me honestly what he was doing.

However, I do think that being aware of his deliberate reinforcement might make it less effective. Being reinforced for Behaviour A would feel less like "wow, the world likes it when I do A, I should do it more!" and more like "Person X wants me to do A", which is a bit less motivating.

comment by [deleted] · 2012-06-21T20:03:43.306Z · score: 1 (1 votes) · LW · GW

I don't think I would be suspicious of him, as long as I agreed with the behaviours he was trying to reinforce.

Really? So say I tell you that all those times that I smiled at you and asked how you were doing were part of a long term plan to change the way you behave. The next day I smile and ask you how you're doing. Has my confession done nothing to change the way you think about my question?

I'm saying that things like smiles and friendly, concerned questions have a certain importance for us that is directly undermined by their being used for for the purposes of changing our behavior. I don't think using them this way is always bad, but it seems to me that people who generally treat people this way are people we tend not to like once we discover the nature of their kindness.

comment by Swimmer963 · 2012-06-21T20:28:26.802Z · score: 0 (0 votes) · LW · GW

Like I said, thoughts experiments about "how would I feel if X happened" are not always accurate. However, when I try to simulate that situation in my head, I find that although I would probably think about his smile and question differently (and be more likely to respond with a joke along the lines of "trying to reinforce me again, huh?") I don't think I would like him less.

Anyway, I think I regularly use smiles and "how are you doing?" to change the way people behave...namely, to get strangers, i.e. coworkers at a new job, to start liking me more.

comment by [deleted] · 2012-06-21T20:43:31.813Z · score: 0 (0 votes) · LW · GW

Well, I guess I'll tap out then. I'm not sure how to voice my position at this point.

comment by Swimmer963 · 2012-06-21T21:53:09.657Z · score: 1 (1 votes) · LW · GW

Your position is that you have a certain emotional response to knowing someone is trying to modify your behaviour. My position is that I have a different emotional response. I can imagine myself having an emotional response like yours...I just don't. (Conversely, I can imagine someone experiencing jealousy in the context of a relationship, but romantic jealousy isn't something I really experience personally.) I don't think that makes either of us wrong.

comment by [deleted] · 2012-06-21T22:42:12.389Z · score: 0 (0 votes) · LW · GW

Well, my position is that doing things like asking how someone is doing so as to reinforce behavior rather than because you want to know the answer is ethically bad. I used the example of the friend to try to motivate and explain that position, but at some point if you are totally fine with that sort of behavior, I don't have very much to argue with. I think you're wrong to be fine with that, but I also don't think I can mount a convimcing argument to that effect. So you've pretty much reached the bottom of my thoughts on the matter, such as they are.

comment by Swimmer963 · 2012-06-22T02:12:45.950Z · score: 1 (1 votes) · LW · GW

I'm curious about whether your reasons for considering this kind of behaviour "unethical" are consequentialist (i.e. a world where people do X is going to be worse overall than a world where no one does X) or deontological (there are certain behaviours, like lying or stealing, that are just bad no matter what world they take place in, and using social cues to manipulate other people is a behaviour that falls into that class.)

comment by [deleted] · 2012-06-22T02:21:45.253Z · score: 0 (0 votes) · LW · GW

Ah, I'm not a consequentialist or a deontologist, but I do think this is a case where intentions are parcticularly important. Doing this kind of reinforcement training to someone without their knowledge is characteristically disrespectful if you just do it to help them, but it may also be the right thing to do in some cases (I'm toning down my claim a bit). Doing it with the result that they are harmed is vicious (that is, an expression or manifestation of a vice) regardless of your intentions. So that puts me somewhere in the middle.

comment by wedrifid · 2012-06-22T03:22:25.780Z · score: 0 (0 votes) · LW · GW

Doing this kind of reinforcement training to someone without their knowledge is characteristically disrespectful if you just do it to help them, but it may also be the right thing to do in some cases (I'm toning down my claim a bit).

I wouldn't necessarily say that. Doing it when you know they don't (or would not) want you to is disrespectful.

Doing it with the result that they are harmed is vicious (that is, an expression or manifestation of a vice) regardless of your intentions.

This definitely seems false. It is the expected result, given information that you have (or should be expected to have) that can indicate viciousness, not actual results. For example, I could reward my children such that they never Jaywalk (still not quite sure what this is) and only cross the road at official crossings. Then one of my children gets hit by a car waiting at a crossing when they would have been fine crossing the street earlier. I haven't been vicious. My kid has been unlucky.

It the general case it is never the result that determines whether your decision was the right decision to make in the circumstance. It is the information available at the time. (The actual result can be used as a proxy by those with insufficient access to your information at the time or when differing incentives would otherwise encourage corruption with 'plausible deniability').

comment by [deleted] · 2012-06-22T15:56:48.834Z · score: 0 (0 votes) · LW · GW

On the unlucky kid: fair enough. But using positive reinforcement to make someone violent or cowardly, even if you think you're benefiting them, is vicious. Thats the sort of case I was thinking about.

I disagree with you about the actual vs. expected result, but thats a bigger discussion.

comment by TheOtherDave · 2012-06-22T16:13:08.048Z · score: 0 (0 votes) · LW · GW

On your account, is using positive reinforcement to make someone peaceful vicious? Virtuous? Neither?

comment by [deleted] · 2012-06-22T19:15:57.445Z · score: 0 (0 votes) · LW · GW

It depends on whether or not they should be peaceful, I guess. But if they're not your child or student or something like that, then it's probably disrespectful at the least.

comment by TheOtherDave · 2012-06-22T19:20:44.249Z · score: 0 (0 votes) · LW · GW

OK. Tapping out now.

comment by shminux · 2012-06-21T23:28:01.449Z · score: 1 (1 votes) · LW · GW

Well, my position is that doing things like asking how someone is doing so as to reinforce behavior rather than because you want to know the answer is ethically bad.

Can you express your personal ethics explicitly and clarify where it comes from?

comment by [deleted] · 2012-06-22T01:31:44.730Z · score: 0 (0 votes) · LW · GW

I'd be happy to try. Do you want a brief account specific to this topic, or something more general?

comment by shminux · 2012-06-22T02:20:17.195Z · score: 0 (0 votes) · LW · GW

If you could trace your ethics backward from "it's unethical when people consciously use punishment/reward system to modify my behavior to their liking" to some basic ideas that you hold inviolate and cannot further trace to anything deeper, I'd appreciate it.

comment by [deleted] · 2012-06-22T02:49:47.142Z · score: 1 (1 votes) · LW · GW

I think there are basically two aspects to our ethical lives: the biological and habituated arrangement of our emotions and our rationality. Our lives involve two corresponding phases. As children, we (and our teachers, parents, etc.) aim at developing the right kinds of emotional responses, and as adults we aim at doing good things. Becoming an adult means having an intellectual grasp of ethics, and being able (if one is raised well) to think thought one's actions.

When you use positive reinforcement training, you treat someone as if they were in the childhood phase of their development, even if the behavioral modification is fairly superficial. This isn't necessarily evil or anything, but it's often disrespectful if it stands in place of appealing to someone's ethical rationality. I guess an analogue would be using dark arts tactics to convince someone to have the right opinions about something. Its disrespectful because it ignores or holds in contempt their ability to reason for themselves.

comment by OnTheOtherHandle · 2012-08-19T19:19:25.440Z · score: 1 (1 votes) · LW · GW

I think I disagree with this because the brain is modular, an evolutionary hodge-podge of old and new subroutines each with different functions. Only a few of those modules are conscious, self-aware, deliberative thinkers capable of planning ahead and accurately judging the consequences of potential actions to decide what to do. The rest is composed of a series of unconscious impulses, temptations, and habits. When I say "I," I refer to the former. When I say "my brain", I refer to the latter.

And I am always trying to trick and manipulate my brain. If I'm on a diet, I'll lock the refrigerator door to make it harder to get a midnight snack. I'll go grocery shopping only when I'm full. I'll praise myself when I eat celery, etc.

Personally, I only identify with, approve of, and demand respect for those conscious, self-reflective modules, and the various emotions and habits that are harmony with them. And if someone who loves me wants to help me trick my brain into better aligning with my values, I'm all for it. Even if a particular technique to condition my brain requires that I don't know what they're doing.

And when it comes to reinforcing behaviors that align with my extrapolated volition ("What is OTOH likely to want to do, but is too scared/lazy/squicked out/biased to get herself to do?"), deliberate, considered, scientifically sound manipulation is probably better than the subconscious manipulation we all engage in, because the chances of getting undesired results are lower.

comment by [deleted] · 2012-08-19T21:20:00.424Z · score: 0 (0 votes) · LW · GW

My objection is basically that it's disrespectful (to the point of being unethical) to do this sort of thing to someone without their consent. As with many such things, there are going to be cases where someone has not or cannot actually give consent, and so we have to ask whether or not they would do so if they had all the facts on the table. In these cases, it's a tricky question whether or not you can assume someone's consent, and it often best to err on the side of not assuming consent.

I notice that you put this in terms of someone you love manipulating your habits in accordance with your values. That sounds a lot like a case where someone is safe assuming your consent.

I was objecting, in the OP, to the lack of any discussion of what seems to me to be the central moral question in this kind of activity, as well as what I took to be the view that this kind of consent can be quite broadly assumed. With some very few exceptions, I think this is unethical.

comment by OnTheOtherHandle · 2012-08-19T21:56:31.232Z · score: 3 (3 votes) · LW · GW

The thing is, other people's actions and reactions will always sway our behavior in a particular direction, and our actions will do the same to others. We evolved to speak and act in such a way as to get allies, friends, mates, etc. - ie, make people like us so we can then get them to do things for us. Those who were good at getting others to like and help them reproduced more frequently than those who were not. Even if I were to agree that influencing others' behavior without their explicit knowledge and consent is unethical, I can't not do that.

My every smile, frown, thank-you, sorry, and nagging criticism will do something to affect the behavior of others, and they won't be thinking "Ah, she thanked me, this will have the effect of reinforcing this behavior." So if I can't avoid it, the next best thing would be to influence skillfully, not clumsily. In both cases, the other person's behavior is being influenced, and in both cases they are not explicitly aware of this. The only difference in the second case is that I know what I'm doing.

I definitely understand where you're coming from. I can empathize with the sense of violation and disrespect, and I agree that in a lot of situations such behavior is problematic, but I probably wouldn't agree with you on what situations, or how often they occur. This was my biggest problem with PUA when I first heard about it. I found it horrifyingly offensive that men might take advantage of the security holes in my brain to get me to sleep with them. But...confident, suave men are attractive. If a man were "naturally" that way, then he's "just sexy," but if someone who didn't initially start out that way explicitly studies how to behave in an attractive manner, that's creepy.

Why? It's not like no one's ever allowed to try to get anyone to sleep with them, and it's not like I would favor a strict rule of a complete, explicit disclaimer explaining, "Everything I say is with the sole intention of convincing you to have sex with me." (Such a disclaimer wouldn't even be true, necessarily. Human interaction is complex and multi-faceted, and any given conversation would have multiple motives, even if one dominates.)

So what's the difference between a man who's "just sexy" and a "creepy PUA" who behaves the same way? (We'll ignore some of the blatant misogyny and unattractive bitterness among many PUA, because many women find the abstract concept itself creepy, with or without misogyny.)

I think it's the knowledge differential, which causes a very skewed power balance. The naturally confident, extroverted man is unconsciously playing out a dance which he never really examined, and the woman he's chatting up is doing the same. When this man is replaced with a hyper self-aware PUA, the actions are the same, but the woman is in the dark while the man can see exactly why what he says causes her to react the way she does.

It's like a chess game between Gary Kasporov and a guy who only vaguely realizes he's playing chess. Yes, it's unfair. But I think the more practical solution is not making Kasporov handicap himself, but teaching the other guy how to play chess.

I think the line between conscious and unconscious influencing of behavior is thinner and more fluid than you seem to say, more like a sliding scale of social self-awareness. And the line between manipulation and self-improvement is even thinner. What if I decided to be much nicer to everyone all of a sudden because I wanted people to like me? The brain is not a perfect deceiver; soon I'll probably fake it til I make it, and everyone's lives would be more pleasant.

In the end, I treat emotional manipulation (which involves changing one's emotional responses to certain behaviors, rather than telling people factual lies) the way I treat offense. It's just not practical to ban offending people. I think it's more useful to be aware of what offends us, and moderate our responses to it. In the same way, it's not possible to ban influencing other people's behavior without their explicit knowledge; the naturally sexy man does this just as much as the PUA does. It's possible to have a norm of taking the other person's wishes into account, and it's possible to study the security holes in our own minds and try to patch them up.

comment by [deleted] · 2012-08-20T14:10:11.007Z · score: 0 (0 votes) · LW · GW

So if I can't avoid it, the next best thing would be to influence skillfully, not clumsily. In both cases, the other person's behavior is being influenced, and in both cases they are not explicitly aware of this. The only difference in the second case is that I know what I'm doing.

I think there is a difference. You're right that all our behavior has or can have a reinforcing effect on other people. But smiles, and frowns, and thank-yous and such aren't therefore just reinforcers. When I smile at someone, I express something like affection, and if I don't feel any affection, I smile falsely. All these kinds of behaviors are the sorts of things that can be done honestly or falsely, and we ought to do them honestly. We do this with children, but with adults it's disrespectful.

It might be possible to smile at someone for the sake of reinforcing some behavior of theirs, and to feel affection all the while, but my sense is that either a smile is an expression of affection, or it is done for some ulterior end.

I think your initial reaction to PUA is spot on. It's a monstrous practice.

comment by OnTheOtherHandle · 2012-08-21T05:50:26.648Z · score: 1 (1 votes) · LW · GW

my sense is that either a smile is an expression of affection, or it is done for some ulterior end.

Here's where I think human thinking is more complicated, muddled, and mutually-reinforcing than you say. In the example of saying "Thank you," is it really so inconceivable that someone might say "Thank you," while thinking (or, more likely, wordlessly intuiting) something along the lines of "I'm grateful and happy that this person did this, and I would like them to do it again"? In fact, much of these "reinforcement" or "animal training" tips, while phrased repulsively, mostly end up advising, "Remember to consistently express the gratitude you feel , and refrain from expressing any annoyance you might feel."

Here's what I might think, if I were the wife in that example: "Not only does nagging and expressing annoyance when I feel my reasonable expectations were not met belittle and irritate my husband, it doesn't even work. He still doesn't put the damn clothes in the damn hamper! We're both less happy, and I didn't even get him to change." If I understand you correctly, that last part, where I discuss the efficacy of my nagging at getting me what I want, sounds dishonestly manipulative to you.

We all expect things from others, and we all care about others. Is it always, inevitably wrong to sully considerations of caring/being a nice person with considerations of ensuring your expectations and needs get met? Or is it that the only legitimate way to get other human beings to meet your expectations is to sit them down and explain it all to them, even if they're annoyed and made unhappy by this Talk and its lack of emotional salience means it doesn't work?

Saying "Thank you" and ignoring the clothes that don't get put in the hamper works. It bypasses defensive, angry, annoyed reactions to nagging. It accurately expresses that clothes-in-the-hamper make me happy - in fact, more directly than the nagging method did, because the nagging method required the husband to infer that clothes-on-floor causes irate nagging, therefore clothes-in-the-hamper must cause happiness and gratitude. He's happy, because he feels appreciated and doesn't feel like he's a teenager again being prodded by his mother. I'm happy, because I don't feel like a grumpy middle-aged mother of a teenager. The clothes are in the hamper.

Was it wrong that I started all this because I was annoyed at having to nag him and wanted a more reliable way to get him to put his clothes in the hamper? Even though the (empirically sound) advice only told me to frame the same content - "Floor bad, hamper good" - in a more positive light, expressing happiness and gratitude when things go right, rather than irritation and disappointment when things go wrong? Even though once I shook myself of the nagging mindset the happiness and gratitude was not grudgingly given, was not an inaccurate portrayal of my now-happier mental state, was not intended to belittle my husband, but only to make us both happier AND get him to put the clothes in the hamper?

comment by Jonathan_Graehl · 2012-06-22T23:38:16.247Z · score: 1 (1 votes) · LW · GW

That's sensible, but realize that it's atypical. Make those expectations clear before you cry foul in a relationship.

If you make an appeal to the "adult" in most people, you'll confuse and infuriate them ("why is he lecturing me?"). Better (by default) stick with a smile when they do right by you, and ignore/brush off when possible if they don't.

comment by shminux · 2012-06-22T02:57:41.042Z · score: 0 (2 votes) · LW · GW

Becoming an adult means having an intellectual grasp of ethics, and being able (if one is raised well) to think thought one's actions.

Even without any feedback from others? Or are you OK with a specific kind of feedback? What kind would it be? Is explicitly telling a person what you expect of them OK? If so, when does it become not OK?

comment by [deleted] · 2012-06-22T15:51:08.718Z · score: 0 (0 votes) · LW · GW

Yes, even without feedback, though its always helpful to have other people to think with. As to when telling someone what to do is okay and not, I can't imagine there's any general rule, but I also expect we're all familiar with the kinds of situations when you can do then and when not.

comment by TheOtherDave · 2012-06-22T16:17:14.531Z · score: 1 (1 votes) · LW · GW

As to when telling someone what to do is okay and not, [...] I also expect we're all familiar with the kinds of situations when you can do then and when not.

Just to be clear: if a hundred randomly-selected humans are presented with an identical list describing, in full detail, a hundred cases where person A tells person B what to do, and those humans are asked to classify those cases into acceptable, unacceptable, and borderline, your expectation is that most or all of those humans will arrive at the same classifications?

Because I find that extremely unlikely.

comment by TimS · 2012-06-22T16:22:31.049Z · score: 0 (0 votes) · LW · GW

Really? To me, it depends substantially on how the list is generated. If we try to "rip from the headlines," I'd expect substantial disagreement. If we follow you around and watch you tell people what to do in your ordinary week, I expect more agreement.

In short, there are lots of points of disagreement about social interaction, but there are far more mundane and uncontroversial interactions than controversial ones.

comment by TheOtherDave · 2012-06-22T16:43:52.847Z · score: 1 (1 votes) · LW · GW

Hm.

Well, I certainly agree that it's possible to generate a list of a hundred cases that 95% of people would agree on the classification of.

But if you followed me around for a week and picked samples randomly from that (both of cases where I tell people what to do, and cases where I could have told people what to do and didn't), and you asked a hundred people, I expect you'd get <60% congruence. I work in an office full of Americans and Israelis, I am frequently amused and sometimes horrified by the spread of opinion on this sort of thing.

Of course, if you narrowed your sample to middle-class Americans, you might well get up above 90%.

Edit: I should explicitly admit, though, that I was not envisioning a randomly generated list of cases. It was a good question.

comment by [deleted] · 2012-06-22T19:20:37.425Z · score: 0 (0 votes) · LW · GW

I had something a set of mundane cases in mind. My post was just meant to point out that discerning these sorts of situations is not something we use a set of rules or criteria for (at least no fixed set we could usefully enumerate), but most people are socially competant enough to tell the difference.

comment by TheOtherDave · 2012-06-22T19:26:35.240Z · score: 0 (0 votes) · LW · GW

I agree that most people who share what you're calling "social competence" within a given culture share a set of rules that determine acceptable utterances in that culture, and that those rules are difficult to enumerate.

comment by TheOtherDave · 2012-06-21T19:00:19.094Z · score: 0 (0 votes) · LW · GW

Oh, you're definitely on to something, and it's something important.

That said, I don't think what you're on to has to do with whether and when it's ethical to manipulate people's behavior.

comment by [deleted] · 2012-06-21T19:32:53.738Z · score: 0 (0 votes) · LW · GW

So what am I on to then?

comment by TheOtherDave · 2012-06-21T20:18:09.877Z · score: 2 (2 votes) · LW · GW

Roughly, that we often respond to others' ability to cause us harm (whether by modifying our behavior or our bank accounts or our internal organs or whatever other mechanism) as a threat, independent of their likelihood of causing us harm.

So if you demonstrate, or even just tell me about, your ability to do these things, then while depending on the specific context, my specific reaction will be somewhat different... my reaction to you knowing my bank PIN number will be different from my reaction to you knowing how to modify my behavior or how to modify the beating of my heart or how to break into my home... they will all have a common emotional component: I will feel threatened, frightened, suspicious, attacked, violated.

That all is perfectly natural and reasonable. And a common and entirely understandable response to that might be for me to declare that, OK, maybe you are able do those things, but a decent or ethical person never will do those things. (That sort of declaration is one relatively common way that I can attempt to modify your likelihood of performing those actions. I realize that you would only consider that a form of manipulation if I realize that such declarations will modify your likelihood of performing those actions. Regardless, the declaration modifies your behavior just the same whether I realize it or not, and whether it's manipulation or not.)

But it doesn't follow from any of that that it's actually unethical for you to log into my bank account, modify my heartbeat, break into my home, or modify my behavior. To my mind, as I said before, the determiner of whether such behavior is ethical or not is whether the result leaves me better or worse off.

Breaking into my home to turn off the main watervalve to keep my house from flooding while I'm at work is perfectly ethical, indeed praiseworthy, and I absolutely endorse you doing so. Nevertheless, I suspect that if you told me that you spent a lot of time thinking about how to break into my home, I would become very suspicious of you.

Again, my emotional reaction to your demonstrated or claimed threat capacity is independent of my beliefs about your likely behaviors, let alone my beliefs about your likely intentions.

comment by [deleted] · 2012-06-21T20:30:42.779Z · score: 0 (0 votes) · LW · GW

Roughly, that we often respond to others' ability to cause us harm (whether by modifying our behavior or our bank accounts or our internal organs or whatever other mechanism) as a threat, independent of their likelihood of causing us harm.

This seems very implausible to me. I often encounter people with the ability to do me great harm (a police officer with a gun, say), and this rarely if ever causes me to be angry, or feel as if my dignity has been infringed upon, or anything like that. Yet these are the reactions typically associated with finding out you've been intentionally manipulated. Do you have some independent reason to believe this is true?

comment by TheOtherDave · 2012-06-21T20:41:39.534Z · score: 0 (0 votes) · LW · GW

Yes, but no reasons I can readily share. And, sure, I might be wrong.

comment by TheOtherDave · 2012-06-21T17:00:27.346Z · score: 9 (9 votes) · LW · GW

But treating human beings, especially adults, like animals is characteristically unethical.

This statement without context is clearly incorrect; there are all sorts of behaviors we can ethically execute with respect to both humans and other animals. I understand that what you and the OP both mean to connote is particular behaviors which we restrict in typical contexts only to non-human animals, but if you're going to label them as unethical when applied to humans it helps to specify what behaviors and context those are.

manipulating the behavior of other people by means other than convincing them that they should behave in a certain way seems to me to be almost definitional of a dark art.

That's a little more specific, but not too much, as I'm not really sure what you mean by "convincing" here.

That is, if at time T1 I don't exhibit behavior B and don't assert that I should exhibit B, and you perform some act A at T2 after which I exhibit B and assert that I should exhibit B, is A an act of convincing me (and therefore OK on your account) or not (and therefore unethical on your account)? How might I test that?

never do this to other people without their explicit consent

This, on the other hand, is clear. Thank you.
I disagree with it strongly.

comment by TimS · 2012-06-21T17:49:33.734Z · score: 2 (2 votes) · LW · GW

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

That story doesn't trouble you at all?

For most people, there's lots of low hanging fruit from trying to recognize when they are reinforcing and punishing behaviors of others. Also, positive reinforcement is more effective at changing behavior than positive punishment.

But that doesn't mean that we should embrace conditioning-type behavior-modification wholesale. I'm highly doubtful that conditioning responses are entirely justifiable by decision-theoretic reasons. And "not justifiable by decision theoretic reasons" is a reasonable definition of non-rational. Which implies that relying on those types of processes to change others behaviors might be unethical.

comment by TheOtherDave · 2012-06-21T18:51:07.095Z · score: 5 (5 votes) · LW · GW

Does it trouble me at all? I suppose. Not a huge amount, but some. Had Esar said "Doing this to people without their consent is troubling" rather than "never do this to other people without their explicit consent" I likely wouldn't have objected.

My response to the rest of this would mostly be repeating myself, so I'll point to here instead.

More generally, "conditioning-type behavior-modification" isn't some kind of special category of activity that is clearly separable from ordinary behavior. We modify one another's behavior through conditioning all the time. You did it just now when you replied to my comment. Declaring it unethical across the board seems about as useful as saying "never kill a living thing."

comment by [deleted] · 2012-06-21T17:06:53.315Z · score: 0 (0 votes) · LW · GW

This statement without context is clearly incorrect...

You seem to know what I mean, so I won't go into a buch of unnecessary qualifications.

is A an act of convincing me?

Not necessarily. Is the meaning of 'convince' really unclear? Threatening someone with a gun seems to satisfy your description, but it's obviously not a case of convincing. I'm not sure what you're unclear about.

I disagree with it strongly.

If you care to explain why, please do so.

comment by TheOtherDave · 2012-06-21T18:45:02.095Z · score: 4 (4 votes) · LW · GW

If you care to explain why, please do so.

Sure.

The easiest way to get at it is with an example.

Suppose I decide I want my coworkers to visit my desk more often at work, and therefore begin a practice of smiling at everyone who visits, keeping treats on my desk and inviting visitors to partake, being nicer to people when they visit me at my desk than I am at other times, and otherwise setting up a schedule of differential reinforcement designed to increase the incidence of desk-visiting behavior, and I do all of that without ever announcing to anyone that I'm doing it or why I'm doing it, let alone securing anyone's consent. (Let alone securing everyone's consent.)

Do you consider that an example of unethical behavior? I don't.

Now, maybe you don't either. Maybe it's "obviously" not an example of manipulating the behavior of other people by means other than convincing them that they should behave in a certain way. I don't really know, since you've declined to clarify your constraints. But it sure does seem to match what you described.

comment by [deleted] · 2012-06-21T18:57:33.130Z · score: 0 (0 votes) · LW · GW

Do you consider that an example of unethical behavior? I don't.

You're right that this doesn't seem quite unethical, but it is awfully creepy and I'm not sure how to pull my intuitions apart there. Sitting across from someone who is faking affection and smiles and pleasantries so as to manipulate my behavior would cause me to avoid them like the plague.

In professional environments I find this happens all the time, and when the fake friendliness is discovered as such, the effect reverses considerably. If it's terribly important to something's being effective that the person you're doing it to doesn't know what's going on, it's probably bad.

comment by TheOtherDave · 2012-06-21T19:11:37.642Z · score: 5 (5 votes) · LW · GW

(nods) Absolutely. I could have also framed it to make it seem far creepier, or to make it seem significantly less creepy.

In particular, the use of loaded words like "faking" and "manipulate" ups the creepy factor of the description a lot. The difference between faking affection and choosing to be affectionate is difficult to state precisely, but boy do we respond to the difference between the words!

I agree that most activities which depend on my ignorance for their effectiveness are bad. I even agree that a higher percentage of activities which depend on my ignorance for their effectiveness are bad than the equivalent percentage of activities that don't so depend.

That said, you seem to be going from that claim to the implicit claim that they are bad by virtue of depending on my ignorance. That's less clear to me.

comment by [deleted] · 2012-06-21T19:49:47.224Z · score: 2 (2 votes) · LW · GW

I could have also framed it to make it seem far creepier

I'll put it simply: if someone asks me about my kids, neither to be polite nor because they care, but because they want to change the way I behave, then they're (in most cases) being manipulative and insincere. While perhaps they're not wronging me, per se, it's certainly not something that speaks well of them, ethically speaking. If you find this controversial, then you surprise me.

It would be bad advice, I think, to encourage people to use positive reinforcement on others when their ignorance is necessary for it to be effective. Not just practically bad advice, as people are pretty good at picking up on fake friendliness. But full stop ethically damaging advice, if taken seriously. I'm not saying that every such case is going to be unethical, but I'm not in the business of lawlike ethical principles anyway.

That said, you seem to be going from that claim to the implicit claim that they are bad by virtue of depending on my ignorance. That's less clear to me.

No, what I said was that behaviors which depend on someone's ignorance for their effectiveness are often also bad behaviors. I didn't say anything one way or the other about a stricter relation between the two properties, but I'll say now that I don't think they're unrelated.

comment by MixedNuts · 2012-06-26T07:00:25.077Z · score: 1 (3 votes) · LW · GW

What do you think being polite is?

comment by TheOtherDave · 2012-06-21T20:37:24.495Z · score: 1 (1 votes) · LW · GW

I agree that asking you about your kids solely to change your behavior is manipulative.
I also agree that it's insincere. (Which is an entirely distinct thing.)
I would also say that asking you about your kids solely to be polite is insincere.
I would not agree that any of these are necessarily unethical.

I am not quite sure what you mean by "ethically damaging advice."
I agree with you that it's not always unethical to positively reinforce others without their knowledge.
I would agree that "Positively reinforcing others without their knowledge is a good thing to do, do it constantly" is advice that, if taken seriously, would often lead me to perform unethical acts. I can accept calling it unethical advice for that reason, I suppose.
But I also think that "Positively reinforcing others without their knowledge is a bad thing to do, never do it." is unethical advice in the same (somewhat unclear) sense.

I agree that behaviors that depend on others' ignorance are often also bad behaviors.
Behaviors that depend on others' knowledge are also often bad behaviors.

comment by [deleted] · 2012-06-21T20:50:11.325Z · score: 1 (1 votes) · LW · GW

Agreed on all counts. In fact, it doesn't look like we disagree at all, judging from your comment.

comment by TheOtherDave · 2012-06-21T20:51:36.805Z · score: 0 (0 votes) · LW · GW

Oh good!
When you started out by saying "never do this," I concluded otherwise.
I'm pleased to discover I was wrong.

comment by [deleted] · 2012-06-21T21:01:09.145Z · score: 0 (0 votes) · LW · GW

Well, I think I'd stand by what I said originally. Though I guess I'm counting on no one reading that as the exceptionless proposition 'for all x such that x is a case of using positive reinforcement without someone's knowledge, x is unethical'. Likewise, if someone asked me, I'd say 'Don't ever shoplift, it's unethical.' Though I wouldn't want or expect anyone to read that as 'all cases of shoplifting are, without exception, unethical.'

comment by TheOtherDave · 2012-06-21T21:07:08.449Z · score: 0 (0 votes) · LW · GW

OK. I apologize for misunderstanding your original comment.

comment by [deleted] · 2012-06-21T21:13:56.637Z · score: 0 (0 votes) · LW · GW

Quite alright, I've enjoyed the discussion.

comment by AdeleneDawner · 2012-06-21T23:07:34.929Z · score: 1 (1 votes) · LW · GW

You and Esar both: Taboo 'creepy'? Particularly with an eye to 'why is it important that this situation evokes this emotion'?

comment by TheOtherDave · 2012-06-21T23:59:39.916Z · score: 0 (0 votes) · LW · GW

Well, I think it's important because IMHO that negative emotional response is what underlies the (incorrect) description of the corresponding behavior as unethical. But I expect Esar would find that implausible.

comment by AdeleneDawner · 2012-06-22T00:05:19.898Z · score: 1 (1 votes) · LW · GW

'Taboo with an eye to this question', not 'answer this question'. I'd already noticed the pattern that people consider finding something creepy to be sufficient reason to label it unethical, but that observation isn't useful for very much beyond predicting other peoples' labeling habits.

comment by TheOtherDave · 2012-06-22T00:23:50.763Z · score: 0 (0 votes) · LW · GW

Oh, I see.
Sorry, misunderstood.
I could replace "creepy" everywhere it appears with "emotionally disquieting", but I'm not sure what that would help. I figured using the same language Esar was using would be helpful, but I may well have been wrong.

comment by OnTheOtherHandle · 2012-08-19T22:35:55.344Z · score: 0 (0 votes) · LW · GW

I think it's false to suggest that pleasantries are being outright faked. This person is probably not sitting there going, "Oh, woe is me, I have to pay the horrible price of smiling and being nice to these imbeciles in order to make them give me what I want; I would never do that otherwise." In fact, why would he even want his coworkers to visit his desk more if he had such utter contempt for them that he had to fake affection wholesale?

Rather, like many people, there's a part of him which would probably like to be a nicer person overall, but he can't always bring himself to live up to the ideal. "People will visit my desk more" is a good immediate incentive to be a better person. The coworker who wants more people to visit their desk is also affected by the results of his own behavior. He'll probably be happier because of the visitations, and his happiness would cause him to smile more, and the very act of smiling would make him even more happy. After a while the "initial motivation," whether it was 100% selfish "I want people to visit my desk more; damn their own desires" or the 100% altruistic "I want to manipulate myself into being a nicer person," or, more likely, a mixture of the two, has faded away, and all that remains is the slightly modified, more pleasant person.

comment by Tuesday_Next · 2012-08-03T01:10:47.152Z · score: 0 (0 votes) · LW · GW

I don't understand how using friendly behavior to reinforce people visiting one's desk precludes that behavior being genuine. You seem to be dismissing the possibility that the person in question feels real affection, and is smiling because they are in fact happy that their desk is being visited. Just because they are using their (real) positive response to coworkers visiting their desk as positive reinforcement doesn't mean that their behavior is "fake" in any way.

Just like a woman who feels a surge of affection towards her husband when he puts away the laundry, and kisses or praises him.

Yes, it's positive reinforcement, but it's also a genuine response.

comment by drethelin · 2012-06-21T15:41:33.425Z · score: -1 (15 votes) · LW · GW

http://en.wikipedia.org/wiki/Love_bombing

This is getting creepy.

comment by wedrifid · 2012-06-22T11:59:31.201Z · score: 15 (15 votes) · LW · GW

http://en.wikipedia.org/wiki/Love_bombing

If this genuinely looks like love bombing then it could be an indication that you need more affection in your life to recalibratethe the base rate.

comment by sketerpot · 2012-06-22T00:47:01.993Z · score: 4 (4 votes) · LW · GW

You realize that almost all people express appreciation or displeasure routinely, right? It's a normal and reasonable part of human interaction, and it's a skill that someone can try to improve without needing to feel too conflicted. Love bombing is far more extreme than anything that this post even touched on. So, while we're linking to things, here's one:

http://lesswrong.com/lw/md/cultish_countercultishness/

comment by Viliam_Bur · 2012-06-22T10:33:55.836Z · score: 5 (5 votes) · LW · GW

Love bombing is just a tool -- its morality depends on how it is used. In a typical situation it is used to ruin the person's natural resistance towards groups that exploit them; that is obviously evil.

A different thing would be to use love bombing with the person's explicit consent, as a reinforcement for things the person values, and for nothing else. Preferably for a limited time specified in advance. It could be a great tool to overcome akrasia.

comment by MarkusRamikin · 2012-06-22T11:01:31.572Z · score: 4 (4 votes) · LW · GW

love bombing with the person's explicit consent

That sounds even more creepy. I like it.

comment by faul_sname · 2012-06-24T01:45:58.529Z · score: 3 (3 votes) · LW · GW

LWers do many cultish things, but I think it's safe to say that's not one of them.

comment by Desrtopa · 2012-06-24T01:58:33.296Z · score: 2 (2 votes) · LW · GW

LWers do many cultish things

How many?

comment by faul_sname · 2012-06-24T04:44:27.889Z · score: 4 (8 votes) · LW · GW

At least 3:

Specifically: foster a distrust of what outsiders say, quotes a lot of stuff by a self-appointed charismatic leader, and emphasize a single solution (rationality) for a large number of problems.

Notable also are the large number of cultish things LWers don't do, such as aggressive recruiting (or really, any recruiting at all).

comment by Desrtopa · 2012-06-24T05:22:14.134Z · score: 4 (4 votes) · LW · GW

quotes a lot of stuff by a self-appointed charismatic leader

I wouldn't exactly call Eliezer a self appointed leader. The community basically accreted around him. If he disavowed being the leader, I think we'd say he was being dishonest or fooling himself.

Not that this is a distinction from cults, the same would probably be true of most of them, I just think it's not quite accurate as a characterization.

Oh, also I think most cult leaders probably have more charisma off the internet.

comment by faul_sname · 2012-06-24T05:31:52.215Z · score: 3 (3 votes) · LW · GW

Oh, probably. I hear Luke has more real-life charisma... Though he kind of kills the "fosters a distrust of outside sources" with the amount he cites outside sources.

comment by wedrifid · 2012-06-24T06:19:57.727Z · score: 9 (9 votes) · LW · GW

Oh, probably. I hear Luke has more real-life charisma... Though he kind of kills the "fosters a distrust of outside sources" with the amount he cites outside sources.

Quite a lot of charisma, but nothing near the level a cult leader would need to pull off a personality cult. (Although he could probably make up for this if he really wanted to by spending a few weeks reading up research on cult formation then applying it systematically as a 'how to' guide.)

comment by Swimmer963 · 2012-06-24T08:41:09.060Z · score: 3 (3 votes) · LW · GW

Quite a lot of charisma, but nothing near the level a cult leader would need to pull off a personality cult. (Although he could probably make up for this if he really wanted to by spending a few weeks reading up research on cult formation then applying it systematically as a 'how to' guide.)

I would like to see Lukeprog post an article on that topic. It would be fascinating.

comment by wedrifid · 2012-06-24T09:05:49.556Z · score: 5 (5 votes) · LW · GW

I would like to see Lukeprog post an article on that topic. It would be fascinating.

Fascinating but suboptimal signalling.

comment by Will_Newsome · 2012-06-21T04:44:22.406Z · score: -1 (1 votes) · LW · GW

This article implicitly positively reinforces positive reinforcement and negatively reinforces negative reinforcement. But there are situations in which negative reinforcement should be positively reinforced, e.g. if this article is in fact correct to negatively reinforce negative reinforcement. The article thus implicitly contradicts itself.

Yes, in the should-world we could've all learned to avoid putting our hands on hot stovetops simply by getting an M&M for every hour we managed to avoid putting our hands on hot stovetops. In the real world, learning via pain is simply a better mechanism. Sometimes it's a good idea to cultivate a paralyzing fear of being wrong, including a paralyzing fear of being wrong about cultivating a paralyzing fear of being wrong. Sometimes that's the only option you have if you want to reliably bind yourself to reality, instead of being insane like every other Panglossian happy-go-lucky fool.

So I really hope someday someone writes out a list of ways to efficiently torture yourself into having at least some hope of ultimately not being seen as obviously stupid in retrospect, to complement this article and perhaps adjust for any optimistic biases that might have crept in.

(Downvoters: So you agree with me that negative reinforcement has its place. Clever of you.)

comment by JGWeissman · 2012-06-21T04:54:11.633Z · score: 0 (0 votes) · LW · GW
comment by roland · 2012-06-22T22:58:11.402Z · score: -3 (13 votes) · LW · GW

Edit: relevant quotes from the post:

When trying to maintain order in a class, ignore unruly behavior and praise good behavior (Madsen et al. 1968; McNamara 1987).

To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."

Reward opinion-expressing to get people to express their opinions more often

Now that we all know this, shouldn't we abolish downvotes? From my personal experience the emotional impact of a downvote is extremely frustrating and not helpful at all. The message I get from a downvote is "You are wrong!" or "What you said doesn't agree with the group consensus so we will punish you for it!". I don't see this as constructive in any sense.

comment by RichardKennaway · 2012-06-23T07:48:41.174Z · score: 1 (7 votes) · LW · GW

The message I get from a downvote is "You are wrong!" or "What you said doesn't agree with the group consensus so we will punish you for it!".

The message I get from a downvote is "Someone did not like this." Obviously, that person is wrong. :-)

ETA: -2! Two people did not like this! I die. My brain turns into maggots which burst from my skull and multiply until they devour the world. All die. O the embarrassment.

comment by Jonathan_Graehl · 2012-06-22T23:17:27.500Z · score: 0 (6 votes) · LW · GW

I think downvotes are generally useful to other readers (though it's odd that the parent suggestion has one as I type), but I agree that people should be protected from the discouraging effect of an early, single downvote. So, why not postpone displaying the negative score to the user for long enough for possible upvotes to counter? (I don't volunteer to implement this).

comment by TimS · 2012-06-22T23:51:46.471Z · score: 5 (5 votes) · LW · GW

The fact that reinforcement can be very effective in changing frequency of behavior doesn't say that punishment should never be used to change the frequency of behavior.

Reinforcement is useful for increasing frequency of behavior. When decreased frequency of behavior is desired, punishment is the type of intervention to use. (For applied behavior analysis, those are the definitions of reinforcement and punishment).

comment by Jonathan_Graehl · 2012-06-22T23:56:21.329Z · score: 2 (2 votes) · LW · GW

Sure. Although I wasn't clear about this, I had in mind the common case of a non-punishing downvoter who simply disagrees with the comment (or wants to see less of its ilk) without saying why. In case punishment is the desired effect, you're right - immediate is better.

comment by wedrifid · 2012-06-23T02:58:38.182Z · score: 0 (0 votes) · LW · GW

When decreased frequency of behavior is desired, punishment is the type of intervention to use.

Either punishment or extinction#Operant_conditioning) (no punishment, no reward).

comment by TheOtherDave · 2012-06-23T01:43:11.740Z · score: 4 (4 votes) · LW · GW

Be aware that some people upvote comments "back to zero" that they wouldn't otherwise upvote. (Some other people consider this bad practice.)