Calibrate your self-assessments

scottalexander

Calibrate your self-assessments

post by Scott Alexander (Yvain) · 2011-10-09T23:26:01.432Z · LW · GW · Legacy · 122 comments

  FOOTNOTES
None
122 comments

When I moved to Ireland, I knew that their school system, and in particular their examinations, would be different from the ones I was used to. I educated myself on them and by the time I took my first exam I thought I was reasonably prepared.

I walked out of my first examination almost certain I had failed. I remember emailing my parents, apologizing to them for my failure and promising I would do better when I repeated the class.

Then I got my results back, and learned I had passed with honors.

This situation repeated itself with depressing regularity over the next few semesters. Took exam, walked out in tears certain I had failed, made angsty complaints and apologies, got results back, celebrated. Eventually I decided that I might as well skip steps two to five and go straight to the celebrations.

This was harder than I expected. Just knowing that my feelings of abject failure usually ended out all right did not change those feelings of abject failure. I still walked out of each exam with the same gut certainty of disaster I had always had. What I did learn to do was ignore it: to force myself to walk home with a smile on my face and refuse to let myself dwell on the feelings of failure or take them seriously. And in this I was successful, and now the feelings of abject failure produce only a tiny twinge of stress.

In LW terminology, I am calibrating my self-assessment of examination success¹.

We appreciate objective measurements, like a percent score on an examination, or the running time of a marathon in minutes and seconds. But in the absence of such measurements, we use subjective mental estimates: how I feel I did on this exam, or how plausible that theory sounds.

The rationality literature has especially focused on one particular subjective mental estimate: our feelings of probability. For example, someone may say they feel 80% certain that Germany is larger than France. However, if they consistently answer questions like this with 80% confidence, and only get 60% right, then we say they are mis-calibrated: their subjective mental estimate of probability has a consistent mismatch with a more normatively correct probability. Calibration means revising your subjective mental estimate until it matches the objective value it tries to estimate; so that when you estimate something with 80% confidence, you get it right 80% of the time.

My story about exam scores is also a story about calibration. My subjective mental estimate of my exam scores was consistently too low; I would estimate I failed when I had really passed by a wide margin. By suppressing my original mental estimate and replacing it with one better informed by past experience, I am calibrating my estimate of exam scores.

Since passing my exams, I've identified other areas of my life where I need to calibrate my estimates:

-- Embarrassment. I used to be mortified if I answered a question wrong in class, assuming that people would judge me on it as long as they knew me. After thinking about it, I realized that although many people in my class answer questions wrong every day, I literally cannot remember a single one. If you pointed out any student in my class, even one of my close friends who I would be expected to pay extra attention to, and asked me "Has this person ever answered a question wrong in class?" I wouldn't be able to tell you. This suggests they won't remember my mistakes either, and that my subjective feeling of loss of respect on answering a question wrong is exaggerated to say the least².

-- Interestingness. I tend to think that if I talk about something I'm interested in, other people will be interested in it too. No matter how fascinating the underlying concept to me, nor how well I think I'm explaining it, this almost never happens.

-- Flirting. Through painful trial and error, I've found that my hunch that a woman likes me is almost always wrong. Someone will be flirting very heavily with me, and I'll think "there is no way in the world she's not into me", and then it will turn out she will not be into me.

These aren't just things I'm often wrong about; making a list of those would be a Sisyphean task. They're the things that I'm wrong about that my natural instincts never auto-correct, so that I know I'm going to keep being wrong unless I consciously calibrate my natural instincts against a reasoned opinion.

In general, I find I am most often miscalibrated in areas that relate to self evaluation. Cognitive psychology has a slew of ideas about so-called "self-assessment biases". You've probably heard the self-serving ones where 94% of professors rate their teaching ability above average, or how everyone thinks they're an above average driver, or how (ironically) everyone thinks they're less susceptible to biases than other people. But more surprisingly, I also find cases where people consistently underestimate themselves - like my own tendency to always think I've failed my examinations. I don't have a good explanation of this - I don't know if it's strategic humility, self-verification, some underlying depression-like state, or what - but I'm pretty sure it exists. And there are two situations in which I find it most common and most annoying.

The first involves good looks. Some people just have no idea how attractive they are or aren't. This is most obvious in body dysmorphic disorder, a condition where normal looking (or even very attractive) people somehow get it into their head that some feature of theirs - their nose, their hair, their weight - is inhumanly hideous and that they look like some kind of swamp monster. This is an officially recognized psychiatric disorder because it's completely divorced from reality - usually their nose or hair or whatever looks absolutely normal and just like everyone else's.

BDD is less rare than people think, at about one to two percent of the population, but even people without the full-fledged disorder can be really bad at determining how attractive they are or aren't. There are a lot of pretty girls who go around saying they're ugly in order to trick people into complimenting them, or to signal that they're available and not too picky, but I've come to realize that there are also a lot of pretty girls who genuinely believe they're ugly (it's less obvious in men, but I wouldn't be surprised if it were there under the surface).

And research agrees: studies show that people are uniquely bad at rating their own physical attractiveness. The opinion of unbiased observers evaluating a subject's attractiveness usually correlate at a level of r = .4 to .5; the opinion of the subject herself correlates with everyone else only around the r = .2 level. Other studies using purportedly "objective" measures of attractiveness like facial symmetry report a similarly low level of correlation between the objective measures and the self-reports. What self-reported physical attractiveness correlates strongly with is not objective attractiveness, but self-reported self-esteem, with r values around .5 or .6 depending on the study.

If you're not so good at statistics - that means that people often agree on how attractive a particular subject is, but that subject's estimate of her own attractiveness is often completely different from everyone else's (in either direction), and more related to that subject's self-esteem than to reality.

Sites like hotornot.com or okcupid's MyBestFace have a lot of problems, most obviously that they depend a lot upon how good a specific photo is. But I think either is leagues ahead of trying to guess how attractive you are to others based on how attractive you feel. If you have any concern whatsoever about how attractive you are, the worst thing you can do is trust your own brain, especially if it's telling you you're probably pretty ugly when everyone around you seems to think you're okay.

Which brings me to the number one most tragic failure of the inside view I see in my friends, my acquaintances, and the psychiatric patients I encounter.

Nietzsche said that a casual stroll through an insane asylum shows that faith does not prove anything. Such an experience might also teach people to be skeptical of their own subjective valuation of themselves - their self-esteem. If our hypothetical visitor doesn't figure it out after seeing the depressed patients, who are obsessed with their own guilt and moral worthlessness to the point of confessing to any crime they hear about because it seems like the sort of thing someone as awful as themselves might do, she can go visit the schizophrenics with delusions of grandeur, who insist they are the next Jesus or Einstein, or God's chosen representative on Earth.

These people get locked away because their self-esteem is at an extreme no sane human would ever reach. But your location outside the insane asylum doesn't prove your own calculations of self-esteem come from reasoning processes that are any more valid. We all know self-obsessed narcissists without any real achievements to their name, and we all know people who insist that they are ugly and stupid and unlikeable even though they don't seem any worse off than anyone else.

Research confirms that people's self-esteem is poorly correlated with reality. Across many experiments with many different designs, people's self-reported likeability has no correlation with their likeability as reported by other people with whom they interact. This is true whether the experiment measures artificial interaction in a lab, simulated "dates" with people of the opposite sex, or the attitudes of their roommates.

There are no studies correlating self-reported morality with experimentally determined morality, but if you want conduct one, you could probably gather enough secretly gay evangelical ministers and adulterous family-values politicians to make up a pretty good sample size.

If you hate yourself and think you're worthless, take a moment to consider whether you have any evidence that you're objectively doing any worse than anyone else, or whether you just have a low self-esteem set point. If the latter appears to be true, then try to replace the inside view with the outside view when worrying about how much bother you're being to other people or whether you "deserve" to be happy.

(if your problem is in the other direction you may not have as much vested interest in correcting yourself, but do keep in mind that most of the purported benefits of self-confidence have been exaggerated).

SUMMARY

People's subjective mental estimates are often way off, especially when they're estimating qualities closely linked to their self-worth. Both everyday experience and scientific research provide ample evidence of people who both underestimate and overestimate themselves in various ways. If you worry you may be one of those people, try and get objective estimates of the parameter you're concerned about from other people or from empirical testing. Then make an effort of will to consciously replace your subjective estimates with your new better-calibrated estimates.

FOOTNOTES

1: This could also be interpreted as replacing the Inside View with the Outside View and this would also be a good moral to draw from the story; I'm phrasing it in terms of calibration because it's more appropriate for some of the other examples later down.

2: See Gilovich, Medvec, and Savitsky, 2000 for experimental proof of the same idea

122 comments

Comments sorted by top scores.

comment by Vladimir_M · 2011-10-09T23:31:58.857Z · LW(p) · GW(p)

Through painful trial and error, I've found that my hunch that a woman likes me is almost always wrong. Someone will be flirting very heavily with me, and I'll think "there is no way in the world she's not into me", and then it will turn out she will not be into me.

It may also be that you are recognizing the indications of interest correctly, but then screwing things up with your follow-up behavior. Usually, female attraction in the initial phases is easy to destroy with even a single serious misstep. (And it may be serious even if it seems insignificant or altogether non-obvious to you.)

Replies from: HughRistik, HughRistik

↑ comment by HughRistik · 2011-10-22T07:03:35.741Z · LW(p) · GW(p)

Flirting. Through painful trial and error, I've found that my hunch that a woman likes me is almost always wrong. Someone will be flirting very heavily with me, and I'll think "there is no way in the world she's not into me", and then it will turn out she will not be into me.

Another possibilities behind this, in addition to Vladimir_M's excellent hypothesis:

There is a small percentage of women who look like they are flirting with everyone, even though they are merely being friendly.

If 5% of women are flirtatious with no attraction, they could still dominate Yvain's flirting experience if the base rate of attracted women flirting with him is low. Meanwhile, perhaps 100% of the women attracted to Yvain are a sort of serious or shy type who isn't into overt flirting. Then both P(attraction | flirting) and P(flirting | attraction) could be low.

In contrast, another man could experience that same 5% of merely friendly flirts, but also run into a greater number of attracted flirts, making most of his flirting experiences indicative of attraction.

↑ comment by HughRistik · 2011-10-22T06:12:32.400Z · LW(p) · GW(p)

I agree. Failing to recognize sex differences in attraction (particularly greater female selectiveness and preferences for behavior and personality traits) will sabotage males, and leave females turned off and creeped out.

comment by Prismattic · 2011-10-10T02:33:51.631Z · LW(p) · GW(p)

I've noticed that I have a particular form of calibration problem, for which I don't know if there is a specific term. Tentatively, I'm calling it "pernicious sliding selective self-assessment."

What I mean by this is that all of my achievements become diminished in my own eyes, because my frame of reference for comparison gradually excludes people who haven't reached at least an equal level of achievement:

-- When I started working out, I gradually came to ignore the 80% (or whatever it is) of the population that is sedentary and could only compare myself to the people I see at the gym, a disproportionate number of whom make me look weak by comparison.

-- Similarly, when I took up a martial art, I ended up comparing myself not to the population as a whole, but to the more advanced practitioners, thus feeling incompetent.

-- I have a lot of academic accomplishments, including a degree summa cum laude from a competitive university and a Fulbright fellowship. Yet I suffer horribly from imposter syndrome, in part because my frame of reference for comparison gradually weans out anyone who isn't also academically accomplished.

Unfortunately, the fact that I am aware this is happening doesn't seem to help overcome it.

Replies from: ciphergoth, John_Maxwell_IV, kalla724, taryneast, duckduckMOO, JenniferRM

↑ comment by Paul Crowley (ciphergoth) · 2011-10-10T08:35:30.335Z · LW(p) · GW(p)

I remember the shock of going to my first crypto conference and realizing that I was nowhere near being the smartest person in the room. From there, it seemed to me that unless you were at the very top of your profession, you were always going to compare yourself to your immediate superiors and feel bad. However, I'm reliably informed that in at least one instance, being at the very pinnacle of a world-respected field of endeavour is not enough to feel good about your abilities.

Replies from: Prismattic

↑ comment by Prismattic · 2011-10-10T17:04:24.446Z · LW(p) · GW(p)

It occurs to me that I also neglected to include participating on Lesswrong in my list. It's a slightly different phenomenon, but here the local sample is so skewed in terms of intelligence that even those of us with IQs 2 or 3 standard deviations above the mean can be quietly nursing the humiliating thought that maybe we are idiots after all.

That is especially so for those of us who excel more in verbal intelligence than in math and programming capabilities.

Replies from: Swimmer963, AspiringRationalist

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2011-10-11T13:41:17.089Z · LW(p) · GW(p)

So I'm not the only one who's found that!

When I was younger, for reasons that I don't understand well now, I really didn't want to be defined by "intelligence." People often told me that I was smart, and that because I was smart, I ought to do x, y, z (be a biologist, be a physicist, whatever, and if it was a teacher, it was usually the subject they taught.) Which prompted me not to want to do x, y, z even though I found pretty much all subjects fascinating.

So I went into nursing, where a lot of the material (practical skills and empathy-based skills) involves stuff I'm not naturally good at...and all of the sudden intelligence is something I want to prove, and the fact that most people on LW are smarter than I am bothers me way more than it should.

↑ comment by NoSignalNoNoise (AspiringRationalist) · 2012-03-18T21:00:17.328Z · LW(p) · GW(p)

I have found that LessWrong has the opposite effect on me. While I think that I am less rational and less intelligent than the average person here (or perhaps the availability-weighted average?), my main cognitive response has been an increase in self-esteem.

Strangely, in college, where there was also an abundance of people smarter than me, I and my response was a general feeling of inferiority.

I would hypothesize (~40% confidence) that the source of this difference is a sense of competing with my college classmates for jobs vs. aspiring to gain the abilities that others here have.

↑ comment by John_Maxwell (John_Maxwell_IV) · 2011-10-10T05:31:07.584Z · LW(p) · GW(p)

Will it help if I congratulate you on your academic accomplishments?

↑ comment by kalla724 · 2011-10-10T18:45:59.276Z · LW(p) · GW(p)

This is a VERY common form of anxiety. The good thing is, it is fairly well understood, and in vast majority of cases, the problem responds really, really well to CBT interventions. I'm familiar with cases where a dozen sessions (and a willing commitment to overcoming the issue) completely and permanently solved the problem. Have you tried something like that?

In case you haven't, I highly recommend it. Even if you don't want/can't afford treatment right now, you could try basic CBT on your own (keeping thought records can be particularly helpful). It is likely to be less effective than guided therapy, but it is certainly cheaper.

↑ comment by taryneast · 2011-10-10T13:07:41.384Z · LW(p) · GW(p)

I've noticed this trend myself. I also see it most frequently amongst my upper-middle-class friends (as opposed to lower-middle or working class friends). Amongst said group, just passing or even passing well isn't enough - you have to be top of the class, or you aren't anybody special.

It's an extremely judgmental attitude and very difficult to live up to. I don't know about the men, but it seems to makes for hyperactive, control freak women... or for early nervous breakdowns. Nasty destructive cycles crop up pretty often too.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2011-10-12T14:24:00.546Z · LW(p) · GW(p)

Three books on common, inhumanly stringent standards.

Compassion and Self-Hate by Theodore Rubin. (Self-hatred as a semi-autonomous mental habit, with compassion as the only way out)

I Thought It Was Just Me (but it isn't) by Brene Brown. (Women and shame, with a claim that women are haunted by incompatible standards, while men are haunted by a single unachievable standard-- I'm not sure this is true, but I'm keeping an eye out for evidence one way or the other. )

Perfect Girls, Starving Daughters: The Frightening New Normalcy of Hating Your Body by Courntey Martin (Eating disorders among high-achieving young women.)

↑ comment by duckduckMOO · 2011-11-07T00:06:58.998Z · LW(p) · GW(p)

I don't see the problem. You have high standards. It would be crazy to compare yourself to an average person in each of these situations. Do you really want to feel good about lifting more weight than the average sedentary person as a gym-goer? In martial arts specifically I think you should always be comparing yourself to the person directly above you. It's competitive and its self improvement. Slightly better is what you should be next week. A lot better is what you should be next year. The "impostor syndrome" (yes, those are scare quotes) seems like a seperate issue to me. Comparing yourself to those around you might make you feel insecure and untalented if you have a bias towards overrating others or if you are less talented but that only makes your achievements more impressive.

The scare quotes are because it seems to be assumed that anyone who has an accomplishment deserves it. Some people must luck out. If we're not going to just reject the notion of deserving entirely there must be some people who don't deserve their accomplishments and as a result feel like they don't deserve their accomplishments. Additionally, feeling like you don't deserve your accomplishment, even if most people feel like they do, doesn't mean you're pathological. People have different standards for considering themselves deserving. Some are way off one end of the bell curve but that doesn't mean there's anything wrong. When you consider yourself competent or deserving is a personal judgement. There's nothing inconsistent in an above average or even elite person thinking they are incompetent.

↑ comment by JenniferRM · 2011-10-12T19:49:18.418Z · LW(p) · GW(p)

The change in your basis of comparison is probably quite common, and is probably part of the cause for the Dunning Kruger effect...

...a cognitive bias in which unskilled people make poor decisions and reach erroneous conclusions, but their incompetence denies them the metacognitive ability to recognize their mistakes. The unskilled therefore suffer from illusory superiority, rating their ability as above average, much higher than it actually is, while the highly skilled underrate their own abilities, suffering from illusory inferiority.

It is worth pointing out that the effect is more dramatic and common in Americans relative to Europeans and actually appears to be reversed in people from East Asia. In other words: culture matters, and it wouldn't surprise me if outlier-types (who consciously self modify) can exaggerate, reverse, or correct for the effect by exposing themselves to some kind of reminders, training, and/or social context.

comment by kalla724 · 2011-10-10T19:01:53.660Z · LW(p) · GW(p)

I tend to think that if I talk about something I'm interested in, other people will be interested in it too. No matter how fascinating the underlying concept to me, nor how well I think I'm explaining it, this almost never happens.

Heh, I hear you on this one.

My initial response was to try to "fit in better" by simply avoiding those topics and sticking to smalltalk. Worked really well for the purpose, but also made me feel stupid - and worse, it drastically reduced the number of interesting people I got to know. Essentially, while most people found the stuff I care about horribly boring, occasionally I would run into someone to have a worthwhile chat with; but if you eliminate all the possibilities for such chats, they never happen.

So I resorted to introductory stories: interesting anecdotes and personal tales that have something to do with the subjects I'm interested in, even if they are tangential to the main area. If you draw people into the subject first, they are much more likely to allow you to expound on the boring bits as well.

For example, a side interest of mine are factors that change people's motivation patterns. Essentially impossible to talk about directly. But I can introduce the subject in two steps. First I start smalltalk about the financial crisis and the motivations of bankers responsible for it. Then I segue into short descriptions of some of Dan Ariely's experiments (where he showed that amount of theft increases proportionally with the number of steps a person is removed from actual, physical money), which can be described in a way that most people find very intriguing. THEN you can go into more technical details, and have a spellbound audience.

All of the above is probably a very obvious approach, but was quite a revelation to me. We all have our little blind spots...

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2011-10-12T14:26:31.395Z · LW(p) · GW(p)

I'm interested in seeing your motivation patterns summary expanded into a whole article.

Replies from: kalla724, None, AmagicalFishy, pianocigarette

↑ comment by kalla724 · 2011-11-02T20:39:25.573Z · LW(p) · GW(p)

Thanks. :) That could be done - however, I'm relatively new to this community, I come by fairly infrequently (as you can see by the delay in this response), and I have no idea how to move in that direction. Any advice?

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2011-11-03T04:42:05.734Z · LW(p) · GW(p)

I'd probably be content with as many detailed examples as you want to post.

Other things which could go into an article: how you came to that strategy and anything you've done to refine it, and links to research you use in your examples or that you've used to develop that strategy.

↑ comment by [deleted] · 2011-10-17T00:53:14.263Z · LW(p) · GW(p)

I third it. Motion passed?

↑ comment by AmagicalFishy · 2011-10-14T05:28:29.244Z · LW(p) · GW(p)

I second this!

↑ comment by pianocigarette · 2011-10-14T14:41:04.904Z · LW(p) · GW(p)

Aye!

comment by thelittledoctor · 2011-10-10T01:23:20.450Z · LW(p) · GW(p)

On the subject of body dysmorphic disorder (WARNING: Some gooey personal details ahead, and notes on how I more-or-less fixed mine):

I am male, and have been told that I have it (and by my psychiatrist, even!). To me, therefore, this is a particularly good example of the discrepancy between automatic self-evaluation and abstract self-evaluation; I could abstractly note that people seemed to find me attractive, and that I was in many ways on the right side of the physical appearance bell curve, but looking in the mirror inspired nothing but revulsion.

So, of course, this became the first subject at which I attempted mindhacking. I started with the typical self-help-book advice: Look in the mirror every morning, and tell yourself you're attractive. This was, somewhat surprisingly to me, partway successful - but only partway. The result was that my subjective evaluation eventually transformed to "I could be so good-looking, if it weren't for X, Y, and Z!" (although really the list I made had a lot more than 3 items). The primary feature to which the fixation reduced at this stage was my nose - rather than go into the specifics, allow me simply to say that I disliked it very much. So, I thought, this halfway worked - why not meet it from the other direction? I'll get a nose job.

So I got a nose job. And I didn't tell anyone I was getting one, figuring I could see if there was a change in their reactions (Results: some people figured it out, some did not. Of those who did not, the most common comment was "Did you get your teeth whitened?".) The change to my subjective self-evaluations, after a settling-in phase, was quite the opposite of the usual "plastic surgery will never make you happy/satisfied" advice we hear - my perception of my own attractiveness rose substantially. The other items on the list did not go away, but with the primary one gone, I self-evaluated much more positively.

So, I thought, I'll give the self-help method some more time. I started playing with lighting around and above my bathroom mirror, which made for an interesting effect - using more flattering lighting improved my self-evaluation and feelings of attractiveness for that day, even though I abstractly know it doesn't make a difference outside, but having the ability to switch lightings around in general also helped me grok that one's toothsomeness is not a fixed value; it very much changes based on the situation.

Grokking that particular fact produced another increase in my self-evaluation of my appearance, albeit a smaller one, and also put me on another track - what other circumstantial things that I've been neglecting could help? This post grows long, but basically: I fixed my posture and started dressing nicer (where "nicer" simply meant anything that I self-evaluated positively when looking in the mirror) and tried various new things with my hair (which had been very long and shaggy). Both helped notably. I also opted to try more elective procedures - first Lasik (which went swimmingly - worthwhile even if you don't think your glasses make you look ugly; having 20/15 vision is AWESOME) then electrolysis. Both of these also seemed successful, although nothing produced the dramatic change in my self-evaluation that rhinoplasty did.

In sum, it does seem to be possible, at least in my case (and n=1 so p >>>.05, obviously) to remove physical attractiveness as a major daily concern for someone with BDD. I still think about my appearance, and sometimes contemplate things to change/improve, but no longer have angst or pervasive worries about it. Since my self-evaluation is no longer so overpoweringly negative, I seem to be able to just take people at their word (or body language) about my appearance these days. This is not quite the same as perfecting my ability to self-evaluate, but I've manually pushed and prodded my self-evaluation close enough to the outside consensus that outside evaluations are now responsible for much of the daily/weekly/monthly variation.

Replies from: pianocigarette, Logos01

↑ comment by pianocigarette · 2011-10-14T15:02:52.977Z · LW(p) · GW(p)

Forgive my noobity, but what are n, p in n=1 so p>>>.05?

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2011-10-14T15:09:00.247Z · LW(p) · GW(p)

In this context n is the sample size of the number people, and p is the associated chance that this would occur by chance. In classical statistics one generally cares about results that have a p < .05 (that is there's a less than 1/20 chance of the result occurring due to random chance).

Replies from: Apprentice

↑ comment by Apprentice · 2011-10-18T23:59:08.723Z · LW(p) · GW(p)

Yes, but it's worth spelling out the heresy more explicitly. A good LW Bayesian isn't supposed to regard the number .05 as somehow special. Also, the whole notion of p values is misguided and counterproductive. Any more questions?

↑ comment by Logos01 · 2011-10-10T10:44:28.069Z · LW(p) · GW(p)

The change to my subjective self-evaluations, after a settling-in phase, was quite the opposite of the usual "plastic surgery will never make you happy/satisfied" advice we hear - my perception of my own attractiveness rose substantially. The other items on the list did not go away, but with the primary one gone, I self-evaluated much more positively.

How much of this reaction (decline in severity of BDD) was from the material change observable directly by you, and how much of this reaction was due to the failure of others to notice any change and yet still rate you as attractive (in your opinion, if you can even guess)? I'm curious.

Replies from: thelittledoctor

↑ comment by thelittledoctor · 2011-10-10T12:32:22.258Z · LW(p) · GW(p)

I should clarify - though a lot of people didn't figure out that it was specifically my nose that had changed, most people with whom I had any semi-regular interaction noticed that something was different about me, and commented to that effect (and of course there were certain groups - flatmates, lovers, my anatomy teacher - who immediately realized it was a nose job). The lesson I took away from that was that while people do evaluate your attractiveness when they see you, mostly they don't cache much more than a general positive or negative impression of your appearance. Thus when something changes for the better, they think "Ey looks better" but can't quite place why. I should have realized this before - the number of times I've thought someone looked different but not been able to place why defies counting.

In light of that, I would say it was definitely partly due to the reactions of others (positive without knowing why) and partly due to being able to observe a material change myself, but as to their proportions I have only speculation. As a hunch, I would guess that the latter was a larger factor - my self-evaluations after the settling-in period were pretty close to what I imagined they'd be, when I was visualizing myself with a new nose before the operation.

comment by KPier · 2011-10-09T17:31:04.879Z · LW(p) · GW(p)

Since I first read about calibration on LessWrong, I've been trying this with tests and debate tournaments.

With a sample size of about 50: 95% of my estimated test grades are within 3% of my actual test grades.

On debate, however, if I am 60% confident I won a round, I won it 90% of the time; if I am 80% confident I won, I win 100% of the time. Other people seem to be much better than me at assessing the probability I won a debate round (if they observed it).

It seems that I am really good at some forms of estimating, and really bad in other situations, which means that overall switching from Inside View to Outside View wouldn't necessarily be an improvement, but that in certain situations it would help me enormously. Has anyone else encountered this?

Replies from: AShepard, TheOtherDave

↑ comment by AShepard · 2011-10-09T21:54:52.107Z · LW(p) · GW(p)

Interesting that your debate predictions tend too low. In my debate experience, nearly everyone consistently overestimated their likelihood of winning a given round. This bias tended to increase the better the debaters perceived themselves to be.

Replies from: KPier

↑ comment by KPier · 2011-10-10T00:53:39.307Z · LW(p) · GW(p)

I think a lot of debaters I know fall into the general trap of believing the things they argue. In a debate round, you have to be focused on the mentality of "I'm winning", or you won't be able to convince the judge of that; I am probably atypical in that I notice that kind of self-deception and apparently overcorrect for it. I've convinced a number of my teammates to try this experiment as well, and most of them follow the trend you noticed.

Replies from: FiftyTwo

↑ comment by FiftyTwo · 2011-10-10T01:24:25.103Z · LW(p) · GW(p)

My own experience of debating is that while I can estimate the 'strategic' side relatively effectively I find it more difficult to predict whether the judges accept an individual argument. I've noticed this as a problem with several debaters, often due to the inferential gaps between them and the judges (e.g. assuming some psychological/philosophical/economic concept is intuitively obvious).

[Incidentally, I'm involved in UK bp debating, so if that makes it probable we've met pm me a name or a hint. ]

Replies from: KPier

↑ comment by KPier · 2011-10-10T01:35:42.725Z · LW(p) · GW(p)

Nope, US high school policy. I'm thinking of writing an article on debate and rationality (though not until after I'm done applying to college, which will be January); if you'd have something to say about that, PM me.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T08:54:23.843Z · LW(p) · GW(p)

Could the debate tournaments be to some extent responsible for extremely irritating counter productive arguments online where you are left wondering what exactly did so much convince the other side and why they won't tell what it is? I never did debates at school.

↑ comment by TheOtherDave · 2011-10-09T18:57:09.598Z · LW(p) · GW(p)

I've encountered similar things insofar as I'm better calibrated for some tasks than others. And I agree with you that defining the right reference classes for when to trust my estimations vs. when to trust the outside view (and which outside views to trust) is important.

I'm curious: if you re-express your data set in terms of standard deviations... e.g., the percentage of your estimated test grades that are within a std dev of the correct answer... rather than absolute percentages, do you still get very different results in the two cases?

Replies from: KPier

↑ comment by KPier · 2011-10-10T00:56:44.693Z · LW(p) · GW(p)

Maybe I'm being really stupid, but how exactly would I define a standard deviation of the correct answer? Using the distribution for the whole class?

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-10-10T04:33:32.032Z · LW(p) · GW(p)

I meant within the set of your 50 test scores, assuming they're normalized to a common range.

To pick an extreme example: if all your test scores fall between 92% and 98%, it becomes less remarkable that your estimations of your test scores all fall within 3% of your actual test scores... anyone else could do about as well, given that fact about the data set. So it seems that knowing something about the distribution is helpful in reasoning about the causes of the differences in the accuracy of your judgments.

Replies from: KPier

↑ comment by KPier · 2011-10-11T00:56:47.734Z · LW(p) · GW(p)

Oh, that makes sense.

Nope, still a big difference. For example, here are my scores from the last few weeks:

Predicted/Actual: 98/100 72/72.5 94/94 85/86 82.5/87.5 90/92

Replies from: Luke_A_Somers

↑ comment by Luke_A_Somers · 2011-10-11T14:26:42.418Z · LW(p) · GW(p)

Interesting that there were no too-high predictions.

comment by XiXiDu · 2011-10-09T14:33:57.469Z · LW(p) · GW(p)

I walked out of my first examination almost certain I had failed. I got my results back, and learned I had passed with honors. This situation repeated itself with depressing regularity over the next few semesters.

Back during my schooldays I was often certain that I had failed a test, just like many other people in my class. It turned out that most of the time the others didn't fail the test while I did. I concluded that they are for some reason just bullshitting about their self-assessments, it didn't occur to me that they just repeatedly failed to accurately predict their own success.

Replies from: Solvent, taryneast

↑ comment by Solvent · 2011-10-11T07:29:05.488Z · LW(p) · GW(p)

I often walk out of an exam thinking I did brilliantly, only to be highly surprised with my crap grades later.

Replies from: Swimmer963, taelor

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2011-10-11T13:45:16.507Z · LW(p) · GW(p)

My brother said to me a few days ago that "whenever I think I've done well, I've done terribly, and whenever I think I've done terribly, I've done well."

Replies from: Solvent

↑ comment by Solvent · 2011-10-12T06:11:23.480Z · LW(p) · GW(p)

Hrmm. That one could be related to impostor syndrome, maybe. However, what this thread has really established is that people can have any possible combination of experiences with test results (always thinking you've done badly, always thinking you've done well, thinking the opposite of what actually happened, and actually being right.)

↑ comment by taelor · 2011-10-12T22:29:58.108Z · LW(p) · GW(p)

Interestingly, just yesterday, I got back the results of a math test and an English paper. I was convinced I was firmly in the B+ range on the test, and ended up with a C-; I was also convinced that I was in the C+ range on the paper, and got a solid A. I've noticed that my expectations are often inaccurate, but they're accurate enough that I'm hesitant to simply negate them.

↑ comment by taryneast · 2011-10-10T13:02:10.746Z · LW(p) · GW(p)

:) yet another example of "never attribute to malice what can easily be ascribed to incompetence"

Replies from: AspiringRationalist

↑ comment by NoSignalNoNoise (AspiringRationalist) · 2012-03-18T21:12:35.929Z · LW(p) · GW(p)

As soon as this catches on, malicious people will learn to appear incompetent.

comment by spriteless · 2011-10-12T19:31:51.275Z · LW(p) · GW(p)

You know, not sensing people's emotions from their faces is seeming less like a handicap and more like I am immune to illusions every time I hear about the biases most people have. Then I realize I don't expect people to be feeling something unless they tell me anyways, and that's potentially just as inaccurate. Then I realize I might not be compensating exactly enough, and worry recursively.

Replies from: fredspeaking

↑ comment by fredspeaking · 2012-03-22T17:43:28.095Z · LW(p) · GW(p)

Have you ever considered studying Facial Action Coding System?

Replies from: spriteless

↑ comment by spriteless · 2012-03-25T21:31:54.520Z · LW(p) · GW(p)

Thanks for the link. I have been reading peoples faces for awhile, but there's a second or so of lag, and I can miss things.

comment by Bobertron · 2011-10-10T12:03:32.123Z · LW(p) · GW(p)

If you hate yourself and think you're worthless, take a moment to consider whether you have any evidence that you're objectively doing any worse than anyone else

Okay, suppose I'm objectively doing worse than everyone else, is that a reason to hate myself? I don't think so.

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-10T14:04:36.821Z · LW(p) · GW(p)

If statements are uni-directional.

Replies from: Bobertron

↑ comment by Bobertron · 2011-10-10T14:40:12.184Z · LW(p) · GW(p)

Let's test that:

If you take a moment to consider whether you have any evidence that you're objectively doing any worse than anyone else, then hate yourself and think you're worthless.

You are right that's not what Yvain said. If statements really are uni-directional ;-)

However, if you are interested a realistic assessment, you should consider how you are objectively doing even if you don't hate yourself. If you are interested in your own wellbeing, you shouldn't hate yourself even if you are doing worse.

It's true, if you hate yourself it might be more likely that you don't assess yourself objectively, so the advise has some merit, but I think it can strengthen the idea that if you are doing worse than everyone else, you should hate yourself (even if it is not intended or logically implied).

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-10T16:45:56.187Z · LW(p) · GW(p)

I agree with you that care needs to be taken that exercises intended to increase cheer have a low risk of decreasing cheer, but my experience with depressed people is that experimenting until you find something that works for them is fairly low-cost. They're already trapped in a spiral of negative thoughts; what matters is the difference between the old thought and the new thought. If the old thought was -5 and the new thought is -6, not much has changed; but if the new thought is a 2, then things are looking up.

The obvious solution to any mood problem is "stop it," and the obvious approach to self-loathing is "don't do it." Pointing those things out is not particularly helpful. (That is how I interpret your criticism of Yvain's statement; the implicit proposal is "instead of giving them a non-rigorous reason to not hate themselves, don't give a reason.")

Replies from: Bobertron

↑ comment by Bobertron · 2011-10-10T17:23:55.098Z · LW(p) · GW(p)

instead of giving them a non-rigorous reason to not hate themselves, don't give a reason

I think there are more options than just a more realistic assessment of how well someone is doing at something. For example if someone thinks he/she is fat, one could point out that there is more to attractiveness than body weight or that there is more to a person than physical attractiveness. If the person is not really fat, pointing that out would be helpful, too, but while doing so the believe that body weight is important shouldn't be reinforced.

I agree that just saying "stop it" is hardly helpful and it was not my intention claim that it is.

comment by juliawise · 2011-10-14T14:25:35.913Z · LW(p) · GW(p)

These people get locked away because their self-esteem is at an extreme no sane human would ever reach

I work in a psychiatric hospital, and I don't think self-esteem has much to do with it. You only actually get locked away if a doctor can convince a judge you're going to get yourself or someone else physically hurt (To be fair, this was different in Nietzsche's day.)

I'd say the vast majority of people with poor reality testing, including inaccurate self-assessments, are living in the community rather than in locked wards. Which does reinforce your message: it could be any of us.

comment by FiftyTwo · 2011-10-10T01:11:29.630Z · LW(p) · GW(p)

I've always had this problem when asked to self assess on personality traits. E.g. 'Are you extroverted?' Compared to what baseline? My friends? Some hypothetical average person?

Replies from: scav

↑ comment by scav · 2011-10-10T08:08:08.398Z · LW(p) · GW(p)

Not a problem: you have correctly identified a meaningless question. The correct answer is "mu".

I have read, and maybe don't take my word for it, that so called "personality traits" don't reliably correspond to objective properties of a person. Or rather, that we apply them as labels to a person on seeing one set of behaviour we associate with it, but they aren't predictors of any of the other kinds of behaviour that typically get assigned the same label.

I don't have a reference to it here at work, but at home I have a psychology book which contains a paper about it. Maybe someone else here will have a clearer memory of what I'm talking about?

Anyway, that's my main objection to so-called personality tests that give you a numeric personality score on one or more axes. I just want to ask "what are the units on that number?", "how was this calibrated against objective measurements?". The Myers-Briggs test goes one step further and combines 4 uncalibrated numeric scores into a "personality type". Might as well be a horoscope.

Replies from: Morendil

↑ comment by Morendil · 2011-10-10T08:44:36.614Z · LW(p) · GW(p)

clearer memory of what I'm talking about

This paper by Borsboom (pdf) maybe? (Specifically attacking the Big Five.)

comment by dreeves · 2011-10-10T06:26:58.656Z · LW(p) · GW(p)

I'm not sure how much this relates to the question of calibrating self-assessments in particular, but when I was at Yahoo, a few of us made this game -- http://yootles.com/calibration/guessum -- with the objective of being a fun way to train yourself to be better calibrated, at least for mundane estimation tasks. It was based on this article: http://messymatters.com/calibration (and especially the follow-on post with the results, and the speculation that people could learn to be better calibrated, hence the game).

Replies from: Morendil, AspiringRationalist

↑ comment by Morendil · 2011-10-10T10:02:24.038Z · LW(p) · GW(p)

Guessum is nice work! Needs a "Share" button, maybe. I've taken the all-time high score and I know I'm not that good in either calibration or discrimination, so I'd expect this has had way too few players.

Replies from: pianocigarette

↑ comment by pianocigarette · 2011-10-14T15:16:37.675Z · LW(p) · GW(p)

I second the share button. Maybe a chrome app.

↑ comment by NoSignalNoNoise (AspiringRationalist) · 2012-03-18T21:07:49.804Z · LW(p) · GW(p)

A bit late to the game, but the link now appears to be broken. Has gessum been moved?

Replies from: dreeves

↑ comment by dreeves · 2017-05-30T22:38:20.982Z · LW(p) · GW(p)

Now resurrected!

comment by qvalq (qv^!q) · 2023-02-24T17:46:53.825Z · LW(p) · GW(p)

we all know people who insist that they are ugly and stupid and unlikeable even though they don't seem any worse off than anyone else.

I don't [? · GW] know these people [? · GW].

comment by CronoDAS · 2011-10-09T22:06:33.405Z · LW(p) · GW(p)

And research agrees: studies show that people are uniquely bad at rating their own physical attractiveness.

Well, that's a relief, that I'm not unusual in being unable to evaluate my own physical attractiveness. (Or am I unusual in being aware that I don't know how good or bad looking I am?)

Replies from: selylindi

↑ comment by selylindi · 2011-10-09T22:38:03.531Z · LW(p) · GW(p)

One valuable insight from OkCupid is that, "the more men as a group disagree about a woman's looks, the more they end up liking her". For what it's worth, I can add one data point to that as a man. My HotOrNot pictures earned a measly 4.3 rating, which rather hurt my self-image for a while. But in more practical terms, I get asked out multiple times a night at bars. For anyone else who lacks self-esteem about their looks, I'd like to reiterate the usual advice, now backed with OkCupid data: get out there and let other people do the judging of your looks instead of you, because it's quite likely that someone in a crowd will have eyes for you!

Replies from: Eliezer_Yudkowsky, atucker

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2011-10-09T23:28:27.684Z · LW(p) · GW(p)

That plus I'd expect a certain amount of sampling bias at HotOrNot. I mean, I could be wrong, but AFAIK it could easily be true that you are in the 43% bracket of HotOrNot (not that I expect their 10-point system actually correlates to this, but anyway...) while still being pretty attractive by real-world mortal human standards.

Replies from: nazgulnarsil

↑ comment by nazgulnarsil · 2011-10-10T00:11:14.328Z · LW(p) · GW(p)

People's subjective experience of how attractive someone is is heavily influenced by framing. I can't find the relevant study but basically people responded with better ratings when someone was surrounded by less attractive people than when someone was surrounded by people who were around the same or more attractive. Conclusion? The same as Mises: preference rankings are ordinal, not cardinal. The frame of hotornot is looking at a very large group, so all but the most attractive in the set will rank slightly worse than they otherwise would have (real life situations are always much smaller sets).

In addition, as the okcupid article indicates, variance matters a lot. 3 people rating you a 9 or 10 and 7 people rating you 1 or 2 means your overall rating will be low, even though a significant fraction of people think you're the bees knees.

Oh and to quantify: the research I'm familiar with indicates that women should, on average, bump up their estimation of their own attractiveness and men should bump it downward (but a smaller bump than women). But this hides an important dynamic: we don't care what the average person thinks of us. We care about what people whom we find attractive think. A rating of 8 from someone who we rate an 8 is roughly twelve billion times more important than from someone we rate a 2.

Replies from: Prismattic

↑ comment by Prismattic · 2011-10-10T02:25:57.337Z · LW(p) · GW(p)

Addtionally, assessments of physical attractiveness are also influenced by assessments of other traits. Suppose you meet someone you think is a 10, but you discover that you cringe every time they open their mouth (to disambiguate, this a reference to the content of their speech, not their dentistry). Not only are you probably not going to want to be with that person, but your physical assessment is going to change. I don't mean they will suddenly seem ugly, but probably they'll be a 7 or 8, and you won't be able to understand how you ever thought they were a 10 in the first place.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2011-10-10T11:05:11.080Z · LW(p) · GW(p)

Another rating bias: people probably don't use the rating scale uniformly.

I remember reading somewhere that when women rate men online, instead of "nice, medium, ugly" their rating is more like "nice, ugly, ugly" (the median guy is rated disproportionally low).

If this is true, then this bias could be partially corrected if the web would not display "your rating is 4.3 of 10", but rather "your rating is higher than 70% of people in the same category". Or if the site displays global statistics, you can locate yourself in the distribution curve.

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-10T14:18:44.548Z · LW(p) · GW(p)

You can find that on okCupid's post about attractiveness. Women rate 80% of guys as worse-looking than medium, whereas male ratings are symmetric and fairly normalized.

↑ comment by atucker · 2011-10-10T04:30:34.200Z · LW(p) · GW(p)

I feel a bit ambivalent about that finding (but find the site's stats really fun to look through).

If male attention is disproportionately directed towards people that the particular guy finds attractive, then it seems possible that women with the same average rating can have different amounts of attention for reasons other than the disagreement. Like, if people who rates someone a 5 gives her 10 attention, a guy rating someone a 4 gives her 5 attention, 3: 3, 2: 1, 1: 0 (entirely made up), then there are different ways of getting a 4. It depends on the ratio of attractiveness rating to attention giving.

A person with an average rating of 4 with people four people giving them a 4-rating gets 20 attention. A person with an average rating of 4 with three 5-ratings and one 1-rating gets 30 attention.

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-10T14:13:36.512Z · LW(p) · GW(p)

Rather than making up numbers, check out their linear regression model:

msgs = .4m1-.5m2-.1m4+.9m5+k

4s ("eh, cute") get asked out less. 5s ("hot") get asked out more. 1s ("weird") get asked out more, because somebody thought that was a five, and rather than a supermodel who must be inundated with messages, it's someone quirky whose average rating is only a 3 (and thus approachable).

Replies from: HughRistik, atucker

↑ comment by HughRistik · 2011-10-22T06:43:32.864Z · LW(p) · GW(p)

and rather than a supermodel who must be inundated with messages, it's someone quirky whose average rating is only a 3 (and thus approachable)

That's the hypothesis that OkCupid advanced: game-theoretically, it makes sense to go for people you are strongly into who other people aren't into. But there's a problem with this hypothesis: it could turn out to be true, but right now, it's sort of silly.

It's unnecessary. Look at some normal distributions, and it's easy to see that having a high variance of attractiveness is sufficient to explain high positive responses (that motivate 5-ratings and messaging) and highly negative responses (that motivate 1-ratings). Let's say you message a woman with high variance (lots of 5s, lots of 1s) who you consider a five. Maybe that's because for you, she isn't actually a 5, she's a 6! But the scale only goes to 5 inducing a ceiling effect. You are going for her because you are really, really into her (for the same reasons that other guys are really, really not into her), not because you anticipate less competition.

There is no evidence that people are thinking of that sort of game theory. It's possible, but if men really cared so much about minimizing competition, you'd think they would message women they found 3s and 4s more often.
If you think someone is a 5, even due to high variance traits that other guys hate, you don't necessarily realize that other guys hate those traits (typical mind fallacy). Instead, you may assume that other guys would be into her just as much as you, which undermines the notion that you are trying to get the women that other guys won't pursue. This hypothesis gives men too much credit predicting the psychology of other men, and calculating the average appeal of a woman across the whole male population. For instance, I can't figure out what the guys who give flower-hair-girl a 1-rating are smoking.
Assortive mating. If you message a woman with tattoos and piercings (to use the example in the article), is that because you are thinking "aha! tats and piercings will turn off other guys, so I'll have her all to myself," or "wow, I really like that she has tattoos and piercings, and she is probably going to like my tattoos and piercings, too!" This hypothesis isn't really supported or necessary either, but it helps why men don't treat all 5s (from their perspective) equally. If someone has tribal markers, it explains why you might both find them a 5 a message them, while you are less likely to message another woman you rate a 5 without tribal markers, and why other guys from other tribes can't stand women with the affiliations you like.

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-22T08:16:13.910Z · LW(p) · GW(p)

It's unnecessary. Look at some normal distributions, and it's easy to see that having a high variance of attractiveness is sufficient to explain high positive responses (that motivate 5-ratings and messaging) and highly negative responses (that motivate 1-ratings).

Emphasis mine. Is there a difference between this and "quirky"?

Replies from: HughRistik

↑ comment by HughRistik · 2011-10-22T20:03:10.798Z · LW(p) · GW(p)

I think you were on the right track with the word "quirky." It was the OkCupid article's game theoretic hypothesis that I was objected (referenced by avoiding people "inundated with messages" in your comment).

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-22T22:29:35.994Z · LW(p) · GW(p)

Gotcha. I saw their game theory as justifying why people think quirkiness is (sometimes) attractive, not something people are consciously doing.

↑ comment by atucker · 2011-10-10T14:30:32.879Z · LW(p) · GW(p)

Numbers were made up because people rating someone a 4 don't give them negative attention (as in intercepting messages), so much as something more like give them less attention than average given their attractiveness level.

Replies from: Vaniver

↑ comment by Vaniver · 2011-10-10T16:33:35.924Z · LW(p) · GW(p)

It may actually give them negative attention; suppose I don't message anyone I rate a 4 (I don't) and by raising their rating I make others less likely to message them (because their average rating is higher). (I thought there was a way to determine another user's average rating, but I'm not seeing it from a quick check of the site, so this may not be the case.)

To the best of my knowledge, though, the coefficient for m4 and m2 aren't "relative to m3" but absolute; if someone gets 10 5s, they're expected to get 9 messages. If they got 9 4s and a 5, they're expected to get no messages. (Of course, what would be interesting is looking at clusters rather than just linearly regressing the data.)

Replies from: atucker

↑ comment by atucker · 2011-10-10T17:30:40.207Z · LW(p) · GW(p)

Fair point, that works too.

comment by Laoch · 2011-10-12T17:45:22.057Z · LW(p) · GW(p)

Great to hear that your exams went well after all! Well done.

comment by taw · 2011-10-10T14:42:53.361Z · LW(p) · GW(p)

I never had any anxiety about exams. You just need to realize they have some implicit curve and cannot fail too many people, so if you're better than median, you simply have to pass, and if you're better than 80th/90th percentile you'll pass with very good result. It's very easy to observe your skills relative to others in your group.

But then I outside view naturally when vast majority of people seem to inside view naturally.

comment by Lapsed_Lurker · 2011-10-10T20:13:04.880Z · LW(p) · GW(p)

When I moved to Ireland, I knew that their school system, and in particular their examinations, would be different from the ones I was used to.

What were the main differences? I am guessing the school system you moved from was the US? As I only know parts of the UK system, I feel like I'm missing possibly important information.

Replies from: Barry_Cotter

↑ comment by Barry_Cotter · 2011-10-11T14:01:29.283Z · LW(p) · GW(p)

The only (unfortunate) major difference between the English[0] and Irish university systems is that most English degrees take three years Ireland has been steadily moving towards four for decades. We have the same grading system for degrees, the same (old) academic calendar with some universities having adopted the American one. I am not under the impression that the manner of teaching is wildly different in the US from the rest of the world (except for the abombination that is the Socratic method, in that other abombination, the postgraduate law school). They do seem to be much more fond of multiple choice tests than in those parts of the world with more dialects of English.

[0] I could probably have said British, but the Scottish system is different in some ways I'm too lazy to look up.

Replies from: a_gramsci

↑ comment by a_gramsci · 2011-11-04T19:27:26.351Z · LW(p) · GW(p)

On the Socratic method; I was wondering if anyone had any ideas about that or could write an article on the benefits and consequences of it. From what I see is that the above average students get frustrated when the jump to conclusions faster than the teachers guide the class to them, and the below average students who consistently aren't understanding the questions, with the Socratic method really only working for the average students (this scale though can be re-calibrated, for example if the teacher caters the to the below average students, now the average students are also frustrated, and vice versa.)

Replies from: taelor, TheOtherDave

↑ comment by taelor · 2011-11-04T21:31:34.639Z · LW(p) · GW(p)

The main problem with using the Socratic Method as a didactic tool is that it really wasn't intended for that purpose; Socrates was a man who claimed to know nothing, and the "Socratic method" is simply a collection of techniques he developed to demonstrate that other people didn't know anything either. 90% of his so-called Method (as demonstrated in the early dialogues like Euthyphro or Charmides -- which have the highest probability of actually being representative of things he actually said, and not just mouthpiecing from Plato) consists of Socrates demanding that people define their terms, refusing to continue the argument until they did so, and then pointing out that the definitions they supply are either self-contradictory or inconsistent with what they're actually arguing. When used correctly, the Socratic method is great at exposing logical inconsistency and self- contradiction, but extremely inefficient when it comes to guiding people to truth -- its purpose is to destroy; it does not create.

Replies from: a_gramsci

↑ comment by a_gramsci · 2011-11-05T00:22:53.109Z · LW(p) · GW(p)

That's really interesting; maybe we need a new name for the (convoluted) modern Socratic method?

↑ comment by TheOtherDave · 2011-11-04T19:46:07.402Z · LW(p) · GW(p)

I'm curious: are you comparing the Socratic method here to some other technique that works more reliably with a broad range of capabilities?

I would have thought that no matter what technique I use, the subset of my class that I devote most of my attention to will get the most benefit, and everyone else will be frustrated that they aren't getting as much out of it as they could with more attention.

Replies from: a_gramsci

↑ comment by a_gramsci · 2011-11-04T21:12:26.650Z · LW(p) · GW(p)

You're right, I really wasn't thinking of a specific method of comparison, rather I was just kind of ranting on how much I dislike it. Of the teaching methods we have: Lecturing- Above average students might be bored if the teacher is telling them information they already knew, but it many times has just a blanket boredom effect

Demonstrating- Even if certain students already know information, can still be interesting if they try to extend their thinking on the demonstration. The opposite of a lecture, many times has a blanket engaging effect

Socratic- See above post So really, there is no silver bullet, only what you say of devoting attention to specific subsets of class. Apart from the limited use cases of a demonstration, the only way to maximize what part of the class is interested is by catering to the largest subset

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-11-04T21:55:12.016Z · LW(p) · GW(p)

Well, there is of course the alternative of getting away from the idea of one-teacher-many-students altogether, by encouraging students to teach one another, or by creating autodidactic tools that students can explore on their own, or by segmenting students differently, etc.

But yeah, if we're restricting the scope of discussion to traditional teacher-led classrooms, the segmentation problem is hard to get away from.

That said, I'm rather fond of Socratic inquiry myself.

Replies from: Bugmaster

↑ comment by Bugmaster · 2011-11-04T22:32:49.962Z · LW(p) · GW(p)

I partial to the Socratic method as well, but whenever I'm explaining something to someone, I have to constantly remind myself to stay away from it. Unfortunately, in my experience (which may not be representative), the Socratic method elicits very strong negative emotions in the target audience, and it does so very quickly. Making people hate you is not a good educational technique.

Replies from: TheOtherDave, Vaniver, a_gramsci

↑ comment by TheOtherDave · 2011-11-04T23:53:45.832Z · LW(p) · GW(p)

I find it works moderately well for me, but then again I am usually operating in a community of peers where it's not a given who is teaching whom. I ask questions that are designed to elicit clear thought about areas of uncertainty, and either my interlocutor answers them sensibly and I am enlightened, or they fail to and they are.

I can see where it would be different if I went into the exchange convinced I was the instructor.

↑ comment by Vaniver · 2011-11-04T22:35:44.299Z · LW(p) · GW(p)

Yeah, the actual Socratic method is designed more to instill doubt than it is to explain concepts. (The student takes a position, then you shred it with pointed questions.)

The 'modern' Socratic method of asking leading questions doesn't work all that well because of inferential distances and awkwardness. You standing there expectantly waiting while they think something through is oftentimes an unpleasant experience for them.

Replies from: dlthomas

↑ comment by dlthomas · 2011-11-04T22:58:00.749Z · LW(p) · GW(p)

In any form of teaching, expecting an appropriate inferential distance is important. I wonder to what degree that can be trained explicitly.

Replies from: a_gramsci

↑ comment by a_gramsci · 2011-11-04T22:58:53.032Z · LW(p) · GW(p)

That's why I think that the basic concept of "building block" schooling works-you essentially keep the distance constant, but teach them ever more challenging topics. The one time where there is a large gap is in the introduction of completely new ideas or subjects. For example, in physics when people first learn of general relativity there is a large inferential distance, which is very hard to remedy.

Replies from: dlthomas

↑ comment by dlthomas · 2011-11-04T23:12:50.843Z · LW(p) · GW(p)

Amount you need to understand to get from what you currently understand to also understanding the new thing.

Eliezer talks about it in a piece well worth reading.

↑ comment by a_gramsci · 2011-11-04T22:51:24.484Z · LW(p) · GW(p)

Interestingly, the one time that I find that the modern Socratic method works is math. Because it is so much more helpful in math to have an innate understanding of the subjects, you have to be able to explain why an equation or theorem works/is true. So when time permits, guiding them with questions is very helpful, as figuring something out sticks in your mind more than having it on a board.

comment by Dmytry · 2012-03-20T07:15:16.934Z · LW(p) · GW(p)

re: flirting, your underlying system cares for maximizing max. expected pay-off, it doesn't care for maximizing accuracy. Biologically, you get giant utilons loss for missing a mating opportunity (Think about it! For your underlying system, that's a huge fraction of worth of your life! It's the utility loss comparable to being forced to play Russian roulette with live bullets, several times in the row!), and microscopic utilon loss for being rejected. Your underlying system, though, doesn't trust your cognition with the correct mating odds - if it tells your cognition correct odds, you end up not trying, and not reproducing.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T07:42:48.427Z · LW(p) · GW(p)

(Think about it! For your underlying system, that's a huge fraction of worth of your life! It's the utility loss comparable to being forced to play Russian roulette with live bullets, several times in the row!), and microscopic utilon loss for being rejected.

Our underlying system assigns massive utilon loss for being rejected. Far more than either current reproductive maximisation incentives or our own practical hedonistic interests would assign. ('Think about it'...)

Replies from: army1987, Dmytry

↑ comment by A1987dM (army1987) · 2012-03-20T18:09:19.810Z · LW(p) · GW(p)

You mean "hedon loss", right? Yeah, we do hate being rejected, but I'm not sure we should.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T21:52:06.141Z · LW(p) · GW(p)

You mean "hedon loss", right?

I mean the utilon calculation done by the crudely specified 'underlying system', which we infer based on observed behavior and reported experience. That's not quite the same as "hedon loss" but it is also a very different thing to "utilon loss" as could be described from the perspective of an anthropomorphized agent that seeks to maximise inclusive genetic fitness.

Yeah, we do hate being rejected, but I'm not sure we should.

Yes, the thing that adds up to this!

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-03-20T22:54:54.008Z · LW(p) · GW(p)

I mean the utilon calculation done by the crudely specified 'underlying system', which we infer based on observed behavior and reported experience.

Which doesn't do any utilon calculation, anyway. :-)

(Edited to add more links.)

↑ comment by Dmytry · 2012-03-20T07:50:18.013Z · LW(p) · GW(p)

Nah, it assigns massive punishment to the spokesman part of the brain coz it failed. Then the spokesman, in spare time, complains about it.

If it assigned utilon loss in the end, he wouldn't have been trying to talk to women despite getting rejected. Half of 'omg i dont have any consistent utility system' is the spokesman who's not deciding, whining about getting beaten by underlying system for failures. edit: i do believe in some localization of 'consciousness' in the sense that the part close to the speech centre is to some extent independent and unaware of what is going on in the rest. Brain is a distributed computing system, with non negligible spatial separation of components, and very non-negligible lag. I programmed that sort of systems, you have components working in partial ignorance of the state of other components. And if you are bandwidth limited you see how you can maximize the ignorance to free up bandwidth.

edit: Okay, to summarize for ya: you know how dogs are trained professionally? With treat and punishment method? Now consider one part of your own brain internally training other part of your own brain in precisely same way. With feelings of pain and feelings of reward. Positive and negative feelings are not utility, feelings are instrumental value that trains neural network to perform a task, adjusted for maximum efficacy (ideally).

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T10:51:19.137Z · LW(p) · GW(p)

Nah, it assigns massive punishment to the spokesman part of the brain coz it failed. Then the spokesman, in spare time, complains about it.

The more or less mainstream 'just so' story here paints a somewhat different picture - and it is one that strikes me as fairly credible as far as these things go. As these things so often do it consists of an appeal to the Environment of Evolutionary Adaptation where, it is said:

There are far fewer potential mates. Thus, rejection by one mate actually constitutes a non-trivial loss of a resource.
In a tribe where 40% of male deaths are, essentially, murders a failed sexual advance may actually provoke reprisals if word reaches a more powerful rival.
If your pool of potential mates form one (or a limited number of local) social circles the social status loss of being rejected by one female within the circle can represent a disadvantage when pursuing other candidates within that same circle.

In short, our instincts are calibrated for environments where a sexual advance being rejected has a far greater consequence than what it does for most people here and now.

If it assigned utilon loss in the end, he wouldn't have been trying to talk to women despite getting rejected.

A lot of the time guys don't end up trying to talk to women when doing so would seem to be the best way to satisfy their desires (in the 'maximise utility sense').

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T12:15:49.221Z · LW(p) · GW(p)

There is no such thing as environment of evolutionary adaptedness. Look at your foot. It's suboptimal in environment of evolutionary adaptedness and is only explainable by how it got there (from going on the tree and shortening important bones there).

I do agree that there are adaptations to that environment, but with this many free variables up to speculation on what the environment was, and on the method in one exactly adapts, and how much, one can explain literally anything, including anything that could later be shown to be product of conditioning. On top of this, you have this huge majority of people NOT taking rejections too seriously, which you can undoubtedly also explain by EEA in some other way, because it can explain everything. In fact I seen a zillion explanations of why and how we do one-night stands, and why women choose confident men for one night stands, out of some other EEA.

WRT the utility, pain is not utility. Pain is a value that trains your neural network not to do same stupid stuff again. If i do, or don't, spray with a water bottle my dog for chewing stuff up, (ideally) has nothing to do with utility of the items it is chewing up and everything to do with how much I think would get her not to chew stuff up, and the positive reinforcement not working for this task very well. Apply same principle to yourself feeling sort-of-pain for any internal reasons, and here you go.

WRT the brain being a distributed computing system with non-trivial lags, it is a plain fact, WRT distributed computing system's part having incomplete picture of the whole, that is plain how you get the distributed systems to work optimally. It doesn't give freedom to explain everything, in fact it has trouble explaining consciousness and pervasive sense that you are not a distributed computing system. Ditto goes for facts about 'how do you train a piece of brain'. With stick and carrot, that's how.

WRT maximizing utility, people of course suck at maximizing utility. But if you see utility of a belief for Yvain on the likehood of rejection, the best belief for him is that rejection is unlikely, a mis-calibrated one, because rest of the brain is mis-calibrated too. Evolution, by the way, doesn't make ideal systems. Nothing affects the prediction that Yvain is better off being overconfident about women. The rejection lasts for, maybe, day, the successful relationship where he DID dump his interests early (to cause early rejection to avoid wasting time), lasts for lifetime. He hyperbolically discounts for time. To act more optimally with hyperbolic discounting, he also has miscalibrated expectations. If he calibrates his expectations but does not get rid of hyperbolic discounting (which is much harder), he'll act even less optimally.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T12:58:05.343Z · LW(p) · GW(p)

On top of this, you have this huge majority of people NOT taking rejections too seriously,

Most people do take rejections too seriously. In fact, for this reason, I recommend 'Rejection Therapy' (deliberately asking for stuff that you will not get) as an excellent personal development technique. (We tried it on the rationality boot camp last year.)

Particularly when it comes to mating based rejection people nearly universally experience anxiety grossly out of proportion to what is actually at stake - which is mostly the inconvenience of having to go and try again with someone else.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T13:00:07.414Z · LW(p) · GW(p)

Well, I guess there's different definitions of 'most people' (rationality boot camp?) and 'too seriously' . The too seriously, is when you take it so seriously that you end up alone.

edit: there's also the 'too seriously to have 15 children by 10 women' kind of too seriously, on that i'd agree. With regards to anxiety being grossly out of proportion, that's meant to be compensated for by over-confidence. Yvain is not generally over confident. Just in those cases (flirting).

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T13:22:27.692Z · LW(p) · GW(p)

Well, I guess there's different definitions of 'most people' (rationality boot camp?)

"> 70% of single individuals of a suitable mating age in the world" is both a grossly conservative estimate and satisfies any remotely reasonable usage of 'most people' in the context.

Did you just try to spin that as an insult to those in my reference class? If I recall correctly all but two of the twenty participants were as of that time in at least one relationship. The lesson that should be taken is that even those who are already successful can serve to benefit from recalibrating their aversion to rejection downwards towards the optimal level.

The too seriously, is when you take it so seriously that you end up alone.

That's certainly what you can expect in the extreme case. More often, however, people simply end up with fewer experiences with fewer people and must be satisfied with relationship arrangements that are perhaps less than they could have been. Or, at the least, must counter the aversive emotions that may otherwise have been an inconvenience while going after what they want despite their inhibitions.

If I were take the narrow group of my own friends and acquaintances and your suggested symptom "forever alone" the experiment would be biased in the other direction. Very few are single and an absolutely sickening proportion has gone and got themselves outright married.

Replies from: army1987, Dmytry

↑ comment by A1987dM (army1987) · 2012-03-20T18:13:22.005Z · LW(p) · GW(p)

70% of single individuals

[emphasis added]

Selection bias much?

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T21:44:53.148Z · LW(p) · GW(p)

Selection bias much?

There certainly could be, in such a class. Mind you I'd extend the 70% estimate to include all other individuals too - it'd just be slightly less of an understatement.

Replies from: Dmytry, Dmytry

↑ comment by Dmytry · 2012-03-20T22:04:34.940Z · LW(p) · GW(p)

Any actual basis to the notion that >70% of people are overly cautious for their own good, considering the risk of status loss (real or perceived), and the fact that they are also over-confident about the success rate?

I'd say that for well over 70% of people worldwide, what you said about environment of evolutionary adaptedness, still applies.

Speaking of which. Almost everyone gone through 'environment of social adaptation' in the learning sense, i.e. daycare then school, and the late stages of it are similar enough to 'environment of evolutionary adaptedness' - with the striving for status, small number of apparent mating opportunities, potentials for scary punishments - the products of conditioning by which can all provide ample food for evolutionary psychologists, and ample set of strange biases. The people who didn't go through school, are small and biased sample; the cultures where there is no school are naturally living in something even more similar to supposed 'environment of evolutionary adaptedness'.

It does seem that the sexual attraction, as a novel feeling, confuses the hell out of brain when it first appears, especially in puritan cultures. Kids even do certain thing, that involves looking at pictures of women (or imagining) , and being afraid that someone would walk in and catch them. That, plus Pavlov's conditioning, already provides ample explanation in terms of well studied, uncontroversial phenomena. It may not be as intellectually engaging as imagining what cavemen were like, but that's uncontroversial theory which is well tested, and predicts social anxiety of men around women in many cultures; EP can only predict additional anxiety on top of this. I've seen enough of different people of different nations, with different approach to talking to women, to know that significant anxiety is not universal. Keep in mind that i am from former soviet union, in which many people of different nations with diverse cultures and religions were moved around.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T22:32:53.416Z · LW(p) · GW(p)

I'd say that for well over 70% of people worldwide, what you said about environment of evolutionary adaptedness, still applies.

For posterity, let's recall that the claim of yours that I actually disagreed with was:

and microscopic utilon loss for being rejected.

(So be a little more careful with which straw men you take aim at. You just caught yourself in the crossfire.)

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T22:36:57.388Z · LW(p) · GW(p)

Jesus Christ man, the woman in question been flirting with him, or so he thought. Hence way smaller loss than usual. Way to twist everything. edit: and him not being among those 70% , but among privileged few, about whom you can say, their anxiety is entirely mis-calibrated and the worst that can happen is having to try with other person.

↑ comment by Dmytry · 2012-03-20T22:05:25.433Z · LW(p) · GW(p)

I'd say that for well over 70% of people worldwide, what you said about environment of evolutionary adaptedness, still applies.

↑ comment by Dmytry · 2012-03-20T13:31:05.580Z · LW(p) · GW(p)

Nah, no insult meant beyond the sample being biased, which I trust we are all rational enough here not to take as an insult. I think it is fair to guess that you have mean IQ well over 100, which too is enough to ruin applicability of experiments.

That's certainly what you can expect in the extreme case. More often, however, people simply end up with fewer experiences with fewer people and must be satisfied with relationship arrangements that are perhaps less than they could have been. Or, at the least, must counter the aversive emotions that may otherwise have been an inconvenience while going after what they want despite their inhibitions.

I agree, actually. But see, there's the example of big problem for smart individuals in general: you do have that hyperbolic discounting, and you do have anxiety, you can't think them away, you must train them away, and that doesn't even fully work. Then you have your wild overestimate of the probability of success when some conditionals are met, and you can think this away easier.

Even if combined with the former it makes a fairly solid strategy, implementing a strategy on tweaked biases. See, there's also environment of 'cultural adaptedness', or 'memetic adaptedness', if you wish, and its at least hundred years back, and hundred years back it is NOT safe to hit on strangers unless some conditions (them flirting) are met. And it still works pretty well now. Not ideally - thanks to anti-murder laws, it is much safer to hit on strangers now - but it works. edit: actually, scratch that. Only in 5..10% of population it is perfectly safe to hit on strangers, and even there, you have a ton of harassment laws so even though you aren't likely to be beaten up, you can be screwed over.

edit: and for full disclosure, I also sucked at hitting on strangers. Its extremely uncomfortable. The point is that the overconfidence after conditions (flirting) are met is compensating, and makes a two way conditional implemented on biases: if (not flirting ) don't proceed ; if ( flirting) do proceed ; Hmm. To think about it maybe we came up with some truth here from disagreement. You can implement simple agents (similar to game AIs i can code) by combining biases, if you can tweak biases. And that is a plausible way how evolution can implement logic without wiring up individual neurons. The results are ultra messy though and have a ton of strange side effects, and become deregulated when one tries to get rid of some of the biases.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T13:42:34.568Z · LW(p) · GW(p)

Nah, no insult meant beyond the sample being biased, which I trust we are all rational enough here not to take as an insult. I think it is fair to guess that you have mean IQ well over 100, which too is enough to ruin applicability of experiments.

You should by now be aware that the claim (that I had previously assumed to be completely uncontroversial) is nothing to do with people at a particular training program (which related only to experiments with a solution) but rather with humanity in general. It isn't presented as the outcome of my own experiment but rather as a matter of both common and expert knowledge.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T13:47:58.277Z · LW(p) · GW(p)

http://en.wikipedia.org/wiki/Social_anxiety_disorder

also known as social phobia, is an anxiety disorder characterized by intense fear in social situations[1] causing considerable distress and impaired ability to function in at least some parts of daily life.

I took it as the social anxiety disorder if the anxiety leads to impaired ability to function. In social situations there's considerable loss of status from rejection, by the way, and the status is good for finding new mates, so I am entirely unconvinced that humanity in general suffers from some anxiety-impaired ability to function, especially given how the over-confidence hits an override on the anxiety, in the vast majority who haven't thought of explicitly calibrating themselves. edit: on top of that you are in the privileged 5% maybe for whom the loss from rejection is only having to try with someone else. Even in your country there's nonzero chance of getting beaten up from hitting on strangers. Everywhere else (outside first world) the chance is not even all that small.

Replies from: wedrifid

↑ comment by wedrifid · 2012-03-20T13:52:36.220Z · LW(p) · GW(p)

It is not my claim that the entire population of the world has a clinically diagnosable anxiety disorder. That would be crazy (and given how 'disorder' is used, only a hop and a step away from outright oxymoronic).

I do maintain the things that I have actually stated.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-20T13:57:42.989Z · LW(p) · GW(p)

"Far more than either current reproductive maximisation incentives or our own practical hedonistic interests would assign."

Who's 'we' here anyway? Mankind? I'm sure >95% can get beaten up for hitting on strangers. 5% ? There's still the status loss from rejection.

comment by A1987dM (army1987) · 2012-03-18T18:07:38.023Z · LW(p) · GW(p)

University exams are a lot easier to pass in Ireland than in Italy (the only two countries I'm familiar with), but the consequences of failing an exam are way worse in Ireland than in Italy.

comment by Daniel_Burfoot · 2011-10-11T01:20:40.811Z · LW(p) · GW(p)

how everyone thinks they're an above average driver

I think I'm a terrible driver, and I gave up driving when I realized there was a good chance I might hurt someone.

comment by Mike Bishop (MichaelBishop) · 2011-10-10T20:05:12.739Z · LW(p) · GW(p)

broken link on "usually correlate"?

comment by Rubix · 2011-10-16T07:21:03.323Z · LW(p) · GW(p)

very interesting. glad i read it. can really relate to the exam example (same thing happens to me). One comment, though, i wish you would not call women "girls" (as in "pretty girls") if you are talking about adult behavior, they are women, not girls.

Calibrate your self-assessments

Contents

122 comments