If individual performance is Pareto distributed, how should we reform education?

post by yhoiseth · 2021-05-24T12:22:44.155Z · LW · GW · 27 comments

This is a question post.

In The best and the rest: Revisiting the norm of normality of individual performance (2012), O’Boyle and Aguinis show that individual performance follows a Paretian distribution:

We revisit a long-held assumption in human resource management, organizational behavior, and industrial and organizational psychology that individual performance follows a Gaussian (normal) distribution. We conducted 5 studies involving 198 samples including 633,263 researchers, entertainers, politicians, and amateur and professional athletes. Results are remarkably consistent across industries, types of jobs, types of performance measures, and time frames and indicate that individual performance is not normally distributed—instead, it follows a Paretian (power law) distribution. Assuming normality of individual performance can lead to misspecified theories and misleading practices. Thus, our results have implications for all theories and applications that directly or indirectly address the performance of individual workers including performance measurement and management, utility analysis in preemployment testing and training and development, personnel selection, leadership, and the prediction of performance, among others.

Currently, systems of formal education assume that individual performance is normally distributed. For example, in all countries that I know of, university grades have a strict upper bound and are at least roughly normally distributed. The PISA tests are another example. Following the release of PISA results, “most public attention concentrates on just one outcome: the mean scores of countries and their rankings of countries against one another.”

If it is true that individual performance is Pareto distributed, how should we reform education?

An answer: Decouple age from level and have very lax minimum requirements

Here is my (certainly not original) answer: Decouple age from level and have very lax minimum requirements. Structure schools so that students can progress at their own pace in different subjects. Crucially, make it possible to progress extremely quickly in as little as one subject.

Let’s say that you’re a math prodigy. We will let you concentrate on math as much as you want, ignoring other subjects. You could start making original contributions to mathematics years earlier, greatly increasing the time you have available for advancing the field. We would allow you to proceed to university without knowing how your country’s political system works.

Instead of worrying about the mean, we would let students follow completely different paths:

  1. The majority won’t need a lot of math. They just need to know enough to manage their finances, buy and sell things, et cetera. As long as they have learned the basics, we would let them concentrate on other subjects.
  2. A minority needs levels of expertise orders of magnitude greater than that of the majority. This minority would be allowed to follow a very different path.

I could say a lot more about this idea, but I’ll leave it at that. What other ideas should we consider?

Answers

answer by Dagon · 2021-05-24T16:36:04.384Z · LW(p) · GW(p)

I think that the distribution is mostly irrelevant to the problems and purpose of education systems. Public, large-scale, youth education is mostly about child-care and socialization, and only incidentally about skill or knowledge development.  Outliers, regardless of the distribution or percentage, aren't particularly well-served.

comment by Stuart Anderson (stuart-anderson) · 2021-05-24T19:47:29.191Z · LW(p) · GW(p)

-

Replies from: jaspax
comment by jaspax · 2021-05-25T15:19:46.891Z · LW(p) · GW(p)

They have no formal lessons on prosocial behaviours

Um?

  • How and when to say "please" and "thank you"
  • How to address and talk to police, firemen, and other public officials
  • The importance of "sharing", etc.
  • The bad of "bullying", etc.
  • How and when to write thank-you letters and other social niceties
  • Appropriate ways to talk to someone who lost a family member

These and others were all things that I recall from my grade school years. One could critique the means and content of these lessons all day, but it seems unsupportable to claim that there are no lessons on such behaviours.

(If you're autistic, your problem may be that you were taught the explicit, formal, and decontextualised rules that schools include, but failed to pick up the implicit, informal, and contextually-dependent behaviours that schools don't include.)

Replies from: Viliam, stuart-anderson
comment by Viliam · 2021-05-25T18:36:43.158Z · LW(p) · GW(p)

Schools do many useful and harmful things, also from social perspective. They teach you that bullying is wrong, but they also create convenient opportunities for bullying and make avoiding the bullies difficult. They teach you social skills, after they separated you from older kids whom you could instinctively emulate. Probably they are a net benefit, but some of the problems they solve are problems they have created.

comment by Stuart Anderson (stuart-anderson) · 2021-05-26T00:41:41.825Z · LW(p) · GW(p)

-

Replies from: jaspax
comment by jaspax · 2021-05-26T05:07:26.516Z · LW(p) · GW(p)

I'm not sure that I agree with the notion that one needs to teach reasons before behaviours. When it comes to socialisation, one needs to teach the desired behaviours first, and the complicated rationale later, if at all. And we do this precisely because we DO care about outcomes: people (including highly intelligent, nerdy people; let's not flatter ourselves) are much better at applying heuristics and rules learned in early childhood than they are deriving proper action from first principles. I think that the general shape of childhood education in this matter is actually correct: first you teach people to do things because It's The Right Thing To Do; later, in an advanced course, you can break out the game theory to show how the prescription is derived.

Replies from: stuart-anderson
comment by yhoiseth · 2021-05-25T15:31:23.025Z · LW(p) · GW(p)

Public, large-scale, youth education is mostly about child-care and socialization, and only incidentally about skill or knowledge development.

I agree that child-care and socialization are big parts of it, but I also think skill and knowledge development play a big role. For example, I care about my doctor’s education due to the skill and knowledge development (as well as certification) that happened during their formal education.

People such as voters and parents also care at least to some degree what people learn in school. They might be mistaken a lot of the time, but they do care.

answer by jacopo · 2021-05-25T13:14:57.890Z · LW(p) · GW(p)

I think there is a crucial difference between performance, as defined in the paper, and ability which should be taken very much into account. I will not debate if their definition of performance is consistent or not with the common usage, but they failed to state their definitions clearly and I think you misunderstood their results because of this. 

The paper measures performance as the results of (roughly) zero-sum competitions. This is very clear when they analyze athletes (number of wins), politicians (election wins, re-elections) and actors (awards). But this is also true for research, as writing an impactful paper means arriving at a novel result before competing teams or succeeding at explaining something where other have failed. 

But, for a professional runner, winning 90% of races is not the same as being 90% faster. Indeed, a runner who is on average 5% faster will win most races (not all, as he will have off days where his speed goes down by more than 5%).

Tests such as PISA and grades try to measure ability, e.g. your math skill. That is analogous to a runner's speed, not to how many races he wins. I believe this is very much Gaussian distributed, and the paper does not show anything to the contrary. Indeed it is very reasonable to believe that Gaussian distributed abilities result in Pareto distributed outcomes in competitive situations (it may be a provable result but I'm too lazy to do the math now). So, it's pretty much appropriate to give grades on a Gaussian.

Now, we could debate if productivity comes mostly from exceptional performers in the real world, which might result in similar reform ideas. BTW, that's something I mostly don't believe but it's a tenable position on a very complicated issue. 

comment by Dagon · 2021-05-25T14:39:32.088Z · LW(p) · GW(p)

I think that's very important to note, thank you!  In fact, the two measures may be quite related - it's believable that pairwise comparisons across a normal distribution along with some noise (most of these are small numbers of contests) can look a lot like a power law (without the asymtotic crazy-large values).

But really, the tie between education and ability or performance is pretty tenuous in the first place, so we shouldn't take any policy recommendations from this mathematical curiosity.

comment by yhoiseth · 2021-05-25T16:52:38.084Z · LW(p) · GW(p)

Thanks for the insightful comment. I agree that the performance measures used tend toward zero-sum games. I don’t, however, think that research is an example of a (roughly) zero-sum game. Scientific breakthroughs to be made is not a limited resource in anywhere near the same sense as sports trophies is a limited resource. When we’re counting papers, we’re getting closer to zero-sum, but I still think it’s significantly positive-sum.

Leaving that aside, I still think we need more examples from positive-sum games. We could look at things like

  1. jobs created by entrepreneurs;
  2. wealth created by entrepeneurs;
  3. salaries;
  4. books sold by authors;
  5. returns made by investors; and
  6. records sold by artists.

My hunch is that these also follow a Paretian distribution, but I’m only about 70 percent sure of that. Hypothetically, if I was right, what would you think then?

Replies from: jacopo
comment by jacopo · 2021-05-26T15:19:00.459Z · LW(p) · GW(p)

Maybe zero-sum was not the right expression, because I think it is broader than strictly zero-sum games. I meant winner-takes-most situations, where the reward of the best performer is outsized with respect to the reward of the next-best. This does not necessarily mean that the game is strictly zero-sum. In many cases, it is just that the product you deliver is scalable, so everyone will just want the best product (of course, preferences may mean that the ranking is not the same for everyone). 

I am also convinced that all the things you mentioned have a fat tail, even if they don't follow strictly a Pareto distribution (probably books/records will be the most close to Pareto, salaries the most close to a Gaussian but with a fat tail on the right). But I think this does not reflect the distribution of quality/skill but the characteristics of the markets.

Example: book sales. I like fantasy books, but the number of books I read per year is capped. So there are a few authors I follow, plus maybe once per year I look for reviews and check if some good book by other authors has come out. If a certain book I would read is not released, chances are I would read the next best one, and find that in fact it is not much worse. Of course, books of much better/worse quality would convince me to read more/less, but in practice the quality delivered by different authors is close enough that this is a relatively small effect. If everyone had the same taste in books, and everyone read 10 books per year, we would all be reading the same 10. If an outstanding book came out, book number 10 would pass from one billion sales to zero. Of course, this is way oversimplified: we have different tastes, and the interaction of objective quality with subjective tastes, plus other factors, creates a Pareto-like distribution of sales. 

Example 2: tech companies. In most western countries, Google has a market share which is 10x Bing. It's not that Google is 10x better than Bing. If people used Bing, they would maybe waste 10% extra time to get to the result they want. But that's fairly consistent across different people. So Google is like a runner which is 10% faster and wins 90% of races. This is not true for all companies, but for most of the largest ones rely on mechanisms which create winner-takes-most situations (IP, brand recognition, network effects, economies of scale). That's why you have a fat tail in wealth created by entrepreneurs (IMHO).

To go back to research. Scientific breakthroughs are not a limited resource, it's true. But given the area of expertise of a researcher and the state of the art in the field, the most promising research topics are limited. And there are many researchers going into those topics. The first to find even a partial solution will easily get published on a fast track. The others will get published but much extra work will be required: compare with previous results, fight referees which favor other approaches, show extra rigor in the analysis... All this will lower their apparent productivity. Or, if you are not confident, you can take a less promising topic: you have less risk but your expected productivity goes down anyway. To this, add that better researchers get access to better complements: more funding, more and better collaborators, maybe less teaching responsibilities if you are in academia. All this widens the productivity gap between the best and the not-so-worse. Funding is particularly perverse because it's partially awarded on past results without dividing by money spent to obtain them, so good/lucky researchers enter into a cycle of more results -> more funding -> even more results -> even more funding ...

In general, I think fat tails in outcomes are present everywhere because they come out naturally from the interaction of incentive structures (e.g. markets, IP, funding), economies of scale, and network effects. But they don't need to reflect an underlying distribution of abilities. I obviously cannot prove that they never do, but I my standard assumption is that they don't. (You could say that I have a prior that ability is distributed in a Gaussian way given that as far as I know all human characteristics that are directly measurable on an absolute scale look more Gaussian-like than Pareto-like)

Replies from: yhoiseth
comment by yhoiseth · 2021-05-26T16:23:49.392Z · LW(p) · GW(p)

Thanks a ton. That is very helpful. I think I understand your point now. (Others in the comments have also said something similar, but I didn’t grasp it until now.)

Let me try to work through it in my own words and apply your insight to my question:

Education contributes to people’s abilities — at least, that’s the idea. It also certifies them. Ability is roughly Gaussian, so tests and teaching should assume that. Which they currently do.

Results, however, depend on many other (possibly overlapping) things, such as

  1. luck;
  2. market structure;
  3. intellectual property rights;
  4. economies of scale;
  5. branding; and
  6. network effects.

For education policy, Pareto results don’t matter. Schools can only affect the input, not the output.

I still think my reform suggestion is good. But I am no longer convinced that Pareto performance implies anything for education education reform. Unless, of course, it turns out that ability does follow a Pareto distribution. But that seems unlikely to me.

answer by James_Miller · 2021-05-24T16:46:15.829Z · LW(p) · GW(p)

Providing one-on-one tutoring to highly intelligent children should be considered by the effective altruism community in part because many members of this community would themselves be qualified to be such tutors.

comment by yhoiseth · 2021-05-25T15:32:14.952Z · LW(p) · GW(p)

Good idea. Would love to hear if anyone has any experience trying this.

Replies from: James_Miller
comment by James_Miller · 2021-05-27T11:58:28.346Z · LW(p) · GW(p)

I do with my son.

answer by CellBioGuy · 2021-05-25T08:09:27.954Z · LW(p) · GW(p)

Returns on performance being pareto distributed is emphatically NOT the same thing as performance being pareto distributed.

comment by Gurkenglas · 2021-05-25T09:23:23.150Z · LW(p) · GW(p)

How do you measure performance, then? If you can only rank it, distributions mean nothing.

comment by yhoiseth · 2021-05-25T15:33:56.116Z · LW(p) · GW(p)

I’m not sure if I understand what you mean. Would you care to elaborate?

Replies from: yhoiseth
answer by Stuart Anderson · 2021-05-24T21:20:04.203Z · LW(p) · GW(p)

-

comment by Vanilla_cabs · 2021-05-26T06:21:25.427Z · LW(p) · GW(p)

The easiest way to deal with the smart outliers is to remove the speed limit as you suggest.

I can't find the american report I read years ago about acceleration, but the conclusion was that grade skipping's benefits almost always overwhelmed the drawbacks. In particular, socialisation does not always degrade after skipping, it might actually improve. Grade skipping has the advantage of being totally free (actually saving money for everyone involved including taxpayers) and applicable today.

TL/DR: Grade skipping is a low hanging fruit.

Replies from: stuart-anderson
answer by freedomandutility · 2021-05-25T17:17:42.859Z · LW(p) · GW(p)

If ability in the underlying population is normally distributed, competition for jobs should still leads to people from the right side of the normal distribution ending up in the relevant jobs, and people from the left side of the normal distribution not getting the jobs. If we now measure the performance of people with the jobs, shouldn't we expect the graph to look like the right side of a normal distribution, which looks more like a pareto distribution than an entire normal distribution? 

So surely finding that the performance of employees in a field looks more like a pareto distribution than a normal distribution doesn't demonstrate that individual performance at the population level is more like a pareto distribution than a normal distribution?

comment by yhoiseth · 2021-05-26T08:15:41.669Z · LW(p) · GW(p)

Good question. You might be right about that.

27 comments

Comments sorted by top scores.

comment by Viliam · 2021-05-25T18:51:58.307Z · LW(p) · GW(p)

I agree that in a perfect world, you could progress in each subject individually. It simply does not make sense to say "we will not allow you to learn more math, because you suck at history" (or vice versa).

This does not necessarily imply that minimal requirements need to be removed. I could imagine a school that insists on you attending the subjects you suck at... without preventing you from simultaneously studying other subjects at a higher level.

As Dagon said, the exact distribution is mostly irrelevant for this argument. In a world where skills are normally distributed, the same reform would still be an improvement.

The elephant in the classroom is childcare. Most parents need it. A few don't. If you provide mass childcare, it makes sense to provide some education at the same time. If you don't need the childcare, the forced coupling of childcare and education is annoying. Maybe we should decouple childcare from education, starting by decoupling teaching from certification -- if children are tested by an external institution, it makes it easy to also test homeschooled children fairly using the same system. (Note: by supporting homeschooling you also support all kinds of experiments in education, which can formally pretend to be homeschooling. This in my opinion is even more important than homeschooling as such.) And if testing is external, it also makes it easy to test each subject at individual speed.

I agree that the positive extremes matter in education. The person who in future invents the cure for cancer, should be allowed to progress as quickly as possible, without being artificially slowed down to the level of the average student; even the average student would benefit from this rule.