Evaluating expertise: a clear box model

post by JustinShovelain · 2020-10-15T14:18:23.599Z · score: 26 (14 votes) · LW · GW · 3 comments


    Purpose of expertise modelling
    Types of expertise modelling
    How expertise modelling fits within a truth finding process
  Clear box expertise modelling 
    Main suggested heuristics for clear box expertise modelling
    Use of clear box expertise modelling
  Expertise Calculator


Purpose of expertise modelling

To get what we value we must make good decisions. To make these decisions we must know what relevant facts are true. But the world is so complex that we cannot check everything directly ourselves and so must defer to topic “experts” for some things. How should we choose these experts and how much should we believe what they tell us? In this document, I’ll describe a way to evaluate experts.

Many of the problems in the world, be they political, economic, scientific, or personal, are caused by or exacerbated by making epistemic mistakes. We trust in the wrong advice and don’t seek out the right advice. We vote for the wrong politicians, believe the marketers, promote bad bosses, are mesmerized by conspiracy theories, are distracted by the irrelevant, fight with our neighbors, lack important information, suffer accidents, and don’t know the best of what has been discovered. If we accurately know what to do, how to do it, and why to do it, then we become more effective and motivated. 

Types of expertise modelling

To evaluate these experts individually, we can use three methods: black box models, clear box models, or deferring further to other, “meta”, experts about these topic experts (see also this and this [LW · GW]). 

How expertise modelling fits within a truth finding process

To move towards knowing the truth about a topic, a good process to go through would be the list of the following steps:

  1. Figure out one’s own impression: Figure out our independent, without-the-experts (though possibly with-their-models) impression of the topic. For instance by using Fermi [LW · GW] modelling and Bayesian analysis, taking into account the limitations [EA · GW] in our data, and dealing with the upsides and downsides of using explicit probabilities [EA · GW].
  2. Evaluate experts individually: Combine the three types of expertise modelling above to evaluate experts. Some additional tools and perspectives for this: double cruxing [LW · GW], frame conflicts [LW · GW], types of disagreement [LW · GW], causes of disagreement analysis [LW · GW], Bayesian truth serum, mechanism design, the meanings [LW · GW] of words [LW · GW], bets, analysis of the dynamics between experts [LW · GW], and Aumann’s agreement theorem.
  3. Aggregate expert opinions: Aggregate across a sample of the topic experts to figure out what we believe given all of them. For example, using tools like prediction markets 1 [LW · GW],2 [? · GW],3 [LW · GW], the techniques used in superforecasting, the Delphi method, and mimesis [LW · GW].
  4. Combine one’s impression and the aggregation of expert opinions into an overall “all things considered” assessment [EA · GW].

(Further gains can be had in complexifying and going back and forth over these steps and not just down the list. Also, gains may be had as a community using a process like ‘Evidential Reasoning’ referred to in here and perhaps mechanisms like that described here.)

Clear box expertise modelling 

Main suggested heuristics for clear box expertise modelling

Let’s zoom in now on clear box modelling, the primary purpose of this post. How do we evaluate when others know more about a topic than ourselves? How do we compare experts? How can we know how much someone knows about a complex topic and how clear their thinking is about it?

Loosely inspired by AI theory, I believe that some good heuristic features to focus on are the following (see also this post [EA · GW] that makes some similar points):

(note that the necessity of these heuristics will depend on the specific topic and its type of difficulty) 

A Fermi pseudo equation (the mathematical version of pseudocode) to summarize this:

The importance of each factor would vary by topic. As a heuristic composed of heuristics, I think this is a good start.

Use of clear box expertise modelling

These factors can be used either in Fermi pseudo equation form, or as a checklist to compare experts and help ensure you consider all relevant factors. (See here for the usefulness of checklists.)

These heuristics can also be used constructively when trying to become an expert in a topic or when teaching others, as these are factors to optimize for in order to understand a topic.  They also give a sense of how much you know yourself in comparison (to others and in an absolute sense) so you can know how humble you should be and how much you have yet to learn.

Finally once you have evaluated the expertise of someone you can use that information in your truth finding processes which you in turn use to make decisions and achieve your goals and values. 

In the spirit of providing models that people can interact with I have provided a simple online calculator for the expertise equation heuristic:

Expertise Calculator

 (this is very much a rough draft calculator and, with its guessed weights, tries to cover the vast range of expertise from your dog Spot considering the topic for a moment to Einstein devoting his life to it)


My thanks to Ozzie Gooen [LW · GW], David Kristoffersson [LW · GW], Denis Drescher [EA · GW], Michael Aird [LW · GW], Marcello Herreshoff, Siebe Rozendal [EA · GW]Elizabeth [LW · GW], Dan Burfoot [LW · GW], Gregory Lewis [EA · GW], Spencer Greenberg, Shri Samson, Andres Gomez Emilsson [EA · GW], Alexey Turchin [LW · GW], and Remmelt Ellen [EA · GW] for reviewing and providing helpful feedback on the article.


Comments sorted by top scores.

comment by waveman · 2020-10-16T01:02:03.878Z · score: 6 (3 votes) · LW(p) · GW(p)
  • Black box/outside view of the expert: This type of modelling would be just looking at the expert’s prediction accuracy in the past without asking about detailed properties of how they come to those decisions. Their prediction accuracy is ultimately what we want to get at but sometimes track records are incomplete or don’t exist yet.

    [Worked out how to exit quote mode Pressing alt-enter 3 times works, today at least.]

You can do a lot better than this. Some signs of an expert from an outside perspective

1. Can predict the future better than simple extrapolations. 
2. Can fix broken things better than everyman.
3. Can design and make things better than everyman.
4. Can explain things in a parsimonious way better than everyman, 

All of the above need to take into account the possibility that luck played a part. For example if millions of people play the stock market and 29 get rich, then you need to take the large number of "attempts" in deciding whether the 29 have skill.

When you take this seriously it is astonishing now many 'experts' appear to have no skill at all.

comment by Mike G · 2020-10-17T20:51:44.739Z · score: 1 (1 votes) · LW(p) · GW(p)

Good blog.

How many fields offer "short feedback loops over a long time period that are accurate"?

In my field, K-12 teacher coaching, there is rarely data on teacher performance that isn't "noisy or irrelevant or statistically biased." Gates Foundation spent $100 million to try to figure out this problem, but it proved thorny, both the stats and the politics.

Even in a data-loving field, like golf, with many competing experts (instructors) for hire, it's nearly impossible to know which ones actually generate the largest gains in their students. To use Waveman's point, it's hard to assess their relative skill at "fixing broken skills" because the data isn't available...even to the instructors.

What are the fields that best lend themselves to this sort of calculator?

comment by NunoSempere (Radamantis) · 2020-10-17T18:24:38.431Z · score: 1 (1 votes) · LW(p) · GW(p)

Very interesting! Your categorization into black box / clear box / social reputation seems like it's missing a level, and hence to me your names feel slightly off. I might instead think in terms of:

  1. Clear box: I fact check some of the expert's claims, and estimate the accuracy of the claims I can't estimate based on the ones which I can. For example, [Ibn Tufail's](https://en.wikipedia.org/wiki/Ibn_Tufail) metaphysical claims might be difficult to refute, but his books also reference biological mechanisms which are easier to evaluate (e.g., men can be born from mud.) Similarly, if an expert claims some broad historical thesis, I can compare it to, e.g., Spain in the last centuries to see if it checks out.
  2. Black box: I know the person's track record, or that the track record is good, but not which claims/accomplishments it's based on. For example, I know that Renaissance Technologies has a good track record in making money from the stock market and cultivating startups, even if I don't know how exactly they did it. Or I might know that someone is a super-forecaster without knowing what questions they have predicted to get there.
  3. Proxies box (your clear box): I look at proxies for accuracy/track record. Some can be mechanistic: like skin in the game, alignment, computational power, time. But you can't look at, say, computational power or alignment directly (yet), so might have to look at correlational proxies for that, like prestigious university affiliations, big car, nice suit, English accent, brings up cogent and interesting points in a conversation, presentation skills, etc..
  4. Deference pointer: I trust other people's assessment & status signals.

On 1., see Epistemic Spot Checks [LW · GW], and in particular this comment thread [LW(p) · GW(p)]. On 3., see Hanson's How to pick an X.