Evaluating expertise: a clear box modelpost by JustinShovelain · 2020-10-15T14:18:23.599Z · LW · GW · 3 comments
Context of expertise modelling of expertise modelling expertise modelling fits within a truth finding process Clear box expertise modelling suggested heuristics for clear box expertise modelling of clear box expertise modelling Expertise Calculator None 3 comments
Purpose of expertise modelling
To get what we value we must make good decisions. To make these decisions we must know what relevant facts are true. But the world is so complex that we cannot check everything directly ourselves and so must defer to topic “experts” for some things. How should we choose these experts and how much should we believe what they tell us? In this document, I’ll describe a way to evaluate experts.
Many of the problems in the world, be they political, economic, scientific, or personal, are caused by or exacerbated by making epistemic mistakes. We trust in the wrong advice and don’t seek out the right advice. We vote for the wrong politicians, believe the marketers, promote bad bosses, are mesmerized by conspiracy theories, are distracted by the irrelevant, fight with our neighbors, lack important information, suffer accidents, and don’t know the best of what has been discovered. If we accurately know what to do, how to do it, and why to do it, then we become more effective and motivated.
Types of expertise modelling
To evaluate these experts individually, we can use three methods: black box models, clear box models, or deferring further to other, “meta”, experts about these topic experts (see also this and this [LW · GW]).
- Black box/outside view of the expert: This type of modelling would be just looking at the expert’s prediction accuracy in the past without asking about detailed properties of how they come to those decisions. Their prediction accuracy is ultimately what we want to get at but sometimes track records are incomplete or don’t exist yet.
- Clear box/white box/inside view of the expert/interpretability: This type of modelling looks inside and asks about the specific properties of the experts that make them accurate. This lets us gauge their opinions when we don’t have a predictive track record for them. It also lets us better estimate to what extent their expertise generalizes, points out possible ways they may err and how to fix these errors, and points out how to gain expertise ourselves and improve upon the state-of-the-art.
- Social reputation/deference pointer: This strategy passes the buck of analyzing whether to believe an expert, to other, meta-experts; but, it still requires an ability to evaluate a meta-expert's ability to evaluate other experts, and so reduces to using black box or clear box models about the meta-expert’s ability to evaluate other experts. This has the advantage of letting us quickly assess something, but has the downsides of social biases and of playing the game-of-telephone.
How expertise modelling fits within a truth finding process
To move towards knowing the truth about a topic, a good process to go through would be the list of the following steps:
- Figure out one’s own impression: Figure out our independent, without-the-experts (though possibly with-their-models) impression of the topic. For instance by using Fermi [LW · GW] modelling and Bayesian analysis, taking into account the limitations [EA · GW] in our data, and dealing with the upsides and downsides of using explicit probabilities [EA · GW].
- Evaluate experts individually: Combine the three types of expertise modelling above to evaluate experts. Some additional tools and perspectives for this: double cruxing [LW · GW], frame conflicts [LW · GW], types of disagreement [LW · GW], causes of disagreement analysis [LW · GW], Bayesian truth serum, mechanism design, the meanings [LW · GW] of words [LW · GW], bets, analysis of the dynamics between experts [LW · GW], and Aumann’s agreement theorem.
- Aggregate expert opinions: Aggregate across a sample of the topic experts to figure out what we believe given all of them. For example, using tools like prediction markets 1 [LW · GW],2 [? · GW],3 [LW · GW], the techniques used in superforecasting, the Delphi method, and mimesis [LW · GW].
- Combine one’s impression and the aggregation of expert opinions into an overall “all things considered” assessment [EA · GW].
(Further gains can be had in complexifying and going back and forth over these steps and not just down the list. Also, gains may be had as a community using a process like ‘Evidential Reasoning’ referred to in here and perhaps mechanisms like that described here.)
Clear box expertise modelling
Main suggested heuristics for clear box expertise modelling
Let’s zoom in now on clear box modelling, the primary purpose of this post. How do we evaluate when others know more about a topic than ourselves? How do we compare experts? How can we know how much someone knows about a complex topic and how clear their thinking is about it?
- Data/unsupervised learning data quality(1, 2): How much data and feedback have these experts been exposed to? If they have short feedback loops over a long time period that are accurate (not noisy or irrelevant or statistically biased) and if the problem is simple then they probably have enough data for developing good explicit and intuitive models. Have they been exposed to alternative models and had their own thoughts subject to feedback and criticism? Also, if they have knowledge and skills in a closely related domain these may transfer to this topic. The AI analogue is the amount and quality of training data they have and roughly corresponds to the units of accuracy/virtual cycle.
- Incentives/motivation/supervised learning: How motivated are they towards understanding the topic? How motivated are they to share that knowledge without distortion? If their incentives are aligned with yours and they are paid to be accurate about the topic then there is a good chance they are motivated to give you the correct answer. This factor can be broken up into incentives to come to know the truth personally and incentives to convey what they believe accurately. The AI analogue is getting the right reinforcement, correctly labeled data, or having the right utility function and roughly corresponds to the units of cycles/cycles.
- Compute: How neurologically intelligent (the neurological contribution to IQ) are they, how creatively are they taking into account diverse perspectives, and how attentively focused on the task are they? This factor is a bit messy because of the neuroscience, but roughly corresponds to raw neurological speed, neurological parallelism, low level neurological wiring efficiency, neuroplasticity, and working memory size acting as a memory cache to speed things up. The AI analogue is having a lot of compute per unit time and corresponds to the units of cycles/second.
- Effective thinking: How good are they at thinking, in terms of rationality and meta-cognition, about the problems? If they don’t have good general methods to learn, think, and find mental errors then they are likely to be inefficient at figuring out the truth. Effective thinking methods are partially dependent on the topic. The AI analogue is having efficient good algorithms that approximate solutions well and don’t have systematic biases and roughly corresponds to the units virtual cycles/cycle.
- Time: How long have they been thinking about the topic at hand? Even given all the above, if they just haven’t had time to think about the question they may very well not give a good answer. The AI analogue corresponds to how deep a search process progresses and simply how much compute time has occured and corresponds to the units of seconds.
(note that the necessity of these heuristics will depend on the specific topic and its type of difficulty)
A Fermi pseudo equation (the mathematical version of pseudocode) to summarize this:
The importance of each factor would vary by topic. As a heuristic composed of heuristics, I think this is a good start.
Use of clear box expertise modelling
These factors can be used either in Fermi pseudo equation form, or as a checklist to compare experts and help ensure you consider all relevant factors. (See here for the usefulness of checklists.)
These heuristics can also be used constructively when trying to become an expert in a topic or when teaching others, as these are factors to optimize for in order to understand a topic. They also give a sense of how much you know yourself in comparison (to others and in an absolute sense) so you can know how humble you should be and how much you have yet to learn.
Finally once you have evaluated the expertise of someone you can use that information in your truth finding processes which you in turn use to make decisions and achieve your goals and values.
In the spirit of providing models that people can interact with I have provided a simple online calculator for the expertise equation heuristic:
(this is very much a rough draft calculator and, with its guessed weights, tries to cover the vast range of expertise from your dog Spot considering the topic for a moment to Einstein devoting his life to it)
My thanks to Ozzie Gooen [LW · GW], David Kristoffersson [LW · GW], Denis Drescher [EA · GW], Michael Aird [LW · GW], Marcello Herreshoff, Siebe Rozendal [EA · GW], Elizabeth [LW · GW], Dan Burfoot [LW · GW], Gregory Lewis [EA · GW], Spencer Greenberg, Shri Samson, Andres Gomez Emilsson [EA · GW], Alexey Turchin [LW · GW], and Remmelt Ellen [EA · GW] for reviewing and providing helpful feedback on the article.
Comments sorted by top scores.