Help improve reasoning evaluation in intelligence organisations

post by LThorburn · 2020-05-11T20:26:13.634Z · LW · GW · 4 comments

Contents

  Study Motivation
    Methodological Note
  What does participation involve?
  Sign Up Link
  Related Reading
None
4 comments

Cross-posted from the EA Forum [EA · GW].

TL;DR - My research group at the University of Melbourne is working to improve methods for evaluating quality of reasoning, particularly for use within government intelligence organisations. We’re conducting a study to compare a new evaluation method with the method currently used by intelligence agencies in the US. By participating, you get access to training materials in both the existing and proposed methods. You can sign up here.

Study Motivation

It is important that conclusions reached by analysts working in professional intelligence organisations are accurate so that resulting decisions made by governments and other decision-makers are grounded in reality. Historically, failures of intelligence have contributed to decisions or oversights that wasted resources and often caused significant harm. Prominent examples from US history include the attack on Pearl Harbour, the 1961 Bay of Pigs invasion, 9/11, and the Iraq War.

Such events are at least partly the result of institutional decisions made based on poor reasoning. To reduce the risk of such events, it is important that the analysis informing those decisions is well reasoned. We use the phrase well reasoned to mean that the arguments articulated establish the stated conclusion. (If the arguments fail to establish the stated conclusion, we say the analysis is poorly reasoned.)

The ‘industry standard’ method for evaluating quality of reasoning (QoR) amongst intelligence organisations in the US is the IC Rating Scale, a rubric based on a set of Analytic Standards issued by the US Office of the Director of National Intelligence (ODNI) in 2015. There are significant question marks over the extent to which the IC Rating Scale is (and can be) operationalised to improve the QoR in intelligence organisations. See here for a detailed summary, but in brief:

Our research group has been developing an alternative method for evaluating QoR, notionally called the Reasoning Stress Test (RST), which focuses on detecting the presence of particular types of reasoning flaws in written reasoning. The RST is designed to be an easy to apply and efficient method, but this approach comes at a cost: raters do not consider the degree to which the reasoning displays other reasoning virtues, nor go through a checklist of the necessary and sufficient conditions of good reasoning.

We are conducting a study to compare the ability of participants trained in each method to discriminate between well and poorly reasoned intelligence-style products (among other research questions).

We are offering training in both the current and novel methods for evaluating QoR in return for participation in the study. The training has been primarily designed for intelligence analysis, so will give you insight into how reasoning is evaluated in such institutions. However, the principles of reasoning quality taught are much more broadly applicable. They apply to all types of reasoning, and can be used to assess QoR in any institution with intelligence or analytical roles.

Methodological Note

We are aware that by publicly describing the potential limitations of the two methods—as we have done above—we risk prejudicing participants’ responses to either method in the study. The alternative, not to provide such information, would make it harder for you to decide whether the training is of interest. We decided to provide the information because:

Significant work has been done to develop polished, insightful training into both methods, and we are confident that learning the principles behind and application of both methods will help you evaluate the reasoning of others.

What does participation involve?

Participating in the study involves:

You can sign up here.

---

Any questions, comments or suggestions welcome.

4 comments

Comments sorted by top scores.

comment by [deleted] · 2020-05-12T04:52:20.775Z · LW(p) · GW(p)

Hi, judging from your post history, you seem to be new to LessWrong. It might help if you add some info on why you think the study would be of interest to our specific community.

Replies from: LThorburn
comment by LThorburn · 2020-05-25T23:09:46.540Z · LW(p) · GW(p)

Thanks for your comment, this is my first post but I have been reading LessWrong and adjacent sites for 6+ years, so I'm not unfamiliar with the rationalist community.

I don't think I have much to add beyond the pitch in the original post. This is an opportunity to help improve reasoning and decision-making in real-world institutions with significant influence. By participating you get access to training materials used to teach reasoning evaluation in such institutions, which may be of intrinsic methodological interest. Further, by completing the training you may learn new skills that you can apply to improve your own reasoning.

It is also an opportunity to benchmark your own reasoning evaluation skills against others: we will be sending out such feedback once the study is complete, and are currently looking at ways to incorporate benchmarking into the training itself.

comment by Closed Limelike Curves · 2020-05-12T02:37:34.858Z · LW(p) · GW(p)

This post seems both interesting and like a way to get a very unrepresentative sample.

Replies from: LThorburn
comment by LThorburn · 2020-05-25T22:43:26.584Z · LW(p) · GW(p)

Appreciate your comment - we are aware of that and this is just one of several recruitment avenues we are pursuing.