MichaelA's Shortform

michaela

MichaelA's Shortform

post by MichaelA · 2020-01-16T11:33:31.728Z · LW · GW · 22 comments

22 comments

22 comments

Comments sorted by top scores.

comment by MichaelA · 2020-03-28T07:42:49.724Z · LW(p) · GW(p)

Collection [EA · GW] of discussions of key cruxes related to AI safety/alignment

These are works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc.

I've also included some works that attempt to clearly lay out a particular view in a way that could be particularly helpful for others trying to see where the cruxes are, even if the work itself don't spend much time addressing alternative views. I'm not sure precisely where to draw the boundaries in order to make this collection maximally useful.

These are ordered from most to least recent.

I've put in bold those works that very subjectively seem to me especially worth reading.

General, or focused on technical work

Ben Garfinkel on scrutinising classic AI risk arguments - 80,000 Hours, 2020

Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics [EA · GW] - James Fodor, 2020; this received pushback from Rohin Shah, which resulted in a comment thread [EA · GW] worth adding here in its own right

Fireside Chat: AI governance - Ben Garfinkel & Markus Anderljung, 2020

My personal cruxes for working on AI safety [EA · GW] - Buck Shlegeris, 2020

What can the principal-agent literature tell us about AI risk? [LW · GW] - Alexis Carlier & Tom Davidson, 2020

Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society - Carina Prunkl & Jess Whittlestone, 2020 (commentary here [EA · GW])

Interviews with Paul Christiano [LW · GW], Rohin Shah [LW · GW], Adam Gleave, and Robin Hanson [LW · GW] - AI Impacts, 2019 (summaries and commentary here [EA · GW] and here [EA · GW])

Brief summary of key disagreements in AI Risk [EA · GW] - iarwain, 2019

A list of good heuristics that the case for AI x-risk fails [LW · GW] - capybaralet, 2019

Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More [LW · GW] - 2019

Clarifying some key hypotheses in AI alignment [LW · GW] - Ben Cottier & Rohin Shah, 2019

A shift in arguments for AI risk - Tom Sittler, 2019 (summary and discussion here [LW · GW])

The Main Sources of AI Risk? [LW · GW] - Wei Dai & Daniel Kokotajlo, 2019

Current Work in AI Alignment - Paul Christiano, 2019 (key graph can be seen at 21:05)

What failure looks like [LW · GW] - Paul Christiano, 2019 (critiques here [LW · GW] and here; counter-critiques here [LW · GW]; commentary here [LW · GW])

Disentangling arguments for the importance of AI safety [LW · GW] - Richard Ngo, 2019

Reframing superintelligence - Eric Drexler, 2019 (I haven't yet read this; maybe it should be in bold)

Prosaic AI alignment [LW · GW] - Paul Christiano, 2018

How sure are we about this AI stuff? [? · GW] - Ben Garfinkel, 2018 (it's been a while since I watched this; maybe it should be in bold)

AI Governance: A Research Agenda - Allan Dafoe, 2018

Some conceptual highlights from “Disjunctive Scenarios of Catastrophic AI Risk” [LW · GW] - Kaj Sotala, 2018 (full paper here)

A model I use when making plans to reduce AI x-risk [LW · GW] - Ben Pace, 2018

Interview series on risks from AI - Alexander Kruel (XiXiDu), 2011 (or 2011 onwards?)

Focused on takeoff speed/discontinuity/FOOM specifically

Discontinuous progress in history: an update [LW · GW] - Katja Grace, 2020 (also some more comments here [EA · GW])

My current framework for thinking about AGI timelines [LW · GW] (and the subsequent posts in the series) - zhukeepa, 2020

What are the best arguments that AGI is on the horizon? [EA · GW] - various authors, 2020

The AI Timelines Scam [LW · GW] - jessicat, 2019 (I also recommend reading Scott Alexander's comment there)

Double Cruxing the AI Foom debate [LW · GW]- agilecaveman, 2018

Quick Nate/Eliezer comments on discontinuity [LW · GW] - 2018

Arguments about fast takeoff [LW · GW] - Paul Christiano, 2018

Likelihood of discontinuous progress around the development of AGI - AI Impacts, 2018

The Hanson-Yudkowsky AI-Foom Debate - various works from 2008-2013

Focused on governance/strategy work

My Updating Thoughts on AI policy [LW · GW] - Ben Pace, 2020

Some cruxes on impactful alternatives to AI policy work [LW · GW] - Richard Ngo, 2018

Somewhat less relevant

A small portion of the answers here [EA · GW] - 2020

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

Replies from: matthew-barnett

↑ comment by Matthew Barnett (matthew-barnett) · 2020-03-28T08:26:02.997Z · LW(p) · GW(p)

I think you should add Clarifying some key hypotheses in AI alignment [LW · GW].

Replies from: MichaelA

↑ comment by MichaelA · 2020-03-28T08:37:24.483Z · LW(p) · GW(p)

Ah yes, meant to add that but apparently missed it. Added now. Thanks!

comment by MichaelA · 2020-01-16T11:33:31.914Z · LW(p) · GW(p)

Ways of describing the “trustworthiness” of probabilities

While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.)

I realised that it might be valuable to write a post collecting [EA · GW] all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.^[1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea - this is basically just a collection of the concepts, relevant quotes, and links where readers can find more.

Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).

Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that.

Epistemic credentials

Dominic Roser speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes:

The expression ‘‘epistemic credentials of probabilities’’ is a shorthand for two things: First, it refers to the credentials of the epistemic access to the probabilities: Are our beliefs about the probabilities well-grounded? Second—and this applies only to the case of subjective probabilities—it refers to the credentials of the probabilities themselves: Are our subjective probabilities—i.e. our degrees of belief—well-grounded?

He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities:

What does it mean for probabilities to have (sufficiently) high epistemic credentials? It could for example mean that we can calculate or reliably estimate the probabilities (Keynes 1937, p. 214; Gardiner 2010, p. 7; Shue 2010, p. 148) rather than just guesstimate them; it could mean that our epistemic access allows for unique, numerical or precise probabilities (Kelsey and Quiggin 1992, p. 135; Friedman 1976, p. 282; Kuhn 1997, p. 56) rather than for qualitative and vague character- izations of probabilities or for ranges of probabilities; or it could mean that our epistemic access allows for knowledge of probabilities, in particular for knowledge that is certain, or which goes beyond the threshold of being extremely insecure, or which is not only based on a partial theory that is only valid ceteris paribus (Hansson 2009, p. 426; Rawls 1999, p. 134; Elster 1983, p. 202).

These examples from the literature provide different ways of spelling out the idea that our epistemic situation with regard to the probabilities must be of sufficient quality before we can properly claim to have probabilities. I will not focus on any single one of those ways. I am only concerned with the fact that they are all distinct from the idea that the mere existence of probabilities and mere epistemic access, however minimal, is sufficient. This second and narrower way of understanding ‘‘having probabilities’’ seems quite common for distinguishing risk from uncertainty. For example, in his discussion of uncertainty, Gardiner (2006, p. 34), based on Rawls (1999, p. 134), speaks of lacking, or having reason to sharply discount, information about probabilities, Peterson (2009, p. 6) speaks of it being virtually impossible to assign probabilities, and Bognar (2011, p. 331) says that precautionary measures are warranted whenever the conditions that Rawls described are approximated. This indicates that in order to distinguish risk from uncertainty, these authors do not examine whether we have probabilities at all, but rather whether we have high-credentials probabilities rather than low-credentials probabilities.

Note also that some believe that scientific progress can turn contexts of uncertainty into contexts of risk. For example, in the third assessment report the IPCC gave a temperature range but it did not indicate the probability of staying within this range. In the fourth assessment report, probabilities were added to the range. If one believes that scientific progress can move us from uncertainty to risk, this indicates as well that one’s risk-uncertainty distinction is not about the sheer availability of probabilities, i.e. having probabilities simpliciter. Given the gradual progress of science, it would be surprising if, after some time, probabilities suddenly became available at all. It seems more plausible that probabilities which were available all along changed from having hardly any credentials (in which case we might call them hunches) to gradually having more credentials. And when they cross some threshold of credentials, then—so I interpret parts of the literature—we switch from uncertainty to risk and we can properly claim to have probabilities. (line breaks added)

Resilience (of credences)

Amanda Askell discusses the idea that we can have “more” or “less” resilient credences^[2] in this [? · GW] talk and this book chapter.

From the talk:

if I thought there was a 50% chance that I would get $100, there’s actually a difference between a low resilience 50% and a high resilience 50%.

I’m going to argue that, if your credences are low resilience, then the value of information in this domain is generally higher than it would be in a domain where your credences are high resilience. And, I’m going to argue that this means that actually in many cases, we should prefer interventions with less evidential support, all else being equal.

[...] One kind of simple formulation of resilience [...] is that credo-resilience is how stable you expect your credences to be in response to new evidence. If my credences are high resilience, then there’s more stability. I don’t expect them to vary that much as new evidence comes in, even if the evidence is good and pertinent to the question. If they’re low resilience, then they have low stability. I expect them to change a little in response to new evidence. That’s true in the case of the untested coin, where I just have no data about how good it is, so the resilience of my credence of 50% is fairly low.

It’s worth noting that resilience levels can reflect either the set of evidence that you have about a proposition, or your prior about the proposition. So, if it’s just incredibly plausible that the coins are generally fair. For example, if you saw me simply pick the coin up out of a stack of otherwise fair coins, in this case you would have evidence that it’s fair. But if you simply live in a world that doesn’t include a lot of very biased coins, then your prior might be doing a lot of the work that your evidence would otherwise do. These are the two things that generate credo-resilience.

In both cases, with the coin, your credence that the coin will land heads on the next flip is the same, it’s 0.5. Your credence of 0.5 about the tested coin is resilient, because you’ve done a million trials of this coin. Whereas, your credence about the untested coin is quite fragile. It could easily move in response to new evidence, as we see here.

Later in the talk, Askell highlights an implication of this idea, and how it differs from the idea of just not having precise probabilities at all:

A lot of people seems to be kind of unwilling to assert probability estimates about whether something is going to work or not. I think a really good explanation for this is that, in cases where we don’t have a lot of evidence, our credences about how good our credences are, are fairly low.

We basically think it’s really likely that we’re going to move around a lot in response to new evidence. We’re just not willing to assert a credence that we think is just going to be false, or inaccurate once we gain a little bit more evidence. Sometimes people think you have mushy credences, that you don’t actually have precise probabilities that you can assign to claims like, “This intervention is effective to Degree N.” I actually think resilience might be a good way of explaining that away, to say, “No. You can have really precise estimates. You just aren’t willing to assert them.”

(This comment thread [EA(p) · GW(p)] seems to me to suggest that the term “robustness of credences” may mean the same thing as “resilience of credences”, but I’m not sure about that.)

Evidential weight (balance vs weight of evidence)

In the book chapter linked to above, Askell also discusses the idea of evidential weight (or the idea of the weight of the evidence, as opposed to the balance of evidence). This seems quite similar to the idea of credence resilience.

The balance of the evidence refers to how decisively the evidence supports the proposition. The weight of the evidence is the total amount of relevant evidence that we have.

Since I can’t easily copy and paste from that chapter, for further info see pages 39-41 of that chapter (available in the preview I linked to).

Probability distributions (and confidence intervals)

Roser writes:

Policy choice can still do justice to precautionary intuitions even when making use of probabilities. I submit that what drives precautionary intuitions is that in cases where there is little and unreliable evidence, our subjective probability distributions should exhibit a larger spread around the best guess. These spread out probability distributions yield precautionary policy-making when they are combined with, for example, the general idea of diminishing marginal utility (Stern 2007, p. 38) or the idea that an equal probability of infringing our descendants’ rights and bequeathing more to them than we owe them does not cancel each other out (Roser and Seidel 2017, p. 82). (emphasis added)

And Nate Soares [LW · GW] writes:

we are bounded reasoners, and we usually can't consider all available hypotheses. [...]

Bounded Bayesian reasoners should expect that they don't have access to the full hypothesis space. Bounded Bayesian reasoners can expect that their first-order predictions are incorrect due to a want of the right hypothesis, and thus place high credence on "something I haven't thought of", and place high value on new information or other actions that expand their hypothesis space. Bounded Bayesians can even expect that their credence for an event will change wildly as new information comes in.

[...] if I expect that I have absolutely no idea what the black swans will look like but also have no reason to believe black swans will make this event any more or less likely, then even though I won't adjust my credence further, I can still increase the variance of my distribution over my future credence for this event.

In other words, even if my current credence is 50% I can still expect that in 35 years (after encountering a black swan or two) my credence will be very different. This has the effect of making me act uncertain about my current credence, allowing me to say "my credence for this is 50%" without much confidence. So long as I can't predict the direction of the update, this is consistent Bayesian reasoning. (emphasis added)

A good, quick explanation, accompanied by diagrams, can be found in this comment [EA(p) · GW(p)].

Precision, sharpness, vagueness

(These ideas seem closely related to the ideas of probability distributions and confidence intervals [above], and to the concept of haziness [below].)

In this paper, Adam Elga writes:

Sometimes one’s evidence for a proposition is sharp. For example: You’ve tossed a biased coin thousands of times. 83% of the tosses landed heads, and no pattern has appeared even though you’ve done a battery of statistical tests. Then it is clear that your confidence that the next toss will land heads should be very close to 83%.

Sometimes one’s evidence for a proposition is sparse but with a clear upshot. For example: You have very little evidence as to whether the number of humans born in 1984 was even. But it is clear that you should be very near to 50% confident in this claim.

But sometimes one’s evidence for a proposition is sparse and unspecific. For example: A stranger approaches you on the street and starts pulling out objects from a bag. The first three objects he pulls out are a regular-sized tube of toothpaste, a live jellyfish, and a travel-sized tube of toothpaste. To what degree should you believe that the next object

he pulls out will be another tube of toothpaste?

[...]

It is very natural in such cases to say: You shouldn’t have any very precise degree of confidence in the claim that the next object will be toothpaste. It is very natural to say: Your degree of belief should be indeterminate or vague or interval-valued. On this way of thinking, an appropriate response to this evidence would be a degree of confidence represented not by a single number, but rather by a range of numbers. The idea is that your probability that the next object is toothpaste should not equal 54%, 91%, 18%, or any other particular number. Instead it should span an interval of values, such as [10%, 80%]. (emphasis added)

Elga then quotes various authors making claims along those lines, and writes:

These authors all agree that one’s evidence can make it downright unreasonable to have sharp degrees of belief. The evidence itself may call for unsharp degrees of belief, and this has nothing to do with computational or representational limitations of the believer. Let me write down a very cautious version of this claim:

UNSHARP: It is consistent with perfect rationality that one have unsharp degrees of belief.

However, Elga spends the rest of the paper arguing against this claim, and arguing instead (based on a type of Dutch book argument) for the following claim:

SHARP: Perfect rationality requires one to have sharp degrees of belief.

(Elga’s arguments seem sound to me, but I think they still allow for representing our beliefs as probability distributions that do have some mean or central or whatever value, and then using that value in many of the contexts Elga talks about. Thus, in those contexts, we’d act as if we have a “sharp degree of belief”, but we could still be guided by the shape and width of our probability distributions when thinking about things like how valuable additional information would be. But I’m not an expert on these topics, and haven’t thought about this stuff in depth.)

See also the Wikipedia article on imprecise probability.

Haziness

Chris Smith (I believe that’s their name, based on this post [EA · GW]) writes:

Consider a handful of statements that involve probabilities:

A hypothetical fair coin tossed in a fair manner has a 50% chance of coming up heads.

When two buddies at a bar flip a coin to decide who buys the next round, each person has a 50% chance of winning.

Experts believe there’s a 20% chance the cost of a gallon of gasoline will be higher than $3.00 by this time next year.

Dr. Paulson thinks there’s an 80% chance that Moore’s Law will continue to hold over the next 5 years.

Dr. Johnson thinks there’s a 20% chance quantum computers will commonly be used to solve everyday problems by 2100.

Kyle is an atheist. When asked what odds he places on the possibility that an all-powerful god exists, he says “2%.”

I’d argue that the degree to which probability is a useful tool for understanding uncertainty declines as you descend the list.

The first statement is tautological. When I describe something as “fair,” I mean that it perfectly conforms to abstract probability theory.

In the early statements, the probability estimates can be informed by past experiences with similar situations and explanatory theories.

In the final statement, I don’t know what to make of the probability estimate.

The hypothetical atheist from the final statement, Kyle, wouldn’t be able to draw on past experiences with different realities (i.e., Kyle didn’t previously experience a bunch of realities and learn that some of them had all-powerful gods while others didn’t). If you push someone like Kyle to explain why they chose 2% rather than 4% or 0.5%, you almost certainly won’t get a clear explanation.

If you gave the same “What probability do you place on the existence of an all-powerful god?” question to a number of self-proclaimed atheists, you’d probably get a wide range of answers.

I bet you’d find that some people would give answers like 10%, others 1%, and others 0.001%. While these probabilities can all be described as “low,” they differ by orders of magnitude. If probabilities like these are used alongside probabilistic decision models, they could have extremely different implications. Going forward, I’m going to call probability estimates like these “hazy probabilities.”

Placing hazy probabilities on the same footing as better-grounded probabilities (e.g., the odds a coin comes up heads) can lead to problems. (bolding added)

Hyperpriors, credal sets, and other things I haven't really learned about

Wikipedia says:

Bayesian approaches to probability treat it as a degree of belief and thus they do not draw a distinction between risk and a wider concept of uncertainty: they deny the existence of Knightian uncertainty. They would model uncertain probabilities with hierarchical models, i.e. where the uncertain probabilities are modelled as distributions whose parameters are themselves drawn from a higher-level distribution (hyperpriors). (emphasis added)

I haven’t looked into this and don’t properly understand it, so I won’t say more about it here, but I think it’s relevant. (This also might be related to the idea of confidence intervals mentioned earlier; as stated at the top, this is a low-effort version of this post where I’m not really trying to explain how the different framings might overlap or differ.)

The ideas of a credal set and of robust Bayesian analysis also seem relevant, but I have extremely limited knowledge on those topics.

I hope you found this somewhat useful. As stated earlier, comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).

Also, if you know of another term/concept/framing that’s relevant, please add a comment mentioning it, to expand the collection here.

It’s possible that something like this has already been done - I didn’t specifically check if it had been done before. If you know of something like this, please comment or message me a link to it. ↩︎
I think that credences are essentially a subtype of probabilities, and that the fact that Askell uses that term rather than probability doesn’t indicate (a) that we can’t use the term “robustness” in relation to probabilities, or (b) that we can’t use the other terms covered in this post in relation to credences. But I haven’t thought about that in depth. ↩︎

Replies from: Davidmanheim, MichaelA, mr-hire, JesseClifton, MichaelA

↑ comment by Davidmanheim · 2021-12-26T15:23:47.987Z · LW(p) · GW(p)

Appendix D of this report informed a lot of work we did on this, and in decreasing order of usefulness, it lists Shafer's "Belief functions," Possibility Theory, and the "Dezert-Smarandache Theory of Plausible and Paradoxical Reasoning." I'd add "Fuzzy Sets" / "Fuzzy Logic."

(Note that these are all formalisms in academic writing that predate and anticipate most of what you've listed above, but are harder ways to understand it. Except DST, which is hard to justify except as trying to be exhaustive about what people might want to think about non-probability belief.)

↑ comment by MichaelA · 2020-08-23T10:41:45.312Z · LW(p) · GW(p)

See also Open Philanthropy Project's list of different kinds of uncertainty (and comments on how we might deal with them) here.

↑ comment by Matt Goldenberg (mr-hire) · 2020-01-24T18:50:29.999Z · LW(p) · GW(p)

I like this and would find a post moderately valuable. I think sometimes posts with a lot of synonyms are hard to have take aways from, because it's hard to remember all the synonyms. What I think is useful is comparing and contrasting the different takes, creating a richer view of the whole framework by examining it from many angles.

Re Knightian Uncertainty vs. Risk, I wrote a post that discusses the interaction of different types of risks (including knightian) here: https://www.lesswrong.com/posts/eA9a5fpi6vAmyyp74/how-to-understand-and-mitigate-risk [LW · GW]

Replies from: MichaelA

↑ comment by MichaelA · 2020-01-24T23:58:16.772Z · LW(p) · GW(p)

Thanks for the feedback!

I think sometimes posts with a lot of synonyms are hard to have take aways from, because it's hard to remember all the synonyms. What I think is useful is comparing and contrasting the different takes, creating a richer view of the whole framework by examining it from many angles.

Yeah, I'd agree with that, and it's part of why fleshing this out is currently low priority for me (since the latter approach takes actual work!), but remains theoretically on the list :)

↑ comment by JesseClifton · 2020-01-24T06:10:02.028Z · LW(p) · GW(p)

There are "reliabilist" accounts of what makes a credence justified. There are different accounts, but they say (very roughly) that a credence is justified if it is produced by a process that is close to the truth on average. See (this paper)[https://philpapers.org/rec/PETWIJ-2].

Frequentist statistics can be seen as a version of reliabilism. Criteria like the Brier score for evaluating forecasters can also be understood in a reliabilist framework.

↑ comment by MichaelA · 2021-12-10T16:33:01.811Z · LW(p) · GW(p)

Adam Binks replied [EA(p) · GW(p)] to this list on the EA Forum with:

To add to your list - Subjective Logic represents opinions with three values: degree of belief, degree of disbelief, and degree of uncertainty. One interpretation of this is as a form of second-order uncertainty. It's used for modelling trust. A nice summary here with interactive tools for visualising opinions and a trust network.

comment by MichaelA · 2020-03-30T10:42:21.377Z · LW(p) · GW(p)

Collection [EA · GW] of discussions of epistemic modesty, "rationalist/EA exceptionalism", and similar

These are currently in reverse-chronological order.

Some thoughts on deference and inside-view models [EA · GW] - Buck Shlegeris, 2020

But have they engaged with the arguments? - Philip Trammell, 2019

Epistemic learned helplessness - Scott Alexander, 2019

AI, global coordination, and epistemic humility - Jaan Tallinn, 2018

In defence of epistemic modesty [EA · GW] - Greg Lewis, 2017

Inadequate Equilibria [? · GW] - Eliezer Yudkowsky, 2017

Common sense as a prior [LW · GW] - Nick Beckstead, 2013

From memory, I think a decent amount of Rationality: A-Z [? · GW] by Eliezer Yudkowsky is relevant

Philosophical Majoritarianism - Hal Finney, 2007

Somewhat less relevant/substantial

This comment/question [EA · GW] - Michael Aird (i.e., me), 2020

Naming Beliefs - Hal Finney, 2008

Likely relevant, but I'm not yet sure how relevant as I haven't yet read it

Are Disagreements Honest? - Cowen & Hanson, 2004

Uncommon Priors Require Origin Disputes - Robin Hanson, 2006

Aumann's agreement theorem - Wikipedia

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

Replies from: MichaelA

↑ comment by MichaelA · 2020-08-04T12:02:28.487Z · LW(p) · GW(p)

comment by MichaelA · 2021-01-02T08:39:40.041Z · LW(p) · GW(p)

Problems in AI risk that economists could potentially contribute to

List(s) of relevant problems

What can the principal-agent literature tell us about AI risk? [LW · GW] (and this comment [LW(p) · GW(p)])
Many of the questions in Technical AGI safety research outside AI [EA · GW]
Many of the questions in The Centre for the Governance of AI’s research agenda
Many of the questions in Cooperation, Conflict, and Transformative Artificial Intelligence [? · GW] (a research agenda of the Center on Long-Term Risk)
At least a couple of the questions in 80,000 Hours' Research questions that could have a big social impact, organised by discipline
Longtermist AI policy projects for economists (this doc was originally just made for Risto Uuk's own use, so the ideas shouldn't be taken as high-confidence recommendations to anyone else)

Context

I intend for this to include both technical and governance problems, and problems relevant to a variety of AI risk scenarios (e.g., AI optimising against humanity, AI misuse by humans, AI extinction risk, AI dystopia risk...)

Wei Dai’s list of Problems in AI Alignment that philosophers could potentially contribute to [LW · GW] made me think that it could be useful to have a list of problems in AI risk that economists could potentially contribute to. So I began making such a list.

But:

I’m neither an AI researcher nor an economist
I spent hardly any time on this, and just included things I’ve stumbled upon, rather than specifically searching for these things

So I’m sure there’s a lot I’m missing.

Please comment if you know of other things worth mentioning here. Or if you’re better placed to make a list like this than I am, feel very free to do so; you could take whatever you want from this list and then comment here to let people know where to find your better thing.

(It’s also possible another list like this already exists. And it's also possible that economists could contribute to such a large portion of AI risk problems that there’s no added value in making a separate list for economists specifically. If you think either of those things are true, please comment to say so!)

Replies from: Vael Gates

↑ comment by Vael Gates · 2021-09-23T01:00:33.598Z · LW(p) · GW(p)

Recently I was also trying to figure out what resources to send to an economist, and couldn't find a list that existed either! The list I came up with is subsumed by yours, except:
- Questions within Some AI Governance Research Ideas
- "Further Research" section within an OpenPhil 2021 report: https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth
- The AI Objectives Institute just launched, and they may have questions in the future

comment by MichaelA · 2020-06-24T03:02:32.837Z · LW(p) · GW(p)

“Intelligence” vs. other capabilities and resources

Legg and Hutter (2007) collect 71 definitions of intelligence. Many, perhaps especially those from AI researchers, would actually cover a wider set of capabilities or resources than people typically want the term “intelligence” to cover. For example, Legg and Hutter’s own “informal definition” is: “Intelligence measures an agent’s ability to achieve goals in a wide range of environments.” But if you gave me a billion dollars, that would vastly increase my ability to achieve goals in a wide range of environments, even if it doesn’t affect anything we’d typically want to refer to as my “intelligence”.

(Having a billion dollars might lead to increases in my intelligence, if I use some of the money for things like paying for educational courses or retiring so I can spend all my time learning. But I can also use money to achieve goals in ways that don’t look like “increasing my intelligence”.)

I would say that there are many capabilities or resources that increase an agent’s ability to achieve goals in a wide range of environments, and intelligence refers to a particular subset of these capabilities or resources. Some of the capabilities or resources which we don’t typically classify as “intelligence” include wealth, physical strength, connections (e.g., having friends in the halls of power), attractiveness, and charisma.

“Intelligence” might help in a wider range of environments than those capabilities or resources help in (e.g., physical strength seems less generically useful). And some of those capabilities or resources might be related to intelligence (e.g., charisma), be “exchangeable” for intelligence (e.g., money), or be attainable via intelligence (e.g., higher intelligence can help one get wealth and connections). But it still seems a useful distinction can be made between “intelligence” and other types of capabilities and resources that also help an agent achieve goals in a wide range of environments.

I’m less sure how to explain why some of those capabilities and resources should fit within “intelligence” while others don’t. At least two approaches to this can be inferred from the definitions Legg and Hutter collect (especially those from psychologists):

Talk about “mental” or “intellectual” abilities
- But then of course we must define those terms.
Gesture at examples of the sorts of capabilities one is referring to, such as learning, thinking, reasoning, or remembering.
- This second approach seems useful, though not fully satisfactory.

An approach that I don’t think I’ve seen, but which seems at least somewhat useful, is to suggest that “intelligence” refers to the capabilities or resources that help an agent (a) select or develop plans that are well-aligned with the agent’s values, and (b) implement the plans the agent has selected or developed. In contrast, other capabilities and resources (such as charisma or wealth) primarily help an agent implement its plans, and don’t directly provide much help in selecting or developing plans. (But as noted above, an agent could use those other capabilities or resources to increase their intelligence, which then helps the agent select or develop plans.)

For example, both (a) becoming more knowledgeable and rational and (b) getting a billion dollars would help one more effectively reduce existential risks [EA · GW]. But, compared to getting a billion dollars, becoming more knowledgeable and rational is much more likely to lead one to prioritise existential risk reduction.

I find this third approach useful, because it links to the key reason why I think the distinction between intelligence and other capabilities and resources actually matters. This reason is that I think increasing an agent’s “intelligence” is more often good than increasing an agent’s other capabilities or resources. This is because some agents are well-intentioned yet currently have counterproductive plans. Increasing the intelligence of such agents may help them course-correct and drive faster, whereas increasing their other capabilities and resources may just help them drive faster down a harmful path.

(I plan to publish a post expanding on that last idea soon, where I’ll also provide more justification and examples. There I’ll also argue that there are some cases where increasing an agent’s intelligence would be bad yet increasing their “benevolence” would be good, because some agents have bad values, rather than being well-intentioned yet misguided.)

Replies from: TurnTrout

↑ comment by TurnTrout · 2020-06-24T13:10:10.720Z · LW(p) · GW(p)

But if you gave me a billion dollars, that would vastly increase my ability to achieve goals in a wide range of environments, even if it doesn’t affect anything we’d typically want to refer to as my “intelligence”.

I don't think it would - the "has a billion dollars" is a stateful property - it depends on the world state. I think the LH metric is pretty reasonable and correctly ignores how much money you have. The only thing you "bring" to every environment under the universal prior, is your reasoning abilities.

My understanding is that this analysis conflates "able to achieve goals in general in a fixed environment" (power/resources) vs "able to achieve high reward in a wide range of environments" (LH intelligence), but perhaps I have misunderstood.

Replies from: MichaelA

↑ comment by MichaelA · 2020-06-25T00:28:30.091Z · LW(p) · GW(p)

Firstly, I'll say that, given that people already have a pretty well-shared intuitive understanding of what "intelligence" is meant to mean, I don't think it's a major problem for people to give explicit definitions like Legg and Hutter's. I think people won't then go out and assume that wealth, physical strength, etc. count as part of intelligence - they're more likely to just not notice that the definitions might imply that.

But I think my points do stand. I think I see two things you might be suggesting:

Intelligence is the only thing that increases an agent’s ability to achieve goals across all environments.
Intelligence is an ability, which is part of the agent, whereas things like wealth are resources, and are part of the environment.

If you meant the first of those things, I'd agree that "“Intelligence” might help in a wider range of environments than those [other] capabilities or resources help in". E.g., a billion US dollars wouldn't help someone at any time before 1700CE (or whenever) or probably anytime after 3000CE achieve their goals, whereas intelligence probably would.

But note that Legg and Hutter say "across a wide range of environments." A billion US dollars would help anyone, in any job, any country, and any time from 1900 to 2020 achieve most of their goals. I would consider that a "wide" range of environments, even if it's not maximally wide.

And there are aspects of intelligence that would only be useful in a relatively narrow set of environments, or for a relatively narrow set of goals. E.g., factual knowledge is typically included as part of intelligence, and knowledge the dates of birth and death of US presidents will be helpful in various situations, but probably in fewer situations and for fewer goals than a billion dollars.

If you meant the second thing, I'd note in response the other capabilities, rather than the other resources. For example, it seems to me intuitive to speak of an agent's charisma or physical strength as a property of the agent, rather than of the state. And I think those capabilities will help it achieve goals in a wide (though not maximally wide) range of environments.

We could decide to say an agent's charisma and physical strength are properties of the state, not the agent, and that this is not the case for intelligence. Perhaps this is useful when modelling an AI and its environment in a standard way, or something like that, and perhaps it's typically assumed (I don't know). If so, then combining an explicit statement of that with Legg and Hutter's definition may address my points, as that might explicitly slice all other types of capabilities and resources out of the definition of "intelligence".

But I don't think it's obvious that things like charisma and physical strength are more a property of the environment than intelligence is - at least for humans, for whom all of these capabilities ultimately just come down to our physical bodies (assuming we reject dualism, which seems safe to me).

Does that make sense? Or did I misunderstand your points?

Replies from: TurnTrout

↑ comment by TurnTrout · 2020-06-25T18:55:09.862Z · LW(p) · GW(p)

Thanks for the clarification. Yes, I'm suggesting bullet point 2.

LH intelligence evaluates learning algorithms. It makes sense to say an algorithm can adapt to a wide range of environments (in their precise formal sense: achieves high return under the universal mixture over computable environments), and maybe that it's more "charismatic" (has hard-coded social skills, or can learn them easily in relevant environments). But it doesn't make sense to say that an algorithm is physically stronger - that has to be a fact which is encoded by the environment's state (especially in this dualistic formalism).

The paper's math automatically captures these facts, in my opinion. I agree the boundary gets fuzzier in an embedded context, but so do a lot of things right now.

Replies from: MichaelA

↑ comment by MichaelA · 2020-06-26T00:03:50.072Z · LW(p) · GW(p)

Ok, so it sounds like Legg and Hutter's definition works given certain background assumptions / ways of modelling things, which they assume in their full paper on their own definition.

But in the paper I cited, Legg and Hutter give their definition without mentioning those assumptions / ways of modelling things. And they don't seem to be alone in that, at least given the out-of-context quotes they provide, which include:

"[Performance intelligence is] the successful (i.e., goal-achieving) performance of the system in a complicated environment"
"Achieving complex goals in complex environments"
"the ability to solve hard problems."

These definitions could all do a good job capturing what "intelligence" typically means if some of the terms in them are defined certain ways, or if certain other things are assumed. But they seem inadequate by themselves, in a way Legg and Hutter don't note in their paper. (Also, Legg and Hutter don't seem to indicate that that paper is just or primarily about how intelligence should be defined in relation to AI systems.)

That said, as I mentioned before, I don't actually think this is a very important oversight on their part.

comment by MichaelA · 2020-09-04T08:51:21.298Z · LW(p) · GW(p)

If any reading this has read anything I’ve written on LessWrong or the EA Forum, I’d really appreciate you taking this brief, anonymous survey. Your feedback is useful whether your opinion of my work is positive, mixed, lukewarm, meh, or negative.

And remember what mama always said: If you’ve got nothing nice to say, self-selecting out of the sample for that reason will just totally bias Michael’s impact survey.

(If you're interested in more info on why I'm running this survey and some thoughts on whether other people should do similar, I give that here [LW · GW].)

comment by MichaelA · 2020-03-28T02:46:38.490Z · LW(p) · GW(p)

Psychology: An Imperfect and Improving Science

This is an essay I wrote in 2017 as coursework for the final year of my Psychology undergrad degree. (That was a year before I learned about EA and the rationalist movement.)

I’m posting this as a shortform comment, rather than as a full post, because it’s now a little outdated, it’s just one of many things that people have written on this topic, and I don’t think the topic is of central interest to a massive portion of LessWrong readers. But I do think it holds up well, is pretty clear, and makes some points that generalise decently beyond psychology (e.g., about drawing boundaries between science and pseudoscience, evaluating research fields, and good research practice).

I put the references in a “reply” to this.

Psychology's scientific status has been denied or questioned by some (e.g., Berezow, 2012; Campbell, 2012). Evaluating such critiques and their rebuttals requires defining “science”, considering what counts as psychology, and exploring how unscientific elements within a field influence the scientific standing of that field as a whole. This essay presents a conception of “science” that consolidates features commonly seen as important into a family resemblance model. Using this model, I argue psychology is indeed a science, despite unscientific individuals, papers, and practices within it. However, these unscientific practices make psychology less scientific than it could be. Thus, I outline their nature and effects, and how psychologists are correcting these issues.

Addressing whether psychology is a science requires specifying what is meant by “science”. This is more difficult than some writers seem to recognise. For example, Berezow (2012) states we can “definitively” say psychology is non-science “[b]ecause psychology often does not meet the five basic requirements for a field to be considered scientifically rigorous: clearly defined terminology, quantifiability, highly controlled experimental conditions, reproducibility and, finally, predictability and testability.” However, there are fields that do not meet those criteria whose scientific status is generally unquestioned. For example, astronomy and earthquake science do not utilise experiments (Irzik & Nola, 2014). Furthermore, Berezow leaves unmentioned other features associated with science, such as data-collection and inference-making (Irzik & Nola, 2011). Many such features have been noted by various writers, though some are contested by others or only present or logical in certain sciences. For example, direct observation of the matters of interest has been rightly noted as helping make fields scientific, as it reduces issues like the gap between self-reported intentions and the behaviours researchers seek to predict (Godin, Conner, & Sheeran, 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002; Skinner, 1987). However, self-reported intentions are still useful predictors of behaviour and levers for manipulating it (Godin et al., 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002), and science often productively investigates constructs such as gravity that are not directly observable (Bringmann & Eronen, 2016; Chomsky, 1971; Fanelli, 2010; Michell, 2013). Thus, definitions of science would benefit from noting the value of direct observation, but cannot exclude indirect measures or unobservable constructs. This highlights the difficulty – or perhaps impossibility – of defining science by way of a list of necessary and sufficient conditions for scientific status (Mahner, 2013).

An attractive solution is instead constructing a family resemblance model of science (Dagher & Erduran, 2016; Irzik & Nola, 2011, 2014; Pigliucci, 2013). Family resemblance models are sets of features shared by many but not all examples of something. To demonstrate, three characteristics common in science are experiments, double-blind trials, and the hypothetico-deductive method (Irzik & Nola, 2014). A definition of science omitting these would be missing something important. However, calling these “necessary” excludes many sciences; for example, particle physics would be rendered unscientific for lack of double-blind trials (Cleland & Brindell, 2013; Irzik & Nola, 2014). Thus, a family resemblance model of science only requires a field to have enough scientific features, rather than requiring the field to have all such features. The full list of features this model should include, the relative importance of each feature, and what number or combination is required for something to be a “science” could all be debated. However, for showing that psychology is a science, it will suffice to provide a rough family resemblance model incorporating some particularly important features, which I shall now outline.

Firstly, Berezow's (2012) “requirements”, while not actually necessary for scientific status, do belong in a family resemblance model of science. That is, when these features can be achieved, they make a field more scientific. The importance of reproducibility is highlighted also by Kahneman (2014) and Klein et al. (2014a, 2014b), and that of testability or falsifiability is also mentioned by Popper (1957) and Ferguson and Heene (2012). These features are related to the more fundamental idea that science should be empirical; claims should be required to be supported by evidence (Irzik & Nola, 2011; Pigliucci, 2013). Together, these features allow science to be self-correcting, incrementally progressing towards truth by accumulation of evidence and peer-review of ideas and findings (Open Science Collaboration, 2015). This is further supported by scientists' methods and results being made public and transparent (Anderson, Martinson, & De Vries, 2007, 2010; Nosek et al., 2015; Stricker, 1997). Additionally, findings and predictions should logically cohere with established theories, including those from other sciences (Lilienfeld, 2011; Mahner, 2013). These features all support science's ultimate aims to benefit humanity by explaining, predicting, and controlling phenomena (Hansson, 2013; Irzik & Nola, 2014; Skinner, cited in Delprato & Midgley, 1992). Each feature may not be necessary for scientific status, and many other features could be added, but the point is that each feature a field possesses makes that field more scientific. Thus, armed with this model, we are nearly ready to productively evaluate the scientific status of psychology.

However, two further questions must first be addressed: What is psychology, and how do unscientific occurrences within psychology affect the scientific status of the field as a whole? For example, it can generally be argued parapsychology is not truly part of psychology, for reasons such as its lack of support from mainstream psychologists. However, there are certain more challenging instances, such as the case of a paper by Bem (2011) claiming to find evidence for precognition. This used accepted methodological and analytical techniques, was published in a leading psychology journal, and was written by a prominent, mainstream psychologist. Thus, one must accept that this paper is, to a substantial extent, part of psychology. It therefore appears important to determine whether Bem's paper exemplifies science. It certainly has many scientific features, such as use of experiments and evidence. However, it lacks other features, such as logical coherence with the established principle of causation only proceeding forwards in time.

But it is unnecessary here to determine whether the paper is non-science, insufficiently scientific, or bad science, because, regardless, this episode shows psychology as a field being scientific. This is because scientific features such as self-correction and reproducibility are most applicable to a field as a whole, rather than to an individual scientist or article, and these features are visible in psychology's response to Bem's (2011) paper. Replication attempts were produced and supported the null hypothesis; namely, that precognition does not occur (Galak, LeBoeuf, Nelson, Simmons, 2012; Ritchie, Wiseman, & French, 2012; Wagenmakers, Wetzels, Borsboom, van der Maas, & Kievit, 2012). Furthermore, publicity, peer-review, and self-correction of findings and ideas were apparent in those failed replications and in commentary on Bem's paper (Wagenmakers, Wetzels, Borsboom, & van der Maas, 2011; Francis, 2012; LeBel & Peters, 2011). Peers discussed many issues with Bem's article, such as several variables having been recorded by Bem's experimental program yet not mentioned in the study (Galak et al., 2012; Ritchie et al., 2012), suggesting that the positive results reported may have been false positives emerging by chance from many, mostly unreported analyses. Wagenmakers et al. (2011) similarly noted other irregularities and unexplained choices in data transformation and analysis, and highlighted that Bem had previously recommended to psychologists: “If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. […] Go on a fishing expedition for something—anything—interesting” (Bem, cited in Wagenmakers et al., 2011). These responses to Bem’s study by psychologists highlight that, while the scientific status of that study is highly questionable, isolated events such as that need not overly affect the scientific status of the entire field of psychology.

Indeed, psychology's response to Bem's (2011) paper exemplifies ways in which the field in general fits the family resemblance model of science outlined earlier. This model captures how different parts of psychology can each be scientific, despite showing different combinations of scientific features. For example, behaviourists may use more direct observation and clearly defined terminology (see Delprato & Midgley, 1992; Skinner, 1987), while evolutionary psychologists better integrate their theories and findings with established theories from other sciences (see Burke, 2014; Confer et al., 2010). These features make subfields that have them more scientific, but lacking one feature does not make a subfield non-science. Similarly, while much of psychology utilises controlled experiments, those parts that do not, like longitudinal studies of the etiology of mental disorders, can still be scientific if they have enough other scientific features, such as accumulation of evidence to increase our capacity for prediction and intervention.

Meanwhile, other scientific features are essentially universal in psychology. For example, all psychological claims and theories are expected to be based on or confirmed by evidence, and are rejected or modified if found not to be. Additionally, psychological methods and findings are made public by publication, with papers being peer-reviewed before this and open to critique afterwards, facilitating self-correction. Such self-correction can be seen in the response to Bem's (2011) paper, as well as in how most psychological researchers now reject the untestable ideas of early psychoanalysis (see Cioffi, 2013; Pigliucci, 2013). Parts of psychology vary in their emphasis on basic versus applied research; for example, some psychologists investigate the processes underlying sadness while others conduct trials of specific cognitive therapy techniques for depression. However, these various branches can support each other, and all psychological research ultimately pursues benefitting humanity by explaining, predicting, and controlling phenomena. Indeed, while there is much work to be done and precision is rarely achieved, psychology can already make predictions much more accurate than chance or intuition in many areas, and thus provides benefits as diverse as anxiety-reduction via exposure therapy and HIV-prevention via soap operas informed by social-cognitive theories (Bandura, 2002; Lilienfeld, Ritschel, Lynn, Cautin, & Latzman, 2013; Zimbardo, 2004). All considered, most of psychology exemplifies most important scientific features, and thus psychology should certainly be considered a science.

However, psychology is not as scientific as it could be. Earlier I noted that isolated papers reporting inaccurate findings and utilising unscientific practices, as Bem (2011) seems highly likely to have, should not significantly affect psychology's scientific status, as long as the field self-corrects adequately. However, as several commentators on Bem's paper noted, more worrying is what that paper reflects regarding psychology more broadly, given that it largely met or exceeded psychology's methodological, analytical, and reporting standards (Francis, 2012; LeBel & Peters, 2011; Wagenmakers et al., 2011). The fact Bem met these standards, yet still “discovered” and got published results that seem to violate fundamental principles about how causation works, highlights the potential prevalence of spurious findings in psychological literature. These findings could result from various flaws and biases, yet might fail to be recognised or countered in the way Bem's report was if they are not as clearly false; indeed, they may be entirely plausible, yet inaccurate (LeBel & Peters, 2011). Thus, I will now discuss how critiques regarding Bem's paper apply to much of mainstream psychology.

Firstly, the kind of “fishing expedition” recommended by Bem (cited in Wagenmakers et al., 2011) is common in psychology. Researchers often record many variables, and have flexibility in which variables, interactions, participants, data transformations, and statistics they use in their analyses (John, Loewenstein, & Prelec, 2012). Wagenmakers et al. (2012) note that such practices are not inherently problematic, and indeed such explorations are useful for suggesting hypotheses to test in a confirmatory manner. The issue is that often these explorations are inadequately reported and are presented as confirmatory themselves, despite the increased risk of false positives when conducting multiple comparisons (Asendorpf et al., 2013; Wagenmakers et al., 2012). Neuropsychological studies can be particularly affected by failures to control for multiple comparisons, even if all analyses are reported, because analysis of brain activity makes huge numbers of comparisons the norm. Thus, without statistical controls, false positives are almost guaranteed (Bennett, Baird, Miller, & Wolford, 2009). The issue of uncontrolled multiple comparisons, whether reported or not, causing false positives can be compounded by hindsight bias making results seem plausible and predictable in retrospect (Wagenmakers et al., 2012). This can cause overconfidence in findings and make researchers feel comfortable writing articles as if these findings were hypothesised beforehand (Kerr, 1998). These practices inflate the number of false discoveries and spurious confirmations of theories in psychological literature.

This is compounded by publication bias. Journals are more likely to publish novel and positive results than replications or negative results (Ferguson & Heene, 2012; Francis, 2012; Ioannidis, Munafò, Fusar-Poli, Nosek, & David, 2014; Kerr, 1998). One reason for this is that, despite the importance of self-correction and incremental progress, replications or negative results are often treated as not show anything substantially interesting (Klein et al., 2014b). Another reason is the idea that null results are hard to interpret or overly likely to be false negatives (Ferguson & Heene, 2012; Kerr, 1998). Psychological studies regularly have insufficient power; their sample sizes mean that, even if an effect of the expected size does exist, the chance of not finding it is substantial (Asendorpf et al., 2013; Bakker, Hartgerink, Wicherts, & van der Maas, 2016). Further, the frequentist statistics typically used by psychologists cannot clearly quantify the support data provides for null hypotheses; these statistics have difficulty distinguishing between powerful evidence for no effect and simply a failure to find evidence for an effect (Dienes, 2011). While concerns about the interpretability of null results are thus often reasonable, they distort the psychological literature's representation of reality (see Fanelli, 2010; Kerr, 1998). Publication bias also takes the form of researchers being more likely to submit for publication those studies that revealed positive results (John et al., 2012). This can occur because researchers themselves also often find negative results difficult to interpret, and know they are less likely to be published or to lead to incentives like grants or prestige (Kerr, 1998; Open Science Collaboration, 2015). Thus, flexibility in analysis, failure to control for or report multiple comparisons, presentation of exploratory results as confirmatory, publication bias, low power, and difficulty interpreting null results are interrelated issues. These issues in turn make psychology less scientific by reducing the transparency of methods and findings.

These issues also undermine other scientific features. The Open Science Collaboration (2015) conducted replications of 100 studies from leading psychological journals, finding that less than half replicated successfully. This low level of reproducibility in itself makes psychology less scientific, and provides further evidence of the likely high prevalence and impact of the issues noted above (Asendorpf et al., 2013; Open Science Collaboration, 2015). Together, these problems impede self-correction, and make psychology's use of evidence and testability of theories less meaningful, as replications and negative tests are often unreported (Ferguson & Heene, 2012). This undermines psychology's ability to benefit humanity by explaining, predicting, and controlling phenomena.

However, while these issues make psychology less scientific, they do not make it non-science. Other sciences, including “hard sciences” like physics and biology, also suffer from issues like publication bias and low reproducibility and transparency (Alatalo, Mappes, & Edgar, 1997; Anderson, Burnham, Gould, & Cherry, 2001; McNutt, 2014; Miguel et al., 2014; Sarewitz, 2012; Service, 2002). Their presence is problematic and demands a response in any case, and may be more pronounced in psychology than in “harder” sciences, but it is not necessarily damning (see Fanelli, 2010). For example, the Open Science Collaboration (2015) did find a large portion of effects replicated, particularly effects whose initial evidence was stronger. Meanwhile, Klein et al. (2014a) found a much higher rate of replication for more established effects, compared to the Open Science Collaboration's quasi-random sample of recent findings. Both results highlight that, while psychology certainly has work to do to become more reliable, the field also has the capacity to scientifically progress towards truth and is already doing so to a meaningful extent.

Furthermore, psychologists themselves are highlighting these issues and researching and implementing solutions for them. Bakker et al. (2016) discuss the problem of low power and how to overcome it with larger sample sizes, reinforced by researchers habitually running power analyses prior to conducting studies and reviewers checking these analyses have been conducted. Nosek et al. (2015) proposed guidelines for promoting transparency by changing what journals encourage or require, such as replications, better reporting and sharing of materials and data, and pre-registration of studies and analysis plans. Pre-registration side-steps confirmation and hindsight bias and unreported, uncorrected multiple comparisons, as expectations and analysis plans are on record before data is gathered (Wagenmakers et al., 2012). Journals can also conditionally accept studies for publication based on pre-registered plans, minimising bias against null results by both journals and researchers. Such proposals still welcome exploratory analyses, but prevent these analyses being presented as confirmatory (Miguel et al., 2014). Finally, psychologists have argued for, outlined how to use, and adopted Bayesian statistics as an alternative to frequentist statistics (Ecker, Lewandowsky, & Apai, 2011; Wagenmakers et al., 2011). Bayesian statistics provide clear quantification of evidence for null hypotheses, combatting one source of publication bias and making testability of psychological claims more meaningful (Dienes, 2011; Francis, 2012). These proposals are beginning to take effect. For example, many journals and organisations are signatories to Nosek et al.'s guidelines. Additionally, the Centre for Open Science, led by the psychologist Brian Nosek, has set up online tools for researchers to routinely make their data, code, and pre-registered plans public (Miguel et al., 2014). This shows psychology self-correcting its practices, not just individual findings, to become more scientific.

I have argued here that claims that psychology is non-scientific may often reflect unworkable definitions of science and ignorance of what psychology actually involves. A family resemblance model of science overcomes the former issue by outlining features that sciences do not have to possess to be science, but do become more scientific by possessing. This model suggests psychology is a science because it generally exemplifies most scientific features; most importantly, it accumulates evidence publicly, incrementally, and self-critically to benefit humanity by explaining, predicting, and controlling phenomena. However, psychology is not as scientific as it could be. A variety of interrelated issues with researchers' and journals' practices and incentive structures impede the effectiveness and meaningfulness of psychology's scientific features. But failure to be perfectly scientific is not unique to psychology; it is universal among sciences. Science has achieved what it has because of its constant commitment to incremental improvement and self-correction of its own practices. In keeping with this, psychologists are researching and discussing psychology's issues and their potential solutions, and such solutions are being put into action. More work must be done, and more researchers and journals must act on and push for these discussions and solutions, but already it is clear both that psychology is a science and that it is actively working to become more scientific.

Replies from: MichaelA

↑ comment by MichaelA · 2020-03-28T02:48:56.201Z · LW(p) · GW(p)

References

Alatalo, R. V., Mappes, J., & Elgar, M. A. (1997). Heritabilities and paradigm shifts. Nature, 385(6615), 402-403. doi:10.1038/385402a0

Anderson, D. R., Burnham, K. P., Gould, W. R., & Cherry, S. (2001). Concerns about finding effects that are actually spurious. Wildlife Society Bulletin, 29(1), 311-316.

Anderson, M. S., Martinson, B. C., & Vries, R. D. (2007). Normative dissonance in science: Results from a national survey of U.S. scientists. Journal of Empirical Research on Human Research Ethics: An International Journal, 2(4), 3-14. doi:10.1525/jer.2007.2.4.3

Anderson, M. S., Ronning, E. A., Vries, R. D., & Martinson, B. C. (2010). Extending the Mertonian norms: Scientists' subscription to norms of research. The Journal of Higher Education, 81(3), 366-393. doi:10.1353/jhe.0.0095

Asendorpf, J. B., Conner, M., Fruyt, F. D., Houwer, J. D., Denissen, J. J., Fiedler, K., … Wicherts, J. M. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27(2), 108-119. doi:10.1002/per.1919

Bakker, M., Hartgerink, C. H., Wicherts, J. M., & Han L. J. Van Der Maas. (2016). Researchers' intuitions about power in psychological research. Psychological Science, 27(8), 1069-1077. doi:10.1177/0956797616647519

Bandura, A. (2002). Environmental sustainability by sociocognitive deceleration of population growth. In P. Shmuck & W. P. Schultz (Eds.), Psychology of sustainable development (pp. 209-238). New York, NY: Springer.

Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407-425. doi:10.1037/a0021524

Bennett, C. M., Miller, M. B., & Wolford, G. L. (2009). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: an argument for multiple comparisons correction. Neuroimage, 47(Suppl 1), S125. doi:10.1016/s1053-8119(09)71202-9

Berezow, A. B. (2012, July 13). Why psychology isn't science. Los Angeles Times. Retrieved from http://latimes.com

Bringmann, L. F., & Eronen, M. I. (2015). Heating up the measurement debate: What psychologists can learn from the history of physics. Theory & Psychology, 26(1), 27-43. doi:10.1177/0959354315617253

Burke, D. (2014). Why isn't everyone an evolutionary psychologist? Frontiers in Psychology, 5. doi:10.3389/fpsyg.2014.00910

Campbell, H. (2012, July 17). A biologist and a psychologist square off over the definition of science. Science 2.0. Retrieved from http://www.science20.com

Chomsky, N. (1971). The case against BF Skinner. The New York Review of Books, 17(11), 18-24.

Cleland, C. E, & Brindell, S. (2013). Science and the messy, uncontrollable world of nature. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 183-202). Chicago, IL: University of Chicago Press.

Confer, J. C., Easton, J. A., Fleischman, D. S., Goetz, C. D., Lewis, D. M., Perilloux, C., & Buss, D. M. (2010). Evolutionary psychology: Controversies, questions, prospects, and limitations. American Psychologist, 65(2), 110-126. doi:10.1037/a0018413

Dagher, Z. R., & Erduran, S. (2016). Reconceptualizing nature of science for science education: Why does it matter? Science & Education, 25, 147-164. doi:10.1007/s11191-015-9800-8

Delprato, D. J., & Midgley, B. D. (1992). Some fundamentals of B. F. Skinner's behaviorism. American Psychologist, 47(11), 1507-1520. doi:10.1037//0003-066x.47.11.1507

Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on?. Perspectives on Psychological Science, 6(3), 274-290. doi:10.1177/1745691611406920

Ecker, U. K., Lewandowsky, S., & Apai, J. (2011). Terrorists brought down the plane!—No, actually it was a technical fault: Processing corrections of emotive information. The Quarterly Journal of Experimental Psychology, 64(2), 283-310. doi:10.1080/17470218.2010.497927

Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS ONE, 5(4). doi:10.1371/journal.pone.0010068

Ferguson, C. J., & Heene, M. (2012). A vast graveyard of undead theories: Publication bias and psychological science's aversion to the null. Perspectives on Psychological Science, 7(6), 555-561. doi:10.1177/1745691612459059

Francis, G. (2012). Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19(2), 151-156. doi:10.3758/s13423-012-0227-9

Galak, J., LeBoeuf, R. A., Nelson, L. D., & Simmons, J. P. (2012). Correcting the past: Failures to replicate psi. Journal of Personality and Social Psychology, 103(6), 933-948. doi:10.1037/a0029709

Godin, G., Conner, M., & Sheeran, P. (2005). Bridging the intention-behaviour gap: The role of moral norm. British Journal of Social Psychology, 44(4), 497-512. doi:10.1348/014466604x17452

Hansson, S. O. (2013). Defining pseudoscience and science. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 61-77). Chicago, IL: University of Chicago Press.

Ioannidis, J. P., Munafò, M. R., Fusar-Poli, P., Nosek, B. A., & David, S. P. (2014). Publication and other reporting biases in cognitive sciences: Detection, prevalence, and prevention. Trends in Cognitive Sciences, 18(5), 235-241. doi:10.1016/j.tics.2014.02.010

Irzik, G., & Nola, R. (2011). A family resemblance approach to the nature of science for science education. Science & Education, 20(7), 591-607. doi:10.1007/s11191-010-9293-4

Irzik, G., & Nola, R. (2014). New directions for nature of science research. In M. R. Matthews (Ed.), International Handbook of Research in History, Philosophy and Science Teaching (pp. 999-1021). Dordrecht: Springer.

John, L., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science, 23(5), 524-532. doi:10.1177/0956797611430953

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196-217. doi:10.1207/s15327957pspr0203_4

Kahneman, D. (2014). A new etiquette for replication. Social Psychology, 45(4), 310-311.

Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014a). Investigating variation in replicability: A “many labs” replication project. Social Psychology, 45(3), 142-152. doi:10.1027/a000001

Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014b). Theory building through replication: Response to commentaries on the “many labs” replication project. Social Psychology, 45(4), 299-311. doi:10.1027/1864-9335/a000202

Lebel, E. P., & Peters, K. R. (2011). Fearing the future of empirical psychology: Bem's (2011) evidence of psi as a case study of deficiencies in modal research practice. Review of General Psychology, 15(4), 371-379. doi:10.1037/a0025172

Lilienfeld, S. O. (2011). Distinguishing scientific from pseudoscientific psychotherapies: Evaluating the role of theoretical plausibility, with a little help from Reverend Bayes. Clinical Psychology: Science and Practice, 18(2), 105-112. doi:10.1111/j.1468-2850.2011.01241.x

Lilienfeld, S. O., Ritschel, L. A., Lynn, S. J., Cautin, R. L., & Latzman, R. D. (2013). Why many clinical psychologists are resistant to evidence-based practice: Root causes and constructive remedies. Clinical Psychology Review, 33(7), 883-900. doi:10.1016/j.cpr.2012.09.008

Mahner, M. (2013). Science and pseudoscience: How to demarcate after the (alleged) demise of the demarcation problem. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 29-43). Chicago, IL: University of Chicago Press.

McNutt, M. (2014). Reproducibility. Science, 343(6168), 229. doi:10.1126/science.1250475

Michell, J. (2013). Constructs, inferences, and mental measurement. New Ideas in Psychology, 31(1), 13-21. doi:10.1016/j.newideapsych.2011.02.004

Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … Laan, M. V. (2014). Promoting transparency in social science research. Science, 343(6166), 30-31. doi:10.1126/science.1245317

Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., … & Contestabile, M. (2015). Promoting an open research culture. Science, 348(6242), 1422-1425.

Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.

Popper, K. (1957). Philosophy of science: A personal report. In C. A. Mace (Ed.), British Philosophy in Mid-Century (155-160). London: Allen and Unwin.

Pigliucci, M. (2013). The demarcation problem: A (belated) response to Laudan. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 9-28). Chicago, IL: University of Chicago Press.

Rhodes, R. E., & Bruijn, G. D. (2013). How big is the physical activity intention-behaviour gap? A meta-analysis using the action control framework. British Journal of Health Psychology, 18(2), 296-309. doi:10.1111/bjhp.12032

Ritchie, S. J., Wiseman, R., & French, C. C. (2012). Failing the future: Three unsuccessful attempts to replicate Bem’s “retroactive facilitation of recall” effect. PLoS ONE, 7(3), e33423. doi:10.1371/journal.pone.0033423

Sarewitz, D. (2012). Beware the creeping cracks of bias. Nature, 485(7397), 149.

Service, R. F. (2002). Scientific misconduct: Bell Labs fires star physicist found guilty of forging data. Science, 298(5591), 30-31. doi:10.1126/science.298.5591.30

Sheeran, P. (2002). Intention—behavior relations: A conceptual and empirical review. European Review of Social Psychology, 12(1), 1-36. doi:10.1080/14792772143000003

Skinner, B. F. (1987). Whatever happened to psychology as the science of behavior? American Psychologist, 42(8), 780-786. doi:10.1037/0003-066x.42.8.780

Stricker, G. (1997). Are science and practice commensurable? American Psychologist, 52(4), 442-448. doi:10.1037//0003-066x.52.4.442

Wagenmakers, E., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100(3), 426-432. doi:10.1037/a0022790

Wagenmakers, E., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632-638. doi:10.1177/1745691612463078

Zimbardo, P. G. (2012). Does psychology make a significant difference in our lives?. In Applied Psychology (pp. 39-64). Psychology Press.

MichaelA's Shortform

Contents

22 comments

Collection [EA · GW] of discussions of key cruxes related to AI safety/alignment

General, or focused on technical work

Focused on takeoff speed/discontinuity/FOOM specifically

Focused on governance/strategy work

Somewhat less relevant

Ways of describing the “trustworthiness” of probabilities

Epistemic credentials

Resilience (of credences)

Evidential weight (balance vs weight of evidence)

Probability distributions (and confidence intervals)

Precision, sharpness, vagueness

Haziness

Hyperpriors, credal sets, and other things I haven't really learned about

Collection [EA · GW] of discussions of epistemic modesty, "rationalist/EA exceptionalism", and similar

Somewhat less relevant/substantial

Likely relevant, but I'm not yet sure how relevant as I haven't yet read it

Problems in AI risk that economists could potentially contribute to

“Intelligence” vs. other capabilities and resources

Psychology: An Imperfect and Improving Science