Our plan for 2019-2020: consulting for AI Safety education

post by RAISE · 2019-06-03T16:51:27.534Z · LW · GW · 17 comments

Contents

  Trial results
  If funding wasn’t a problem
  Funding might not be a problem
  Our new direction
  We’re hiring
  Let's talk
None
15 comments

UPDATE: this plan received sizable criticism. We are reflecting on it, and working on a revision.

Tl;dr: a conversation with a grantmaker made us drop our long-held assumption that outputs needed to be concrete to be recognized. We decided to take a step back and approach the improvement of the AI Safety pipeline on a more abstract level, doing consulting and research to develop expertise in the area. This will be our focus in the next year.

Trial results

We have tested our course in April. We didn’t get a positive result. It looks like this was due to bad test design, with a high variance and low number of participants clouding any pattern that could have emerged. In hindsight, we should clearly have tested knowledge before the intervention as well as after it, though arguably this would have been nearly impossible given the one-month deadline that our funder imposed.

What we did learn is that we are greatly unaware of the extent to which our course is being used. This is mostly due to using software that is not yet mature enough to give this kind of data. If we want to continue building the course, we feel that our first priority ought to be to set up a feedback mechanism that gives us precise insights into how students are journeying through.

However, other developments have pointed our attention away from developing the course, and towards developing the question that the course is an answer to.

If funding wasn’t a problem

During the existence of RAISE, it’s runway has never been longer than about 2 months. This did cripple our ability to make long term decisions, in favor of dishing out some quick results to show value. Seen from a "quick feedback loops" paradigm, this may have been a healthy dynamic. It did also lead to sacrifices that we didn’t actually want to make.

Had we been tasked with our particular niche without any funding constraints, our first move would have been to do extensive study into what the field needs. We feel that EA is missing a management layer. There is a lot that a community-focused management consultant could do, simply by connecting all the dots and coordinating the many projects and initiatives that exist in the LTF space. We have identified 30 (!) small and large organisations that are involved in AI Safety. Not all of them are talking to each other, or even aware of each other.

Our niche being AI Safety education, we would have spent a good 6 months developing expertise and network in this area. We would have studied the scientific frontiers of relevant domains like education and the metasciences. We would have interviewed AIS organisations and asked them what they look for in employees. We would've studied existing alignment researchers and looked for patterns. Talk to grantmakers and consider their models.

Funding might not be a problem

After getting turned down by the LTF fund [LW · GW] (which was especially meaningful because they didn’t seem to be constrained by funding), we had a conversation with one of their grantmakers. The premise of the conversation was something like “what version of RAISE would you be willing to fund?” The answer was pretty much what we just described. They thought pipeline improvement was important, but hard, and just going with the first idea that sounds good (an online course) would be a lucky shot if it worked. Instead, someone should be thinking about the bigger picture first.

The mistake we had been making from the beginning was to assume we needed concrete results to be taken seriously.

Our new direction

EA really does seem to be missing a management layer. People are thinking about their careers, starting organisations, doing direct work and research. Not many people are drawing up plans for coordination on a higher level and telling people what to do. Someone ought to be dividing up the big picture into roles for people to fill. You can see the demand for this by how seriously we take 80k. They’re the only ones doing this beyond the organisational level.

Much the same in the cause area we call AI Safety Education. Most AIS organisations are necessarily thinking about hiring and training, but no one is specializing in it. In the coming year, our aim is to fill this niche, building expertise and doing management consulting. We will aim to smarten up the coordination there. Concrete outputs might be:

We’re hiring

Do you think this is important? Would you like to fast track your involvement with the Xrisk community? Do you have good google-fu, or would you like to conduct depth interviews with admirable people? Most importantly, are you not afraid to hack your own trail?

We think we could use one or two more people to join us in this effort. You’d be living for free in the EA Hotel. We can’t promise any salary in addition to that. Do ask us for more info!

Let's talk

A large part of our work will involve talking to those involved in AI Safety. If you are working in this field, and interested in working on the pipeline, then we would like to talk to you.

If you have important information to share, have been plotting to do something in this area for a while, and want to compare perspectives, then we would like to talk to you.

And even if you would just like to have an open-ended chat about any of this, we would like to talk to you!

You can reach us at raise@aisafety.info



17 comments

Comments sorted by top scores.

comment by habryka (habryka4) · 2019-06-03T22:51:32.902Z · LW(p) · GW(p)

As the funder that you are very likely referring to, I do want to highlight that I don't feel like this summarizes my views particularly well. In particular this section:

EA really does seem to be missing a management layer. People are thinking about their careers, starting organisations, doing direct work and research. Not many people are drawing up plans for coordination on a higher level and telling people what to do. Someone ought to be dividing up the big picture into roles for people to fill. You can see the demand for this by how seriously we take 80k. They’re the only ones doing this beyond the organisational level.
Much the same in the cause area we call AI Safety Education. Most AIS organisations are necessarily thinking about hiring and training, but no one is specializing in it. In the coming year, our aim is to fill this niche, building expertise and doing management consulting. We will aim to smarten up the coordination there. Concrete outputs might be:
+ Advice for grantmakers that want to invest in the AI Safety researcher pipeline
+ Advice for students that want to get up to speed and test themselves quickly
+ Suggesting interventions for entrepreneurs that want to fill up gaps in the ecosystem
+ Publishing thinkpieces that advance the discussion of the community, like this one [EA · GW]
+ Creating and keeping wiki pages about subjects that are relevant to us
+ Helping AIS research orgs with their recruitment process

I think in general people should be very hesitant to work on social coordination problems because they can't find a way to make progress on the object-level problems. My recommendation was very concretely "try to build an internal model of what really needs to happen for AI-risk to go well" and very much not "try to tell other people what really needs to happen for AI-risk", which is almost the exact opposite.

I actually think going explicitly in this direction is possibly worse than RAISE's previous plans. One of my biggest concerns with RAISE was precisely that it was trying far too early to tell people what exactly to learn and what to do, without understanding the relevant problems themselves first. This seems like it exacerbates that problem by trying to make your job explicitly about telling other people what to do.

A lot of my thoughts in this space are summarized by the discussion around Davis' recent post "Go Do Something", in particular Ray's and Ben Hoffman's comments about working on social coordination technology:

Benquo:

This works for versions of "do something" that mainly interact with objective reality, but there's a pretty awful value-misalignment problem if the way you figure out what works is through feedback from social reality.
So, for instance, learning to go camping or cook or move your body better or paint a mural on your wall might count, but starting a socially legible project may be actively harmful if you don't have a specific need that's meeting that you're explicitly tracking. And unfortunately too much of people's idea of what "go do something" ends up pointing to trying to collect credit for doing things.
Sitting somewhere doing nothing (which is basically what much meditation is) is at least unlikely to be harmful, and while of limited use in some circumstances, often an important intermediate stage in between trying to look like you're doing things, and authentically acting in the world.

Ray:

It's been said before for sure, but worth saying periodically.
Something I'd add, which particularly seems like the failure mode I see in EA-spheres (less in rationalist spheres but they blur together)
Try to do something other than solve coordination problems.
Or, try to do something that provides immediate value to whoever uses it, regardless of whether other people are also using it.
A failure mode I see (and have often fallen to) is looking around and thinking "hmm, I don't know how to do something technical, and/or I don't have the specialist skills necessary to do something specialist. But, I can clearly see problems that stem from people being uncoordinated. I think I roughly know how people work, and I think I can understand this problem, so I will work on that."
But:
+ It actually requires just as much complex specialist knowledge to solve coordination problems as it does to do [whatever other thing you were considering].
+ Every time someone attempts to rally people around a new solution, and fails, they make it harder for the next person who tries to rally people around a new solution. This makes the coordination system overall worse.
This is a fairly different framing than Benquo's (and Eliezer's) advice, although I think it amounts to something similar.
Replies from: Wei_Dai, chris-leong, toonalfrink
comment by Wei_Dai · 2019-06-09T10:08:28.654Z · LW(p) · GW(p)

My recommendation was very concretely “try to build an internal model of what really needs to happen for AI-risk to go well”

I'm not sure anyone knows what really needs to happen for AI-risk to go well, including people who have been thinking about this question for many years. Do you really mean for RAISE to solve this problem, or just to think about this question for more than they already have, or to try to learn the best available model from someone else (if so who)?

Replies from: habryka4
comment by habryka (habryka4) · 2019-06-09T18:50:57.786Z · LW(p) · GW(p)

Mostly think more about this question than they already have, which likely includes learning the best available models from others.

The critique here was more one of intention than one of epistemic state. It seems to me like there is a mental motion of being curious about how to make progress on something, even if one is still confused, which I contrast with a mental motion of "trying to look like you are working on the problem".

Replies from: Wei_Dai
comment by Wei_Dai · 2019-06-09T19:56:04.792Z · LW(p) · GW(p)

Ah ok. Given that, it seems like you need to explain your critique more, or try to figure out the root cause of the wrong intention and address that, otherwise wouldn't they just switch to "trying to look like you're trying to build models of what needs to be done to solve AI risk"?

Another problem is that it seems even harder to distinguish between people who are really trying to build such models, and people who are just trying to look like they're doing that, because there's no short-term feedback from reality to tell you whether someone's model is any good. It seems like suggesting people to do that when you're not sure of their intention is really dangerous, as it could mess up the epistemic situation with AI risk models (even more than it already is). Maybe it would be better to just suggest some concrete short-term projects for them to do instead?

comment by Chris Leong (chris-leong) · 2019-06-04T02:24:04.666Z · LW(p) · GW(p)

Maybe there is a possible project in this direction. I'll assume that this is general advice you'd give to many people who want to work in this space. If it is important for people to build a model of what is required for AI to go well then people may as well work on this together. And sure there's websites like Less Wrong, but people can exchange information much faster by chatting either in person or over Skype. (Of course there are worries that this might lead to overly correlated answers)

comment by toonalfrink · 2019-06-03T23:14:56.955Z · LW(p) · GW(p)

Looks like I didn't entirely succeed in explaining our plan.

My recommendation was very concretely "try to build an internal model of what really needs to happen for AI-risk to go well" and very much not "try to tell other people what really needs to happen for AI-risk", which is almost the exact opposite.

And that's also what we meant. The goal isn't to just give advice. The goal is to give useful and true advice, and this necessarily requires a model of what really needs to happen for AI risk.

We're not just going to spin up some interesting ideas. That's not the mindset. The mindset is to generate a robust model and take it from there, if we ever get that far.

We might be talking to people in the process, but as long as we are in the dark the emphasis will be on asking questions.

EDIT: this wasn't a thoughtful reply. I take it back. See Ruby's comments below

Replies from: Ruby
comment by Ruby · 2019-06-04T18:10:52.055Z · LW(p) · GW(p)
The goal is to give useful and true advice, and this necessarily requires a model of what really needs to happen for AI risk.

My understanding of habryka, and the position I'd agree with, is that the goal should be solely to be forming a model at this time. As currently stated, even before you've formed the model, there is a presumption that once you have a model the right thing to do is then instruct others. That is premature.

The mindset is to generate a robust model and take it from there, if we ever get that far.

That kind of seems good, but that's not what your plan says. The plan as written states that a management layer is required - that seems like possibly a conclusion you reach after you've done your model building stage, yet this plan states up front. That seems like a good recipe for confirmation bias to me.

I think there's an alternative version of this point, and a good underlying procedure, which is that along the way of forming a model of things you generate various hypotheses. That the EA community needs a management layer or someone to connect the dots, etc., is maybe a reasonable hypothesis / proto-model of the situation. It's something you could possibly post about and solicit feedback and ideas from others and talk to about. That seems good and it seems good if you can generate that hypothesis plus other hypotheses (caveats being, as habryka stated, you should likely start with non-coordination work).

An alternative version of your plan which might receive better reception from many is purely research. "We are going to research questions X and Y." You might have the knack for the that, and I could see it getting funded if you had a good agenda/good plan and seemed your questioning wouldn't disrupt or displace other work + others weren't worried you were going to pivot it into something counterproductive.

Replies from: Ruby, Ruby, habryka4
comment by Ruby · 2019-06-04T18:28:58.329Z · LW(p) · GW(p)

Sorry, I have to add one more piece that really seems worth calling out.

Looks like I didn't entirely succeed in explaining our plan.
>"My recommendation was very concretely "try to build an internal model of what really needs to happen for AI-risk to go well" and very much not "try to tell other people what really needs to happen for AI-risk", which is almost the exact opposite."
And that's also what we meant. The goal isn't to just give advice. The goal is to give useful and true advice, and this necessarily requires a model of what really needs to happen for AI risk.

If someone's position is "very much don't try to tell other people what really needs to happen for AI-risk", you cannot claim that you meant the same thing when your position includes "the goal is to give useful and true advice". You don't also mean what Habryka means at all (you're ignoring it) and it strikes me as disingenuous to claim that you mean the same thing thing as him at all.

More formally:

Person A: "X, definitely not Y"
Person B: "X and Y"

Person B is absolutely not saying the same thing as A.


Replies from: toonalfrink
comment by toonalfrink · 2019-06-05T15:50:16.654Z · LW(p) · GW(p)

Apologies, that was a knee-jerk reply. I take it back: we did disagree about something.

We're going to take some time to let all of this criticism sink in.

comment by Ruby · 2019-06-04T18:20:41.819Z · LW(p) · GW(p)

Additional thought, just as an example, something in the post that seemed off:

Instead, someone should be thinking about the bigger picture first.

Researchers and people working at the various organizations are clever, capable people. I would venture that they're necessarily thinking about the big picture as part of their work and how it fits in. Given their connection and involvement (possibly over the course of years), their models are probably difficult to surpass if you're starting from the outside. I also don't think it's as simple as talking to lots of people to get all their models and combining them (communicating and synthesizing models is really hard + you'd need to build a lot of credibility before others trusted you could, assuming anyone could really do this well).

The line quoted above, as written, almost makes seems like there's this low hanging fruit because everyone currently working was heads-down on their problems and no one thought to work on the big picture. That strikes me as a very bad assumption and makes me worry about the kind of reasoning you would use to advise others. Possibly you meant something different from that and more defensible . . . but then unambiguous and clear communication is going to be very key in any coordination/advisory role.

Anyhow, I dislike being purely critical. Not many people are dedicating themselves to trying to solve important problems, so I want to say that I do approve of efforts trying. I think it's good that you sought feedback on your first project and then formed a new plan. I've written these comments because I hope they help nudge you in the direction of really good plans. If they're biased critical, it's because I'm trying to explain more of factors I think might be leading to a negative reception of this plan. Because we need all the good plans and good people working on them we can get.

comment by habryka (habryka4) · 2019-06-04T18:53:47.394Z · LW(p) · GW(p)

This seems roughly correct to me.

comment by RyanCarey · 2019-06-04T17:33:25.551Z · LW(p) · GW(p)

Hey! Thanks for sharing your experience with RAISE.

I'm sorry to say it, but I'm not convinced by this plan overall. Also, on the meta-level, I think you've got insufficient feedback on the idea before sharing it. Personally, my preferred format for giving inline feedback on a project idea is Google Docs, and so I've copied this post into a GDoc HERE and added a bunch of my thoughts there.

I don't mean to make you guys get discouraged, but I think that a bunch of aspects of this proposal are pretty ill-considered and need a bunch of revision. I'd be happy to provide further input.

Replies from: toonalfrink, habryka4
comment by toonalfrink · 2019-06-05T16:30:36.403Z · LW(p) · GW(p)

Thank you. We're reflecting on this and will reach out to have a conversation soon.

comment by habryka (habryka4) · 2019-06-04T17:46:55.301Z · LW(p) · GW(p)

Note: I think view access to a document is not sufficient to see comments. At least I can't see any comments.

Replies from: RyanCarey
comment by RyanCarey · 2019-06-04T18:53:40.325Z · LW(p) · GW(p)

Should now be fixed