Update on Harvard AI Safety Team and MIT AI Alignment

post by Xander Davies (xanderdavies), Sam Marks (samuel-marks), kaivu (kaivalya-hariharan), tlevin (trevor), eleni (guicosta), maxnadeau, Naomi Bashkansky · 2022-12-02T00:56:45.596Z · LW · GW · 4 comments

Contents

  What we’ve been doing
  What worked
    Communication & Outreach Strategy
    Operations
    Pedagogy
  Mistakes/Areas for Improvement
  Next Steps/Future Plans
  How You Can Get Involved
None
4 comments

We help organize the Harvard AI Safety Team (HAIST) and MIT AI Alignment (MAIA), and are excited about our groups and the progress we’ve made over the last semester. 

In this post, we’ve attempted to think through what worked (and didn’t work!) for HAIST and MAIA, along with more details about what we’ve done and what our future plans are. We hope this is useful for the many other AI safety groups that exist or may soon exist, as well as for others thinking about how best to build community and excitement around working to reduce risks from advanced AI.

Important things that worked:

Important things we got wrong:

If you’re interested in supporting the alignment community in our area, the Cambridge Boston Alignment Initiative is currently hiring.

What we’ve been doing

HAIST and MAIA are concluding a 3-month period during which we expanded from one group [EA · GW] of about 15 Harvard and MIT students who read AI alignment papers together once a week to two large student organizations that:

What worked

Communication & Outreach Strategy

Operations

Pedagogy

After we incorporate a final round of participant feedback, we’ll release our final adaptation of the AGISF curriculum, structured as 9 weeks of two-hour meetings, and with various minor curricular substitutions.

Mistakes/Areas for Improvement

Next Steps/Future Plans

At this stage, we’re most focused on addressing mistakes and opportunities for improvement on existing programming (see above). Concretely, some of our near-term top priorities are:

How You Can Get Involved

4 comments

Comments sorted by top scores.

comment by aogara (Aidan O'Gara) · 2022-12-02T19:11:06.059Z · LW(p) · GW(p)

This is fantastic, thank you for sharing. I helped start USC AI Safety this semester and we're facing a lot of the same challenges. Some questions for you -- feel free to answer some but not all of them:

  • What does your Research Fellows program look like? 
    • In particular: How many different research projects do you have running at once? How many group members are involved in each project? Have you published any results yet?
    • Also, in terms of hours spent or counterfactual likelihood of producing a useful result, how much of the research contributions come from students without significant prior research experience vs. people who've already published papers or otherwise have significant research experience? 
    • The motivation for this question is that we'd like to start our own research track, but we don't have anyone in our group with the research experience of your PhD students or PhD graduates. One option would be to have students lead research projects, hopefully with advising from senior researchers that can contribute ~1 hour / week or less. But if that doesn't seem likely to produce useful outputs or learning experiences, we could also just focus on skilling up and getting people jobs with experienced researchers at other institutions. Which sounds more valuable to you?
  • What about the general member reading group?
    • Is there a curriculum you follow, or do you pick readings week-by-week based on discussion? 
    • It seems like there are a lot of potential activities for advanced members: reading groups, the Research Fellows program, facilitating intro groups, weekly social events, and participating in any opportunities outside of HAIST. Do you see a tradeoff where dedicated members are forced to choose which activities to focus on? Or is it more of a flywheel effect, where more engagement begets more dedication? For the typical person who finished your AGISF intro group and has good technical skills, which activities would you most want them to focus on? (My guess would be research > outreach and facilitation > participant in reading groups > social events.)
    • Broadly I agree with your focus on the most skilled and engaged members, and I'd worry that the ease of scaling up intro discussions could distract us from prioritizing research and skill-building for those members. How do you plan to deeply engage your advanced members going forward?
  • Do you have any thoughts on the tradeoff between using AGISF vs. the ML Safety Scholars curriculum for your introductory reading group? 
    • MLSS requires ML skills as a prerequisite, which is both a barrier to entry and a benefit. Instead of conceptual discussions of AGI and x-risk, it focuses on coding projects and published ML papers on topics like robustness and anomaly detection. 
    • This semester we used a combination of both, and my impression is that the MLSS selections were better received, particularly the coding assignments. (We'll have survey results on this soon.) This squares with your takeaway that students care about "the technically interesting parts of alignment (rather than its altruistic importance)". 
    • MLSS might also be better from a research-centered approach [EA · GW] if research opportunities in the EA ecosystem are limited but students can do safety-relevant work with mainstream ML researchers.
    • On the other hand, AGISF seems better at making the case that AGI poses an x-risk this century. A good chunk of our members still are not convinced of that argument, so I'm planning to update the curriculum at least slightly towards more conceptual discussion of AGI and x-risks. 
  • How valuable do you think your Governance track is relative to your technical tracks? 
    • Personally I think governance is interesting and important, and I wouldn't want the entire field of AI safety to be focused on technical topics. But thinking about our group, all of our members are more technically skilled than they are in philosophy, politics, or economics. Do you think it's worth putting in the effort to recruit non-technical members and running a Governance track next semester, or would that effort better be spent focusing on technical members?

Appreciate you sharing all these detailed takeaways, it's really helpful for planning our group's activities. Good luck with next semester!

Replies from: samuel-marks
comment by Sam Marks (samuel-marks) · 2022-12-04T18:32:46.127Z · LW(p) · GW(p)

These are all fantastic questions! I'll try to answer some of the ones I can. (Unfortunately a lot of the people who could answer the rest are pretty busy right now with EAGxBerkeley, getting set up for REMIX, etc., but I'm guessing that they'll start having a chance to answer some of these in the coming days.)

Regarding the research program, I'm guessing there's around 6-10 research projects ongoing, with between 1 and 3 students working on each; I'm guessing almost none of the participants have previous research experience. (Kuhan would have the actual numbers here.) This program just got started in late October, so certainly no published results yet.

I'm guessing the mentors are not all on the same page about how much of the value comes from doing object-level useful research vs. upskilling. My feeling is that it's mostly upskilling, with the exception of a few projects where the mentor was basically taking on a RA for a project they were already working on full-time. In fact, when pitching projects, I explicitly disclaimed for some of them that I thought they were likely not useful for alignment (but would be useful for learning research skills and ML upskilling).

It sounds like in your situation, there's a lack of experienced mentors. (Though I'll note that a mentor spending ~1 hour per week meeting with a group sounds like plenty to me.) If that's right, then I think I'd recommend focusing on ML upskilling programming instead of starting a research program. My thoughts here are: (1) I doubt participants will get much mileage out of working on projects that they came up with themselves, especially without mentors to help them shape their work; (2) poorly mentored research projects can be frustrating for the mentees, and might sour them on further engaging with your programming or AI safety as a whole; (3) ML upskilling programming seems almost as valuable to me and much easier to do well.

 

Regarding general member programming: for our weekly reading group, we pick readings week-by-week, usually based on someone messaging a group chat saying "I'd really love to read X this week." (X was often something that had come out in the last week or so.) I don't think this wasn't an especially good way to do things, but we got lucky and it mostly worked out. 

That said, I think most of the value here was from getting a bunch of aligned people in a room reading something and discussing with each other. If you don't already have a lot of people sold on AI x-risk and with a background similar to having completed AGISF, I think it'd be better to run a more structured reading group rather than doing something like this.

Like we mentioned in the post, we think that we actually underinvested in developing programming for our members to participate in (instead putting slightly too much work into making the intro fellowship go well). Most of our full members were too busy for the research program, and the bar for facilitating for our intro fellowship was relatively high (other than Xander, all of our facilitators were PhD students or people who worked full-time on AIS). So the only real thing we had for full members were the weekly general member meetings and the retreats at the end of the semester.

For the typical person who finished your AGISF intro group and has good technical skills, which activities would you most want them to focus on? (My guess would be research > outreach and facilitation > participant in reading groups > social events.)

I think my ordering would be

research > further ML upskilling > reading groups > outreach

with social events not really mattering much to me, and facilitating not being an option for most of them, thanks to our wealth of over-qualified facilitators. I'm not sure how this should translate to your situation, sorry.

Regarding the intro fellowship, we hadn't really considered MLSS at all, and probably we should have. I think we were approaching things from a frame separating our programming into things that require coding (ML upskilling) and things that don't (AGISF), but this was potentially a mistake. The MLSS curriculum looks good, I agree that it seems better at getting people research-ready, and I'll think about whether it makes sense to incorporate some of this stuff for next semester -- thanks for this suggestion!

One dynamic to keep in mind is that when you advertise for an AI educational program, you'll get a whole bunch of people who are excited about AI and don't care much about the safety angle (it seems like lots of the people we attracted to our research program were like this). To some extent this is okay -- it gives a chance to persuade people who would have otherwise gone into AI capabilities work! -- but I think it's also worth trying not to spend resources teaching ML to people who will just go off and work in capabilities. One nice thing about AGISF is that it starts off with multiple weeks on safety, allowing people who aren't interested in safety to self-select out before the technical material. (And the technical content is mostly stuff that I'm not worried is could advance capabilities anyway.) So if you've noticed that you have a lot of people sticking around to the end of your curriculum without really engaging with the safety angle, I might recommend front-loading some AGISF-style safety content.

Anyway, above-and-beyond anything I say above, I think my top piece of advice is to have a 1-1 call with Xander (or more if you've spoken with him already). I think Xander is really good at this stuff and consistently made really good judgement calls in the process of building HAIST and MAIA, and I expect he'd be really helpful in helping you think through the same issues in your context at USC.

comment by Charlie Steiner · 2022-12-03T09:41:07.715Z · LW(p) · GW(p)

Great to hear! Maybe I'll see some of you next year.

comment by Gabe M (gabe-mukobi) · 2022-12-14T04:53:07.247Z · LW(p) · GW(p)

Congrats all, it seems like you were wildly successful in just 1 semester of this new strategy!

I have a couple of questions:

130 in 13 weekly reading groups

  1. = 10 people per group, that feels like a lot and maybe contributed to the high drop rate. Do you think this size was ideal?

Ran two retreats, with a total of 85 unique attendees

  1. These seem like huge retreats compared to other university EA retreats at least, and more like mini-conferences. Was this the right size, or do you think they would have been more valuable as more selective and smaller things where the participants perhaps got to know each other better?

two weekly AI governance fellowships with 15 initial and 14 continuing participants.

  1. This retention rate seems very high, though I imagine maybe these were mostly people already into AI gov and not representative of what a scaled-up cohort would look like. Do you plan to also expand AI governance outreach/programming next term?

Overall, I'm really glad your doing all these things and paving the way for others to follow--we'll seek to replicate some of your success at Stanford :)