Theory of Change for AI Safety Camp
post by Linda Linsefors · 2025-01-22T22:07:10.664Z · LW · GW · 3 commentsContents
Introduction 1. Learning by doing 2. Networking / Incubating new teams/projects 3. Direct project outputs 4. Letting people test their personal fit 5. Miscellaneous Flowchart model We’re currently fundraising Acknowledgement None 3 comments
In a private discussion, related to our fundraiser, it was pointed out that AISC hasn't made clear enough what our theory of change is. Therefore this post.
Some caveats/context:
- This is my personal viewpoint. Other organisers might disagree about what is central or not.
- I’ve co-organised AISC1, AISC8, AISC9, and now AISC10. Remmelt has co-organised all except AISC2. Robert recently joined for AISC10.
- I hope there will be an AISC11, but in either case, I will no longer be an organiser. This is partly because I get restless when running the same thing too many times, and partly because there are other things I want to do. But I do think the AISC is in good hands with Remmelt and Robert.
Introduction
I think that AISC theory of change has a number of components/mechanisms, listed here in order of importance (in my judgement).
- Learning by doing
- Networking / Incubating new teams/projects
- Direct project outputs
- Letting people test their personal fit
- Miscellaneous
In the following sections, I’ll describe the motivation for each of them, and how they have shaped the design of AISC.
1. Learning by doing
If someone’s completely new to AI Safety, they should start out just reading up on the arguments. There are lots of resources for this, e.g. AI Safety Fundamentals, Rob Miles’s videos, and so many other reading lists and self-study guides that you probably don’t know where to start.
But at some point, after studying a lot, it comes the time to try to do research (or other concrete project) yourself. Taking this step is hard. Some people can do this on their own, but many more can’t. It’s easier to take this step together. The early AI Safety camps were in some sense just that. There were some participants with prior AI safety research experience, but there were no research leads and no significant mentor involvement, just executing on the idea that it would be easier to do this together.
There is some tacit knowledge about actually trying to achieve something in AI safety, research or otherwise, that you only get by actually trying. It helps to have an experienced mentor around to guide you, that definitely speeds things up, but it’s not a necessary ingredient. If it were, AI safety could never have gotten started. I’d also argue that in AI safety there is less know-how for senior researchers to pass on to junior researchers, than in any mature field like physics. We still want some amount of junior people trying out their own ideas, for the exploration value. Also, there are still not that many senior people around. For all of the above reasons, AISC has never relied on mentors, with the exception of AISC6.
So what are the research leads then? The research lead format that we’ve used since AISC8, is a solution to a practical problem. At the earlier camps (i.e. AISC1 - AISC5), we first accepted participants, then everyone brainstormed project ideas together, and after that people got coordinated into teams based on which ideas they liked. This process was hard to coordinate well and sometimes produced teams of people that only superficially wanted the same thing, it was kind of a mess.
In the new format, we invite people to suggest project ideas for AISC in the research lead application phase. Then we, the organisers, help them (if needed) turn this idea into an actual plan, with clear expectations for what joining this project would entail. Then everyone on the internet can have a look at these plans, and apply to the ones they like. This creates teams that have a clear direction from the start, and with team members who want the same thing.
Sometimes the research lead also provides mentorship, but this is a bonus rather than a core part. The research lead is not always the most senior person on the team (we get some pretty awesome team member applications).
(To avoid misunderstandings, I should point out that we also do filter which projects we accept. See here for more on this.)
Another large benefit of the new format is that we now let the research lead take care of selecting their teams. This means that the organisers get to spend much less time on applications, which previously was a huge task. This together with simplifying the team formation process massively, have allowed us to scale up AISC a lot.
I do think something was lost because we no longer let everyone participate in project idea creation, but also think this loss was worth the tradeoffs.
However, I do know that some research leads (against my explicit warnings of how this can go wrong) have chosen to (to different degrees) not have a clearly specified project, but instead let their team members take part in deciding what to do. I’d be interested in seeing more of this in future AISC, but I do think it is a harder task, and would therefore want higher requirements for the research lead, than for more narrowly specified projects.
2. Networking / Incubating new teams/projects
I put these as one, since I think of [Incubating new teams/projects] as a subset of [networking], i.e, one of the biggest values of networking is the chance to find long-term collaborators.
I think the networking aspect was weakened when AISC moved online, since this reduced the interactions between teams. We tried to compensate for this with various activities, but most participants are just too busy with their own project (and the rest of their lives) to participate in inter-team activities.
Therefore the networking is mostly just with their own team. This may not seem very efficient, and it’s definitely not how I would design an event that was primarily focused on networking. But on the other hand, the repeated interaction of working together on a project for months, does have obvious benefits. You may not meet lots of people, but you have time to get to know them more, and you get to see how you work together on an actual project.
The networking benefits are probably clearest when seen from the point of view of the research lead. As a research lead, you have an idea for something that you want to do, and you would like collaborators, but you may not know how to find them. It’s not easy to get the attention of the people who may be a good fit for your project when you don’t know who they are and they don’t know who you are. Enter AISC! By collecting and listing all the projects together, we attract much more attention than any project would have done on its own. Many research leads have expressed that they are impressed by the quality of applicants this method has provided for them.
While most projects end at the end of AISC, a few keep working together. My favourite example is AI Standards Lab which is the continuation of a project at AISC8 (2023).
3. Direct project outputs
Every project plan includes an intended project output to be delivered at the end of AISC, together with a project's specific theory of change.
However, 3 months of part time work is not a lot of time, especially when you also need to make time at the start to get to know each other and figure out how to work together. For this reason, we should expect that the impact from direct project outputs to be less than the impact from lessons learned and connections made during the camp.
I think this expectation is backed up by observation. E.g. you can read the public endorsements we got from past participants on our current fundraiser. This is just a handful of comments, but in my experience, this is representative of what our alumni value about AISC.
(The reason we insist that every project should aim for real impact has as much to do with learning-by-doing as with impact from direct project outputs.)
However, we do also have some nice project outputs. Here are some from AISC9 (2024):
- Ambitious Mech-Interp, led by Alice Rigg
- SatisfAI, led by Jobst Heitzig
- Asymmetric control in LLMs, led by Domenic Rosati
However, sometimes it’s harder to define what counts as an AISC project output. Extracting Paragraphs from LLM Token Activations is the output of a project that started at AISC9 and continued at SPAR.
4. Letting people test their personal fit
I don’t have much to say about this one. It’s not something I think much about. I felt like it should probably be on the list, since others often mention it as one of the benefits of AISC.
If someone wants to join AISC to see if AI safety is something they want to do, or test their fit for some AI safety relevant role or task, then I’m all for it.
One AISC alumni told me that they thought they were not smart enough to do AI safety research, but also thought they should at least give it a try before giving up. They joined AISC6 and are now a full-time AI safety researcher.
5. Miscellaneous
The design of AISC is driven mainly by considerations 1 and 2, together with practical constraints, and scale (by scale I mean that helping more people is better, and this does trade off against other values). But this still leaves quite a bit of wiggle room.
When we can, AISC strives to be welcoming. Ideally we’d like to give everyone a chance. Sadly we can’t accept every applicant. E.g, I’d never ask a research lead to accept someone they don’t want on their team, and even if that wasn’t a constraint, there's just too many applicants. But we’re not interested in being more restrictive than we have to, just to raise the prestige of AISC. I’m proud of our (relatively) high acceptance rate. For AISC10 we accepted 192 out of 420 applicants, by which I mean, our research leads selected these people.
We try to make interacting with us a net positive experience for everyone. I hope the rest of the post has made it clear why research leads and other participants will get value out of AISC. But we’re also trying to do something for people we reject. We talk to every research lead applicant and give them personal feedback. There is no realistic way we can give personal feedback to all team member applicants, but to not leave them totally without value, we sent this email to everyone who didn’t get in. We also sent out email introductions to connect applicants to other nearby applicants, for those who opted in to this service, and as part of this I also collected links to local AI safety groups.
Flowchart model
I was told to include one of these. I personally think flowcharts like this cut out too much of the relevant context, and therefore aren't useful for my own thinking. But I do think it can be useful as a communication tool, so here you go, I made this for you (you = anyone who reads this post and likes flowcharts).
We’re currently fundraising
You can donate to us here. You can discuss our fundraiser, funding situation, and whether or not you think AISC should be funded, here [LW · GW] or here [EA · GW].
Acknowledgement
Thanks to Gergő Gáspár, Remmelt Ellen, Robert Kralisch and Phil Hazelden for helpful feedback and proofreading. The flowchart is modelled after this one for EA Hotel.
3 comments
Comments sorted by top scores.
comment by Lucas Teixeira · 2025-01-22T22:52:40.970Z · LW(p) · GW(p)
so here you go, I made this for you
I don't see a flow chart
Replies from: Linda Linsefors↑ comment by Linda Linsefors · 2025-01-23T10:19:52.920Z · LW(p) · GW(p)
This comment has two disagree votes, which I interpret as other people seeing the flowchart. I see it too. If it still doesn't work for you for some reason, you can also see it here: AISC ToC Graph - Google Drawings
Replies from: Lucas Teixeira↑ comment by Lucas Teixeira · 2025-01-23T17:24:02.013Z · LW(p) · GW(p)
I see it now