Posts

Talk: AI safety fieldbuilding at MATS 2024-06-23T23:06:37.623Z
Talent Needs of Technical AI Safety Teams 2024-05-24T00:36:40.486Z
MATS Winter 2023-24 Retrospective 2024-05-11T00:09:17.059Z
MATS AI Safety Strategy Curriculum 2024-03-07T19:59:37.434Z
Announcing the London Initiative for Safe AI (LISA) 2024-02-02T23:17:47.011Z
MATS Summer 2023 Retrospective 2023-12-01T23:29:47.958Z
Apply for MATS Winter 2023-24! 2023-10-21T02:27:34.350Z
[Job Ad] SERI MATS is (still) hiring for our summer program 2023-06-06T21:07:07.185Z
How MATS addresses “mass movement building” concerns 2023-05-04T00:55:26.913Z
SERI MATS - Summer 2023 Cohort 2023-04-08T15:32:56.737Z
Aspiring AI safety researchers should ~argmax over AGI timelines 2023-03-03T02:04:51.685Z
Would more model evals teams be good? 2023-02-25T22:01:31.568Z
Air-gapping evaluation and support 2022-12-26T22:52:29.881Z
Probably good projects for the AI safety ecosystem 2022-12-05T02:26:41.623Z
Ryan Kidd's Shortform 2022-10-13T19:12:47.984Z
SERI MATS Program - Winter 2022 Cohort 2022-10-08T19:09:53.231Z
Selection processes for subagents 2022-06-30T23:57:25.699Z
SERI ML Alignment Theory Scholars Program 2022 2022-04-27T00:43:38.221Z
Ensembling the greedy doctor problem 2022-04-18T19:16:00.916Z
Is Fisherian Runaway Gradient Hacking? 2022-04-10T13:47:16.454Z
Introduction to inaccessible information 2021-12-09T01:28:48.154Z

Comments

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-15T22:59:09.471Z · LW · GW

I interpret your comment as assuming that new researchers with good ideas produce more impact on their own than in teams working towards a shared goal; this seems false to me. I think that independent research is usually a bad bet in general and that most new AI safety researchers should be working on relatively few impactful research directions, most of which are best pursued within a team due to the nature of the research (though some investment in other directions seems good for the portfolio).

I've addressed this a bit in thread, but here are some more thoughts:

  • New AI safety researchers seem to face mundane barriers to reducing AI catastrophic risk, including funding, infrastructure, and general executive function.
  • MATS alumni are generally doing great stuff (~46% currently work in AI safety/control, ~1.4% work on AI capabilities), but we can do even better.
  • Like any other nascent scientific/engineering discipline, AI safety will produce more impactful research with scale, albeit with some diminishing returns on impact eventually (I think we are far from the inflection point, however).
  • MATS alumni, as a large swathe of the most talented new AI safety researchers in my (possibly biased) opinion, should ideally not experience mundane barriers to reducing AI catastrophic risk.
  • Independent research seems worse than team-based research for most research that aims to reduce AI catastrophic risk:
    • "Pair-programming", builder-breaker, rubber-ducking, etc. are valuable parts of the research process and are benefited by working in a team.
    • Funding insecurity and grantwriting responsibilities are larger for independent researchers and obstruct research.
    • Orgs with larger teams and discretionary funding can take on interns to help scale projects and provide mentorship.
    • Good prosaic AI safety research largely looks more like large teams doing engineering and less like lone geniuses doing maths. Obviously, some lone genius researchers (especially on mathsy non-prosaic agendas) seem great for the portfolio too, but these people seem hard to find/train anyways (so there is often more alpha in the former by my lights). Also, I have doubts that the optimal mechanism to incentivize "lone genius research" is via small independent grants instead of large bounties and academic nerdsniping.
  • Therefore, more infrastructure and funding for MATS alumni, who are generally value-aligned and competent, is good for reducing AI catastrophic risk in expectation.
Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-15T19:00:28.952Z · LW · GW

Also note that historically many individuals entering AI safety seem to have been pursuing the "Connector" path, when most jobs now (and probably in the future) are "Iterator"-shaped, and larger AI safety projects are also principally bottlenecked by "Amplifiers". The historical focus on recruiting and training Connectors to the detriment of Iterators and Amplifiers has likely contributed to this relative talent shortage. A caveat: Connectors are also critical for founding new research agendas and organizations, though many self-styled Connectors would likely substantially benefit as founders by improving some Amplifier-shaped soft skills, including leadership, collaboration, networking, and fundraising.

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-15T17:34:19.693Z · LW · GW

In theory, sure! I know @yanni kyriacos recently assessed the need for an ANZ AI safety hub, but I think he concluded there wasn't enough of a need yet?

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-15T17:05:55.210Z · LW · GW

@Elizabeth, Mesa nails it above. I would also add that I am conceptualizing impactful AI safety research as the product of multiple reagents, including talent, ideas, infrastructure, and funding. In my bullet point, I was pointing to an abundance of talent and ideas relative to infrastructure and funding. I'm still mostly working on talent development at MATS, but I'm also helping with infrastructure and funding (e.g., founding LISA, advising Catalyze Impact, regranting via Manifund) and I want to do much more for these limiting reagents.

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-14T02:14:20.134Z · LW · GW

I would amend it to say "sometimes struggles to find meaningful employment despite having the requisite talent to further impactful research directions (which I believe are plentiful)"

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-07-12T18:20:27.913Z · LW · GW

Why does the AI safety community need help founding projects?

  1. AI safety should scale
    1. Labs need external auditors for the AI control plan to work
    2. We should pursue many research bets in case superalignment/control fails
    3. Talent leaves MATS/ARENA and sometimes struggles to find meaningful work for mundane reasons, not for lack of talent or ideas
    4. Some emerging research agendas don’t have a home
    5. There are diminishing returns at scale for current AI safety teams; sometimes founding new projects is better than joining an existing team
    6. Scaling lab alignment teams are bottlenecked by management capacity, so their talent cut-off is above the level required to do “useful AIS work”
  2. Research organizations (inc. nonprofits) are often more effective than independent researchers
    1. Block funding model” is more efficient, as researchers can spend more time researching, rather than seeking grants, managing, or other traditional PI duties that can be outsourced
    2. Open source/collective projects often need a central rallying point (e.g., EleutherAI, dev interp at Timaeus, selection theorems and cyborgism agendas seem too delocalized, etc.)
  3. There is (imminently) a market for for-profit AI safety companies and value-aligned people should capture this free energy or let worse alternatives flourish
    1. If labs or API users are made legally liable for their products, they will seek out external red-teaming/auditing consultants to prove they “made a reasonable attempt” to mitigate harms
    2. If government regulations require labs to seek external auditing, there will be a market for many types of companies
    3. “Ethical AI” companies might seek out interpretability or bias/fairness consultants
  4. New AI safety organizations struggle to get funding and co-founders despite having good ideas
    1. AIS researchers are usually not experienced entrepeneurs (e.g., don’t know how to write grant proposals for EA funders, pitch decks for VCs, manage/hire new team members, etc.)
    2. There are not many competent start-up founders in the EA/AIS community and when they join, they don’t know what is most impactful to help
    3. Creating a centralized resource for entrepeneurial education/consulting and co-founder pairing would solve these problems
Comment by Ryan Kidd (ryankidd44) on Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) · 2024-06-28T21:31:31.584Z · LW · GW

AI that obeys the intention of a human user can be asked to help build unsafe AGI, such as by serving as a coding assistant.

I think a better example of your point is "Corrigible AI can be used by a dictator to enforce their rule."

Comment by Ryan Kidd (ryankidd44) on Talk: AI safety fieldbuilding at MATS · 2024-06-24T20:44:59.570Z · LW · GW

Yep, it was pointed out to me by @LauraVaughan (and I agree) that e.g. working for RAND or a similar government think tank is another high-impact career pathway in the "Nationalized AGI" future.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-30T22:14:38.038Z · LW · GW

Yeah, I basically agree with this nuance. MATS really doesn't want to overanchor on CodeSignal tests or publication count in scholar selection.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-27T04:48:13.150Z · LW · GW

I do think category theory professors or similar would be reasonable advisors for certain types of MIRI research.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-26T22:51:25.996Z · LW · GW

Yes to all this, but also I'll go one level deeper. Even if I had tons more Manifund money to give out (and assuming all the talent needs discussed in the report are saturated with funding), it's not immediately clear to me that "giving 1-3 year stipends to high-calibre young researchers, no questions asked" is the right play if they don't have adequate mentorship, the ability to generate useful feedback loops, researcher support systems, access to frontier models if necessary, etc.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-26T18:35:54.686Z · LW · GW

I want to sidestep critique of "more exploratory AI safety PhDs" for a moment and ask: why doesn't MIRI sponsor high-calibre young researchers with a 1-3 year basic stipend and mentorship? And why did MIRI let Vivek's team go?

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-26T03:00:23.061Z · LW · GW

We changed the title. I don't think keeping the previous title was aiding understanding at this point.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T21:32:56.384Z · LW · GW

I like Adam's description of an exploratory AI safety PhD:

You'll also have an unusual degree of autonomy: You’re basically guaranteed funding and a moderately supportive environment for 3-5 years, and if you have a hands-off advisor you can work on pretty much any research topic. This is enough time to try two or more ambitious and risky agendas.

Ex ante funding guarantees, like The Vitalik Buterin PhD Fellowship in AI Existential Safety or Manifund or other funders, mitigate my concerns around overly steering exploratory research. Also, if one is worried about culture/priority drift, there are several AI safety offices in Berkeley, Boston, London, etc. where one could complete their PhD while surrounded by AI safety professionals (which I believe was one of the main benefits of the late Lightcone office).

Comment by Ryan Kidd (ryankidd44) on Ryan Kidd's Shortform · 2024-05-25T18:07:42.155Z · LW · GW

I am a Manifund Regrantor. In addition to general grantmaking, I have requests for proposals in the following areas:

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T18:00:56.568Z · LW · GW

I plan to respond regarding MATS' future priorities when I'm able (I can't speak on behalf of MATS alone here and we are currently examining priorities in the lead up to our Winter 2024-25 Program), but in the meantime I've added some requests for proposals to my Manifund Regrantor profile.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T16:51:57.245Z · LW · GW

An interesting note: I don't necessarily want to start a debate about the merits of academia, but "fund a smart motivated youngster without a plan for 3 years with little evaluation" sounds a lot like "fund more exploratory AI safety PhDs" to me. If anyone wants to do an AI safety PhD (e.g., with these supervisors) and needs funding, I'm happy to evaluate these with my Manifund Regrantor hat on.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T14:50:03.470Z · LW · GW

I can understand if some people are confused by the title, but we do say "the talent needs of safety teams" in the first sentence. Granted, this doesn't explicitly reference "funding opportunities" too, but it does make it clear that it is the (unfulfilled) needs of existent safety teams that we are principally referring to.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T00:39:04.151Z · LW · GW

As a concrete proposal, if anyone wants to reboot Refine or similar, I'd be interested to consider that while wearing my Manifund Regrantor hat.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-25T00:21:45.022Z · LW · GW

Yes, there is more than unwieldiness at play here. If we retitled the post "Hiring needs in on-paradigm technical AI safety," (which does seem unwieldy and introduces an unneeded concept, IMO) this seems like it would work at cross purposes to our (now explicit) claim, "there are few opportunities for those pursuing non-prosaic, theoretical AI safety research." I think it benefits no-one to make false or misleading claims about the current job market for non-prosaic, theoretical AI safety research (not that I think you are doing this; I just want our report to be clear). If anyone doesn't like this fact about the world, I encourage them to do something about it! (E.g., found organizations, support mentees, publish concrete agendas, petition funders to change priorities.)

As indicated by MATS' portfolio over research agendas, our revealed preferences largely disagree with point 1 (we definitely want to continue supporting novel ideas too, constraints permitting, but we aren't Refine). Among other objectives, this report aims to show a flaw in the plan for point 2: high-caliber newcomers have few mentorship, job, or funding opportunities to mature as non-prosaic, theoretical technical AI safety researchers and the lead time for impactful Connectors is long. We welcome discussion on how to improve paths-to-impact for the many aspiring Connectors and theoretical AI safety researchers.

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-24T21:57:23.421Z · LW · GW

I think there might be a simple miscommunication here: in our title and report we use "talent needs" to refer to "job and funding opportunities that could use talent." Importantly, we generally make a descriptive, not a normative, claim about the current job and funding opportunities. We could have titled the report "Open and Impactful Job and Funding Opportunities in Technical AI Safety," but this felt unwieldy. Detailing what job and funding opportunities should exist in the technical AI safety field is beyond the scope of this report.

Also, your feedback is definitely appreciated!

Comment by Ryan Kidd (ryankidd44) on Talent Needs of Technical AI Safety Teams · 2024-05-24T20:16:20.923Z · LW · GW

You are, of course, correct in your definitions of "technical" and "prosaic" AI safety. Our interview series did not exclude advocates of theoretical or non-prosaic approaches to AI safety. It was not the intent of this report to ignore talent needs in non-prosaic technical AI safety. We believe that this report summarises our best understanding of the dominant talent needs across all of technical AI safety, at least as expressed by current funders and org leaders.

MATS has supported several theoretical or non-prosaic approaches to improving AI safety, including Vanessa Kosoy’s learning theoretic agenda, Jesse Clifton’s and Caspar Oesterheldt’s cooperative AI research, Vivek Hebbar’s empirical agent foundations research, John Wentworth’s selection theorems agenda, and more. We remain supportive of well-scoped agent foundations research, particularly that with tight empirical feedback loops. If you are an experienced agent foundations researcher who wants to mentor, please contact us; this sub-field seems particularly bottlenecked by high-quality mentorship right now.

I have amended our footnote to say:

Technical AI safety, in turn, here refers to the subset of AI safety research that takes current and future technological paradigms as its chief objects of study, rather than governance, policy, or ethics. Importantly, this does not exclude all theoretical approaches, but does in practice prefer those theoretical approaches which have a strong foundation in experimentation. Due to the dominant focus on prosaic AI safety within the current job and funding market, the main focus of this report, we believe there are few opportunities for those pursuing non-prosaic, theoretical AI safety research.

If you disagree with our assessment, please let us know! We would love to hear about more jobs or funding opportunities for non-prosaic AI safety research.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-16T21:07:06.575Z · LW · GW

Do you think there are some cultural things that ought to be examined to figure out why scaling labs are so much more attractive than options that at-least-to-me seem more impactful in expectation?

As a naive guess, I would consider the main reasons to be:

  • People seeking jobs in AI safety often want to take on "heroic responsibility." Work on evals and policy, while essential, might be seen as "passing the buck" onto others, often at scaling labs, who have to "solve the wicked problem of AI alignment/control" (quotes indicate my caricature of a hypothetical person). Anecdotally, I've often heard people in-community disparage AI safety strategies that primarily "buy time" without "substantially increasing the odds AGI is aligned." Programs like MATS emphasizing the importance of AI governance and including AI strategy workshops might help shift this mindset, if it exists.
  • Roles in AI gov/policy, while impactful at reducing AI risk, likely have worse quality-of-life features (e.g., wages, benefits, work culture) than similarly impactful roles in scaling labs. People seeking jobs in AI safety might choose between two high-impact roles based on these salient features without considering how many others making the same decisions will affect the talent flow en masse. Programs like MATS might contribute to this problem, but only if the labs keep hiring talent (unlikely given poor returns on scale) and the AI gov/policy orgs don't make attractive offers (unlikely given METR and Apollo pay pretty good wages, high status, and work cultures comparable to labs; AISIs might be limited because government roles don't typically pay well, but it seems there are substantial status benefits to working there).
  • AI risk might be particularly appealing as a cause area to people who are dispositionally and experientially suited to technical work and scaling labs might be the most impactful place to do many varieties of technical work. Programs like MATS are definitely not a detriment here, as they mostly attract individuals who were already going to work in technical careers, expose them to governance-adjacent research like evals, and recommend potential careers in AI gov/policy.
Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-16T20:43:43.454Z · LW · GW

Cheers, Akash! Yep, our confirmed mentor list updated in the days after publishing this retrospective. Our website remains the best up-to-date source for our Summer/Winter plans.

Do you think this is the best thing for MATS to be focusing on, relative to governance/policy?

MATS is not currently bottlenecked on funding for our current Summer plans and hopefully won't be for Winter either. If further interested high-impact AI gov mentors appear in the next month or two (and some already seem to be appearing), we will boost this component of our Winter research portfolio. If ERA disappeared tomorrow, we would do our best to support many of their AI gov mentors. In my opinion, MATS is currently not sacrificing opportunities to significantly benefit AI governance and policy; rather, we are rate-limited by factors outside of our control and are taking substantial steps to circumvent these, including:

  • Substantial outreach to potential AI gov mentors;
  • Pursuing institutional partnerships with key AI gov/policy orgs;
  • Offering institutional support and advice to other training programs;
  • Considering alternative program forms less associated with rationality/longtermism;
  • Connecting scholars and alumni with recommended opportunities in AI gov/policy;
  • Regularly recommending scholars and alumni to AI gov/policy org hiring managers.

We appreciate further advice to this end!

Do you think there are some cultural things that ought to be examined to figure out why scaling labs are so much more attractive than options that at-least-to-me seem more impactful in expectation?

I think this is a good question, but it might be misleading in isolation. I would additionally ask:

  • "How many people are the AISIs, METR, and Apollo currently hiring and are they mainly for technical or policy roles? Do we expect this to change?"
  • "Are the available job opportunities for AI gov researchers and junior policy staffers sufficient to justify pursuing this as a primary career pathway if one is already experienced at ML and particularly well-suited (e.g., dispositionally) for empirical research?"
  • "Is there a large demand for AI gov researchers with technical experience in AI safety and familiarity with AI threat models, or will most roles go to experienced policy researchers, including those transitioning from other fields? If the former, where should researchers gain technical experience? If the latter, should we be pushing junior AI gov training programs or retraining bootcamps/workshops for experienced professionals?"
  • "Are existing talent pipelines into AI gov/policy meeting the needs of established research organizations and think tanks (e.g., RAND, GovAI, TFS, IAPS, IFP, etc.)? If not, where can programs like MATS/ERA/etc. best add value?"
  • "Is there a demand for more organizations like CAIP? If so, what experience do the founders require?"
Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-15T20:23:24.818Z · LW · GW

Of the scholars ranked 5/10 and lower on value alignment, 63% worked with a mentor at a scaling lab, compared with 27% of the scholars ranked 6/10 and higher. The average scaling lab mentors rated their scholars' value alignment at 7.3/10 and rated 78% of their scholars at 6/10 and higher, compared to 8.0/10 and 90% for the average non-scaling lab mentor. This indicates that our scaling lab mentors were more discerning of value alignment on average than non-scaling lab mentors, or had a higher base rate of low-value alignment scholars (probably both).

I also want to push back a bit against an implicit framing of the average scaling lab safety researcher we support as being relatively unconcerned about value alignment or the positive impact of their research; this seems manifestly false from my conversations with mentors, their scholars, and the broader community.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-15T17:36:15.829Z · LW · GW

It seems plausible to me that at least some MATS scholars are somewhat motivated by a desire to work at scaling labs for money, status, etc. However, the value alignment of scholars towards principally reducing AI risk seems generally very high. In Winter 2023-24, our most empirical research dominated cohort, mentors rated the median scholar's value alignment at 8/10 and 85% of scholars were rated 6/10 or above, where 5/10 was “Motivated in part, but would potentially switch focus entirely if it became too personally inconvenient.” To me this is a very encouraging statistic, but I’m sympathetic to concerns that well-intentioned young researchers who join scaling labs might experience value drift, or find it difficult to promote safety culture internally or sound the alarm if necessary; we are consequently planning a “lab safety culture” workshop in Summer. Notably, only 3.7% of surveyed MATS alumni say they are working on AI capabilities; in one case, an alumnus joined a scaling lab capabilties team and transferred to working on safety projects as soon as they were able. As with all things, maximizing our impact is about striking the right balance between trust and caution and I’m encouraged by the high apparent value alignment of our alumni and scholars.

We additionally believe:

  1. Advancing researchers to get hired at lab safety teams is generally good;
  2. We would prefer that the people on lab safety teams have more research experience and are more value-aligned, all else equal, and we think MATS improves scholars on these dimensions;
  3. We would prefer lab safety teams to be larger, and it seems likely that MATS helps create a stronger applicant pool for these jobs, resulting in more hires overall;
  4. MATS creates a pipeline for senior researchers on safety teams to hire people they have worked with for up to 6.5 months in-program, observing their compentency and value alignment;
  5. Even if MATS alumni defect to work on pure capabilities, we would still prefer them to be more value-aligned than otherwise (though of course this has to be weighed against the boost MATS gave to their research abilities).

Regarding “AI control,” I suspect you might be underestimating the support that this metastrategy has garnered in the technical AI safety community, particularly among prosaic AGI safety thought leaders. I see Paul’s decision to leave ARC in favor of the US AISI as a potential endorsement of the AI control paradigm over intent alignment, rather than necessarily an endorsement of an immediate AI pause (I would update against this if he pushes more for a pause than for evals and regulations). I do not support AI control to the exclusion of other metastrategies (including intent alignment and Pause AI), but I consider it a vital and growing component of my strategy portfolio.

It’s true that many AI safety projects are pivoting towards AI governance. I think the establishment of AISIs is wonderful; I am in contact with MATS alumni Alan Cooney and Max Kauffman at the UK AISI and similarly want to help the US AISI with hiring. I would have been excited for Vivek Hebbar’s, Jeremy Gillen’s, Peter Barnett’s, James Lucassen’s, and Thomas Kwa’s research in empirical agent foundations to continue at MIRI, but I am also excited about the new technical governance focus that MATS alumni Lisa Thiergart and Peter Barnett are exploring. I additionally have supported AI safety org accelerator Catalyze Impact as an advisor and Manifund Regrantor and advised several MATS alumni founding AI safety projects; it's not easy to attract or train good founders!

MATS has been interested in supporting more AI governance research since Winter 2022-23, when we supported Richard Ngo and Daniel Kokotajlo (although both declined to accept scholars past the training program) and offered support to several more AI gov researchers. In Summer 2023, we reached out to seven handpicked governance/strategy mentors (some of which you recommended, Akash), though only one was interested in mentoring. In Winter 2023-24 we tried again, with little success. In preparation for the upcoming Summer 2024 and Winter 2024-25 Programs, we reached out to 25 AI gov/policy/natsec researchers (who we asked to also share with their networks) and received expressions of interest from 7 further AI gov researchers. As you can see from our website, MATS is supporting four AI gov mentors in Summer 2024 (six if you count Matija Franklin and Philip Moreira Tomei, who are primarily working on value alignment). We’ve additionally reached out to RAND, IAPS, and others to provide general support. MATS is considering a larger pivot, but available mentors are clearly a limiting constraint. Please contact me if you’re an AI gov researcher and want to mentor!

Part of the reason that AI gov mentors are harder to find is that programs like the RAND TASP, GovAI, IAPS, Horizon, ERA, etc. fellowships seem to be doing a great job collectively of leveraging the available talent. It’s also possible that AI gov researchers are discouraged from mentoring at MATS because of our obvious associations with AI alignment (it’s in the name) and the Berkeley longtermist/rationalist scene (we’re talking on LessWrong and operate in Berkeley). We are currently considering ways to support AI gov researchers who don’t want to affiliate with the alignment, x-risk, longtermist, or rationalist communities.

I’ll additionally note that MATS has historically supported much research that indirectly contributes to AI gov/policy, such as Owain Evans’, Beth Barnes’, and Francis Rhys Ward’s capabilities evals, Evan Hubinger’s alignment evals, Jeffrey Ladish’s capabilities demos, Jesse Clifton’s and Caspar Oesterheldt’s cooperation mechanisms, etc.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-12T22:30:10.811Z · LW · GW

Yeah, that amount seems reasonable, if on the low side, for founding a small org. What makes you think $300k is reasonably easy to raise in this current ecosystem? Also, I'll note that larger orgs need significantly more.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-12T21:44:46.713Z · LW · GW

I think the high interest in working at scaling labs relative to governance or nonprofit organizations can be explained by:

  1. Most of the scholars in this cohort were working on research agendas for which there are world-leading teams based at scaling labs (e.g., 44% interpretability, 17% oversight/control). Fewer total scholars were working on evals/demos (18%), agent foundations (8%), and formal verification (3%). Therefore, I would not be surprised if many scholars wanted to pursue interpretability or oversight/control at scaling labs.
  2. There seems to be an increasing trend in the AI safety community towards the belief that most useful alignment research will occur at scaling labs (particularly once there are automated research assistants) and external auditors with privileged frontier model access (e.g., METR, Apollo, AISIs). This view seems particularly strongly held by proponents of the "AI control" metastrategy.
  3. Anecdotally, scholars seemed generally in favor of careers at an AISI or evals org, but would prefer to continue pursuing their current research agenda (which might be overdetermined given the large selection pressure they faced to get into MATS to work on that agenda).
  4. Starting new technical AI safety orgs/projects seems quite difficult in the current funding ecosystem. I know of many alumni who have founded or are trying to found projects who express substantial difficulties with securing sufficient funding.

Note that the career fair survey might tell us little about how likely scholars are to start new projects as it was primarily seeking interest in which organizations should attend, not in whether scholars should join orgs vs. found their own.

Comment by Ryan Kidd (ryankidd44) on Key takeaways from our EA and alignment research surveys · 2024-05-11T20:13:35.397Z · LW · GW

Can you estimate dark triad scores from the Big Five survey data?

Comment by Ryan Kidd (ryankidd44) on Key takeaways from our EA and alignment research surveys · 2024-05-11T19:26:13.914Z · LW · GW

You might be interested in this breakdown of gender differences in the research interests of the 719 applicants to the MATS Summer 2024 and Winter 2024-25 Programs who shared their gender. The plot shows the difference between the percentage of male applicants who indicated interest in specific research directions from the percentage of female applicants who indicated interest in the same.

The most male-dominated research interest is mech interp, possibly due to the high male representation in software engineering (~80%), physics (~80%), and mathematics (~60%). The most female-dominated research interest is AI governance, possibly due to the high female representation in the humanities (~60%). Interestingly, cooperative AI was a female-dominated research interest, which seems to match the result from your survey where female respondents were less in favor of "controlling" AIs relative to men and more in favor of "coexistence" with AIs.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-11T19:17:36.397Z · LW · GW

This is potentially exciting news! You should definitely visit the LISA office, where many MATS extension program scholars are currently located.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-11T19:11:20.688Z · LW · GW

Last program, 44% of scholar research was on interpretability, 18% on evals/demos, 17% on oversight/control, etc. In summer, we intend for 35% of scholar research to be on interpretability, 17% on evals/demos, 27% on oversight/control, etc., based on our available mentor pool and research priorities. Interpretability will still be the largest research track and still has the greatest interest from potential mentors and applicants. The plot below shows the research interests of 1331 MATS applicants and 54 potential mentors who have applied for our Summer 2024 or Winter 2024-25 Programs.

Comment by Ryan Kidd (ryankidd44) on MATS Winter 2023-24 Retrospective · 2024-05-11T05:11:25.699Z · LW · GW

Oh, I think we forgot to ask scholars if they wanted Microsoft at the career fair. Is Microsoft hiring AI safety researchers?

Comment by Ryan Kidd (ryankidd44) on Key takeaways from our EA and alignment research surveys · 2024-05-04T21:42:27.601Z · LW · GW

Thank you so much for conducting this survey! I want to share some information on behalf of MATS:

  • In comparison to the AIS survey gender ratio of 9 M:F, MATS Winter 2023-24 scholars and mentors were 4 M:F and 12 M:F, respectively. Our Winter 2023-24 applicants were 4.6 M:F, whereas our Summer 2024 applicants were 2.6 M:F, closer to the EA survey ratio of 2 M:F. This data seems to indicate a large recent change in gender ratios of people entering the AIS field. Did you find that your AIS survey respondents with more AIS experience were significantly more male than newer entrants to the field?
  • MATS Summer 2024 applicants and interested mentors similarly prioritized research to "understand existing models", such as interpretability and evaluations, over research to "control the AI" or "make the AI solve it", such as scalable oversight and control/red-teaming, over "theory work", such as agent foundations and cooperative AI (note that some cooperative AI work is primarily empirical).
  • The forthcoming summary of our "AI safety talent needs" interview series generally agrees with this survey's findings regarding the importance of "soft skills" and "work ethic" in impactful new AIS contributors. Watch this space!
  • In addition to supporting core established AIS research paradigms, MATS would like to encourage the development of new paradigms. For better or worse, the current AIS funding landscape seems to have a high bar for speculative research into new paradigms. Has AE Studios considered sponsoring significant bounties or impact markets for scoping promising new AIS research directions?
  • Did survey respondents mention how they proposed making AIS more multidisciplinary? Which established research fields are more needed in the AIS community?
  • Did EAs consider AIS exclusively a longtermist cause area, or did they anticipate near-term catastrophic risk from AGI?
  • Thank you for the kind donation to MATS as a result of this survey!
Comment by Ryan Kidd (ryankidd44) on Estimating the Current and Future Number of AI Safety Researchers · 2024-04-24T16:42:39.861Z · LW · GW

I found this article useful. Any plans to update this for 2024?

Comment by Ryan Kidd (ryankidd44) on Shallow review of live agendas in alignment & safety · 2023-12-05T21:43:24.748Z · LW · GW

Wow, high praise for MATS! Thank you so much :) This list is also great for our Summer 2024 Program planning.

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-05T17:11:43.210Z · LW · GW

Another point: Despite our broad call for mentors, only ~2 individuals expressed interest in mentorship who we did not ultimately decide to support. It's possible our outreach could be improved and I'm happy to discuss in DMs.

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-03T23:28:26.380Z · LW · GW

I don't see this distribution of research projects as "Goodharting" or "overfocusing" on projects with clear feedback loops. As MATS is principally a program for prosaic AI alignment at the moment, most research conducted within the program should be within this paradigm. We believe projects that frequently "touch reality" often offer the highest expected value in terms of reducing AI catastrophic risk, and principally support non-prosaic, "speculative," and emerging research agendas for their “exploration value," which might aid potential paradigm shifts, as well as to round out our portfolio (i.e., "hedge our bets").

However, even with the focus on prosaic AI alignment research agendas, our Summer 2023 Program supported many emerging or neglected research agendas, including projects in agent foundations, simulator theory, cooperative/multipolar AI (including s-risks), the nascent "activation engineering" approach our program helped pioneer, and the emerging "cyborgism" research agenda.

Additionally, our mentor portfolio is somewhat conditioned on the preferences of our funders. While we largely endorse our funders' priorities, we are seeking additional funding diversification so that we can support further speculative "research bets". If you are aware of large funders willing to support our program, please let me know!

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T20:54:31.910Z · LW · GW

There seems to be a bit of pushback against "postmortem" and our team is ambivalent, so I changed to "retrospective."

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T19:15:25.286Z · LW · GW

Thank you!

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T19:14:56.510Z · LW · GW

Ok, added!

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T04:30:30.475Z · LW · GW

FYI, the Net Promoter score is 38%.

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T04:17:18.067Z · LW · GW

Ok, graph is updated!

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T04:09:23.934Z · LW · GW

Do you think "46% of scholar projects were rated 9/10 or higher" is better? What about "scholar projects were rated 8.1/10 on average" ?

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T03:45:14.937Z · LW · GW

We also asked mentors to rate scholars' "depth of technical ability," "breadth of AI safety knowledge," "research taste," and "value alignment." We ommitted these results from the report to prevent bloat, but your comment makes me think we should re-add them.

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T03:33:30.801Z · LW · GW

Yeah, I just realized the graph is wrong; it seems like the 10/10 scores were truncated. We'll upload a new graph shortly.

Comment by Ryan Kidd (ryankidd44) on MATS Summer 2023 Retrospective · 2023-12-02T03:05:08.270Z · LW · GW

Cheers, Vaniver! As indicated in the figure legend for "Mentor ratings of scholar research", mentors were asked, “Taking the above [depth/breadth/taste ratings] into account, how strongly do you support the scholar's research continuing?” and prompted with:

  • 10/10 = Very disappointed if [the research] didn't continue;
  • 5/10 = On the fence, unsure what the right call is;
  • 1/10 = Fine if research doesn't continue.

Mentors rated 18% of scholar research projects as 10/10 and 28% as 9/10.

Comment by Ryan Kidd (ryankidd44) on Apply for MATS Winter 2023-24! · 2023-11-08T01:22:45.383Z · LW · GW

Also, last year's program was 8-weeks and this year's program is 10-weeks.

Comment by Ryan Kidd (ryankidd44) on Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter · 2023-11-08T01:14:51.374Z · LW · GW

Buck Shlegeris, Ethan Perez, Evan Hubinger, and Owain Evans are mentoring in both programs. The links show their MATS projects, "personal fit" for applicants, and (where applicable) applicant selection questions, designed to mimic the research experience.

Astra seems like an obviously better choice for applicants principally interested in:

  • AI governance: MATS has no AI governance mentors in the Winter 2023-24 Program, whereas Astra has Daniel Kokotajlo, Richard Ngo, and associated staff at ARC Evals and Open Phil;
  • Worldview investigations: Astra has Ajeya Cotra, Tom Davidson, and Lukas Finnvedan, whereas MATS has no Open Phil mentors;
  • ARC Evals: While both programs feature mentors working on evals, only Astra is working with ARC Evals;
  • AI ethics: Astra is working with Rob Long.
Comment by Ryan Kidd (ryankidd44) on Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter · 2023-11-08T00:10:58.900Z · LW · GW

MATS has the following features that might be worth considering:

  1. Empowerment: Emphasis on empowering scholars to develop as future "research leads" (think accelerated PhD-style program rather than a traditional internship), including research strategy workshops, significant opportunities for scholar project ownership (though the extent of this varies between mentors), and a 4-month extension program;
  2. Diversity: Emphasis on a broad portfolio of AI safety research agendas and perspectives with a large, diverse cohort (50-60) and comprehensive seminar program;
  3. Support: Dedicated and experienced scholar support + research coach/manager staff and infrastructure;
  4. Network: Large and supportive alumni network that regularly sparks research collaborations and AI safety start-ups (e.g., Apollo, Leap Labs, Timaeus, Cadenza, CAIP);
  5. Experience: Have run successful research cohorts with 30, 58, 60 scholars, plus three extension programs with about half as many participants.