MATS Spring 2024 Extension Retrospective

post by HenningB (HenningBlue), Matthew Wearden (matthew-wearden), Cameron Holmes (cameron-holmes), Ryan Kidd (ryankidd44) · 2025-02-12T22:43:58.193Z · LW · GW · 0 comments

Contents

  Introduction and summary
  Extension program overview
    Office spaces
  Analysis of end-of-extension survey
    Top choices for next career steps
    Key uncertainties
  What went well
    Research output
    Career transitions
    Freedom to execute
    Research management (RM)
    Takeaways: what to continue
  What could be better
    Takeaways: what to improve
    Advice from past to future extension scholars
      Preparation
      Engagement
      Career plans
  Limitations
  Acknowledgements
None
No comments

Introduction and summary

This retrospective focuses on the 4-month MATS extension phase (referred to as "MATS 5.1") that ran from April 1 to July 25, 2024, and presents findings gathered from an end-of-extension survey as well as follow-up interviews and surveys ~5 months after the program.

Main changes from the 4.1 to 5.1 extension phase:

  1. Cohort grew from 26 to 36 scholars split across London, Berkeley and remote participants;
  2. MATS formalized research management for the London cohort and grew the team to 2 FTEs;
  3. The cohort visited Google DeepMind's London offices;
  4. The London team organized Tuesday lightning talks from scholars and MATS staff.

Key takeaways from MATS extension impact:

  1. Research success: 75% of scholars published results in some form (paper, LW/AF post, codebase), of which 57% got accepted to a conference.
  2. Career transitions:
    1. 61% of scholars are currently working full-time on AI safety within 5 months of MATS, 22% are doing some safety related work;
    2. 33% currently pursue independent research, 6% are working at technical safety orgs, 6% are currently upskilling in industry (non AIS), 22% are pursuing a PhD, and 17% have not found employment yet (as of December '24);
    3. One scholar co-founded a new safety-focused organization (Decode Research);
    4. We found no clear successes where someone planning to join a frontier lab actually achieved this, although most scholars aimed for this.
  3. Research management:
    1. Formalized research management for London cohort and grew the team to 2 FTEs;
    2. All scholars received regular research management and largely reported this as very helpful;
    3. Compared to the main program, RM was more helpful during the extension. This was substantially influenced by the decreased mentor engagement and increased independence scholars are expected to have during this phase of the program.
  4. Extension program
    1. Overall, scholars were very happy with program and the services and environment provided by MATS. Many highlighted that it was one of the best professional experiences they had.
    2. No scholar experienced a major challenge or inconvenience from the program, but some encountered challenges related to motivation and personal productivity throughout the extension.
    3. Almost all scholars were motivated to pursue the extension due to excitement to continue their research - they generally felt that they had just started to build momentum on their research during the 10 weeks of the main program, but were far from having strong results for publication. The extension provided them with a vehicle to carry this momentum through to producing high-quality research outputs, by giving them the time required to continue their work to completion.
    4. Many highlighted that the extension is often the only place where they produce legible work, with statements like “I think the extension is actually the most valuable part of MATS because it gives you more time to explore independently and also collaborate more broadly with people in other streams because it's a little less structured” (Sara Price).
    5. All scholars engaged full-time with their work, no one dropped out or left the program early.
    6. The scholars’ satisfaction with their output relative to their initial goals were mixed, divided equally across feeling below, meeting, and above their expectations.

Key changes MATS is considering for future iterations of the extension program:

  1. Better offboarding resources: Formalize processes and provide additional guidance for post-extension success, especially focusing on career coaching to develop robust career plans and optimize successful career transitions.
  2. More research management (RM): Improve the clarity of services and support that research managers can provide and how to best utilize them. Increase RM capacity especially for career coaching, paper writing support, and project management;
  3. Standardized program structure: Experiment with light structural elements to streamline the extension while preserving independence. Such elements may include formal start and end of program events, regular cohort-wide exchanges for sharing research progress, soft project milestones aligning with conference deadlines, and better support for and integration of remote scholars.

Extension program overview

Towards the end of the 10-week main program, many MATS scholars look to continue their research in structured ways and develop their career. The extension phase allows scholars to pursue their work over 4 months while receiving continuous support from mentors and other MATS services.

The 5.1 extension phase officially ran from April 1 to July 25, 2024. However, scholars are highly flexible in their approach to the extension and may choose start later. A central aim was to personalize the journey and provide services to a diverse set of scholars while imposing little to no formal structure. Certain conference deadlines fell into the extension, like submission for NeurIPS by May 22nd and the ICML MechInterp workshop by May 29. These tend to be important milestones for scholars and shape their approach and focus.

Scholars were accepted into the extension phase based on:

  1. Endorsements from their mentors;
  2. Securing independent funding, generally from the LTFF or Open Philanthropy;
  3. High-quality research plans, as evaluated by select MATS alumni and other contractors.

Of the 50 scholars who applied, 72% were accepted into the extension phase. The resulting 36 scholars pursued the extension at the London Initiative for Safe AI (LISA) office (45%), at FAR Labs in Berkeley (11%), and the remainder remotely (45%).

During the extension, scholars focus on advancing their research and formalizing findings into publications, as well as mapping out their future career paths. MATS continues to provide mentor support (usually less frequent), research management, and access to conducive office spaces. The programming at LISA and FAR Labs additionally offers seminars, workshops, and networking events.

This extension-phase is designed to help scholars produce legible, high-quality output, further develop relevant technical and non-technical skills to become better researchers, create long-term plans, and build connections to advance their professional goals. It functions as a crucial bridge, allowing scholars to thoughtfully transition from the main program into the next chapter of their careers.

Main changes from the 4.1 to 5.1 extension program:

  1. Cohort grew from 26 to 36 scholars split across London, Berkeley and remote
  2. Research management:
    1. From the novel and experimental structure of RM that was established in the 4.1 extension, the role and service were further formalized and the London team was extended to two people;
    2. Generally RMs were increasingly involved in the scholars’ project, in particular through weekly meetings and focus on paper support for conference submissions;
    3. RM was offered to all scholars by default and the London team collaborated more closely with mentors.
  3. Lab visit: a formal visit to the London DeepMind facilities was organized where scholars had the opportunity to connect with leading safety researchers, discuss and pitch their projects, and get to know the office space.
  4. Events at LISA:
    1. The London team launched a weekly lightning talk series. Ownership of this event was transferred to the LISA team. Talks were held by extension scholars, MATS staff, and various members of the LISA community;
    2. Additionally, LISA hosted regular talks on Thursdays given by organizations like Apollo Research, BlueDot Impact and external visitors;
    3. Scholars at FAR Labs could participate at events and talks organized there and extend their professional network.

We collected data for this retrospective through three methods:

In the following paragraphs we share important findings, analyze them, and derive actionable takeaways.

Office spaces

During the extension program, scholars can work from the LISA office in London, FAR Labs in Berkeley, or remotely. Most scholars come to London as it has traditionally been the central hub for the extension. This gives scholars a chance to become part of the largest AI safety hub outside of the Bay Area, and continue to spend more time with their MATS cohort. It is also possible for scholars to obtain a visa to spend up to 6 months conducting independent research in London, compared to the relatively restrictive immigration regime in the US that typically limits non-citizens to visits up to three months at a time. Additionally, some scholars can benefit from increased contact with mentors who are primarily based in UK-friendly time-zones.

For those who worked from FAR Labs or LISA, the office spaces were used frequently and consistently by scholars. In London, 58% of scholars regularly and consistently worked from LISA, 25% more infrequently, and 17% little or no time despite signing up for it. Extension scholars were invoiced individually for their use of LISA, which added operational overhead for all parties.

Almost half of the scholars continue their research remotely, primarily citing visa related and personal reasons. While some are based in the US, they tend to prefer working remotely over being based in the Berkeley office. Scholars did not elaborate further on their reasons other than personal preference for choosing remote work elsewhere in the US over being in Berkeley. MATS does not typically offer financial support for external office spaces but some remote scholars get access to co-working spaces such as FAR Labs and Constellation.

Analysis of end-of-extension survey

Towards the end of the extension, scholars were asked to complete a survey outlining their future career plans. 16 responses to this were received, primarily from scholars that were based in LISA for the extension.

Top choices for next career steps

Scholars shared their first and second choice plans following the extension:

In the evaluation of career transitions in the following paragraph, we examine data that connects the reported plans with actual outcomes.

Key uncertainties

Certainty about career plans are roughly distributed around two modes - one with relatively low certainty and another with relatively high certainty - with a median of 6.5/10. All scholars shared their main uncertainties in the offboarding survey.

Most scholars (≥60%):

Some scholars (30-50% of responses):

Few scholars (≤30% of responses):

Notable patterns:

What went well

Research output

Around 75% of extension scholars finished and published their research in some form (e.g., conference paper, arXiv paper, blog post, codebase). From the total cohort, around 57% of scholars published a peer-reviewed paper.

 

Career transitions

The majority of people are currently working on AI safety three months after the program; only 17% are not.

Over half of scholars continued research, with 33% conducting independent research and 22% transitioned to academia (PhDs), several found work at established orgs (~17%) and one person co-founded a new safety-focused organization (Decode Research)

While most people found positions after the extension, 17% are currently unemployed, a concerning finding that will be discussed below.

For cases where data is available, we can examine the connection translation of reported career plans to actual outcomes:

There seems to be a general trend of difficulty in achieving frontier lab placements despite this being a common first choice. Several participants report being in transition periods or continuing independent research while seeking permanent positions. The available data suggests that achieving exact matches with first or second choice plans was relatively rare, with many participants either adapting their goals or still working toward their initial aims. Transitioning into academia appeared more achievable in some cases.

Freedom to execute

All scholars highlighted that the extension afforded them the greater autonomy and dedicated time to pursue their research. While the main program was often characterized as a "melting pot of ideas" where scholars oriented to AI safety research, learned about "safety language and culture," and commenced research projects, the extension enabled scholars to finish conducting experiments and develop their results into papers. Most scholars appreciated this shift from a more exploratory phase to an execution-focused phase, with less programming, events, and commitments.

Research management (RM)

Many scholars described RM as incredibly helpful for personal support and structure, highlighting emotional support and active listening, accountability and project management, valuable coaching, improving interactions with mentors, as well as important opportunities for networking and community engagement.

For the research process, scholars valued assistance with setting and prioritizing directions, help for growing as a researcher and professional, and discussing and refining technical ideas.

It is important to note that (1) some scholars were highly autonomous and needed minimal support, and (2) the involvement of research managers depended on the person, project and phase. Still, scholars consistently engaged with and benefited from available RM services. “[My research manager helped me to] mature as a researcher, stupid/crackpot ideas to deeper understanding and solid results.” (Roman)

Takeaways: what to continue

What could be better

  1. Unemployed alumni
    1. As a talent development program, MATS aims to facilitate scholars’ transition into safety-relevant work, which is why the unemployed alumni are a point of concern. Note that the sample size was small (3/18 respondents were unemployed). While there are many confounding factors why people may end up without funding or employment, and MATS can’t guarantee successful transitions after the program - in part due to a gap between available safety positions and number of interested researchers - we seek to minimize these cases and consider this a partial failure of our offboarding processes.
    2. In this context, employment means any paid position whether in a company, academia or other program that provides an income stream and professional engagement. After the MATS extension it is reasonable that scholars need time for potential applications and orientation. But the cases mentioned above were unemployed 5 months after the program ended. We see lack of funding and professional environment as a high risk to an individual’s personal and professional well-being, which negatively affects their expected impact. To address risks of unemployment, MATS aims to improve offboarding processes, focusing on career coaching, robust career plans with backup options, and identifying risk candidates early.
    3. Common failure modes include limited backup plans when primary career paths don't materialize, transition challenges from a supervised research program to industry roles, and simply varying or insufficient levels of technical skills and research experience.
    4. Promising suggestions for process improvements are:
      1. Identifying at-risk scholars early (e.g., visa constraints, technical gaps, limited career plans);
      2. Regular career coaching sessions throughout extension, developing concrete, robust plans and backup options;
      3. Dedicated preparation for application processes and upskilling pathways.
  2. Post-extension success
    • While some alumni are well set up in terms of collaboration opportunities, research directions, career plans, and funding, others lack or find it difficult to get these things, which typically results in feeling stuck and not doing their best work. MATS could prepare graduating scholars better for leaving the program structure and support, including expectation management regarding performance and certain career outcomes.
    • The most promising avenues we should focus on are career coaching and stress testing their plans, collaborator matching, and equipping scholars with resources and tools to advance their research and career. Given the growing MATS alumni network and limited capacity of the support staff, it seems promising to form peer support and exchange groups.
    • Many alumni find it helpful to be in the MATS Slack space because of pointers for job and collaboration opportunities, contemporary safety discussions and intellectual exchange.
  3. Quality, clarity, and utilization of research management (RM) services
    • While most scholars found RM particularly helpful during the extension, for some it wasn’t clear how to best utilize the available services and management. MATS could create a clear overview of services and further develop processes for RM engagement including case studies of typical interactions and benefits.
    • From feedback and interviews it is evident that RM were sufficiently available and approachable. But especially junior scholars with limited experience working with managers were uncertain or confused how to benefit the best from available management capacities. RMs should better calibrate to this and consider being more explicit or directive about providing guidance and management. Ultimately, the scholars decide whether and how to utilize RM services.
    • Generally, research managers could provide more guidance on project management because scholars tend to focus on hands-on tasks. But to ensure consistent progress and legible outputs, many scholars would benefit from aligning project timelines with relevant conference deadlines and setting milestones.
    • Additional, feedback for improving RM services included:
      • More proactive engagement to ensure scholars don't "fall through the cracks”;
      • Encourage scholars to utilize RM capacities;
      • Facilitate connection opportunities for remote scholars.
  4. Technical upskilling
    1. During the program, scholars mostly rely on projects for technical upskilling opportunities and on mentors for relevant feedback. However, we’re interested in which parts of the tech stack scholars might have struggled with or what materials they would have benefited from.
    2. In interpretability, people usually highlighted the available tooling and tutorials that exist in the space (e.g., TransformerLens) as well as Neel Nanda's training period for learning interpretability tools. For evals, the Inspect framework from the UK AISI has become more popular and provides decent documentation and tutorials as an entry point. Workshops like engineering best practices (by John Hughes) or Hydra during the main program proved useful later on.
    3. While most scholars didn’t encounter major blockers due to a lack of documentation, common challenges seemed to be more about research methodology and specific ML implementation details rather than basic programming skills.
    4. A notable pattern was that scholars often had to figure things out through trial and error, which most saw as a natural part of research and professional growth but others felt could be better supported through structured resources, technical mentorship or pair programming sessions.
    5. Specific wishes for resources included:
      1. Managing and storing experimental results;
      2. Managing package and development environment especially within teams;
      3. RL for LM training.
    6. Additionally, RMs could do a better job of connecting new and past scholars if they work on similar directions. One such opportunity was missed where new scholars worked on extending a research done by a previous scholar, but they weren't aware of or connected with them initially; had they been introduced to discuss relevant prior work early on, they could have saved several weeks of research time. Practically, this connection seems feasible for scholars and projects managed by the same person, but seems difficult to do across entire cohorts. We could further develop infrastructure and workflows to track current and past projects in detail, but would also expect mentors to provide pointers to relevant work, whether from MATS alumni or other researchers.
    7. For future iterations, MATS may consider sourcing promising technical resources and reflect on the role targeted technical upskilling should play for the extension.
  5. Extension structure
    1. While scholars largely appreciated the decreased structure and increase freedom during the extension, this independence comes with trade-offs: some scholars noted feeling more isolated or lacking the collaborative energy that characterized the main program. This suggests that while the independence and longer duration of the extension was valuable, providing some lightweight structure could help scholars maximize this independence while staying connected to the broader community. MATS aims to balance imposing more structure, optional vs. mandatory activities, and maintaining independence.
    2. Specific suggestions included:
      1. Add soft deadlines or milestones;
      2. Align the program and project scopes to relevant conference deadlines;
      3. Facilitate further structured and technical exchanges within the cohort, especially for remote scholars;
      4. Facilitate more structured interaction opportunities with peers, organizations, and researchers around the co-working spaces;
      5. Facilitate optional regular group discussions, especially for remote scholars.

Takeaways: what to improve

Additionally, the 5.1 alumni mentioned the following things. However, we're uncertain about their cost-effectiveness given the available resources and takeaways from above which we expect to have higher leverage:

Advice from past to future extension scholars

In addition to the feedback shared above, we specifically asked alumni about advice they would give to future extension scholars.

Preparation

Engagement

Career plans

Limitations

Our findings were gathered primarily through surveys and interviews conducted at multiple time points. The initial data collection included an offboarding survey towards the end of the extension (16 respondents). A follow-up survey and interviews (18 respondents) conducted ~5 months after the extension. While this approach provided valuable insights, we acknowledge several limitations in our data collection. The relatively small sample size and potential response bias should be considered when interpreting the results, as participants who chose to respond may not fully represent the entire alumni population. Nevertheless, the findings strongly aligned with the broader observations and reflections of team members and informal conversations with alumni. The several-month follow-up period proved beneficial in tracking initial career transitions, though we recognize that professional trajectories often take longer to fully materialize and the data is dated to early December 2024. To address this limitation, we maintain longer-term tracking through comprehensive surveys, such as our 2024 alumni survey [LW · GW] that features employment outcomes.

Acknowledgements

This retrospective was produced by the ML Alignment & Theory Scholars Program team. Henning Bartsch was the primary author of this report with valuable feedback and help from the MATS team, including Matthew Wearden, Bryce Woodworth, Cameron Holmes, Henry Sleight, and Ryan Kidd.

Thanks to the MATS 5.1 alumni for your time and feedback! We also thank our funders without whose donations we would be unable to run upcoming programs or retain team members essential to this report: Open Philanthropy, the Survival and Flourishing Fund, Foresight Institute, the Long-Term Future Fund, Craig Falls, and several donors via Manifund.

0 comments

Comments sorted by top scores.