Open Call for Research Assistants in Developmental Interpretability

post by Jesse Hoogland (jhoogland), Daniel Murfet (dmurfet), Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel), Stan van Wingerden (stan-van-wingerden) · 2023-08-30T09:02:59.781Z · LW · GW · 11 comments

Contents

  Background
  Position Details
    General info: 
  Who We Are
  Overview of Projects
  What We Expect
  FAQ
    Who is this for? 
    What’s the time commitment?
    What does the compensation mean?
    Do I need to be familiar with SLT and AI alignment? 
    What about compute?
    What are you waiting for? 
None
11 comments

We are excited to announce multiple positions for Research Assistants to join our six-month research project assessing the viability of Developmental Interpretability [LW · GW] (DevInterp). 

This is a chance to gain expertise in interpretability, develop your skills as a researcher, build out a network of collaborators and mentors, publish in major conferences, and open a path towards future opportunities, including potential permanent roles, recommendations, and successive collaborations.

Background

Developmental interpretability [LW · GW] is a research agenda aiming to build tools for detecting, locating, and understanding phase transitions in learning dynamics of neural networks. It draws on techniques from singular learning theory, mechanistic interpretability, statistical physics, and developmental biology. 

Position Details

General info: 

Timeline:

How to Apply: Complete the application form by the deadline. Further information on the application process will be provided in the form.

Who We Are

The developmental interpretability research team consists of experts across a number of areas of mathematics, physics, statistics and AI safety. The principal researchers:

We have a range of projects currently underway, led by one of these principal researchers and involving a number of other PhD and MSc students from the University of Melbourne and collaborators from around the world. In an organizational capacity you would also interact with Alexander Oldenziel and Stan van Wingerden.

You can find us and the broader DevInterp research community on our Discord. Beyond the Developmental Interpretability [LW · GW] research agenda, you can read our first preprint on scalable SLT invariants and check out the lectures from the SLT & Alignment summit.

Overview of Projects

Here’s the selection of the projects underway, some of which you would be expected to contribute to. These tend to be on the more experimental side: 

Next to these, we are working on a number of more theoretical projects. (Though our focus is on the more applied projects, if one of these particularly excites you, you should definitely apply!)

Taken together these projects complete the scoping phase of the DevInterp research agenda, ideally resulting in publications in venues like ICML and NeurIPS.

What We Expect

You will be communicating about your research on the DevInterp Discord, writing code and training models, attending research meetings over Zoom, and in general acting as a productive contributor to a fast-moving research team combining both theoreticians and experimentalists working together to define the future of the science of interpretability.

Depending on interest and background, you may also be reading and discussing papers from ML or mathematics and contributing to the writing of papers on Overleaf. It’s not mandatory, but you would be invited to join in virtual research seminars like the SLT seminar at metauni or SLT reading group on the DevInterp Discord.

There will be a DevInterp conference [LW · GW] in November 2023 in Oxford, United Kingdom, and it would be great if you could attend. There will hopefully be a second opportunity to meet the team in person between November and the end of the employment period (possibly in Melbourne, Australia).

FAQ

Who is this for? 

We’re looking mainly for people who can do engineering work, that is, people with software development and ML skills. It’s not necessary to have a background in interpretability or AI safety, although that’s a plus. Ideally you have legible output / projects that demonstrate ability as an experimentalist. 

What’s the time commitment?

We’re looking mainly for people who can commit full-time, but if you’re talented and only available part-time, don’t shy away from applying.

What does the compensation mean?

We’ve budgeted USD$70k in total to be spread across 1-4 research assistants over the next half year. By default we’re expecting to pay RAs USD$17.50/hour. 

Do I need to be familiar with SLT and AI alignment? 

No (though it’s obviously a plus).

We’re leaning towards taking on skilled general purpose experimentalists (without any knowledge of SLT) over less experienced programmers who know some SLT. That said, if you are a talented theorist, don’t shy away from applying. 

What about compute?

In the current phase of the research agenda the projects are not extremely compute intensive, the necessary cloud GPU access will be provided.

What are you waiting for? 

Apply now.

EDIT: The applications have closed. Take a look at this comment [LW(p) · GW(p)] for summary and feedback.

11 comments

Comments sorted by top scores.

comment by Jesse Hoogland (jhoogland) · 2023-09-16T07:32:03.249Z · LW(p) · GW(p)

Now that the deadline has arrived, I wanted to share some general feedback for the applicants and some general impressions for everyone in the space about the job market:

  • My number one recommendation for everyone is to work on more legible projects and outputs. A super low-hanging fruit for >50% of the applications would be to clean up your GitHub profiles or to create a personal site. Make it really clear to us which projects you're proud of, so we don't have to navigate through a bunch of old and out-of-use repos from classes you took years ago. We don't have much time to spend on every individual application, so you want to make it really easy for us to become interested in you. I realize most people don't even know how to create a GitHub profile page, so check out this guide
  • We got 70 responses and will send out 10 invitations for interviews. 
  • We rejected a reasonable number of decent candidates outright because they were looking for part-time work. If this is you, don't feel dissuaded.
  • There were quite a few really bad applications (...as always): poor punctuation/capitalization, much too informal, not answering the questions, totally unrelated background, etc. Two suggestions: (1) If you're the kind of person who is trying to application-max, make sure you actually fill in the application. A shitty application is actually worse than no application, and I don't know why I have to say that. (2) If English is not your first language, run your answers through ChatGPT. GPT-3.5 is free. (Actually, this advice is for everyone). 
  • Between 5 and 10 people expressed interest in an internship option. We're going to think about this some more. If this includes you, and you didn't mention it in your application, please reach out. 
  • Quite a few people came from a data science / analytics background. Using ML techniques is actually pretty different from researching ML techniques, so for many of these people I'd recommend you work on some kind of project in interpretability or related areas to demonstrate that you're well-suited to this kind of research. 
  • Remember that job applications are always noisy.  We almost certainly made mistakes, so don't feel discouraged!
comment by Thomas Kwa (thomas-kwa) · 2023-08-30T16:28:35.528Z · LW(p) · GW(p)

I would apply if the rate were more like USD $50/hour, considering the high cost of living in Berkeley. As it is this would be something like a 76% pay cut compared to the minimum ML engineer pay at Google and below typical grad student pay, and require an uncommon level of frugality.

Replies from: jhoogland, anonce
comment by Jesse Hoogland (jhoogland) · 2023-08-30T21:04:03.712Z · LW(p) · GW(p)

Hey Thomas, I wrote about our reasoning for this in response to Winston [LW(p) · GW(p)]:

All in all, we're expecting most of our hires to come from outside the US where the cost of living is substantially lower. If lower wages are a deal-breaker for anyone but you're still interested in this kind of work, please flag this in the form. The application should be low-effort enough that it's still worth applying.

Replies from: whitehatStoic
comment by MiguelDev (whitehatStoic) · 2023-08-31T00:55:34.401Z · LW(p) · GW(p)

Hello, I agree with Jesse as the budget they have is really good for hiring capable alignment researchers here in Asia (I'm based currently in Chiang Mai, Thailand) or any other place  where cost is extremely low compared back there in the West. 

Good luck on this project team Dev Interp.

comment by anonce · 2023-08-30T18:24:54.252Z · LW(p) · GW(p)

More than a 76% pay cut, because a lot of the compensation at Google is equity+bonus+benefits; the $133k minimum listed at your link is just base salary.

comment by winstonBosan · 2023-08-30T15:45:20.318Z · LW(p) · GW(p)

Always welcome more optionality in the opportunity space!

Suggestion: Potential Improvement in Narrative Signalling by lowering the range of RAs to hire (thus increasing pay): 

  • If I were applying to this, I'd feel confused and slightly underappreciated if I had the right set of ML/Software Engineering skills but to be barely paid subsistence level for my full-time work (in NY). 
  • It seems like the funding amount is well corresponded to how much the grant is. I am rather naive when it comes to how much ML/engineering talent should be paid in pursuit of alignment goals. But it seems like $70k spread across 4 people at full-time (for half year each) is only slightly above minimal wage in many places. 
  • Comparisons: At 35k a year, it seems it might be considerably lower than industry equivalent even when compared to other programs 
    • Ex: Lightcone has a general policy of paying ~70% market pay for equivalent talent. 
      • Recalling from memory of LTFF/similar grants that experienced researchers were granted 70k ~ 200k for their individual research. 
      • A quick glance at 80k job-board for RAs nets me a range of 32,332 ~ 87,000. 
  • Of course... money is tight: The grant constraint is well acknowledged here. But potentially the number of RAs expected to hire can be further down adjusted as while potentially increasing the submission rate of the candidates that truly fits the requirement of the research program. 
  • Hiring more might not work as intended: Also, it might come as a surprise that fewer people to manage will turn out to be a blessing rather than a curse - hiring one's way out of something is tempting but should usually be tempered with caution.
  • Thing I might have got wrong: 
    • The intended audiences and people of the right skill-set will not be in countries where the salary is barely above subsistence level. 
    • The Research Project has decided that people who possesses an instinct for "I'd like to work here, but please give me X because I think I am worth that much and can offer at least that much value" is generally a poor fit for the project. 
    • A misunderstanding the salary distribution and the rate of deterioration of the current financial climate within the scene. 

Overall, I am glad y'all exist! Good luck :)

Replies from: jhoogland
comment by Jesse Hoogland (jhoogland) · 2023-08-30T16:36:21.075Z · LW(p) · GW(p)

Hey Winston, thanks for writing this out. This is something we talked a lot about internally. Here are a few thoughts: 

Comparisons: At 35k a year, it seems it might be considerably lower than industry equivalent even when compared to other programs 

I think the more relevant comparison is academia, not industry. In academia, $35k is (unfortunately) well within in the normal range for RAs and PhD students. This is especially true outside the US, where wages are easily 2x - 4x lower.

Often academics justify this on the grounds that you're receiving more than just monetary benefits: you're receiving mentorship and training. We think the same will be true for these positions. 

The actual reason is that you have to be somewhat crazy to even want to go into research. We're looking for somewhat crazy.

If I were applying to this, I'd feel confused and slightly underappreciated if I had the right set of ML/Software Engineering skills but to be barely paid subsistence level for my full-time work (in NY).

If it helps, we're paying ourselves even less. As much as we'd like to pay the RAs (and ourselves) more, we have to work with what we have.  

Of course... money is tight: The grant constraint is well acknowledged here. But potentially the number of RAs expected to hire can be further down adjusted as while potentially increasing the submission rate of the candidates that truly fits the requirement of the research program.

For exceptional talent, we're willing to pay higher wages. 

The important thing is that both funding and open positions are exceptionally scarce. We expect there to be enough strong candidates who are willing to take the pay cut.

All in all, we're expecting most of our hires to come from outside the US where the cost of living is substantially lower. If lower wages are a deal-breaker for anyone but you're still interested in this kind of work, please flag this in the form. The application should be low-effort enough that it's still worth applying. 

Replies from: ariel-kwiatkowski
comment by Ariel Kwiatkowski (ariel-kwiatkowski) · 2023-08-30T23:39:30.268Z · LW(p) · GW(p)

Often academics justify this on the grounds that you're receiving more than just monetary benefits: you're receiving mentorship and training. We think the same will be true for these positions. 

 

I don't buy this. I'm actually going through the process of getting a PhD at ~40k USD per year, and one of the main reasons why I'm sticking with it is that after that, I have a solid credential that's recognized worldwide, backed by a recognizable name (i.e. my university and my supervisor). You can't provide either of those things.

This offer seems to take the worst of both worlds between academia and industry, but if you actually find someone good at this rate, good for you I suppose

comment by Chris_Leong · 2023-09-07T06:41:03.959Z · LW(p) · GW(p)

Would any of the involved parties be interested in having a fireside chat for AI Safety Australia and New Zealand about developmental interpretability and this position a few days before the application closes?

If so, please feel free to PM me.

comment by Chris_Leong · 2023-08-30T09:39:04.906Z · LW(p) · GW(p)

Could you clarify if the rate is in AUD or USD?

Replies from: stan-van-wingerden
comment by Stan van Wingerden (stan-van-wingerden) · 2023-08-30T09:45:02.132Z · LW(p) · GW(p)

It's in USD (should be reflected in to the announcement now)