Why I'm joining Anthropic

post by evhub · 2023-01-05T01:12:13.822Z · LW · GW · 4 comments

Personal blogpost. Previously: “What I'll be doing at MIRI [AF · GW]”

For the last three years (since I left OpenAI), I've been a Research Fellow at MIRI. Starting next week, however, I'm going to be stepping down from that position and joining Anthropic as a safety researcher instead.[1]

To start with, some things that this does not mean:

So why do I think this is a good idea? The most basic reason is just that I think things are heating up and it'll be valuable for me to be closer to the action. I think current large language models are getting quite scary but that there's a lot of work to be done in understanding exactly how and what to do about it—see e.g. the recent paper I collaborated with Anthropic on [AF · GW]. I'll be splitting my time between theoretical and empirical work—both of which I'll continue to do—with the idea that being closer to current models should improve my ability to do both.

I expect that a lot of my time will be spent on the Conditioning Predictive Models agenda I've been working on for the past ~6 months, which I expect will be published in the next month or so. Until then, this post [AF · GW] and this post [AF · GW] probably contain the best current public writeups of some of the basic ideas. That being said, I won't be particularly tied down to it and might end up deciding to work on something completely different (e.g. what happened the last time I wrote up a big agenda [AF · GW]).

Since I'm sure I'm going to be asked about it a bunch now, some of my thoughts on Anthropic as an organization (obviously all thoughts are my own):


  1. Though I will technically be keeping my MIRI affiliation as a Research Associate. ↩︎

4 comments

Comments sorted by top scores.

comment by Edward Kmett (edward-kmett) · 2023-01-05T04:58:26.750Z · LW(p) · GW(p)

Time to update my position on 

comment by DragonGod · 2023-01-05T19:01:24.000Z · LW(p) · GW(p)

What do you think MIRI is currently doing wrong/what should they change about their approach/general strategy?

Replies from: evhub
comment by evhub · 2023-01-05T20:39:50.187Z · LW(p) · GW(p)

I thought I was pretty clear in the post that I don't have anything against MIRI. I guess if I were to provide feedback, the one thing I most wish MIRI would do more is hire additional researchers—I think MIRI currently has too high of a hiring bar.

Replies from: DragonGod
comment by DragonGod · 2023-01-06T11:01:41.655Z · LW(p) · GW(p)

I did not think you had anything against MIRI. It's just that leaving your position there provides you more allowance to be candid when giving critical feedback.

I would probably have asked this question to any MIRI staff who was departing. If there was ever a time to get opinions on what MIRI was doing wrong, it's now.