Posts

Offering AI safety support calls for ML professionals 2024-02-15T23:48:12.797Z
Retrospective on the AI Safety Field Building Hub 2023-02-02T02:06:52.722Z
Interviews with 97 AI Researchers: Quantitative Analysis 2023-02-02T01:01:32.087Z
“AI Risk Discussions” website: Exploring interviews from 97 AI Researchers 2023-02-02T01:00:01.067Z
Predicting researcher interest in AI alignment 2023-02-02T00:58:01.120Z
What AI Safety Materials Do ML Researchers Find Compelling? 2022-12-28T02:03:31.894Z
Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding 2022-07-28T21:29:52.424Z
Resources I send to AI researchers about AI safety 2022-06-14T02:24:58.897Z
Vael Gates: Risks from Advanced AI (June 2022) 2022-06-14T00:54:25.448Z
Transcripts of interviews with AI researchers 2022-05-09T05:57:15.872Z
Self-studying to develop an inside-view model of AI alignment; co-studiers welcome! 2021-11-30T09:25:05.146Z

Comments

Comment by Vael Gates on Offering AI safety support calls for ML professionals · 2024-02-16T04:34:13.990Z · LW · GW

FAQ

This is cool! Why haven't I heard of this?
Arkose has been in soft-launch for a while, and we've been focused on email outreach more than public comms. But we're increasingly public, and are in communication with other AI safety fieldbuilding organizations! 

How big is the team?

3 people: Zach Thomas and Audra Zook are doing an excellent job in operations, and I'm the founder.

How do you pronounce "Arkose"? Where did the name come from?

I think whatever pronunciation is fine, and it's the name of a rock. We have an SEO goal for arkose.org to surpass the rock's Wikipedia page.

Where does your funding come from?
The Survival and Flourishing Fund.


Are you kind of like the 80,000 Hours 1-1 team?
Yes, in that we also do 1-1 support calls, and that there are many people for whom it'd make sense to do a call with both 80,000 Hours and Arkose! One key difference is that Arkose is aiming to specifically support mid-career people interested in getting more involved in technical AI safety. 

I'm not a mid-career person, but I'd still be interested in a call with you. Should I request a call?
Regretfully no, since we're currently focusing on professors, PhD students, or industry researcher or engineers who have AI / ML experience. This may expand in the future, but we'll probably still be pretty focused on mid-career folks. 

Is Arkose's Resource page special in any way?
Generally, our resources are selected to be most helpful to professors, PhD students, and industry professionals, which is a different focus than most other resource lists. We also think arkose.org/papers is pretty cool: it's a list of AI safety papers that you can filter by topic area. It's still in development and we'll be updating it over time (and if you'd like to help, please contact Vael!)

How can I help?
• If you know someone who might be a good fit for a call with Arkose, please pass along arkose.org to them! Or fill out our referral form.
• If you have machine learning expertise and would like to help us review our resources (for free or for pay), please contact vael@arkose.org.


Thanks everyone!

Comment by Vael Gates on My Assessment of the Chinese AI Safety Community · 2023-05-01T21:12:59.334Z · LW · GW

Does Anyuan(安远) have a website? I haven't heard of them and am curious. (I've heard of Concordia Consulting https://concordia-consulting.com/ and Tianxia https://www.tian-xia.com/.)

Comment by Vael Gates on “AI Risk Discussions” website: Exploring interviews from 97 AI Researchers · 2023-03-15T00:07:39.579Z · LW · GW

Small update: Two authors gave me permission to publish their transcripts non-anonymously!

Interview with Michael L. Littman (https://docs.google.com/document/d/1GoSIdQjYh21J1lFAiSREBNpRZjhAR2j1oI3vuTzIgrI/edit?usp=sharing)

Interview with David Duvenaud (https://docs.google.com/document/d/1lulnRCwMBkwD9fUL_QgyHM4mzy0al33L2s7eq_dpEP8/edit?usp=sharing)

Comment by Vael Gates on Transcripts of interviews with AI researchers · 2023-03-15T00:02:17.676Z · LW · GW

Two authors gave me permission to publish their transcripts non-anonymously! Thus:

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2023-01-01T23:56:18.869Z · LW · GW

Anonymous comment sent to me, with a request to be posted here:

"The main lede in this post is that pushing the materials that feel most natural for community members can be counterproductive, and that getting people on your side requires considering their goals and tastes. (This is not a community norm in rationalist-land, but the norm really doesn’t comport well elsewhere.)"

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-30T22:22:32.564Z · LW · GW

was this as helpful for you/others as expected?

I think these results, and the rest of the results from the larger survey that this content is a part of, have been interesting and useful to people, including Collin and I. I'm not sure what I expected beforehand in terms of helpfulness, especially since there's a question "helpful with respect to /what/", and I expect we may have different "what"s here.

are you planning related testing to do next?

Good chance of it! There's some question about funding, and what kind of new design would be worth funding, but we're thinking it through.

I wonder if it would be valuable to first test predictions among communicators

Yeah, I think this is currently mostly done informally -- when Collin and I were choosing materials, we had a big list, and were choosing based on shared intuitions that EAs / ML researchers / fieldbuilders have, in addition to applying constraints like "shortness". Our full original plan was also much longer and included testing more readings -- this was a pilot survey. Relatedly, I don't think these results are very surprising to people (which I think you're alluding to in this comment) -- somewhat surprising, but we have a fair amount of information about researcher preferences already.

I do think that if we were optimizing for "value of new information to the EA community" this survey would have looked different.

I wonder about the value of trying to build an informal panel/mailing list of ML researchers

Instead of contacting a random subset of people who had papers accepted at ML conferences? I think it sort of depends on one's goals here, but could be good. A few thoughts: I think this may already exist informally, I think this becomes more important as there's more people doing surveys and not coordinating with each other, and this doesn't feel like a major need from my perspective / goals but might be more of a bottleneck for yours!

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-29T07:03:27.744Z · LW · GW

My guess is that people were aware (my name was all over the survey this was a part of, and people were emailing with me). I think it was also easily inferred that the writers of the survey (Collin and I) supported AI safety work far before the participants reached the part of the survey with my talk. My guess is that my having written this talk didn't change the results much, though I'm not sure which way you expect the confound to go? If we're worried about them being biased towards me because they didn't want to offend me (the person who had not yet paid them), participants generally seemed pretty happy to be critical in the qualitative notes. More to the point, I think the qualitative notes for my talk seemed pretty content focused and didn't seem unusual compared to the other talks when I skimmed through them, though would be interested to know if I'm wrong there.

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-29T00:40:13.666Z · LW · GW

Yeah, we were focusing on shorter essays for this pilot survey (and I think Richard's revised essay came out a little late in the development of this survey? Can't recall) but I'm especially interested in "The alignment problem from a deep learning perspective", since it was created for an ML audience.

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-28T21:48:56.745Z · LW · GW

Whoa, at least one of the respondents let me know that they'd chatted about it at NeurIPS -- did multiple people chat with you about it? (This pilot survey wasn't sent out to that many people, so curious how people were talking about it.)

Edited: talking via DM

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-28T10:44:47.386Z · LW · GW

Thanks! (credit also to Collin :))

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-28T10:40:24.827Z · LW · GW

Agreed that status / perceived in-field expertise seems pretty important here, especially as seen through the qualitative results (though the Gates talk did surprisingly well, given not an AI researcher, but the content reflects that). We probably won't have [energy / time / money] + [we have limited access to researchers] to test something like this, but I think we can hold "status is important" as something pretty true given these results, Hobbhann's (https://forum.effectivealtruism.org/posts/kFufCHAmu7cwigH4B/lessons-learned-from-talking-to-greater-than-100-academics), and a ton of anecdotal evidence from a number of different sources.

(I also think the Sam Bowman article is a great article to recommend, and in fact recommend that first a lot of the time.)

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-28T10:32:03.257Z · LW · GW

(Just a comment on some of the above, not all)

Agreed and thanks for pointing out here that each of these resources has different content, not just presentation, in addition to being aimed at different audiences. This seems important and not highlighted in the post.

We then get into what we want to do about that, where one of the major tricky things is the ongoing debate of "how much researchers need to be thinking in the frame of xrisk to make useful progress in alignment", which seems like a pretty important crux, and another is "what do ML researchers think after consuming different kinds of content", where Thomas has some hypotheses in the paragraph "I'd guess..." but we don't actually have data on this and I can think of alternate hypotheses, which also seems quite cruxy.

Comment by Vael Gates on What AI Safety Materials Do ML Researchers Find Compelling? · 2022-12-28T10:18:35.594Z · LW · GW

These results were actually embedded in a larger survey, and were grouped in sections, so I don't think it came off as particularly long within the survey. (I also assume most people watched the video at faster than 1x.) People also seemed to like this talk, so I'd guess that they watched it as or more thoroughly than they did everything else. We don't have analytics regretfully. (I also forgot to add that we told people to skip the Q&A, so we had them watch the first 48m.)

Comment by Vael Gates on #4.3: Cryonics-friendly insurance agents · 2022-12-04T08:30:17.883Z · LW · GW

I told him I only wanted the bare-bones of interactions, and he's been much better to work with!

Comment by Vael Gates on Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover · 2022-11-26T02:48:03.407Z · LW · GW

I see there's an associated talk now! https://www.youtube.com/watch?v=EIhE84kH2QI

Comment by Vael Gates on AI Safety field-building projects I'd like to see · 2022-09-12T03:02:13.660Z · LW · GW

Re: "Targeted Outreach to Experienced Researchers"

Please apply to work with the aforementioned AISFB Hub! I am actively trying to hire for people who I think would be good fits for this type of role, and offer mentorship / funding / access to and models of the space. Note that you'll need to have AI safety knowledge (for example, I want you to have read / have a plan for reading all of the main readings in the AGISF Technical Curriculum) and high generalist competence, as two of the most important qualifications.

I think most people will not be a good fit for this role (there's more complicated status hierarchies and culture within experienced researchers than are visible at first glance), and like Akash I caution against unilateral action here. I'm psyched about meeting people who are good fits, however, and urge you to apply to work with me if you think that could be you!

Comment by Vael Gates on #4.3: Cryonics-friendly insurance agents · 2022-09-05T03:01:11.511Z · LW · GW

Some Rudi communication style anecdotes:

  • Rudi: "Aren't you a beautiful young woman!" almost immediately when we saw each other on video call for the first time (I identify as nonbinary) (<-- this anecdote is a from a few years ago and from memory, though, so he might have just said something quite similar)
  • Rudi, in a Google Calendar invite note, as a closing: "Let's talk, dear Vael...as I recall, we liked each other a lot.:)"
    • Me in an email back:

      "(Ah, and just one other quick note: in the Google Calendar invite, you've included the line "Let's talk, dear Vael...as I recall, we liked each other a lot.:)". This feels like flirting to me, and I'm not sure but imagine you wouldn't include this in emails to men, so I just wanted to state a preference that I'd enjoy if sentences like this weren't included in the future! Many thanks, and looking forward to talking to you in June!)"
       
    • Rudi back: 

      "Hi Vael, 

      Of course, and thank you for nicely stating your preference, and just for the record I would include a phrase like this with men, women, or non-gendered individuals. (Also for the record, maybe I should re-think this.)  And I still appreciate your observation, and will endeavor to be more circumspect in the future. :)  

      Warm and decidedly professional regards, 

      Rudi :)"
       

I've similarly heard he doesn't do this with men. He also answered my questions when emailing back and forth. But yeah, be ready!
 

Comment by Vael Gates on Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding · 2022-08-25T01:07:41.602Z · LW · GW

Seems like it's great to do one-on-ones with people who could be interested and skilled from all sorts of fields, and top researchers in similar fields could be a good group to prioritize! Alas, I feel like the current bottleneck is people who are good fits to do these one-on-ones (I'm looking to hire people, but not currently doing them myself); there's many people I'd ideally want to reach. 

Comment by Vael Gates on Resources I send to AI researchers about AI safety · 2022-06-21T02:26:30.013Z · LW · GW

Thanks for doing that Kat!

Comment by Vael Gates on Resources I send to AI researchers about AI safety · 2022-06-14T23:50:37.414Z · LW · GW

Sure! This isn't novel content; the vast majority of it is drawn from existing lists, so it's not even particularly mine. I think just make sure the things within are referenced correctly, and you should be good to go!

Comment by Vael Gates on Resources I send to AI researchers about AI safety · 2022-06-14T23:45:06.606Z · LW · GW

With respect to the fact that I don't immediately point people at LessWrong or the Alignment Forum (I actually only very rarely include the "Rationalist" section in the email-- not unless I've decided to bring it up in person, and they've reacted positively), there's different philosophies on AI alignment field-building. One of the active disagreements right now is how much we want new people coming into AI alignment to be the type of person who enjoy LessWrong, or whether it's good to be targeting a broader audience. 

I'm personally currently of the opinion that we should be targeting a broader audience, where there's a place for people who want to work in academia or industry separate from the main Rationalist sphere, and the people who are drawn towards the Rationalists will find their way there either on their own (I find people tend to do this pretty easily when they start Googling), or with my nudging if they seem to be that kind of person. 

I don't think this is much "shying away from reality" -- it feels more like engaging with it, trying to figure out if and how we want AI alignment research to grow, and how to best make that happen given the different types of people with different motivations involved.

Comment by Vael Gates on Resources I send to AI researchers about AI safety · 2022-06-14T23:32:37.443Z · LW · GW

A great point, thanks! I've just edited the "There's also a growing community working on AI alignment" section to include MIRI, and also edited some of the academics' names and links.

I don't think it makes sense for me to list Eliezer's name in the part of that section where I'm listing names, since I'm only listing some subset of academics who (vaguely gesturing at a cluster) are sort of actively publishing in academia, mostly tenure track and actively recruiting students, and interested in academic field-building. I'm not currently listing names of researchers in industry or non-profits (e.g. I don't list Paul Christiano, or Chris Olah), though that might be a thing to do. 

Note that I didn't choose this list of names very carefully, so I'm happy to take suggestions! This doc came about because I had an email draft that I was haphazardly adding things to as I talked to researchers and needed to promptly send them resources, getting gradually refined when I spotted issues. I thus consider it a work-in-progress and appreciate suggestions. 

Comment by Vael Gates on Transcripts of interviews with AI researchers · 2022-05-18T03:15:38.566Z · LW · GW

I've been finding "A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]" to have a lot of content that would likely be interesting to the audience reading these transcripts. For example, the incentives section rhymes with the type of things interviewees would sometimes say. I think the post generally captures and analyzes a lot of the flavor / contextualizes what it was like to talk to researchers.

Comment by Vael Gates on Transcripts of interviews with AI researchers · 2022-05-09T22:18:00.041Z · LW · GW

It was formatted based on typical academic "I am conducting a survey on X, $Y for Z time", and notably didn't mention AI safety. The intro was basically this:

My name is Vael Gates, and I’m a postdoctoral fellow at Stanford studying how productive and active AI researchers (based on submissions to major conferences) perceive AI and the future of the field. For example:

- What do you think are the largest benefits and risks of AI?

- If you could change your colleagues’ perception of AI, what attitudes/beliefs would you want them to have?

My response rate was generally very low, which biased the sample towards... friendly, sociable people who wanted to talk about their work and/or help out and/or wanted money, and had time. I think it was usually <5% response rate for the NeurIPS / ICML sample off the top of my head. I didn't A/B test the email. I also offered more money for this study than the main academic study, and expect I wouldn't have been able to talk to the individually-selected researchers without the money component.

Comment by Vael Gates on Giving calibrated time estimates can have social costs · 2022-04-03T22:59:27.739Z · LW · GW

Thanks Alex :). Comment just on this section:
 

"The annoying thing here is that I believe the only difference between me and another task doer in this situation is that I have more accurate beliefs, or I have a higher belief threshold for making claims (or something similar, like that I only use statement for communicating beliefs and not for socially enforcing a commitment to myself)."

As someone who was in this situation with Alex recently (wanting a commitment from him, in order to make plans with other people that relied on this initial commitment), I think there's maybe an additional thing in my psychology and not in Alex's which is about self-forcing

I'm careful about situations where I'm making a very strong commitment to something, because it means that if I've planned the timing wrong, I'll get the thing done but with high self-sacrifice. I'm committing to skipping sleeping, or fun hangouts I otherwise had planned, or relaxing activities, to get the thing done by the date I said it'd be done. I'm capable and willing to force myself to do this, if the other person wants a commitment from me enough. It's not 100% certain I'll succeed -- e.g. I might be hit by a car -- but I'm certain enough of success that people would expect me to succeed barring an emergency, which is mostly what I expect from other people when they're for-real-for-real committing to something. 

So when I'm asking someone to for-real-for-real commit to me, I'm asking "are you ready to do self-sacrifice if you don't get it done by this date, barring an emergency. It's fine if it's a later date, I just want the certainty of being able to build on this plan". And I do think there's a bunch of different kinds of commitments in day-to-day life, where I make looser commitments all the time, but I do have a category for "for-real-for-real commitment", and will track other people's failures to meet my expectations when I believe they've made a "for-real-for-real" commitment to me. I might track this more carefully than other people do though-- feels like it kinda rhymes with autism and high conscientiousness, maybe also high-performance environments but idk?

Anyway, this all might be the same thing as "I only use statement for communicating beliefs and not for socially enforcing a commitment to myself". I'm not sure I'd use exactly the the "social enforcing a commitment to myself" phrase; in my mind, it feels like a social commitment and also feels like "I'm now putting my personal integrity on the line, since I'm making a for-real-for-real commitment, so I'd better do what I said I would, even if no one's looking".

Amusingly, I think Alex and I are both using self-integrity here, but one hypothesis is that maybe I'm very willing and able to force myself to do things, and this makes up the difference with respect to what concepts we're referring to with respect to (strong) commitment?

Always fun getting unduly detailed with very specific pieces of models :P.

Comment by Vael Gates on Discussion with Eliezer Yudkowsky on AGI interventions · 2021-11-14T08:47:56.919Z · LW · GW

"Alpha Zero scales with more computing power, I think AlphaFold 2 scales with more computing power, Mu Zero scales with more computing power. Precisely because GPT-3 doesn't scale, I'd expect an AGI to look more like Mu Zero and particularly with respect to the fact that it has some way of scaling."


I thought GPT-3 was the canonical example of a model type that people are worried about will scale? (i.e. it's discussed in https://www.gwern.net/Scaling-hypothesis?)

Comment by Vael Gates on MichaelA's Shortform · 2021-09-23T01:00:33.598Z · LW · GW

Recently I was also trying to figure out what resources to send to an economist, and couldn't find a list that existed either! The list I came up with is subsumed by yours, except:
- Questions within Some AI Governance Research Ideas
- "Further Research" section within an OpenPhil 2021 report: https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth
- The AI Objectives Institute just launched, and they may have questions in the future