Akash's Shortform

post by Akash (akash-wasil) · 2024-04-18T15:44:25.096Z · LW · GW · 37 comments

37 comments

Comments sorted by top scores.

comment by Akash (akash-wasil) · 2024-05-18T16:51:27.153Z · LW(p) · GW(p)

My current perspective is that criticism of AGI labs is an under-incentivized public good. I suspect there's a disproportionate amount of value that people could have by evaluating lab plans, publicly criticizing labs when they break commitments or make poor arguments, talking to journalists/policymakers about their concerns, etc.

Some quick thoughts:

  • Soft power– I think people underestimate the how strong the "soft power" of labs is, particularly in the Bay Area. 
  • JobsA large fraction of people getting involved in AI safety are interested in the potential of working for a lab one day. There are some obvious reasons for this– lots of potential impact from being at the organizations literally building AGI, big salaries, lots of prestige, etc.
    • People (IMO correctly) perceive that if they acquire a reputation for being critical of labs, their plans, or their leadership, they will essentially sacrifice the ability to work at the labs. 
    • So you get an equilibrium where the only people making (strong) criticisms of labs are those who have essentially chosen to forgo their potential of working there.
  • Money– The labs and Open Phil (which has been perceived, IMO correctly, as investing primarily into metastrategies that are aligned with lab interests) have an incredibly large share of the $$$ in the space. When funding became more limited, this became even more true, and I noticed a very tangible shift in the culture & discourse around labs + Open Phil
  • Status games//reputation– Groups who were more inclined to criticize labs and advocate for public or policymaker outreach were branded as “unilateralist”, “not serious”, and “untrustworthy” in core EA circles. In many cases, there were genuine doubts about these groups, but my impression is that these doubts got amplified/weaponized in cases where the groups were more openly critical of the labs.
  • Subjectivity of "good judgment"– There is a strong culture of people getting jobs/status for having “good judgment”. This is sensible insofar as we want people with good judgment (who wouldn’t?) but this often ends up being so subjective that it ends up leading to people being quite afraid to voice opinions that go against mainstream views and metastrategies (particularly those endorsed by labs + Open Phil).
  • Anecdote– Personally, I found my ability to evaluate and critique labs + mainstream metastrategies substantially improved when I spent more time around folks in London and DC (who were less closely tied to the labs). In fairness, I suspect that if I had lived in London or DC *first* and then moved to the Bay Area, it’s plausible I would’ve had a similar feeling but in the “reverse direction”.

With all this in mind, I find myself more deeply appreciating folks who have publicly and openly critiqued labs, even in situations where the cultural and economic incentives to do so were quite weak (relative to staying silent or saying generic positive things about labs).

Examples: Habryka, Rob Bensinger, CAIS, MIRI, Conjecture, and FLI. More recently, @Zach Stein-Perlman [LW · GW], and of course Jan Leike and Daniel K. 

Replies from: Zach Stein-Perlman
comment by Zach Stein-Perlman · 2024-05-18T17:00:08.975Z · LW(p) · GW(p)

Sorry for brevity, I'm busy right now.

  1. Noticing good stuff labs do, not just criticizing them, is often helpful. I wish you thought of this work more as "evaluation" than "criticism."
  2. It's often important for evaluation to be quite truth-tracking. Criticism isn't obviously good by default.

Edit:

3. I'm pretty sure OP likes good criticism of the labs; no comment on how OP is perceived. And I think I don't understand your "good judgment" point. Feedback I've gotten on AI Lab Watch from senior AI safety people has been overwhelmingly positive, and of course there's a selection effect in what I hear, but I'm quite sure most of them support such efforts.

4. Conjecture (not exclusively) has done things that frustrated me, including in dimensions like being "'unilateralist,' 'not serious,' and 'untrustworthy.'" I think most criticism of Conjecture-related advocacy is legitimate and not just because people are opposed to criticizing labs.

5. I do agree on "soft power" and some of "jobs." People often don't criticize the labs publicly because they're worried about negative effects on them, their org, or people associated with them.

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-05-18T17:30:07.740Z · LW(p) · GW(p)

RE 1& 2:

Agreed— my main point here is that the marketplace of ideas undervalues criticism.

I think one perspective could be “we should all just aim to do objective truth-seeking”, and as stated I agree with it.

The main issue with that frame, imo, is that it’s very easy to forget that the epistemic environment can be tilted in favor of certain perspectives.

EG I think it can be useful for “objective truth-seeking efforts” to be aware of some of the culture/status games that underincentivize criticism of labs & amplify lab-friendly perspectives.

RE 3:

Good to hear that responses have been positive to lab watch. My impression is that this is a mix of: (a) lab watch doesn’t really threaten the interests of labs (especially Anthropic, which is currently winning & currently the favorite lab among senior AIS ppl), (b) the tides have been shifting somewhat and it is genuinely less taboo to criticize labs than a year ago, and (c) EAs respond more positively to criticism that feels more detailed/nuanced (look I have these 10 categories, let’s rate the labs on each dimension) than criticisms that are more about metastrategy (e.g., challenging the entire RSP frame or advocating for policymaker outreach).

RE 4: I haven’t heard anything about Conjecture that I’ve found particularly concerning. Would be interested in you clarifying (either here or via DM) what you’ve heard. (And clarification note that my original point was less “Conjecture hasn’t done anything wrong” and more “I suspect Conjecture will be more heavily scrutinized and examined and have a disproportionate amount of optimization pressure applied against it given its clear push for things that would hurt lab interests.”)

comment by Akash (akash-wasil) · 2024-04-18T15:44:25.830Z · LW(p) · GW(p)

I think now is a good time for people at labs to seriously consider quitting & getting involved in government/policy efforts.

I don't think everyone should leave labs (obviously). But I would probably hit a button that does something like "everyone at a lab governance team and many technical researchers spend at least 2 hours thinking/writing about alternative options they have & very seriously consider leaving."

My impression is that lab governance is much less tractable (lab folks have already thought a lot more about AGI) and less promising (competitive pressures are dominating) than government-focused work. 

I think governments still remain unsure about what to do, and there's a lot of potential for folks like Daniel K to have a meaningful role in shaping policy, helping natsec folks understand specific threat models, and raising awareness about the specific kinds of things governments need to do in order to mitigate risks.

There may be specific opportunities at labs that are very high-impact, but I think if someone at a lab is "not really sure if what they're doing is making a big difference", I would probably hit a button that allocates them toward government work or government-focused comms work.

Written on a Slack channel in response to discussions about some folks leaving OpenAI. 

Replies from: alexander-gietelink-oldenziel, davekasten
comment by Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2024-04-18T17:25:17.512Z · LW(p) · GW(p)

I'd be worried about evaporative cooling. It seems that the net result of this would be that labs would be almost completely devoid of people earnest about safety.

I agree with you government pathways to impact are most plausible and until recently undervalued. I also agree with you there are weird competitive pressures at labs. 

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-04-19T00:18:28.934Z · LW(p) · GW(p)

I do think evaporative cooling is a concern, especially if everyone (or a very significant amount) of people left. But I think on the margin more people should be leaving to work in govt. 

I also suspect that a lot of systemic incentives will keep a greater-than-optimal proportion of safety-conscious people at labs as opposed to governments (labs pay more, labs are faster and have less bureaucracy, lab people are much more informed about AI, labs are more "cool/fun/fast-paced", lots of govt jobs force you to move locations, etc.)

I also think it depends on the specific lab– EG in light of the recent OpenAI departures, I suspect there's a stronger case for staying at OpenAI right now than for DeepMind or Anthropic. 

comment by davekasten · 2024-04-18T16:42:56.807Z · LW(p) · GW(p)

I largely agree, but think given government hiring timelines, there's no dishonor in staying at a lab doing moderately risk-reducing work until you get a hiring offer with an actual start date.  This problem is less bad for the special hiring authorities being used for AI stuff oftentimes, but it's still not ideal.

comment by Akash (akash-wasil) · 2024-05-30T16:01:07.356Z · LW(p) · GW(p)

I'm surprised why some people are so interested in the idea of liability for extreme harms. I understand that from a legal/philosophical perspective, there are some nice arguments about how companies should have to internalize the externalities of their actions etc.

But in practice, I'd be fairly surprised if liability approaches were actually able to provide a meaningful incentive shift for frontier AI developers. My impression is that frontier AI developers already have fairly strong incentives to avoid catastrophes (e.g., it would be horrible for Microsoft if its AI model caused $1B in harms, it would be horrible for Meta and the entire OS movement if an OS model was able to cause $1B in damages.)

And my impression is that most forms of liability would not affect this cost-benefit tradeoff by very much. This is especially true if the liability is only implemented post-catastrophe. Extreme forms of liability could require insurance, but this essentially feels like a roundabout and less effective way of implementing some form of licensing (you have to convince us that risks are below an acceptable threshold to proceed.)

I think liability also has the "added" problem of being quite unpopular, especially among Republicans. It is easy to attack liability regulations as anti-innovation, argue that that it creates a moat (only big companies can afford to comply), and argue that it's just not how America ends up regulating things (we don't hold Adobe accountable for someone doing something bad with Photoshop.)

To be clear, I don't think "something is politically unpopular" should be a full-stop argument against advocating for it.

But I do think that "liability for AI companies" scores poorly both on "actual usefulness if implemented" and "political popularity/feasibility." I also think the "liability for AI companies" advocacy often ends up getting into abstract philosophy land (to what extent should companies internalize externalities) and ends up avoiding some of the "weirder" points (we expect AI has a considerable chance of posing extreme national security risks, which is why we need to treat AI differently than Photoshop.)

I would rather people just make the direct case that AI poses extreme risks & discuss the direct policy interventions that are warranted.

With this in mind, I'm not an expert in liability and admittedly haven't been following the discussion in great detail (partly because the little I have seen has not convinced me that this is an approach worth investing into). I'd be interested in hearing more from people who have thought about liability– particularly concrete stories for how liability would be expected to meaningfully shift incentives of labs. (See also here). 

Stylistic note: I'd prefer replies along the lines of "here is the specific argument for why liability would significantly affect lab incentives and how it would work in concrete cases" rather than replies along the lines of "here is a thing you can read about the general legal/philosophical arguments about how liability is good."

Replies from: habryka4, NathanBarnard, Chris_Leong, LRudL, RedMan
comment by habryka (habryka4) · 2024-05-30T22:53:29.132Z · LW(p) · GW(p)

One reason I feel interested in liability is because it opens up a way to do legal investigations. The legal system has a huge number of privileges that you get to use if you have reasonable suspicion someone has committed a crime or is being negligent. I think it's quite likely that if there was no direct liability, that even if Microsoft or OpenAI causes some huge catastrophe, that we would never get a proper postmortem or analysis of the facts, and would never reach high-confidence on the actual root-causes.

So while I agree that OpenAI and Microsoft want to of course already avoid being seen as responsible for a large catastrophe, having legal liability makes it much more likely there will be an actual investigation where e.g. the legal system gets to confiscate servers and messages to analyze what happens, which makes it then more likely that if OpenAI and Microsoft are responsible, they will be found out to be responsible.

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-06-02T01:07:19.064Z · LW(p) · GW(p)

I found this answer helpful and persuasive– thank you!

comment by NathanBarnard · 2024-06-06T13:39:55.988Z · LW(p) · GW(p)

I think liability-based interventions are substantially more popular with Republicans than other regulatory interventions - they're substantially more hands-off than, for instance, a regulatory agency.  They also feature prominently in the Josh Hawley proposal. I've also been told by a republican staffer that liability approaches are relatively popular amongst Rs. 

An important baseline point is that AI firms (if they're selling to consumers) are probably by default covered by product liability by default. If they're covered by product liability, then they'll be liable for damages if it can be shown that there was a not excessively costly alternative design that they could have implemented that would have avoided that harm. 

If AI firms aren't covered by product liability, they're liable according to standard tort law, which means they're liable if they're negligent under a reasonable person standard. 

Liability law also gives (some, limited) teeth to NIST standards. If a firm can show that it was following NIST safety standards, this gives it a strong argument that it wasn't being negligent. 

I share your scepticism of liability interventions as mechanisms for making important dents in the AI safety problem. Prior to the creation of the EPA, firms were still in principle liable for the harms their pollution caused, but the tort law system is generically a very messy way to get firms to reduce accident risks. It's expensive and time consuming to go through the court system, courts are reluctant to award punitive damages which means that externalities aren't internalised even theory (in expectation for firms,) and you need to find a plaintiff with standing to sue firms. 

I think there are still some potentially important use cases for liability for reducing AI risks:

  • Making clear the legal responsibilities of private sector auditors (I'm quite confident that this is a good idea)
  • Individual liability for individuals with safety responsibilities at firms (although this would be politically unpopular on the right I'd expect) 
  • Creating safe harbours from liability if firms fulfil some set of safety obligations (similarly to the California bill) - ideally safety obligations that are updated over time and tied to best practice
  • Requiring insurance to cover liability and using this to create better safety practices as firms to reduce insurance premiums and satisfy insurers' requirements for coverage
  • Tieing liability to specific failures modes that we expect to correlate with catastrophic failure modes, perhaps tied to a punitive damages regime - for instance holding a firm liable, including for punitive damages if a model causes harm via say goal misgenerlisation or firms lacking industry standard risk management practices 

To be clear, I'm still sceptical of liability-based solutions and reasonably strongly favour regulatory proposals (where specific liability provisions will still play an important role.)

I'm not a lawyer and have no legal training. 

comment by Chris_Leong · 2024-05-30T23:11:22.332Z · LW(p) · GW(p)

I think we should be talking more about potentially denying a frontier AI license to any company that causes a major disaster (within some future licensing regime), where a company’s record before the law passes will be taken into amount.

comment by L Rudolf L (LRudL) · 2024-06-01T11:07:35.240Z · LW(p) · GW(p)

One alternative method to liability for the AI companies is strong liability for companies using AI systems. This does not directly address risks from frontier labs having dangerous AIs in-house, but helps with risks from AI system deployment in the real world. It indirectly affects labs, because they want to sell their AIs.

A lot of this is the default. For example, Air Canada recently lost a court case after claiming a chatbot promising a refund wasn't binding on them. However, there could be related opportunities. Companies using AI systems currently don't have particularly good ways to assess risks from AI deployment, and if models continue getting more capable while reliability continues lagging, they are likely to be willing to pay an increasing amount for ways to get information on concrete risks, guard against it, or derisk it (e.g. through insurance against their deployed AI systems causing harms). I can imagine a service that sells AI-using companies insurance against certain types of deployment risk, that could also double as a consultancy / incentive-provider for lower-risk deployments. I'd be interested to chat if anyone is thinking along similar lines.

comment by RedMan · 2024-05-30T21:34:21.727Z · LW(p) · GW(p)

There are analogies here in pollution.  Some countries force industry to post bonds for damage to the local environment.  This is a new innovation that may be working.

The reason the superfund exists in the US is because liability for pollution can be so severe that a company would simply cease to operate, and the mess would not be cleaned up.

In practice, when it comes to taking environmental risks, better to burn the train cars of vinyl chloride, creating a catastrophe too expensive for anyone to clean up or even comprehend than to allow a few gallons to leak, creating an expensive accident that you can actually afford.

comment by Akash (akash-wasil) · 2024-06-04T17:20:39.019Z · LW(p) · GW(p)

My rough ranking of different ways superintelligence could be developed:

  1. Least safe: Corporate Race. Superintelligence is developed in the context of a corporate race between OpenAI, Microsoft, Google, Anthropic, and Facebook.
  2. Safer (but still quite dangerous): USG race with China. Superintelligence is developed in the context of a USG project or "USG + Western allies" project with highly secure weights. The coalition hopefully obtains a lead of 1-3 years that it tries to use to align superintelligence and achieve a decisive strategic advantage. This probably relies heavily on deep learning and means we do not have time to invest into alternative paradigms ("provably safe" systems, human intelligence enhancement, etc.
  3. Safest (but still not a guarantee of success): International coalition Superintelligence is developed in the context of an international project with highly secure weights. The coalition still needs to develop superintelligence before rogue projects can, but the coalition hopes to obtain a lead of 10+ years that it can use to align a system that can prevent rogue AGI projects. This could buy us enough time to invest heavily in alternative paradigms. 

My own thought is that we should be advocating for option #3 (international coordination) unless/until there is enough evidence that suggests that it's actually not feasible, and then we should settle for option #2. I'm not yet convinced by people who say we have to settle for option #2 just because EG climate treaties have not went well or international cooperation is generally difficult. 

But I also think people advocating #3 should be aware that there are some worlds in which international cooperation will not be feasible, and we should be prepared to do #2 if it's quite clear that the US and China are unwilling to cooperate on AGI development. (And again, I don't think we have that evidence yet– I think there's a lot of uncertainty here.)

Replies from: bogdan-ionut-cirstea, Dagon, davekasten, Oscar Delaney
comment by Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-06-04T18:56:35.200Z · LW(p) · GW(p)

I don't think the risk ordering is obvious at all, especially not between #2 and #3, and especially not if you also took into account tractability concerns and risks separate from extinction (e.g. stable totalitarianism, s-risks). Even if you thought coordinating with China might be worth it, I think it should be at least somewhat obvious why the US government [/ and its allies] might be very uncomfortable building a coalition with, say, North Korea or Russia. Even between #1 and #2, the probable increase in risks of centralization might make it not worth it, at least in some worlds, depending on how optimistic one might be about e.g. alignment or offense-defense balance from misuse of models with dangerous capabilities.

I also don't think it's obvious alternative paradigms would necessarily be both safer and tractable enough, even on 10-year timelines, especially if you don't use AI automation (using the current paradigm, probably) to push those forward.

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-06-04T19:06:31.780Z · LW(p) · GW(p)

the probable increase in risks of centralization might make it not worth it

Can you say more about why the risk of centralization differs meaningfully between the three worlds?

IMO if you assume that (a) an intelligence explosion occurs at some point, (b) the leading actor uses the intelligence explosion to produce a superintelligence that provides a decisive strategic advantage, and (c) the superintelligence is aligned/controlled...

Then you are very likely (in the absence of coordination) to result in centralization no matter what. It's just a matter of whether OpenAI/Microsoft (scenario #1), the USG and allies (scenario #2), or a broader international coalition (weighted heavily toward the USG and China) are the ones wielding the superintelligence.

(If anything, it seems like the "international coalition" approach seems less likely to lead to centralization than the other two approaches, since you're more likely to get post-AGI coordination.)

especially if you don't use AI automation (using the current paradigm, probably) to push those forward.

In my vision, the national or international project would be investing into "superalignment"-style approaches, they would just (hopefully) have enough time/resources to be investing into other approaches as well.

I typically assume we don't get "infinite time"– i.e., even the international coalition is racing against "the clock" (e.g., the amount of time it takes for a rogue actor to develop ASI in a way that can't be prevented, or the amount of time we have until a separate existential catastrophe occurs.) So I think it would be unwise for the international coalition to completely abandon DL/superalignemnt, even if one of the big hopes is that a safer paradigm would be discovered in time.

Replies from: bogdan-ionut-cirstea
comment by Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-06-04T19:10:06.429Z · LW(p) · GW(p)

IMO if you assume that (a) an intelligence explosion occurs at some point, (b) the leading actor uses the intelligence explosion to produce a superintelligence that provides a decisive strategic advantage, and (c) the superintelligence is aligned/controlled...


I don't think this is obvious, stably-multipolar worlds seem at least plausible to me.

Replies from: ryan_greenblatt
comment by ryan_greenblatt · 2024-06-04T20:35:32.985Z · LW(p) · GW(p)

See also here [LW · GW] and here [LW(p) · GW(p)].

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-06-04T22:28:12.241Z · LW(p) · GW(p)

@Bodgan, Can you spell out a vision for a stably multipolar world with the above assumptions satisfied?

IMO assumption B is doing a lot of the work— you might argue that the IE will not give anyone a DSA, in which case things get more complicated. I do see some plausible stories in which this could happen but they seem pretty unlikely.

@Ryan, thanks for linking to those. Lmk if there are particular points you think are most relevant (meta: I think in general I find discourse more productive when it’s like “hey here’s a claim, also read more here” as opposed to links. Ofc that puts more communication burden on you though, so feel free to just take the links approach.)

Replies from: ryan_greenblatt, bogdan-ionut-cirstea
comment by ryan_greenblatt · 2024-06-05T01:29:27.343Z · LW(p) · GW(p)

(Yeah, I was just literally linking to things people might find relevant to read without making any particular claim. I think this is often slightly helpful, so I do it. Edit: when I do this, I should probably include a disclaimer like "Linking for relevance, not making any specific claim".)

comment by Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-06-05T00:00:34.757Z · LW(p) · GW(p)

Yup, I was thinking about worlds in which there is no obvious DSA, or where the parties involved are risk averse enough (perhaps e.g. for reasons like in this talk) 

Replies from: nathan-helm-burger
comment by Nathan Helm-Burger (nathan-helm-burger) · 2024-06-06T10:24:16.007Z · LW(p) · GW(p)

My expectation is that DSI can (and will) be achieved before ASI. In fact, I expect ASI to be about as useful as a bomb which has a minimum effect size of destroying the entire solar system if deployed. In other words, useful only for Mutually Assured Destruction. DSI only requires a nuclear-armed state actor to have an effective global missile defense system. Whichever nuclear-armed state actor gets that without any other group having that can effectively demand the surrender and disarmament of all other nations. Including confiscating their compute resources. Do you think missile defense is so difficult that only ASI can manage it? I don't. That seems like a technical discussion which would need more details to hash out. I'm pretty sure an explicitly designed tool AI and a large drone and satellite fleet could accomplish that.

comment by Dagon · 2024-06-05T14:42:36.179Z · LW(p) · GW(p)

Competition is fractal. There are multiple hierarchies (countries/departments/agencies/etc, corporations/divisions/teams/etc), with individual humans acting on their own behalf. Often, individuals have influence and goals in multiple hierarchies.

Your 1/2/3 delineation is not the important part. It’s going to be all 3, with chaotic shifts as public perception, funding, and regulation shifts around.

comment by davekasten · 2024-06-04T22:09:34.401Z · LW(p) · GW(p)

Agree -- I think people need to be prepared for "try-or-die" scenarios.  

One unfun one I'll toss into the list: "Company A is 12 months from building Cthulhu, and governments truly do not care and there is extremely strong reason to believe that will not change in the next year.  All our policy efforts have failed, our existing technical methods are useless, and the end of the world has come.  Everyone report for duty at Company B, we're going to try to roll the hard six."

Replies from: mesaoptimizer
comment by mesaoptimizer · 2024-06-04T22:31:07.498Z · LW(p) · GW(p)

If Company A is 12 months from building Cthulhu, we fucked up upstream. Also, I don't understand why you'd want to play the AI arms race -- you have better options. They expect an AI arms race. Use other tactics. Get into their OODA loop.

Unsee the frontier lab.

Replies from: davekasten
comment by davekasten · 2024-06-04T22:41:13.855Z · LW(p) · GW(p)

...yes ? I think my scenario explicitly assumes that we've fucked up upstream in many, many ways. 

Replies from: mesaoptimizer
comment by mesaoptimizer · 2024-06-04T22:44:27.915Z · LW(p) · GW(p)

Oh, by that I meant something like "yeah I really think it is not a good idea to focus on an AI arms race". See also Slack matters more than any other outcome. [LW · GW]

comment by Oscar (Oscar Delaney) · 2024-06-04T17:34:04.699Z · LW(p) · GW(p)

You are probably already familiar with this, but re option 3, the Multilateral AGI Consortium (MAGIC) proposal is I assume along the lines of what you are thinking.

Replies from: davekasten
comment by davekasten · 2024-06-04T21:56:17.281Z · LW(p) · GW(p)

Indeed,  Akash is familiar: https://arxiv.org/abs/2310.20563 :)

(I think it was a later paper he co-authored than the one you cite)

comment by Akash (akash-wasil) · 2024-06-30T16:43:08.386Z · LW(p) · GW(p)

Recommended reading:  A recent piece argues that the US-China crisis hotline doesn't work & generally raises some concerns about US-China crisis communication.

Some quick thoughts:

  • If the claims in the piece are true, there seem to be some (seemingly tractable) ways of substantially improving US-China crisis communication. 
  • The barriers seem more bureaucratic (understanding how the defense world works and getting specific agencies/people to do specific things) than political (I doubt this is something you need Congress to pass new legislation to improve.)
  • In general, I feel like "how do we improve our communication infrastructure during AI-related crises" is an important and underexplored area of AI policy. This isn't just true for US-China communication but also for "lab-government communication", "whistleblower-government communication", and "junior AI staffer-senior national security advisor" communication. 
    • Example: Suppose an eval goes off that suggests that an AI-related emergency might be imminent. How do we make sure this information swiftly gets to relevant people? To what extent do UKAISI and USAISI folks (or lab whistleblowers) have access to senior national security folks who would actually be able to respond in a quick or effective way?
  • I think IAPS' CDDC paper is a useful contribution here. I will soon be releasing a few papers in this broad space, with a focus on interventions that can improve emergency detection + emergency response.
  • One benefit of workshops/conferences/Track 2 dialogues might simply be that you get relevant people to meet each other, share contact information, build trust/positive vibes, and be more likely to reach out in the event of an emergency scenario.
  • Establishing things like the AI Safety and Security Board might also be useful for similar reasons. I think this has gotten a fair amount of criticism for being too industry-focused, and some of that is justified. Nonetheless, I think interventions along the lines of "make it easy for the people who might see the first signs of extreme risk have super clear ways of advising/contacting government officials" seem great. 
comment by Akash (akash-wasil) · 2024-06-07T23:29:17.922Z · LW(p) · GW(p)

I've started reading the Report on the International Control of Atomic Energy and am finding it very interesting/useful.

I recommend this for AI policy people– especially those interested in international cooperation, US policy, and/or writing for policy audiences

Replies from: akash-wasil
comment by Akash (akash-wasil) · 2024-06-07T23:33:53.558Z · LW(p) · GW(p)

@Peter Barnett [LW · GW] @Rob Bensinger [LW · GW] @habryka [LW · GW] @Zvi [LW · GW] @davekasten [LW · GW] @Peter Wildeford [LW · GW] you come to mind as people who might be interested. 

See also Wikipedia Page about the report (but IMO reading sections of the actual report is worth it.)

comment by Akash (akash-wasil) · 2024-06-14T15:55:25.855Z · LW(p) · GW(p)

Recommended readings for people interested in evals work?

Someone recently asked: "Suppose someone wants to get into evals work. Is there a good reading list to send to them?" I spent ~5 minutes and put this list together. I'd be interested if people have additional suggestions or recommendations:

I would send them:

I would also encourage them to read stuff more on the "macrostrategy" of evals. Like, I suspect a lot of value will come from people who are able to understand the broader theory of change of evals and identify when we're "rowing" in bad directions. Some examples here might be:

Replies from: Jozdien, akash-wasil
comment by Jozdien · 2024-06-16T09:50:03.070Z · LW(p) · GW(p)

I'm obviously biased, but I would recommend my post on macrostrategy of evals: The case for more ambitious language model evals [LW · GW].

comment by Akash (akash-wasil) · 2024-06-14T16:00:43.410Z · LW(p) · GW(p)

@Ryan Kidd [LW · GW] @Lee Sharkey [LW · GW] I suspect you'll have useful recommendations here.

comment by Akash (akash-wasil) · 2024-04-24T20:14:37.346Z · LW(p) · GW(p)

I'm interested in writing out somewhat detailed intelligence explosion scenarios. The goal would be to investigate what kinds of tools the US government would have to detect and intervene in the early stages of an intelligence explosion. 

If you know anyone who has thought about these kinds of questions, whether from the AI community or from the US government perspective, please feel free to reach out via LessWrong.