My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAIpost by Andrew_Critch · 2023-05-24T00:02:08.836Z · LW · GW · 39 comments
I have a mix of views on AI x-risk in general — and on OpenAI specifically — that no one seems to be able to remember, due to my views not being easily summarized as those of a particular tribe or social group or cluster. For some of the views I consider most neglected and urgently important at this very moment, I've decided to write them here, all-in-one-place to avoid presumptions that being "for X" means I'm necessarily "against Y" for various X and Y.
Probably these views will be confusing to read, especially if you're implicitly trying to pin down "which side" of some kind of debate or tribal affiliation I land on. As far as I can tell, I don't tend to choose my beliefs in a way that's strongly correlated with or caused by the people I affiliate with. As a result, I apologize in advance if I'm not easily remembered as "for" or "against" any particular protest or movement or trend, even though I in fact have pretty distinct views on most topics in this space... the views just aren't correlated according to the usual social-correlation-matrix.
- Regarding "pausing": I think pausing superintelligence development using collective bargaining agreements between individuals and/or states and/or companies is a good idea, along the lines of FLI's recent open letter, "Pause Giant AI Experiments", which I signed early and advocated for.
- Regarding OpenAI, I feel overall positively about them:
- I think OpenAI has been a net-positive influence for reducing x-risk from AI, mainly by releasing products in a sufficiently helpful-yet-fallible form that society is now able to engage in less-abstract more-concrete public discourse to come to grips with AI and (soon) AI-risk.
- I've found OpenAI's behaviors and effects as an institution to be well-aligned with my interpretations of what they've said publicly. That said, I'm also sympathetic to people other than me who expected more access to models or less access to models than what OpenAI has ended up granting; but my personal assessment, based on my prior expectations from reading their announcements, is "Yep, this is what I thought you told us you would do... thanks!". I've also found OpenAI's various public testimonies, especially to Congress, to move the needle on helping humanity come to grips with AI x-risk in a healthy and coordinated way (relative to what would happen if OpenAI made their testimony and/or products less publicly accessible, and relative to OpenAI not existing at all). I also like their charter, which creates tremendous pressure on them from their staff and the public to behave in particular ways. This leaves me, on-net, a fan of OpenAI.
- Given their recent post on Governance of Superintelligence, I can't tell if their approach to superintelligence is something I do or will agree with, but I expect to find that out over the next year or two, because of the openness of their communications and stance-taking. And, I appreciate the chance for me, and the public, to engage in dialogue with them about it.
- I think the world is vilifying OpenAI too much, and that doing so is probably net-negative for existential safety. Specifically, I think people are currently over-targeting OpenAI with criticism that's easy to formulate because of the broad availability of OpenAI's products, services, and public statements. This makes them more vulnerable to attack than other labs, and I think piling onto them for that is a mistake from an x-safety perspective, in the "shooting the messenger" category. I.e., over-targeting OpenAI with criticism right now is pushing present and future companies toward being less forthright in ways that OpenAI has been forthright, thereby training the world to have less awareness of x-risk and weaker collective orientation on addressing it.
- Regarding Microsoft, I feel quite negatively about their involvement in AI:
- Microsoft should probably be subject to federal-agency-level sanctions — from existing agencies, and probably from a whole new AI regulatory agency — for their reckless deployment of AI models. Specifically, Microsoft should probably be banned from deploying AI models at scale going forward, and from training large AI models at all. I'm not picky about the particular compute thresholds used to define such a ban, as long as the ban would leave Microsoft completely out of the running as an institution engaged in AGI development.
- I would like to see the world "buy back" OpenAI from Microsoft, in a way that would move OpenAI under the influence of more responsible investors, and leave Microsoft with some money in exchange for their earlier support of OpenAI (which I consider positive). I have no reason to think this is happening or will happen, but I hereby advocate for it, conditional on (a) (otherwise I'd worry the money would just pay for more AI research from Microsoft).
- I have some hope that (a) and (b) might be agreeable from non-x-risk perspectives as well, such as "Microsoft is ruining the industry for everyone by releasing scary AI systems" or "Microsoft clearly don't know what they're doing and they're likely to mess up and trigger over-regulation" or something like that. At the very least, it would be good to force a product recall of their most badly-behaved products. You know which ones I'm talking about, but I'm not naming them, to avoid showing up too easily in their search and upsetting them and/or their systems.
- FWIW, I also think Microsoft is more likely than most companies to treat future AI systems in abusive ways that are arguably intrinsically unethical irrespective of x-risk. Perhaps that's another good reason to push for sanctions against them, though it's probably not at present a broadly-publicly-agreeable reason.
- Regarding Facebook/Meta:
- Years ago, I used to find Yann LeCun's views on AI to be thoughtful and reasonable, even if different from mine. I often agreed with his views along the lines that AI applicable and well-codified laws, not just "alignment" or "utility functions", would be crucial to making AI safe for humanity.
- Over the years roughly between 2015 and 2020 (though I might be off by a year or two), it seemed to me like numerous AI safety advocates were incredibly rude to LeCun, both online and in private communications.
- Now, LeCun's public opinions on AGI and AI x-risk seem to be of a much lower quality, and I feel many of his "opponents" are to blame for lowering the quality of discourse around him.
- As an AI safety advocate myself, I feel regretful for not having spoken up sooner in opposition to how people treated LeCun (even though I don't think I was ever rude to him myself), and I'm worried that more leaders in AI — such as Sam Altman, Demis Hassabis, or Dario Amodei — will be treated badly by the public in ways that that turn out to degrade good-faith discourse between lab leaders and the public.
- Regarding AI x-risk in general, I feel my views are not easily clustered with a social group or movement. Here they are:
- Regarding my background: my primary professional ambition for the past ~12 years has been to reduce x-risk: co-founding CFAR, earning to give, working at MIRI, founding BERI, being full-time employee #1 at CHAI, co-founding SFF, SFP, and SFC, and Encultured. I became worried about x-risk in 2010 when Prof. Andrew Ng came to Berkeley and convinced me that AGI would be developed during our lifetimes. That was before people started worrying publicly about AGI and he started saying it was like overpopulation on mars.
- Regarding fairness, bias-protections, and employment: they're important and crucial to x-safety, and should be unified with it rather than treated as distractions. In particular, I feel I care a lot more about unfairness, bias, and unemployment than (I think) most people who worry about x-risk, in large part because preventing the fabric of society from falling apart is crucial to preventing x-risk. I have always felt kinda gross a using a "long term" vs "short term" dichotomy of AI concerns, in part because x-risk is a short term concern and should not be conflated with "longtermism", and in part because x-risk needs to be bundled with unfairness and bias and unemployment and other concerns relevant to the "fabric of society", which preserve the capacity of our species to work together as a team on important issues. These beliefs are summarized in an earlier post Some AI research areas and their relevance to existential safety [LW · GW] (2020). Moreover, I think people who care about x-risk are often making it worse by reinforcing the dichotomy and dismissively using terms like "near termist" or "short termist". We should be bundling and unifying these concerns, not fighting each other for air-time.
- Regarding "pivotal acts": I think that highly strategic consequentialism from persons/institutions with a lot of power is likely to make x-risk worse rather than better, as opposed to trying-to-work-well-as-part-of-society-at-large, in most cases. This is why I have written in opposition to pivotal acts in my post Pivotal outcomes and pivotal processes. [LW · GW]
- My "p(doom)": I think humanity is fairly unlikely (p<10%) to survive the next 50 years unless there is a major international regulatory effort to control how AI is used. I also think the probability of an adequate regulatory effort is small but worth pursuing. Overall I think the probably of humanity surviving the next 50 years is somewhere around 20%, and that AI will probably be a crucial component in how humanity is destroyed. I find this tragic, ridiculous, and a silly thing for us do be doing, however I don't personally think humanity has the wherewithal to stop itself from destroying itself.
- My "p(AGI)": I also think humanity will develop AGI sometime in the next 10 years and that we probably won't die immediately because of it, but will thereafter gradually lose control of how the global economy works in a way that gets us all killed from some combination of AI-accelerated pollution, resource depletion, and armed conflicts. My maximum-likelihood guess for how humanity goes extinct is here:
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs). [LW · GW]
That said, I do think an immediate extinction event (spanning 1-year-ish) following an AGI development this decade is not an absurd concern, and I continue to respect people who believe it will happen. In particular, I think an out-of-control AI singleton is also plausible and not-silly to worry about. I think our probability of extinction specifically from an out-of-control AI singleton is something like 15%-25%. That's higher than an earlier 10%-15% estimate I had in mind prior to observing Microsoft's recent behavior, but still lower than the ~50% extinction probability I'm expecting from multi-polar interaction-level effects coming some years after we get individually "safe" AGI systems up and running ("safe" in the sense that they obey their creators and users; see again my Multipolar Failure [LW · GW] post above for why that's not enough for humanity to survive as a species).
- Regarding how to approach AI risk, again I feel my views are not easily clustered with a social group or movement. I am:
- Positive on democracy. I feel good about and bullish on democratic processes for engaging people with diverse views on how AI should be used and how much risk is okay to take. That includes public discourse, free speech, and peaceful protests. I feel averse to and bearish on imposing my personal views on that outcome, beyond participating in good faith conversations and dialogue about how humanity should use AI, such as by writing this post.
- Laissez-faire on protests. I have strategic thoughts that tell me that protesting AI at this-very-moment probably constitutes poor timing in terms of the incentives created for AI labs that are making progress toward broader acknowledgement of x-risk as an issue. That said, I also think democracy hinges crucially on free speech, and I think the world will function better if people don't feel shut-down or clammed-up by people-like-me saying "the remainder of May 2023 probably isn't a great time for AI protests." In general, when people have concerns that have not been addressed by an adequately legible public record, peaceful protests are often a good response, so at a meta level I think protests often make sense to happen even when I disagree with their messages or timing (such as now).
- Somewhat-desperately positive on empathy. I would like to see more empathy between people on different sides of the various debates around AI right now. Lately I am highly preoccupied with this issue, in particular because I think weak empathy on both sides of various AI x-risk debates are increasing x-risk and other problems in tandem. I am not sure what to do about this, and would somewhat-desperately like to see more empathy in this whole space, but don't know as-yet what I or anyone can do to help, other than just trying to be more empathetic and encouraging the same from others where possible. I say "somewhat-desperately" because I don't actually feel desperate; I tend not to feel desperate about most things in general. Still, this is the issue that I think is more important-and-neglected in service of AI x-safety right now.
Thanks for reading. I appreciate it :) I just shared a lot of thoughts, which are maybe too much to remember. If I could pick just one idea to stick around from this post, it's this:
"Please try to be nice to people you disagree with, even if you disagree with them about how to approach x-risk, even though x-risk is real and needs to be talked about."
Comments sorted by top scores.