Posts
Comments
Great article Garrison!
I found that the vitriolic debate between the people worried about extinction and those worried about AI’s existing harms hides the more meaningful divide — between those trying to make AI more profitable and those trying to make it more human.
Bravo.
Matthew I think you're missing a pretty important consideration here, which is that all of these policy/governance actions did not "just happen" -- a huge amount of effort has been put into them, much of it by the extended AI safety community, without which I think we would be in a very different situation. So I take almost the opposite lesson from what's happening: concerted effort to actually try to govern AI might actually succeed -- but we should be doubling down on what we are doing right and learning from what we are doing wrong, not being complacent.
(comment crossposted from EA forum)
Very interesting post! But I'd like to push back. The important things about a pause, as envisaged in the FLI letter, for example, are that (a) it actually happens, and (b) the pause is not lifted until there is affirmative demonstration that the risk is lifted. The FLI pause call was not, in my view, on the basis of any particular capability or risk, but because of the out-of-control race to do larger giant scaling experiments without any reasonable safety assurances. This pause should still happen, and it should not be lifted until there is a way in place to assure that safety. Many of the things FLI hoped could happen during the pause are happening — there is huge activity in the policy space developing standards, governance, and potentially regulations. It's just that now those efforts are racing the un-paused technology.
In the case of "responsible scaling" (for which I think the ideas of "controlled scaling" or "safety-first scaling" would be better), what I think is very important is that there not be a presumption that the pause will be temporary, and lifted "once" the right mitigations are in place. We may well hit point (and may be there now), where it is pretty clear that we don't know how to mitigate the risks of the next generation of systems we are building (and it may not even be possible), and new bigger ones should not be built until we can do so. An individual company pausing "until" it believes things are safe is subject to the exact same competitive pressures that are driving scaling now — both against pausing, and in favor of lifting a pause as quickly as possible. If the limitations on scaling come from the outside, via regulation or oversight, then we should ask for something stronger: before proceeding, show to those outside organizations that scaling is safe. The pause should not be lifted until or unless that is possible. And that's what the FLI pause letter asks for.
If anyone would like to be funded to do actual high quality research on this topic, I strongly encourage application to FLI's Humanitarian Impacts of Nuclear War grant program. For decades there have been barely any careful studies because there is barely any research funding or support. It's quite possible the effects are not as bad as currently predicted, but it's quite possible they are worse — the modern nuclear winter studies fund that things are worse than the early ones in the 80s (though fortunately the arsenals are much smaller now.)
It seems quite important to me to have a clear-eyed view of what the results of "small" and "large" nuclear wars are like. An all-out nuclear war between the US and Russia currently would probably involve of order 1900 warheads on each side, which is still a stupendous number. (See the BAAS's nuclear notebook for some pretty detailed arsenal numbers.) If something starts, I'm deeply pessimistic about a maintaining a "limited" nuclear war between these two, much as I'd like to believe otherwise.
I think this depends a lot on the use case. I envision for the most part this would be used in/on large known clusters of computation, as an independent check on computation usage and a failsafe. In that case it will be pretty easy to distinguish from other uses like gaming or cryptocurrency mining. If we're in the regime where we're worried about sneaky efforts to assemble lots of GPUs under the radar and do ML with them, then I'd expect there would be pattern analysis methods that could be used as you suggest, or the system could be set up to feed back more information than just computation usage.
The purpose of the COMPUTE token and blockchain here would be to provide a publicly verifiable ledger of the computation done by the computational cores. It would not be integral to the scheme but would be useful for separating the monitoring and control, as detailed in the post. I hope it is clear that a token as a tradeable asset is not at all important to the core idea.
Very cool, thanks for the pointer!
There's no single metric or score that is going to capture everything. Metaculus points as the central platform metric were devised to —as danohu says — reward both participation and accuracy. Both are quite important. It's easy to get a terrific Brier score by cherry-picking questions. (Pick 100 questions that you think have 1% or 99% probability. You'll get a few wrong but your mean Brier score will be ~(few)*0.01. Log score is less susceptible to this). You can also get a fair number of points for just predicting the community prediction — but you won't get that many because as a question's point value increases (which it does with the number of predictions), more and more of the score is relative rather than absolute.
If you want to know how good a predictor is, points are actually pretty useful IMO, because someone who is near the top of the leaderboard is both accurate and highly experienced. Nonetheless more ways of comparing people to each other would be useful. You can look at someone's track record in detail, but we're also planning to roll out a more ways to compare people with each other. None of these will be perfect; there's simply no single number that will tell you everything you might want — why would there be?
I am not an expert on the Outer Space Treaty either, but by also by anecdotal evidence, I have always heard it to be of considerable benefit and a remarkable achievement of diplomatic foresight during the Cold War. However, I would welcome any published criticisms of the Outer Space Treaty you wish to provide.
It's important to note that the treaty was originally ratified in 1967 (as in, ~two years before landing on the Moon, ~5 years after the Cuban Missile Crisis). If you critique a policy for its effects long after its original passage (as with reference to space mining, or as others have the effects of Section 230 of the CDA passed in 1996), your critique is really about the government(s) failing to update and revise the policy, not with the enactment of original policy. Likewise, it is important to run the counterfactual to the policy never being enacted. In this circumstance, I’m not sure how you envision a breakdown in US-USSR (and other world powers) negotiations on the demilitarization of space in 1967 would have led to better outcomes.
You're certainly entitled to your (by conventional standards) pretty extreme anti-regulatory view of e.g. the FDA, IRBs, environmental regulations, etc., and to your prior that regulations are in general highly net negative. I don't share those views but I think we can probably agree that there are regulations (like seatbelts, those governing CFCs, asbestos, leaded gasoline, etc.) that are highly net positive, and others (e.g. criminalization of some drugs, anti-cryptography, industry protections against class action suits, etc.) that are nearly completely negative. What we can do to maximize the former and minimize the latter is a discussion worth having, and a very important one.
In the present case of autonomous weapons, I again think the right reference class is that of things like the bioweapons convention and the space treaty. I think these, also, have been almost unreservedly good: made the world more stable, avoided potentially catastrophic arms races, and left industries (like biotec, pharma, space industry, arms industry) perfectly healthy and arguably (especially for biotech) much better off than they would have been with a reputation mixed up in creating horrifying weapons. I also think in these cases, as with at least some AWs like antipersonnel WMDs, there is a pretty significant asymmetry, with the negative affects (of no regulation) having a tail into extremely bad outcomes, while the negative affects of well-structured regulations seem pretty mild at worst. Those are exactly the sorts of regulations/agreements I think we should be pushing on.
Very glad I wrote up the piece as I did, it's been great to share and discuss it here with this community, which I have huge respect for!
"Regulation," in the sense of a government limitation on otherwise "free" industry does indeed make a bit more sense, and you're certainly entitled to the view that many pieces of regulation of the free market are net negative — though again I think it is quite nuanced, as in many cases (DMCA would be one) regulation allows more free markets that might not otherwise exist.
In this case, though, I think the more relevant reference class is "international arms control agreements" like the bioweapons convention, the convention on conventional weapons, the space treaty, landmine treaty, the nuclear nonproliferation treaties, etc. These are not so much regulations as compacts not to develop and use certain weapons. They may also include some prohibitions on industry developing and selling those weapons, but the important part is that the militaries are not making or buying them. (For example, you can't just decide to build nuclear weapons, but I doubt it is illegal to develop or manufacture a laser-blinding weapon or landmine.)
The issue of regulation in the sense of putting limitations on AI developers (say toward safety issues) is worth debating but I think is a relatively distinct one. It is absolution important to carefully consider whether a given piece of policy or regulation is better than the alternatives (and not, I say again, "better than nothing" because in general the alternative is not "nothing.") And I think it's vital to try to improve existing legislation etc., which has been most of FLI for example's focus.
Thanks Oliver for this, which likewise very much helps me understand better where some of the ideological disagreements lie. Your statement “but then again, the vast majority of policy passed is strongly net-negative” encapsulates it well. Leaving aside that (even if we could agree on what “positive” and “negative” were) this seems almost impossible to evaluate, it indicates a view that the absence of a policy on something is “no policy”. Whereas in my view in the vast majority of situations the absence of some policy is some other policy, whether it’s explicit or implicit. Certainly in the case of AWs, the US (and other militaries) will have some policy about AWs. That’s not at issue. At issue is what the policy will be. And some policies (like prohibiting autonomous WMDs) make much less sense in the absence of the context of an international agreement. So creating that context can create the possibility for a wider range of policies, including better ones.
More generally, when you look at arenas where there is “no policy” often there actually is one. For example, as I understand it, the modern social media ecosystem does not exist because there is no policy governing it, but due to the DMCA. Had that Act been different (or nonexistent) other policies would be governing things instead, for better or worse. Or, if there were no FDA, there would still be policy: it would govern advertising, and lawsuits, and independent market-based drug-testers, and so on. In a more abstract sense, I view policy as a general term for the basic legal structure governing how our society works, and there isn’t a “default” setting. There are settings in various situations that are “leave this to market forces” or “tightly regulate this with a strict government agency” and all manner of others, but those are generally choices implicitly or explicitly made. The US has made “leave it to market forces" much more of the norm and default (which has had a lot of great results!), but that is, at a higher scale, still a policy choice. There are lots of other ways to organize a society, and we’re tried some of them. When we’re talking about development of weapons and international diplomacy I don’t think there is a reliably good default — at all.
So I think it’s quite reasonable to ask for what policy proposals are, and evaluate the particular proposals — as Ben and you are doing. But I don’t think it’s fair or wise to assume that the policies that will be generated by existing actors, in AI weaponry, or AGI, or whatever else, are likely to be particularly good. Policies will exist whether we participate or not, and they will come into being due to the efforts of policymakers guided by various interests. That they are most likely better without the input of groups like FLI who understand the issues and stakes, encompass a lot of expertise, and are working purely in the interest of humanity and it’s longterm flourishing, seems improbably even from an outside view. And of course from the inside view seeing the actual work we have done I don’t think it’s the case.
I'd say you are summarizing at least part of the reasoning as I see it, but I'd add that AWs in general seem likely to significantly increase the number of conflicts and the risk of escalation into a full-scale war (very likely to then go nuclear).
I'm not sure what basis there is for thinking that there is some level of "finite supply" of goodwill toward international agreements. Indeed my intuition would be that more international agreements set precedent and mechanisms for more others, in more of a virtuous than self-limiting cycle. If I had to choose between a AW treaty and some treaty governing powerful AI, the latter (if it made sense) is clearly more important. I really doubt there is such a choice and that one helps with the other, but I could be wrong here. Possibly it's more like lawmaking where there is some level of political capital a given party has and is able to spend; I guess that depends on to what degree the parties see it as a zero sum vs. positive-sum negotiation.
But it seems at least >30% probably to me that the first sufficiently-powerful AGI will be built in the course of trying to make systems that can do (a) science and engineering or (b) create profitable services in the economy. It doesn't seem obvious to me that treaties about warfare are an appropriate tool to deal with this.
I agree: it's quite possible that AGI will develop fairly slowly or pretty firmly in the private sector and military and government involvement will be secondary and not that crucial to the dynamics. In this case the AW governance work would be less relevant and precedent-setting. In that case international governance work like at the OECD would be much more relevant. But since we don't know, I think it makes sense to plan for multiple scenarios.
Just as it is not easy to draw a line between newsfeed algorithms that do and don’t share fake news, it is not clear to me that it is easy to draw a line between machine learning weapons systems that are autonomous and non-autonomous. I'm not certain.
I agree that this is an interesting analogy, and in both cases it's hard. But because something is hard does not necessarily mean it isn't worth doing (in both cases.) In the newsfeed case I expect "outlawing fake news" is indeed unworkable. But in trying to figure out what might work, actually interesting solutions may well arise. Likewise AW governance will be difficult, but our experience was that once we got some real experts into a room to think about what might actually be done, there were all sorts of good ideas.
As with chemical, biological, and nuclear weapons, it will/would be difficult to forestall determined people from getting their hands on them indefinitely — and probably more difficult than any of those cases since there's indeed lots of dual use from drones, and you won't (probably) fear for your life in building one.
Nonetheless I think there is a huge difference between weapons built by amateurs (and even by militaries in secret) versus an open and potential arms-race effort by major military powers. No amateur is going to create a drone WMD, and we can hope that at some level nation-state level anti-AW defenses can keep up with a much less determined program of AW development.
Thanks for your comment. In terms of technical AI safety, I think an interesting research question is the dynamics of multiple adversarial agents — i.e. is there a way to square the need for predicability and control with the need for a system to be unexploitable by an adversary, or are these in hopeless tension? This is relevant for AWs, but seems to also potentially be quite relevant for any multipolar AI world with strongly competitive dynamics.
Thanks for this great piece! A few thoughts with my Metaculus hat on:
- We can think of a sort of "contest spectrum" where there is always a ranking, and there is a relationship between ranking and win probability. On one end of the spectrum the top N people win, and on the other end people are just paid in proportion to how well they predict. The latter end runs into legal problems, as it's effectively then just a betting , while the former end runs into problems, as you say, if the number of questions in the contest is too low. Our current plan at Metaculus is to just make sure contests generally have enough questions (>20 is probably enough) to ensure that your chance of winning by taking extreme positions is vanishingly small; then we hope to implement other sorts of "bounties" that are not directly in proportion to predictive success (and are hence not betting). The probabilistic contest is a good fix for a contest with too few questions, but I'm not sure I love it in general.
- In designing a scoring or reward system, it's very tricky to find the right level of transparency/simplicity. Some transparency and simplicity is important as people need to know what the incentives are. But if it's too simple and transparent, then the metric becomes the goal rather than measuring something else. The current Metaculus point system is complicated, but devised to incentivise certain things (lots of participation, frequent updates, and prediction of one's true estimate of the probability) while being complicated enough that it's kindof inscrutable and hence a pain to game. But there are lots of possible ways to do it and it would be quite interesting to think of more metrics for assessing and comparing predictors. In addition, there's no reason for that Metaculus or anyone else to stick to a single metric (indeed the Metaculus aggregation does not work on the basis of points, and bounties — someday — probably won't either).
- We have a possible idea of "microteaming" here, but have not gotten much feedback on it so far. We definitely need more ways of rewarding information sharing and collaboration.