Thoughts on "The Offense-Defense Balance Rarely Changes" 2024-02-12T03:26:50.662Z
Polio Lab Leak Caught with Wastewater Sampling 2023-04-07T01:06:35.245Z
Tracking Compute Stocks and Flows: Case Studies? 2022-10-05T17:57:55.408Z
Law-Following AI 4: Don't Rely on Vicarious Liability 2022-08-02T23:26:00.426Z
Law-Following AI 3: Lawless AI Agents Undermine Stabilizing Agreements 2022-04-27T17:30:25.915Z
Law-Following AI 2: Intent Alignment + Superintelligence → Lawless AI (By Default) 2022-04-27T17:27:24.210Z
Law-Following AI 1: Sequence Introduction and Structure 2022-04-27T17:26:57.004Z
FHI Report: How Will National Security Considerations Affect Antitrust Decisions in AI? An Examination of Historical Precedents 2020-07-28T18:34:23.537Z
AI Benefits Post 5: Outstanding Questions on Governing Benefits 2020-07-21T16:46:10.725Z
Parallels Between AI Safety by Debate and Evidence Law 2020-07-20T22:52:09.185Z
AI Benefits Post 4: Outstanding Questions on Selecting Benefits 2020-07-14T17:26:12.537Z
Antitrust-Compliant AI Industry Self-Regulation 2020-07-07T20:53:37.086Z
AI Benefits Post 3: Direct and Indirect Approaches to AI Benefits 2020-07-06T18:48:02.363Z
AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good 2020-06-29T17:00:42.150Z
AI Benefits Post 1: Introducing “AI Benefits” 2020-06-22T16:59:22.605Z


Comment by Cullen (Cullen_OKeefe) on Thoughts on "The Offense-Defense Balance Rarely Changes" · 2024-02-13T15:33:46.627Z · LW · GW

Good point!

Comment by Cullen (Cullen_OKeefe) on Thoughts on "The Offense-Defense Balance Rarely Changes" · 2024-02-12T17:58:11.166Z · LW · GW


The offensive technology to make the planet unsuitable for current human civilization ALREADY exists - the defense so far has consisted of convincing people not to use it.

I think this is true in the limit (assuming you're referring primarily to nukes). But I think offense-defense reasoning is still very relevant here: For example, to know when/how much to worry about AIs using nuclear technology to cause human extinction, you would want to ask under what circumstances can humans defend command and control of nuclear weapons from AIs that want to seize them.

We just can't learn much from human-human conflict, where at almost any scale, the victor hopes to have a hospitable environment remaining afterward.

I agree that the calculus changes dramatically if you assume that the AI does not need or want the earth to remain inhabitable by humans. I also agree that, in the limit, interspecies interactions are plausibly a better model than human-human conflicts. But I don't agree that either of these implies that offense-defense logic is totally irrelevant.

Humans, as incumbents, inherently occupy the position of defenders as against the misaligned AIs in these scenarios, at least if we're aware of the conflict (which I grant we might not be). The worry is that AIs will try to gain control in certain ways. Offense-defense thinking is important if we ask questions like:

  1. Can we predict how AIs might try to seize control? I.e., what does control consist in from their perspective, and how might they achieve that given parties' starting positions.
  2. If we have some model of how AIs try to seize control, what does that imply about humanity's ability to defend itself?
Comment by Cullen (Cullen_OKeefe) on AI #12:The Quest for Sane Regulations · 2023-05-18T20:42:23.951Z · LW · GW

The opening statements made it clear that no one involved cared about or was likely even aware of existential risks.

I think this is a significant overstatement given, especially, these remarks from Sen. Hawley:

And I think my question is, what kind of an innovation is [AI] going to be? Is it gonna be like the printing press that diffused knowledge, power, and learning widely across the landscape that empowered, ordinary, everyday individuals that led to greater flourishing, that led above all two greater liberty? Or is it gonna be more like the atom bomb, huge technological breakthrough, but the consequences severe, terrible, continue to haunt us to this day? I don’t know the answer to that question. I don’t think any of us in the room know the answer to that question. Cause I think the answer has not yet been written. And to a certain extent, it’s up to us here and to us as the American people to write the answer.

Obviously he didn't use the term "existential risk." But that's not the standard we should use to determine whether people are aware of risks that could be called, in our lingo, existential. Hawley clearly believes that there is a clear possibility that this could be an atomic-bomb-level invention, which is pretty good (but not decisive) evidence that, if asked, he would agree that this could cause something like human extinction.

Comment by Cullen (Cullen_OKeefe) on Polio Lab Leak Caught with Wastewater Sampling · 2023-04-07T17:43:47.016Z · LW · GW

Yes, same article. (I'm confused what the question is)

Comment by Cullen (Cullen_OKeefe) on Tracking Compute Stocks and Flows: Case Studies? · 2022-10-28T00:14:07.253Z · LW · GW

Thanks! I'm a bit confused by this though. Could you point me to some background information on the type of tracking that is done there?

Comment by Cullen (Cullen_OKeefe) on Trends in GPU price-performance · 2022-09-30T18:13:59.162Z · LW · GW

Is there a publicly accessible version of the dataset?

Comment by Cullen (Cullen_OKeefe) on «Boundaries», Part 1: a key missing concept from utility theory · 2022-08-12T21:13:46.792Z · LW · GW

ELI5-level question: Is this conceptually related to one of the key insights/corollaries of the Coase theory, which is that efficient allocations of property requires clearly defined property rights? And, the behavioral econ observation that irrational attachment to the status quo (e.g., endowment effect) can prevent efficient transactions?

Comment by Cullen (Cullen_OKeefe) on Law-Following AI 4: Don't Rely on Vicarious Liability · 2022-08-04T18:23:22.027Z · LW · GW

Thanks, done. LW makes it harder than EAF to make sequences, so I didn't realize any community member could do so.

Comment by Cullen (Cullen_OKeefe) on Law-Following AI 1: Sequence Introduction and Structure · 2022-04-28T17:41:04.864Z · LW · GW

If some law is so obviously a good idea in all possible circumstances, the AI will do it whether it is law following or human preference following.

As explained in the second post, I don't agree that that's implied if the AI is intent-aligned but not aligned with some deeper moral framework like CEV.

The question isn't if there are laws that are better than nothing. Its whether we are better encoding what we want the AI to do into laws, or into terms of a utility function. Which format (or maybe some other format) is best for encoding our preferences.

I agree that that is an important question. I think we have a very long track record of embedding our values into law. The point of this sequence is to argue that we should therefore at a minimum explore pointing to (some subset of) laws, which has a number of benefits relative to trying to integrate values into the utility function objectively. I will defend that idea more fully in a later post, but to briefly motivate the idea, law (as compared to something like the values that would come from CEV) is more or less completely written down, much more agreed-upon, much more formalized, and has built-in processes for resolving ambiguities and contradictions.

If the human has never imagined mind uploading, does A go up to the human and explain what it is, asking if maybe that law should be changed?

A cartoon version of this may be that A says "It's not clear whether that's legal, and if it's not legal it would be very bad (murder), so I can't proceed until there's clarification." If the human still wants to proceed, they can try to:

  1. Change the law.
  2. Get a declaratory judgment that it's not in fact against the law.
Comment by Cullen (Cullen_OKeefe) on Law-Following AI 1: Sequence Introduction and Structure · 2022-04-28T00:54:54.202Z · LW · GW
  1. I haven't read all of Asimov, but in general, "the" law has a much richer body of interpretation and application than the Laws of Robotics did, and also have authoritative, external dispute resolution processes.

  2. I don't think so. The Counselor function is just a shorthand for the process of figuring out how the law might fairly apply to X. An agent may or may not have the drive to figure that out by default, but the goal of an LFAI system is to give it that motivation. Whether it figures out the law by asking another agent or simply reasoning about the law itself is ultimately not that important.

Comment by Cullen (Cullen_OKeefe) on Law-Following AI 2: Intent Alignment + Superintelligence → Lawless AI (By Default) · 2022-04-28T00:42:17.078Z · LW · GW

(I realized the second H in that blockquote should be an A)

Comment by Cullen (Cullen_OKeefe) on Law-Following AI 1: Sequence Introduction and Structure · 2022-04-28T00:34:37.506Z · LW · GW

I appreciate your engagement! But I think your position is mistaken for a few reasons:

First, I explicitly define LFAI to be about compliance with "some defined set of human-originating rules ('laws')." I do not argue that AI should follow all laws, which does indeed seem both hard and unnecessary. But I should have been more clear about this. (I did have some clarification in an earlier draft, which I guess I accidentally excised.) So I agree that there should be careful thought about which laws an LFAI should be trained to follow, for the reasons you cite. That question itself could be answered ethically or legally, and could vary with the system for the reasons you cite. But to make this a compelling objection against LFAI, you would have to make, I think, a stronger claim: that the set of laws worth having AI follow is so small or unimportant as to be not worth trying to follow. That seems unlikely.

Second, you point to a lot of cases where the law would be underdetermined as to some out-of-distribution (from the caselaw/motivations of the law) action that the AI wanted to do, and say that:

I don't know about you, but I want such a decision made by humans seriously considering the issue, or an AI's view of our best interests. I don't want it made by some pedantic letter of the law interpretation of some act written 100's of years ago. Where the decision comes down to arbitrary phrasing decisions and linguistic quirks.

But I think LFAI would actually facilitate the result you want, not hinder it:

  1. As I say, the pseudocode would first ask whether the act X being contemplated is clearly illegal with reference to the set of laws the LFAI is bound to follow. If it is, then that seems to be some decent (but not conclusive) evidence that there has been a deliberative process that prohibited X.
  2. The pseudocode then asks whether X is maybe-illegal. If there has not been deliberation about analogous actions, that would suggest uncertainty, which would weigh in the favor of not-X. If the uncertainty is substantial, that might be decisive against X.
  3. If the AI's estimation in either direction makes a mistake as to what humans' "true" preferences regarding X are, then the humans can decide to change the rules. The law is dynamic, and therefore the deliberative processes that shape it would/could shape an LFAI's constraints.

Furthermore, all of this has to be considered as against the backdrop of a non-LFAI system. It seems much more likely to facilitate the deliberative result than just having an AI that is totally ignorant of the law.

Your point about the laws being imperfect is well-taken, but I believe overstated. Certainly many laws are substantively bad or shaped by bad processes. But I would bet that most people, probably including you, would rather live among agents that scrupulously followed the law than agents who paid it no heed and simply pursued their objective functions.

Comment by Cullen (Cullen_OKeefe) on Ideal governance (for companies, countries and more) · 2022-04-05T23:23:50.445Z · LW · GW

"Constitutional design" may be a useful keyword. (Though it's obviously focused at the state-actor level, not governance generally.)

Comment by Cullen (Cullen_OKeefe) on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-26T21:47:54.857Z · LW · GW

I had one of the EA Forum's launch codes, but I decided to permanently delete it as an arms-reduction measure. I no longer have access to my launch code, though I admit that I cannot convincingly demonstrate this.

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good · 2020-07-22T17:27:19.149Z · LW · GW

See the 5th post, where I talk about possibly delegating to governments, which would have a similar (or even stronger) such effect.

I think this illuminates two possible cruxes that could explain any disagreement here:

  1. One's level of comfort with having some AI Benefactor implement QALY maximization instead of a less controversial program of Benefits
  2. Whether and how strategic considerations should be addressed via Benefits planning

On (1), while on an object-level I like QALY maximization, having a very large and powerful AI Benefactor unilaterally implement that as the global order seems suboptimal to me. On (2), I generally think strategic considerations should be addressed elsewhere for classic gains from specialization reasons, but thinking about how certain Benefits plans will be perceived and received globally, including by powerful actors, is an important aspect of legitimacy that can't be fully segregated.

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good · 2020-07-17T22:45:45.785Z · LW · GW

At some point, some particular group of humans code the AI and press run. If all the people who coded it were totally evil, they will make an AI that does evil things.

The only place any kind of morality can affect the AI's decisions is if the programmers are somewhat moral.

(Note that I think any disagreement we may have here dissolves upon the clarification that I also—or maybe primarily for the purposes of this series—care about non-AGI but very profitable AI systems)

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 4: Outstanding Questions on Selecting Benefits · 2020-07-17T22:43:25.684Z · LW · GW

Why would humans be making these decisions? Why are we assuming that the AI can design vaccines, but not do this sort of reasoning to select how to benefit people by itself?

I don't think it's very hard to imagine AI of the sort that is able to superhumanly design vaccines but not govern economies.

I would avoid giving heuristics like that much weight. I would say to do QALY calculations, at least to the order of magnitude. The QALY between different possible projects can differ by orders of magnitude. Which projects are on the table depends on how good the tech is and what's already been done. This is an optimisation that we can better make when you have the list of proposed beneficial AI projects in hand.

As I explained in a previous comment (referencing here for other readers), there are some procedural reasons I don't want to do pure EV maximization at the object level once the "pot" of benefits grows big enough to attract certain types of attention.

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 4: Outstanding Questions on Selecting Benefits · 2020-07-17T22:41:12.157Z · LW · GW

I agree that that is true for AGI systems.

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good · 2020-07-17T22:24:33.039Z · LW · GW

Democratization: Where possible, AI Benefits decisionmakers should create, consult with, or defer to democratic governance mechanisms.

Are we talking about decision making in a pre or post superhuman AI setting? In a pre ASI setting, it is reasonable for the people building AI systems to defer somewhat to democratic governance mechanisms, where their demands are well considered and sensible. (At least some democratic leaders may be sufficiently lacking in technical understanding of AI for their requests to be impossible, nonsensical or dangerous.)

In a post ASI setting, you have an AI capable of tracking every neuron firing in every human brain. It knows exactly what everyone wants. Any decisions made by democratic processes will be purely entropic compared to the AI. Just because democracy is better than dictatorship, monarchy ect doesn't mean we can attach positive affect to democracy and keep democracy around in the face of far better systems like benevolent super-intelligence running everything.

Modesty: AI benefactors should be epistemically modest, meaning that they should be very careful when predicting how plans will change or interact with complex systems (e.g., the world economy). Again, pre ASI, this is sensible. I would expect an ASI to be very well calibrated. It will not need to be hard coded with modesty, it can work out how modest to be by its self.

Thanks for asking for clarification on these. Yes, this is in general concerning pre-AGI systems.

[Medium confidence]: I generally agree that democracy has substantive value, not procedural value. However, I think there are very good reasons to have a skeptical prior towards any nondemocratic post-ASI order.

[Lower confidence]: I therefore suspect it's desirable to a nontrivial period of time during which AGI will exist but humans will still retain governance authority over it. My view may vary depending on what we know about the AGI and its alignment/safety.

Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good · 2020-07-17T22:01:06.463Z · LW · GW

I appreciate the comments here and elsewhere :-)

Equality: Benefits are distributed fairly and broadly.[2]

This sounds, at best like a consequence of the fact that human utility functions are sub linear in resources.

I'm not sure that I agree that that is the best justification for this, although I do agree that it is an important one. Other reasons I think this is important include:

  • Many AI researchers/organizations have endorsed the Asilomar Principles, which include desiderata like "AI technologies should benefit and empower as many people as possible." To gain trust, such organizations should plan to follow through on such statements unless they discover and announce compelling reasons not to.
  • People place psychological value on equality.
  • For global stability reasons, and to reduce the risk of adverse governmental action, giving some benefits to rich people/countries seems prudent notwithstanding the fact that you can "purchase" QALYs more cheaply elsewhere.
Comment by Cullen (Cullen_OKeefe) on AI Benefits Post 1: Introducing “AI Benefits” · 2020-06-23T16:52:57.922Z · LW · GW

Thanks! Fixed.