Posts
Comments
criticism in LW comments is why he stopped writing Sequences posts
I wasn't aware of this and would like more information. Can anyone provide a source, or report their agreement or disagreement with the claim?
I second questions 1, 5, and 6 after listening to the Dwarkesh interview.
Re 6: at 1:24:30 in the Dwarkesh podcast Leopold proposes the US making an agreement with China to slow down (/pause) after the US has a 100GW cluster and is clearly going to win the race to build AGI to buy time to get things right during the "volatile period" before AGI.
(Note: Regardless of whether it was worth it in this case, simeon_c's reward/incentivization idea may be worthwhile as long as there are expected to be some cases in the future where it's worth it, since the people in those future cases may not be as willing as Daniel to make the altruistic personal sacrifice, and so we'd want them to be able to retain their freedom to speak without it costing them as much personally.)
I'd be interested in hearing peoples' thoughts on whether the sacrifice was worth it, from the perspective of assuming that counterfactual Daniel would have used the extra net worth altruistically. Is Daniel's ability to speak more freely worth more than the altruistic value that could have been achieved with the extra net worth?
Retracted, thanks.
Retracted due to spoilers and not knowing how to use spoiler tags.
Received $400 worth of bitcoin. I confirm the bet.
@RatsWrongAboutUAP I'm willing to risk up to $20k at 50:1 odds (i.e. If you give me $400 now, I'll owe you $20k in 5 years if you win the bet) conditional on (1) you not being privy to any non-public information about UFOs/UAP and (2) you being okay with forfeiting any potential winnings in the unlikely event that I die before bet resolution.
Re (1): Could you state clearly whether you do or do not have non-public information pertaining to the bet?
Re (2): FYI The odds of me dying in the next 5 years are less than 3% by SSA base rates, and my credence is even less than that if we don't account for global or existential catastrophic risk. The reason I'd ask to not owe you any money in the worlds in which you win (and are still alive to collect money) and I'm dead is because I wouldn't want anyone else to become responsible for settling such a significant debt on my behalf.
If you accept, please reply here and send the money to this Bitcoin address: 3P6L17gtYbj99mF8Wi4XEXviGTq81iQBBJ
I'll confirm receipt of the money when I get notified of your reply here. Thanks!
IMO the largest trade-offs of being vegan for most people aren't health trade-offs, but they're other things like the increased time/attention cost of identifying non-vegan foods. Living in a place where there's a ton of non-vegan food available at grocery stores and restaurants makes it more of a pain to get food at stores and restaurants than it is if you're not paying that close attention to what's in your food. (I'm someone without any food allergies, and I imagine being vegan is about as annoying as having certain food allergies).
That being said, it also seems to me that the vast majority of people's diets are not well optimized for health. Most people care about convenience, cost, taste, and other factors as well. My intuition is that if we took a random person and said "hey, you have to go vegan, lets try to find a vegan diet that's healthier than your current diet" that we'd succeed the vast majority of the time simply because most people don't eat very healthily. That said, the random person would probably prefer a vegan diet optimized for things other than just health more than a vegan diet optimized for just health.
I only read the title, not the post, but just wanted to leave a quick comment to say I agree that veganism entails trade-offs, and that health is one of the axes. Also note that I've been vegan since May 2019 and lacto-vegetarian since October 2017, for ethical reasons, not environmental or health or other preferences reasons.
It's long (since before I changed my diet) been obvious to me that your title statement is true since a prior it seems very unlikely that the optimal diet for health is one that contains exactly zero animal products, given that humans are omnivores. One doesn't need to be informed about nutrition to make that inference.
Probability that most humans die because of an AI takeover: 11%
This 11% is for "within 10 years" as well, right?
Probability that the AI we build doesn’t take over, but that it builds even smarter AI and there is a takeover some day further down the line: 7%
Does "further down the line" here mean "further down the line, but still within 10 years of building powerful AI"? Or do you mean it unqualified?
I made a visualization of Paul's guesses to better understand how they overlap:
https://docs.google.com/spreadsheets/d/1x0I3rrxRtMFCd50SyraXFizSO-VRB3TrCRxUiWe5RMU/edit#gid=0
I took issue with the same statement, but my critique is different: https://www.lesswrong.com/posts/mnCDGMtk4NS7ojgcM/linkpost-what-are-reasonable-ai-fears-by-robin-hanson-2023?commentId=yapHwa55H4wXqxyCT
But to my mind, such a scenario is implausible (much less than one percent probability overall) because it stacks up too many unlikely assumptions in terms of our prior experiences with related systems.
You mentioned 5-6 assumptions. I think at least one isn't needed (that the goal changes as it self-improves), and disagree that the others are (all) unlikely. E.g. Agentic, non-tool AIs are already here and more will be coming (foolishly). Taking a point I just heard from Tegmark on his latest Lex Fridman podcast interview, once companies add APIs to systems like GPT-4 (I'm worried about open-sourced systems that are as powerful or more powerful in the next few years), then it will be easy for people to create AI agents that uses the LLMs capabilties by repeatedly calling it.
This is the fear of “foom,”
I think the popular answer to this survey also includes many slow takeoff, no-foom scenarios.
And then, when humans are worth more to the advance of this AI’s radically changed goals as mere atoms than for all the things we can do, it simply kills us all.
I agree with this, though again I think the "changed" can be ommitted.
Secondly, I also think it's possible that rather than the unaligned superintelligence killing us all in the same second like EY often says, that it may kill us off in a manner like how humans kill off other species (i.e. we know we are doing it, but it doesn't look like a war.)
Re my last point, see Ben Weinstein-Raun's vision here: https://twitter.com/benwr/status/1646685868940460032
Furthermore, the goals of this agent AI change radically over this growth period.
Noting that this part doesn't seem necessary to me. The agent may be misaligned before the capability gain.
Plausibly, such “ems” may long remain more cost-effective than AIs on many important tasks.
"Plausibly" (i.e. 'maybe') is not enough here to make the fear irrational ("Many of these AI fears are driven by the expectation that AIs would be cheaper, more productive, and/or more intelligent than humans.")
In other words, while it's reasonable to say "maybe the fears will all be for nothing", that doesn't mean it's not reasonable to be fearful and concerned due to the stakes involved and the nontrivial chance that things do go extremely badly.
And yes, even if AIs behave predictably in ordinary situations, they might act weird in unusual situations, and act deceptively when they can get away with it. But the same applies to humans, which is why we test in unusual situations, especially for deception, and monitor more closely when context changes rapidly.
"But the same applies to humans" doesn't seem like an adequate response when the AI system is superintelligent or past the "sharp left turn" capabilities threshold. Solutions that work for unaligned deceptive humans won't save us from a sufficiently intelligent/capable unaligned deceptive entity.
buy robots-took-most-jobs insurance,
I like this proposal.
If we like where we are and can’t be very confident of where we may go, maybe we shouldn’t take the risk and just stop changing. Or at least create central powers sufficient to control change worldwide, and only allow changes that are widely approved. This may be a proposal worth considering, but AI isn’t the fundamental problem here either.
I'm curious what you (Hanson) think(s) *is* the fundamental problem here if not AI?
Context: It seems to me that Toby Ord is right that the largest existential risks (AI being number one) are all anthropormphic risks, rather than natural risks. They also seem to be risks associated with the development of new technologies (AI, biologically engineered pandemics, (distant third and fourth:) nuclear risk, climate change). Any large unknown existential risk also seems likely to be a risk resulting from the development of a new technology.
So given that, I would think AI *is* the fundamental problem.
Maybe we can solve the AI problems with the right incentive structures for the humans making the AI, in which case perhaps one might think the fundamental problem is the incentive structure or the institutions that exist to shape those incentives, but I don't find this persuasive. This would be like saying that the problem is not nuclear weapons, it's that the Soviet Union would use them to cause harm. (Maybe this just feels like a strawman of your view in which case feel to ignore this part.)
Doomers worry about AIs developing “misaligned” values. But in this scenario, the “values” implicit in AI actions are roughly chosen by the organisations who make them and by the customers who use them.
There is reason to think "roughly" aligned isn't enough in the case of a sufficiently capable system.
Second, Robin's statement seems to ignore (or contradict without making an argument) the fact that even if it is true for systems not as smart as humans, there may be a "sharp left turn" at some point where, in Nate Soares' words, "as systems start to work really well in domains really far beyond the environments of their training" "it’s predictably the case that the alignment of the system will fail to generalize with it."
Yudkowsky and others might give different reasons why waiting until later to gain more information about the future systems doesn't make sense, including pointing out that that may lead us to missing our first "critical try."
Robin, I know you must have heard these points before--I believe you are more familiar with e.g. Eliezer's views than I am. But if that's the case I don't understand why you would write a sentence like last one in the quotation above. It sounds like a cheap rhetorical trick to say "but instead of waiting to deal with such problems when we understand them better and can envision them more concretely" especially without saying why people who don't think we should wait don't think that's a good enough reason to wait / think there are pressing reasons to work on the problems now despite our relative state of ignorance compared to future AI researchers.
To clarify explicitly, people like Stuart Russell would point out that if future AIs are still built according to the "standard model" (a phrase I borrow from Russell) like the systems of today, then they will continue to be predictably misaligned.
This part doesn't seem to pass the ideological Turing test:
At the moment, AIs are not powerful enough to cause us harm, and we hardly know anything about the structures and uses of future AIs that might cause bigger problems. But instead of waiting to deal with such problems when we understand them better and can envision them more concretely, AI “doomers” want stronger guarantees now.
I strongly agree with this request.
If companies don't want to be the first to issue such a statement then I suggest they coordinate and share draft statements with each other privately before publishing simultaneously.
Demis Hassabis answered the question "Do you think DeepMind has a responsibility to hit pause at any point?" in 2022:
Question: Are innerly-misaligned (superintelligent) AI systems supposed to necessarily be squiggle maximizers, or are squiggle maximizers supposed to only be one class of innerly-misaligned systems?
It'd be nice if Hassabis made another public statement about his views on pausing AI development and thoughts on the FLI petition. If now's not the right time in his view, when is? And what can he do to help with coordination of the industry?
On the subject of DeemMind and pausing AI development, I'd like to highlight Demis Hassabis's remark on this topic in a DeepMind podcast interview a year ago:
'Avengers assembled' for AI Safety: Pause AI development to prove things mathematically
Hannah Fry (17:07):
You said you've got this sort of 20-year prediction and then simultaneously where society is in terms of understanding and grappling with these ideas. Do you think that DeepMind has a responsibility to hit pause at any point?
Demis Hassabis (17:24):
Potentially. I always imagine that as we got closer to the sort of gray zone that you were talking about earlier, the best thing to do might be to pause the pushing of the performance of these systems so that you can analyze down to minute detail exactly and maybe even prove things mathematically about the system so that you know the limits and otherwise of the systems that you're building. At that point I think all the world's greatest minds should probably be thinking about this problem. So that was what I would be advocating to you know the Terence Tao’s of this world, the best mathematicians. Actually I've even talked to him about this—I know you're working on the Riemann hypothesis or something which is the best thing in mathematics but actually this is more pressing. I have this sort of idea of like almost uh ‘Avengers assembled’ of the scientific world because that's a bit of like my dream.
Demis Hassabis didn't sign the letter, but has previously said that DeepMind potentially has a responsibility to hit pause at some point:
'Avengers assembled' for AI Safety: Pause AI development to prove things mathematically
Hannah Fry (17:07):
You said you've got this sort of 20-year prediction and then simultaneously where society is in terms of understanding and grappling with these ideas. Do you think that DeepMind has a responsibility to hit pause at any point?
Demis Hassabis (17:24):
Potentially. I always imagine that as we got closer to the sort of gray zone that you were talking about earlier, the best thing to do might be to pause the pushing of the performance of these systems so that you can analyze down to minute detail exactly and maybe even prove things mathematically about the system so that you know the limits and otherwise of the systems that you're building. At that point I think all the world's greatest minds should probably be thinking about this problem. So that was what I would be advocating to you know the Terence Tao’s of this world, the best mathematicians. Actually I've even talked to him about this—I know you're working on the Riemann hypothesis or something which is the best thing in mathematics but actually this is more pressing. I have this sort of idea of like almost uh ‘Ave
Feedback on the title: I don't like the title because it is binary.
Saying X is "good" or "bad" at something isn't very informative.
There are many degrees of goodness. Was it worse than you thought it would be before you played around with it a bit more? Was it worse than some popular article or tweet made you think? Was it worse than some relevant standard?
Loved the first paragraph:
In 2022, over 700 top academics and researchers behind the leading artificial intelligence companies were asked in a survey about future A.I. risk. Half of those surveyed stated that there was a 10 percent or greater chance of human extinction (or similarly permanent and severe disempowerment) from future AI systems. Technology companies building today’s large language models are caught in a race to put all of humanity on that plane.
Idea: Run a competition to come up with other such first paragraphs people can use in similar op eds, that effectively communicate important ideas like this that are good to propagate.
Then test the top answers like @Peter Wildeford said here:
If we want to know what arguments resonate with New York Times articles we can actually use surveys, message testing, and focus groups to check and we don't need to guess! (Disclaimer: My company sells these services.)
For general public, the Youtube posting is now up—it has 80 comments so far. There are also likely other news articles citing this interview that may have comment sections.
I'm not John, but if you interpret "epsilon precautions" as meaning "a few precautions" and "pre-galaxy-brained" as "before reading Zvi's Galaxy Brained Take interpretation of the film" I agree with his comment.
I just thought of a flaw in my analysis, which is that if it's intractable to make AI alignment more or less likely (and intractable to make the development of transformative AI more or less safe), then accelerating AI timelines actually seems good because the benefits to people post-AGI if it goes well (utopian civilization for longer) seem to outweigh the harms to people pre-AGI if goes badly (everyone on Earth dies sooner). Will think about this more.
Curious if you ever watched M3GAN?
I can't stand it, and I struggle to suspend my disbelief after lazy writing mistakes like this.
FWIW this sort of thing bothers me in movies a ton, but I was able to really enjoy M3GAN when going into it wanting it to be good and believing it might be due to reading Zvi's Mostly Spoiler-Free Review In Brief.
Yes, it's implausible that Gemma is able to build the protype at home in a week. The writer explains that she's using data from the company's past toys, but this still doesn't explain why a similar AGI hasn't been built elsewhere in the world using some other data set. But I was able to look past this detail because the movie gets enough stuff right in its depiction of AI (that other movies about AI don't get right) that it makes up for the shortcomings and makes it one of the top 2 most realistic films on AI I've seen (the other top realistic AI movie being Colossus: The Forbin Project).
As Scott Aaronson says in his review:
Incredibly, unbelievably, here in the real world of 2023, what still seems most science-fictional about M3GAN is neither her language fluency, nor her ability to pursue goals, nor even her emotional insight, but simply her ease with the physical world: the fact that she can walk and dance like a real child, and all-too-brilliantly resist attempts to shut her down, and have all her compute onboard, and not break.
Dumb characters really put me off in most movies, but in this case I think it was fine. Gemma and her assistant's jobs are both on the line if M3GAN doesn't pan out, so they have an incentive to turn a blind eye to that. Also, their suspicions that M3GAN was dangerous weren't blatantly obvious such that people who lacked security mindsets (as some people do in real life) couldn't miss them.
I was thinking the characters were all being very stupid taking big risks when they created this generally intelligent agentic protype M3GAN, but given that we live in a world where a whole lot of industry players are trying to create AGI while not even paying lip service to alignment concerns made me willing to accept that the characters' actions, while stupid, were plausible enough to still feel realistic.
This review of M3GAN didn't get the attention it deserves!
I only just came across your review a few hours ago and decided to stop and watch the movie immediately after reading your Mostly Spoiler-Free Review In Brief section, before reading Aaronson's review and the rest of yours.
- In my opinion, the most valuable part of this review is your articulation of how the film illustrates ~10 AI safety-related problems (in the Don’t-Kill-Everyoneism Problems section).
- This is now my favorite post of yours, Zvi, thanks to the above and your amazing Galaxy Brained Take section. While I agree that it's unlikely the writer intended this interpretation, I took your interpretation to heart and decided to give this film a 10 out of 10 on IMDb, putting it in the top 4% of the 1,340+ movies I have now seen (and rated) in my life, and making it the most underrated movie in my opinion (measured by My Rating minus IMDB Rating).
- While objectively it's not as good as many films I've given 9s and 8s to, I really enjoyed watching it, think it's one of the best films on AI from a realism perspective I've seen (Colossus: The Forbin Project is my other top contender).
I agreed with essentially everything in your review, including your reaction to Aaronson's commentary re Asimov's laws.
This past week I read Nate's post on the sharp left turn (which emphasizes how people tend to ignore this hard part of the alignment problem) and recently watched Eliezer express hopelessness related to humanity not taking alignment seriously in his We're All Gonna Die interview on the Bankless podcast.
This put me in a state of mind such that when I saw Aaronson suggest that an AI system as capable as M3GAN could plausibly follow Asimov's First and Second Laws (and thereby be roughly aligned?), it was fresh on my mind to feel that people were downplaying the AI alignment problem and not taking it sufficiently seriously. This made me feel put off by Aaronson's comment even though he had just said "Please don’t misunderstand me here to be minimizing the AI alignment problem, or suggesting it’s easy" in his previous sentence.
So while I explicitly don't want to criticize Aaronson for this due to him making clear that he did not intend to minimize the alignment problem with his statement re Asimov's Laws, I do want to say that I'm glad you took the time to explain clearly why Asimov's Laws would not save the world from M3GAN.
I also appreciated your insights into the film's illustration of the AI safety-related problems.
Thanks for sharing. FWIW I would have preferred to read the A Way to Be Okay section first, and only reference the other sections if things didn't make sense (though I think I would have understood it all fine without those five sections) (though I didn't know this in advance so I just read the essay from beginning to end).
The main benefits of the project are presumably known to the engineer engaging in it. It was the harm of the project (specifically the harm arising from how the project accelerates AI timelines) that the engineer was skeptical was significant that I wanted to look at more closely to determine whether it was large enough to make it questionable whether engaging in the project was good for the world.
Given my finding that a 400-hour ML project (I stipulated the project takes 0.2 years of FTE work) would, via its effects on shortening AI timelines, shorten the lives of existing people by around 17 years, it seems like this harm is not only trivial, but likely dominates the expected value of engaging in the project. This works out to shortening peoples' lives by around 370 hours for every hour worked on the project.
If someone thinks the known benefits of working on the project are being drastically underestimated as well, I'd be interested in seeing an analysis of the expected value of those benefits, and in particular and am curious which benefits that person thinks are surprisingly huge. Given the lack of safety angle to the project, I don't see what other benefit (or harm) would come close in magnitude to the harm caused via accelerating AI timelines and increasing extinction risk, but of course would love to hear if you have any idea.
Thanks for the response and for the concern. To be clear, the purpose of this post was to explore how much a typical, small AI project would affect AI timelines and AI risk in expectation. It was not intended as a response to the ML engineer, and as such I did not send it or any of its contents to him, nor comment on the quoted thread. I understand how inappropriate it would be to reply to the engineer's polite acknowledgment of the concerns with my long analysis of how many additional people will die in expectation due to the project accelerating AI timelines.
I also refrained from linking to the quoted thread specifically because again this post is not a contribution to that discussion. The thread merely inspired me to take a quantitative look at what the expected impacts of a typical ML project actually are. I included the details of the project for context in case others wanted to take them into account when forecasting the impact.
I also included Jim and Raymond's comments because this post takes their claims as givens. While I understand the ML engineer may have been skeptical of their claims, and so elaborating on why the project is expected to accelerate AI timelines (and therefore increase AI risk) would be necessary to persuade them that their project is bad for the world, again that aim is outside of the scope of this post.
I've edited the heading after "The trigger for this post" from "My response" to "My thoughts on whether small ML projects significantly affect AI timelines" to make clear that the contents are not intended as a response to the ML engineer, but rather are just my thoughts about the claim made by the ML engineer. I assume that heading is what led you to interpret this post as a response to the ML engineer, but if there's anything else that led you to interpret it that way, I'd appreciate you letting me know so I can improve it for others who might read it. Thanks again for reading and offering your thoughts.
IIRC Linch estimated in an EA Forum post that we should spend up to ~$100M to reduce x-risk by 1 basis point, i.e. ~$1M per microdoom. Maybe nanodooms would be a better unit.
Re: 1: Do Dane's Guestimate models ever yield >1 microdoom estimates for solo research projects? That sounds like a lot.
my unconditional median TAI timeline is now something like 2047, with a mode around 2035, defined by the first year we get >30% yearly GWP growth as measured from a prior peak, or an event of comparable significance.
Given it's about to be 2023, this means your mode is 12 years away and your median is 24 years away. I'd expect your mode to be nearer than your median, but probably not that much nearer.
I haven't forecasted when we might get >30% yearly GWP growth or an event of comparable significance (e.g. x-risk) specifically, but naively I'd guess that (for example) 2040 is more likely than 2035 to be the first year in which there is >30% annual GWP growth (or x-risk).
Also every one of the organizations you named is a capabilities company which brands itself based on the small team they have working on alignment off on the side.
I'm not sure whether OpenAI was one of the organizations named, but if so, this reminded me of something Scott Aaronson said on this topic in the Q&A of his recent talk "Scott Aaronson Talks AI Safety":
Maybe the one useful thing I can say is that, in my experience, which is admittedly very limited—working at OpenAI for all of five months—I’ve found my colleagues there to be extremely serious about safety, bordering on obsessive. They talk about it constantly. They actually have an unusual structure, where they’re a for-profit company that’s controlled by a nonprofit foundation, which is at least formally empowered to come in and hit the brakes if needed. OpenAI also has a charter that contains some striking clauses, especially the following:
We are concerned about late-stage AGI development becoming a competitive race without time for adequate safety precautions. Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project.
Of course, the fact that they’ve put a great deal of thought into this doesn’t mean that they’re going to get it right! But if you ask me: would I rather that it be OpenAI in the lead right now or the Chinese government? Or, if it’s going to be a company, would I rather it be one with a charter like the above, or a charter of “maximize clicks and ad revenue”? I suppose I do lean a certain way.
Source: 1:12:52 in the video, edited transcript provided by Scott on his blog.
In short, it seems to me that Scott would not have pushed back on a claim that OpenAI is an organization" that seem[s] like the AI research they're doing is safety research" in the way you did Jim.
I assume that all the sad-reactions are sadness that all these people at the EAGx conference aren't noticing that their work/organization seems bad for the world on their own and that these conversations are therefore necessary. (The shear number of conversations like this you're having also suggests that it's a hopeless uphill battle, which is sad.)
So I wanted to bring up what Scott Aaronson said here to highlight that "systemic change" interventions are necessary also. Scott's views are influential; potentially targeting talking to him and other "thought leaders" who aren't sufficiently concerned about slowing down capabilities progress (or who don't seem to emphasize enough concern for this when talking about organizations like OpenAI) would be helpful, of even necessary, for us to get to a world a few years from now where everyone studying ML or working on AI capabilities is at least aware of arguments about AI alignment and why increasing increasing AI capabilities seems harmful.