Posts

Could Germany have won World War I with high probability given the benefit of hindsight? 2023-11-27T22:52:42.066Z
Could World War I have been prevented given the benefit of hindsight? 2023-11-27T22:39:15.866Z
“Why can’t you just turn it off?” 2023-11-19T14:46:18.427Z
On Overhangs and Technological Change 2023-11-05T22:58:51.306Z
Stuxnet, not Skynet: Humanity's disempowerment by AI 2023-11-04T22:23:55.428Z
Architects of Our Own Demise: We Should Stop Developing AI 2023-10-26T00:36:05.126Z
Roko's Shortform 2020-10-14T17:30:47.334Z
Covid-19 Points of Leverage, Travel Bans and Eradication 2020-03-19T09:08:28.846Z
Ubiquitous Far-Ultraviolet Light Could Control the Spread of Covid-19 and Other Pandemics 2020-03-18T12:44:42.756Z
$100 for the best article on efficient charty - the winner is ... 2010-12-12T15:02:06.007Z
$100 for the best article on efficient charity: the finalists 2010-12-07T21:15:31.102Z
$100 for the best article on efficient charity -- Submit your articles 2010-12-02T20:57:31.410Z
Superintelligent AI mentioned as a possible risk by Bill Gates 2010-11-28T11:51:50.475Z
$100 for the best article on efficient charity -- deadline Wednesday 1st December 2010-11-24T22:31:57.215Z
Competition to write the best stand-alone article on efficient charity 2010-11-21T16:57:35.003Z
Public Choice and the Altruist's Burden 2010-07-22T21:34:52.740Z
Politicians stymie human colonization of space to save make-work jobs 2010-07-18T12:57:47.388Z
Financial incentives don't get rid of bias? Prize for best answer. 2010-07-15T13:24:59.276Z
A proposal for a cryogenic grave for cryonics 2010-07-06T19:01:36.898Z
MWI, copies and probability 2010-06-25T16:46:08.379Z
Poll: What value extra copies? 2010-06-22T12:15:54.408Z
Aspergers Survey Re-results 2010-05-29T16:58:34.925Z
Shock Level 5: Big Worlds and Modal Realism 2010-05-25T23:19:44.391Z
The Tragedy of the Social Epistemology Commons 2010-05-21T12:42:38.103Z
The Social Coprocessor Model 2010-05-14T17:10:15.475Z
Aspergers Poll Results: LW is nerdier than the Math Olympiad? 2010-05-13T14:24:24.783Z
Do you have High-Functioning Asperger's Syndrome? 2010-05-10T23:55:45.936Z
What is missing from rationality? 2010-04-27T12:32:06.806Z
Report from Humanity+ UK 2010 2010-04-25T12:33:33.170Z
Ugh fields 2010-04-12T17:06:18.510Z
Anthropic answers to logical uncertainties? 2010-04-06T17:51:49.486Z
What is Rationality? 2010-04-01T20:14:09.309Z
David Pearce on Hedonic Moral realism 2010-02-03T17:27:31.982Z
Strong moral realism, meta-ethics and pseudo-questions. 2010-01-31T20:20:47.159Z
Simon Conway Morris: "Aliens are likely to look and behave like us". 2010-01-25T14:16:18.752Z
London meetup: "The Friendly AI Problem" 2010-01-19T23:35:47.131Z
Savulescu: "Genetically enhance humanity or face extinction" 2010-01-10T00:26:56.846Z
Max Tegmark on our place in history: "We're Not Insignificant After All" 2010-01-04T00:02:04.868Z
Help Roko become a better rationalist! 2009-12-02T08:23:37.643Z
11 core rationalist skills 2009-12-02T08:09:05.922Z
Being saner about gender and rationality 2009-07-20T07:17:13.855Z
How likely is a failure of nuclear deterrence? 2009-07-15T00:01:28.640Z
Our society lacks good self-preservation mechanisms 2009-07-12T09:26:23.365Z
The enemy within 2009-07-05T15:08:05.874Z
Richard Dawkins TV - Baloney Detection Kit video 2009-06-25T00:27:23.325Z
Shane Legg on prospect theory and computational finance 2009-06-21T17:57:09.235Z
Cascio in The Atlantic, more on cognitive enhancement as existential risk mitigation 2009-06-18T15:09:57.954Z
Intelligence enhancement as existential risk mitigation 2009-06-15T19:35:07.530Z
The Terrible, Horrible, No Good, Very Bad Truth About Morality and What To Do About It 2009-06-11T12:31:02.904Z
Expected futility for humans 2009-06-09T12:04:29.306Z

Comments

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-12-01T02:23:49.582Z · LW · GW

Yes, and also using radio with good encryption to communicate quickly.

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-29T14:22:03.885Z · LW · GW

sure.

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T17:08:40.971Z · LW · GW

It certainly seems that a mastery of tank warfare would have helped a lot. But the British experience with tanks shows that there was a huge amount of resistance within the military to new forms of warfare. Britain only had tanks because Winston Churchill made it his priority to support them.

New weapon systems are not impressive at first. The old ways are typically a local optimum. So the real question here is how to leave that local optimum!

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T13:04:58.511Z · LW · GW

That's a good point. I will clarify. I mean [a] - you win, the enemy surrenders.

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T03:07:45.033Z · LW · GW

I'm struggling to see why fun books would make any difference. Germany didn't lose because it ran out of light reading material.

As for troop morale and so on, I don't think that was a decisive element as by the time it started to matter, defeat was already overdetermined.

In other words, I think Germany would have lost WWI even with infinite morale.

Comment by Roko on Apocalypse insurance, and the hardline libertarian take on AI risk · 2023-11-28T02:33:20.668Z · LW · GW

If it pays out in advance it isn't insurance.

A contract that relies on a probability to calculate payments is also a serious theoretical headache. If you are a Bayesian, there's no objective probability to use since probabilities are subjective things that only exist relative to a state of partial ignorance about the world. If you are a frequentist there's no dataset to use.

There's another issue.

As the threat of extinction gets higher and also closer in time, it can easily be the case that there's no possible payment that people ought to rationally accept.

Finally different people have different risk tolerances such that some people will gladly take a large risk of death for an upfront payment, but others wouldn't even take it for infinity money.

E.g. right now I would take a 16% chance of death for a $1M payment, but if I had $50M net worth I wouldn't take a 16% risk of death even if infinity money was being offered.

Since these x-risk companies must compensate everyone at once, even a single rich person in the world could make them uninsurable.

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T01:58:50.977Z · LW · GW

I think you have a nerdy novel society and a loss of WWI for the same reasons it was lost in our timeline

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-27T23:59:31.069Z · LW · GW

But I don't see an actionable plan to winning here?

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-27T23:59:01.649Z · LW · GW

Sure you can bring decision theory knowledge. All I'm disallowing is something like bringing back exact plans for a nuke.

Comment by Roko on Could World War I have been prevented given the benefit of hindsight? · 2023-11-27T23:41:15.905Z · LW · GW

Well, it turned out that attacking on The Western Front in WWI was basically impossible. The front barely moved over 4 years, and that was with far more opposing soldiers over a much wider front.

So the best strategy for Germany would have been to dig in really deep and just wait for France to exhaust itself.

At least that's my take as something of an amateur.

Comment by Roko on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-27T23:34:46.883Z · LW · GW

But the British could have entered the war anyway. After all, British war goals were to maintain a balance of power in Europe and they don't want France and Russia to fall and Germany to be too strong.

Comment by Roko on Could World War I have been prevented given the benefit of hindsight? · 2023-11-27T23:21:46.867Z · LW · GW

OK, but if I am roleplaying the German side, I might choose to still start WWI but just not attack through Belgium. I will hold the Western Front with France and attack Russia.

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-26T21:06:17.519Z · LW · GW

True. I may in fact have been somewhat underconfident here.

Comment by Roko on Why is violence against AI labs a taboo? · 2023-11-24T22:09:35.687Z · LW · GW

I think violence helps unaligned AI more than it helps aligned AI.

If the research all goes underground it will slow it down but it will also make it basically guaranteed that there's a competitive, uncoordinated transition to superintelligence.

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-24T22:01:23.105Z · LW · GW

Well, Altman is back in charge now.... I don't think I'm being overconfident

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-23T11:36:01.906Z · LW · GW

It seems that I was mostly right in the specifics, there was a lot of resistance to getting rid of Altman and he is back (for now)

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-22T13:06:47.669Z · LW · GW

I didn't make anything up. Altman is now back in charge BTW.

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-19T16:38:45.730Z · LW · GW

Well the new CEO is blowing kisses to him on Twitter

https://twitter.com/miramurati/status/1726126391626985793

Comment by Roko on “Why can’t you just turn it off?” · 2023-11-19T15:36:09.755Z · LW · GW

Well the board are in negotiations to have him back

https://www.theverge.com/2023/11/18/23967199/breaking-openai-board-in-discussions-with-sam-altman-to-return-as-ceo

"A source close to Altman says the board had agreed in principle to resign and to allow Altman and Brockman to return, but has since waffled — missing a key 5PM PT deadline by which many OpenAI staffers were set to resign. If Altman decides to leave and start a new company, those staffers would assuredly go with him."

Comment by Roko on A Cost- Benefit Analysis of Immunizing Healthy Adults Against Influenza · 2023-11-17T19:16:47.682Z · LW · GW

I think there's a pretty big mistake here - the value of not getting flu is a lot more than $200.

At a $5M value of life, each day is worth about $200, so 7 days of almost complete incapacitation is -$1400.

I would certainly pay $1400 upfront to make a bad flu just instantly stop.

Comment by Roko on Concrete positive visions for a future without AGI · 2023-11-09T02:48:39.102Z · LW · GW

dath ilan is currently getting along pretty well without AGI

I hate to have to say it, but you are generalizing from fictional evidence

Dath ilan doesn't actually exist. It's a fantasy journey in Eliezer's head. Nobody has ever subjected it to the rigors of experimentation and attempts at falsification.

The world around us does exist. And things are not going well! We had a global pandemic that was probably caused by government labs that do research into pandemics, and then covered up by scientists who are supposed to tell us the truth about pandemics. THAT ACTUALLY HAPPENED! God sampled from the actual generating function for the universe and apparently that outcome was sufficiently high to have been picked.

A world without any advanced AI tech is probably not a good world; collapsing birthrates, rent extraction, dysgenics and biorisk are probably fatal.

Comment by Roko on On Overhangs and Technological Change · 2023-11-07T14:09:05.341Z · LW · GW

Yes, and I believe that the invention and spread of firearms was key to this as they reduce the skill dependence of warfare, reducing the advantage that a dedicated warband has over a sedentary population.

Comment by Roko on On Overhangs and Technological Change · 2023-11-07T14:06:51.159Z · LW · GW

What happened in the 1200s is that Mongols had a few exceptionally good leaders

It's consistent with the overhang model that a new phase needs ingredients A, B, C, ... X, Y, Z. When you only have A, ... X it doesn't work. Then Y and Z come, it all falls into place and there's a rapid and disruptive change. In this case maybe Y and Z were good leaders or something. I don't want to take too strong a position on this, as given my research it seems there is still debate among specialists about what exactly the key ingredients were.

Comment by Roko on The other side of the tidal wave · 2023-11-07T11:59:44.554Z · LW · GW

Most people, ultimately, do not care about something that abstract and will be happy living in their own little Truman Show realities that are customized to their preferences.

Personally I find The World to be dull and constraining, full of things you can't do because someone might get offended or some lost-purposes system might zap you. Did you fill in your taxes yet!? Did you offend someone with that thoughtcrime?! Plus, there are the practical downsides like ill health and so on.

I'd be quite happy to never see 99.9999999% of humanity ever again, to simply part ways and disappear off into our respective optimized Truman Shows.

And honestly I think anyone who doesn't take this point of view is being insane. Whatever it is you like, you can take with you. Including select other people who mutually consent.

Comment by Roko on On Overhangs and Technological Change · 2023-11-06T20:26:29.425Z · LW · GW

Could an alien observer have identified Genghis Khan's and the Mongol's future prospects

Well, probably not to that level of specificity, but I think the general idea of empires consuming vulnerable lands and smaller groups would have been obvious

Comment by Roko on Stuxnet, not Skynet: Humanity's disempowerment by AI · 2023-11-06T12:32:17.267Z · LW · GW

Thanks! Linked.

Comment by Roko on On Overhangs and Technological Change · 2023-11-06T12:19:01.183Z · LW · GW

Well, sometimes they can, because sometimes the impending consumption of the resource is sort of obvious. Imagine a room that's gradually filling with a thin layer of petrol on the floor, with a bunch of kids playing with matches in it.

Comment by Roko on Stuxnet, not Skynet: Humanity's disempowerment by AI · 2023-11-05T18:24:06.421Z · LW · GW

One possible way to kill humans

I suspect that drones + poison may be surprisingly effective. You only need one small-ish facility to make a powerful poison or bioweapon that drones can spread everywhere or just sneak into the water supply. Once 90% of humans are dead, the remainder can be mopped up.

Way harder to be able to keep things running once we're gone.

Comment by Roko on Stuxnet, not Skynet: Humanity's disempowerment by AI · 2023-11-05T18:17:00.683Z · LW · GW

Yes, this is a great example of Exploratory engineering

Comment by Roko on AI Safety is Dropping the Ball on Clown Attacks · 2023-11-05T11:02:57.594Z · LW · GW

This post is way too long. Forget clown attacks, we desperately need LLMs that can protect us from verbosity attacks.

Comment by Roko on The other side of the tidal wave · 2023-11-05T10:33:55.621Z · LW · GW

You can just create personalized environments to your preferences. Assuming that you have power/money in the post-singularity world.

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-27T16:34:47.611Z · LW · GW

a technical problem, around figuring out how to build an AGI that does what the builder wants

How does a solution to the above solve the coordination/governance problem?

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T21:24:48.683Z · LW · GW

Ah, I see. Yeah, that's a reasonable worry. Any ideas on how someone in those orgs could incentivize such behavior whilst discouraging poorly thought out pivotal acts?

the fact that we are having this conversation simply underscores how dangerous this is and how unprepared we are.

This is the future of the universe we're talking about. It shouldn't be a footnote!

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T21:22:43.300Z · LW · GW

researchers at big labs will not be forced to program an ASI to do bad things against the researchers' own will

Well these systems aren't programmed. Researchers work on architecture and engineering, goal content is down to the RLHF that is applied and the wishes of the user(s), and the wishes of the user(s) are determined by market forces, user preferences, etc. And user preferences may themselves be influenced by other AI systems.

Closed source models can have RLHF and be delivered via an API, but open source models will not be far behind at any given point in time. And of course prompt injection attacks can bypass the RLHF on even closed source models.

The decisions about what RLHF to apply on contentious topics will come from politicians and from the leadership of the companies, not from the researchers. And politicians are influenced by the media and elections, and company leadership is influenced by the market and by cultural trends.

Where does the chain of control ultimately ground itself?

Answer: it doesn't. Control of AI in the current paradigm is floating. Various players can influence it, but there's no single source of truth for "what's the AI's goal".

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T18:24:12.639Z · LW · GW

AI researchers would be the ones in control

No. You have simplistic and incorrect beliefs about control.

If there are a bunch of companies (Deepmind, Anthropic, Meta, OpenAI, ...) and a bunch of regulation efforts and politicians who all get inputs, then the AI researchers will have very little control authority, as little perhaps as the physicists had over the use of the H-bomb.

Where does the control really reside in this system?

Who made the decision to almost launch a nuclear torpedo in the Cuban Missile Crisis?

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T12:37:27.960Z · LW · GW

I would question the idea of "control" being pivotal.

Even if every AI is controllable, there's still the possibility of humans telling those AIs to bad things and thereby destabilizing the world and throwing it into an equilibrium where there are no more humans.

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T11:55:34.919Z · LW · GW

Global compliance is the sine qua non of regulatory approaches, and there is no evidence of the political will to make that happen being within our possible futures unless some catastrophic but survivable casus belli happens to wake the population up

Part of why I am posting this is in case that happens, so people are clear what side I am on.

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T10:50:03.866Z · LW · GW

Who is going to implement CEV or some other pivotal act?

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T10:10:03.797Z · LW · GW

Well, the AI technical safety work that's appropriate for neural networks is about 5-6 years old, if we go back before 2017 I don't think any relevant work was done

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T10:08:46.478Z · LW · GW

yes

Comment by Roko on Architects of Our Own Demise: We Should Stop Developing AI · 2023-10-26T10:08:07.736Z · LW · GW

Conversely, if we had a complete technical solution, I don't see why we necessarily need that much governance competence.

As I said in the article, technically controllable ASIs are the equivalent of an invasive species which will displace humans from Earth politically, economically and militarily.

Comment by Roko on EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem · 2023-10-25T22:11:34.730Z · LW · GW

virtue signalling is generally used for insincerity

Virtue signalling can be sincere.

Comment by Roko on Thoughts on responsible scaling policies and regulation · 2023-10-25T21:46:36.007Z · LW · GW

"If the world were unified around the priority of minimizing global catastrophic risk, I think that we could reduce risk significantly further by implementing a global, long-lasting, and effectively enforced pause on frontier AI development—including a moratorium on the development and production of some types of computing hardware"

This really needs to be shouted from the rooftops. In the public sphere, people will hear "responsible scaling policy" as "It's maximally safe to keep pushing ahead with AI" rather than "We are taking on huge risks because politicians can't be bothered to coordinate".

Comment by Roko on RSPs are pauses done right · 2023-10-21T12:45:15.646Z · LW · GW

The problem with a naive implementation of RSPs is that we're trying to build a safety case for a disaster that we fundamentally don't understand and where we haven't even produced a single disaster example or simulation.

To be more specific, we don't know exactly which bundles of AI capabilities and deployments will eventually result in a negative outcome for humans. Worse, we're not even trying to answer that question - nobody has run an "end of the world simulator" and as far as I am aware there are no plans to do that.

Without such a model it's very difficult to do expected utility maximization with respect to AGI scaling, deployment, etc.

Safety is a global property, not a local property. We have some surface-level understanding of this from events like The Arab Spring or World War I. Was Europe in 1913 "safe"? Apparently not, but it wasn't obvious to people.

What will happen if and when someone makes AI systems that are emotionally compelling to people and demand sentient rights for AIs? How do you run a safety eval for that? What are the consequences for humanity if we let AI systems vote in elections, run for office, start companies or run mainstream news orgs and popular social media accounts? What is the endgame of that world, and does it include any humans?

Comment by Roko on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-18T14:48:19.774Z · LW · GW

You need a method of touching grass so that researchers have some idea of whether or not they're making progress on the real issues.

We already can't make MNIST digit recognizers secure against adversarial attacks. We don't know how to prevent prompt injection. Convnets are vulnerable to adversarial attacks. RL agents that play Go at superhuman levels are vulnerable to simple strategies that exploit gaps in their cognition.

No, there's plenty of evidence that we can't make ML systems robust.

What is lacking is "concrete" evidence that that will result in blood and dead bodies.

Comment by Roko on EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem · 2023-10-16T23:38:25.010Z · LW · GW

Sometimes one just has to cut the Gordian Knot:

  • Veganism is virtue signalling
  • It's unhealthy
  • It's not a priority in any meaningful value system
  • Vegan advocates are irrational about this because they are part of a secular religion
  • Saying otherwise is heresy

Conclusion: eat as much meat as you like and go work on something that actually matters, note that EA has bad epistemics in general because it has religious aspects

Addendum: I know this will get downvoted. But Truth is more important than votes. Enjoy!

Comment by Roko on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-16T16:32:36.339Z · LW · GW

you only start handing out status points after someone has successfully demonstrated the security failure

Maybe you're right, we may need to deploy an AI system that demonstrates the potential to kill tens of millions of people before anyone really takes AI risk seriously. The AI equivalent of Trinity.

https://en.wikipedia.org/wiki/Trinity_(nuclear_test)

Comment by Roko on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-16T05:12:10.617Z · LW · GW

Do you just like not believe that AI systems will ever become superhumanly strong? That once you really crank up the power (via hardware and/or software progress), you'll end up with something that could kill you?

Read what I wrote above: current systems are safe because they're weak, not safe because they're inherently safe.

Security mindset isn't necessary for weak systems because weak systems are not dangerous.

Comment by Roko on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-15T20:53:54.888Z · LW · GW

Indeed. No mention of misuse, multipolar traps, etc!

Comment by Roko on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-15T20:41:53.931Z · LW · GW

I believe the security mindset is inappropriate for AI

I think that's because AI today feels like a software project akin to building a website. If it works, that's nice, but if it doesn't work it's no big deal.

Weak systems have safe failures because they are weak, not because they are safe. If you piss off a kitten, it will not kill you. If you piss off an adult tiger...

The optimistic assumptions laid out in this post don't have to fail in every possible case for us to be in mortal danger. They only have to fail in one set of circumstances that someone actualizes. And as long as things keep looking like they are OK, people will continue to push the envelope of risk to get more capabilities.

We have already seen AI developers throw caution to the wind in many ways (releasing weights as open source, connecting AI to the internet, giving it access to a command prompt) and things seem OK for now so I imagine this will continue. We have already seen some psycho behavior from Sydney too. But all these systems are weak reasoners and they don't have a particularly solid grasp on cause and effect in the real world.

We are certainly in a better position with respect to winning than when I started posting on this website. To me the big wins are (1) that safety is a mainstream topic and (2) that the AIs learned English before they learned physics. But I don't regard those as sufficient for human survival.