Honoring Petrov Day on LessWrong, in 2019

post by Ben Pace (Benito) · 2019-09-26T09:10:27.783Z · LW · GW · 168 comments

Contents

  Unilateralist Action
  Not Destroying the World
  Petrov Day
    Footnotes
None
168 comments

Just after midnight last night, 125 LessWrong users received the following email.

Subject Line: Honoring Petrov Day: I am trusting you with the launch codes

Dear {{username}},

Every Petrov Day, we practice not destroying the world. One particular way to do this is to practice the virtue of not taking unilateralist action. 

It’s difficult to know who can be trusted, but today I have selected a group of LessWrong users who I think I can rely on in this way. You’ve all been given the opportunity to show yourselves capable and trustworthy.

This Petrov Day, between midnight and midnight PST, if you, {{username}}, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours.

Personalised launch code: {{codes}}

I hope to see you on the other side of this, with our honor intact.

Yours, Ben Pace & the LessWrong 2.0 Team

P.S. Here is the on-site announcement [LW · GW].

Unilateralist Action

As Nick Bostrom has observed, society is making it cheaper and easier for small groups to end the world. We’re lucky it requires major initiatives to build a nuclear bomb, and that the world can’t be destroyed by putting sand in a microwave.

However, other dangerous technologies are becoming widely available, especially in the domain of artificial intelligence. Only 6 months after OpenAI created the state-of-the-art language-modelling GPT-2 [LW · GW], others created similarly powerful versions and released them to the public. They disagreed about the dangers, and, because there was nothing stopping them, moved ahead.

I don’t think this example is at all catastrophic, but I worry what this suggests about the future, when people will still have honest disagreements about the consequences of an action but where those consequences will be much worse.

And honest disagreements will happen. In the 1940s, the great physicist Niels Bohr met President Roosevelt and Prime Minister Churchill, to persuade them to give the instructions for building the atomic bomb to Russia. He wanted to bring in a new world order and establish global peace, and thought this would be necessary - he believed strongly that it would prevent arms race dynamics, if only everyone just shared their science. (Churchill did not allow it.) Our newest technologies technologies do not yet have the bomb’s ability to transform the world in minutes, but I think it’s likely we’ll make powerful discoveries in the coming decades, and that publishing those discoveries will not require the permission of a president.

And then it will only take one person to end the world. Even in a group of well-intentioned people, natural disagreements will mean someone will think that taking a damaging action is actually the correct choice — Nick Bostrom calls this the “unilateralist’s curse”. In a world where dangerous technology is widely available, the greatest risk is unilateralist action.

Not Destroying the World

Stanislav Petrov once chose not to destroy the world.

As a Lieutenant Colonel of the Soviet Army, Petrov manned the system built to detect whether the US government had fired nuclear weapons on Russia. On September 26th, 1983, the system reported multiple such attacks. Petrov’s job was to report this as an attack to his superiors, who would launch a retaliative nuclear response. But instead, contrary to all the evidence the systems were giving him, he called it in as a false alarm. This later turned out to be correct.

(For a more detailed story of how Stanislav Petrov saved the world, see the original LessWrong post by Eliezer [LW · GW], which started the tradition of Petrov Day.)

During the Cold War, many other people had the ability to end the world - presidents, generals, commanders of nuclear subs from many countries, and so on. Fortunately, none of them did. As the number of people with the ability to end the world increases, so too does the standard to which we must hold ourselves. We lived up to our responsibilities in the cold war, but barely. (The Global Catastrophic Risks Institute has compiled an excellent list of 60 close calls.)

Petrov Day

On Petrov Day, we try to live to up to this responsibility - we celebrate by not destroying the world.

Raymond Arnold has suggested many ways [LW · GW] of observing Petrov Day. You can discuss it with your friends. You can hold a quiet, dignified ceremony (for example, with the beautiful booklet Jim Babcock created). But you can also play on hard mode: "During said ceremony, unveil a large red button. If anybody presses the button, the ceremony is over. Go home. Do not speak."

In the comments of Ray's post, Zvi asked the following question (about a variant where a cake gets destroyed):

I still don't understand, in the context of the ceremony, what would cause anyone to push the button. Whether or not it would incinerate a cake, which would pretty much make you history's greatest monster.

To which I replied:

The point isn't that anyone sane would push the button. It's that we, as a civilisation, are just going around building buttons (cf. nukes, AGI, etc) and so it's good practice to put ourselves in the situation where any unilateralist can destroy something we all truly value. When I said the above, I was justifying why it was useful to have a ritual around Petrov Day, not why you would press the button. I can't think of any good reason to press the button, and would be angry at anyone who did - they're just decreasing trust and increasing fear of unilateralists. We still should have a ceremony where we all practice the art of sitting together and not pressing the button.

So this year on LessWrong, I thought we'd build ourselves a big red button. Instead of making everyone go home, this button (which you can find over the frontpage map) will shut down the Less Wrong frontpage for 24 hours.

Now, this isn't a button for anyone. I know there are people with an internet access who will happily press buttons that do bad things. So today, I've emailed personalised launch codes to 125 LessWrong users, for us to practice the art of sitting together and not pressing harmful buttons[1]. If any users do submit a set of launch codes, tomorrow I’ll publish their username, and whose launch codes they were.

During Thursday 26th September, we will see whether the people with the codes can be trusted to not, unilaterally, destroy something valuable.

To all here on LessWrong today, I wish you a safe and stable Petrov Day.


Footnotes

[1] I picked the list quickly on Tuesday, mostly leaving out users I don’t really know, and a few people who I thought would press it (e.g. someone who has said in the past that they would). If this goes well we may do it again next year, with an expanded pool or more principled selection criteria. Though I think this is still a representative set - out of the 100+ users with over 1,000 karma who've logged in to LessWrong in the past month, the list includes 53% of them.


Added: Follow-Up to Petrov Day, 2019. [LW · GW]

168 comments

Comments sorted by top scores.

comment by quanticle · 2019-09-27T03:22:44.543Z · LW(p) · GW(p)

In a world where dangerous technology is widely available, the greatest risk is unilateralist action.

What Stanislav Petrov did was just as unilateralist as any of the examples linked in the OP. We must remember that when he chose to disregard the missile alert (based off his own intuition regarding the geopolitics of the world), he was violating direct orders. Yes, in this case everything turned out great, but let's think about the counterfactual scenario where the missile attack had been real. Stanislav Petrov would potentially have been on the hook for more deaths than Hitler and the utter destruction of his nation.

A unilateral choice not to act is as much of a unilateral choice as a unilateral choice to act.

Replies from: matthew-barnett, Benito, RobbBB, Slider, Benito, clone of saturn
comment by Matthew Barnett (matthew-barnett) · 2019-09-27T03:30:14.862Z · LW(p) · GW(p)

If one nation is confident that a rival nation will not retaliate in a nuclear conflict, then the selfish choice is to strike. By refusing orders, Petrov was being the type of agent who would not retaliate in a conflict. Therefore, in a certain sense, by being that type of agent, he arguably raised the risk of a nuclear strike. [Note: I still think his decision to not retaliate was the correct choice]

Replies from: quanticle
comment by quanticle · 2019-09-27T05:16:31.095Z · LW(p) · GW(p)

Petrov's choice was obviously the correct one in hindsight. What I'm questioning is whether Petrov's choice was obviously correct in foresight. The rationality community takes as a given Petrov's assertion that it was obviously silly for the United States to attack the Soviet Union with a single ICBM. Was that actually as silly as Petrov suggested? There were scenarios where small numbers of ICBMs were launched in a surprise attack against an unsuspecting adversary in order to kill leadership, and disrupt command and control systems. How confident was Petrov that this was not one of those scenarios?

Another assumption that the community makes is that Petrov choosing to report the detection would have immediately resulted in a nuclear "counterattack" by the Soviet Union. But Petrov was not a launch authority. The decision to launch or not was not up to him, it was up to the Politburo of the Soviet Union. We have to remember that when he chose to lie about the detection, by calling it a computer glitch when he didn't know for certain that it was one, Petrov was defecting against the system. He was deliberately feeding false data to his superiors, betting that his model of the world was more accurate than his commanders'. Is that the sort of behavior we really want to lionize?

Replies from: Aiyen, jmh, Benito
comment by Aiyen · 2019-09-27T20:36:17.634Z · LW(p) · GW(p)
But Petrov was not a launch authority. The decision to launch or not was not up to him, it was up to the Politburo of the Soviet Union.

This is obviously true in terms of Soviet policy, but it sounds like you're making a moral claim. That the Politburo was morally entitled to decide whether or not to launch, and that no one else had that right. This is extremely questionable, to put it mildly.

We have to remember that when he chose to lie about the detection, by calling it a computer glitch when he didn't know for certain that it was one, Petrov was defecting against the system.

Indeed. But we do not cooperate in prisoners' dilemmas "just because"; we cooperate because doing so leads to higher utility. Petrov's defection led to a better outcome for every single person on the planet; assuming this was wrong because it was defection is an example of the non-central fallacy [LW · GW].

Is that the sort of behavior we really want to lionize?

If you will not honor literally saving the world, what will you honor? If we wanted to make a case against Petrov, we could say that by demonstrably not retaliating, he weakened deterrence (but deterrence would have helped no one if he had launched), or that the Soviets might have preferred destroying the world to dying alone, and thus might be upset with a missileer unwilling to strike. But it's hard to condemn him for a decision that predictably saved the West, and had a significant chance (which did in fact occur) of saving the Soviet Union.

Replies from: quanticle
comment by quanticle · 2019-09-28T07:24:02.798Z · LW(p) · GW(p)

If you will not honor literally saving the world, what will you honor?

I find it extremely troubling that we're honoring someone defecting against their side in a matter as serious as global nuclear war, merely because in this case, the outcome happened to be good.

(but deterrence would have helped no one if he had launched)

That is exactly the crux of my disagreement. We act as if there were a direct lever between Petrov and the keys and buttons that launch a retaliatory counterstrike. But there wasn't. There were other people in the chain of command. There were other sensors. Do we really find it that difficult to believe that the Soviets would not have attempted to verify Petrov's claim before retaliating? That there would not have been practiced procedures to carry out this verification? From what I've read of the Soviet Union, their systems of positive control were far ahead of the United States' as a result of the much lower level of trust the Soviet Politburo had in their military. I find it exceedingly unlikely that the Soviets would have launched without conducting at least some kind of verification with a secondary system. They knew the consequences of nuclear attack just as well as we did.

In that context, Petrov's actions are far less justifiable. He threw away all of the procedures and training that he had... for a hunch. While everything did turn out okay in this instance, it's certainly not a mode of behavior I'd want to see established as a precedent. As I said above: Petrov's actions were just as unilateralist as the people releasing the GPT-2 models, and I find it discomfiting that a holiday opposing that sort of unilateral action is named after someone who, arguably, was maximally unilateralist in his thinking.

comment by jmh · 2019-09-27T13:14:39.299Z · LW(p) · GW(p)

I'm not entirely sure we can ever have a correct choice in foresight.

With regard to Petrov, he did seem to make a good, and reasoned call: The US launching a first strike with 5 missiles just does not make much sense without some very serious assumptions that don't seem to be merited.

I do like the observation that Petrov was being just as unilateralist as what is feared in this thread.

Do we want to lionize such behavior? Perhaps. You argument seems to lend itself to the lens of an AI problem -- and Petrov's behavior then a control on that AI.

Replies from: quanticle
comment by quanticle · 2019-10-03T13:46:42.327Z · LW(p) · GW(p)

I also think it's weird that The Sequences, Thinking Fast and Slow, and other rationalist works such as Good and Real all emphasize gathering data and trusting data over intuition, because human intuition is fallible, subject to bias, taken in by narratives, etc... and then we're celebrating someone who did the opposite of all that and got away with it.

The steelman interpretation is that Petrov made a Bayesian assessment, starting with a prior that a nuclear attack (and especially a nuclear attack with five missiles) was an extremely unlikely scenario, and appropriately discounted the evidence being given to him by the satellite detection system because the detection system was new and therefore prone to false alarms, and found that the posterior probability of attack did not justify his passing the attack warning on. However, this seems to me like a post-hoc justification of a decision that was made on intuition.

Replies from: TurnTrout
comment by TurnTrout · 2019-10-03T14:14:09.093Z · LW(p) · GW(p)

He thought it unlikely that the US would launch a strike with 5 ICBMs only, since a first strike would likely be comprehensive. As far as Bayesian reasoning goes, this seems pretty good.

Also, a big part of being good at Bayesian reasoning is refining your ability to reason even when you can't gather data, when you can't view the same scenario "redrawn" ten thousand times and gather statistics on it.

ETA: the satellite radar operators reported all-clear; however, instructions were to only make decisions based on the computer readouts.

Replies from: quanticle
comment by quanticle · 2019-10-03T16:16:26.550Z · LW(p) · GW(p)

I've replied below [LW(p) · GW(p)] with a similar question, but do you have a source on "satellite radar operators"? The published accounts of the incident imply that Petrov was the satellite radar operator. He followed up with the operators of the ground-based radar later, but at the time he made the decision to stay silent, he had no data that contradicted what the satellite sensors were saying.

As far as the Bayesian justification goes, I think this is bottom-line [LW · GW] reasoning. We're starting with, "Petrov made a good decision," and looking backwards in order to find reasons as to why his reasoning was reasonable and justifiable.

Replies from: TurnTrout
comment by TurnTrout · 2019-10-03T17:01:33.919Z · LW(p) · GW(p)

A group of satellite radar operators told him they had registered no missiles. BBC

I don’t see why this is bottom-line reasoning. It is in fact implausible that the US would first-strike with only five missiles, as that would leave the USSR able to respond.

comment by Ben Pace (Benito) · 2019-09-27T05:46:04.373Z · LW(p) · GW(p)

To quote Stanislav himself:

I imagined if I'd assume the responsibility for unleashing the third World War...
...and I said, no, I wouldn't. ... I always thought of it. Whenever I came on duty, I always refreshed it in my memory.

I don't think it's obvious that Petrov's choice was correct in foresight, I think he didn't know whether it was a false alarm - my current understanding is that he just didn't want to destroy the world, and that's why he disobeyed his orders. It's a fascinating historical case where someone actually got to make the choice, and made the right one. Real world situations are messy and it's hard to say exactly what his reasoning process is and how justifiable it was - it's really bad like decisions like these have to be made, and it doesn't seem likely to me there's some simple decision rule that gets the right answer in all situations (or even most). I didn't make any explicit claim about his reasoning in the post. I simply celebrate that he managed to make the correct choice.

The rationality community takes as a given Petrov's assertion that it was obviously silly for the United States to attack the Soviet Union with a single ICBM.

I don't take it as a given. It seems like I should push back on claims about 'the rationality community' believing something before you first point to a single person who does, and when the person who wrote the post you're commenting on explicitly doesn't.

I agree with you that while LW's red-button has some similarities with Petrov's situation it doesn't reflect many parts of it. As I say in the exchange with Zvi, I think it is instead representative of the broader situation with nukes and other destructive technologies, where we're building them for little good reason and putting ourselves in increasingly precarious positions - which Petrov's 1983 incident illustrates. We honour Petrov Day by not destroying the world, and I think it's good to practice that in this way.

Replies from: Ruby, philh
comment by Ruby · 2019-09-27T17:04:23.293Z · LW(p) · GW(p)

I think we can celebrate that Petrov didn't want to destroy the world and this was a good impulse on his part. I think if we think it's doubtful that he made the correct decision, or that it's complicated, then we should be very, very upfront about that (your comment is upfront, the OP didn't make this fact stick with me). The fact the holiday is named after him made me think (implicitly if not explicitly) that people (including you, Ben) generally endorsed Petrov's reasoning/actions/etc. and so I did take the whole celebration as a claim about his reasoning. I mean, if Petrov reasoned poorly but happened to get a good result, we should celebrate the result yet condemn Petrov (or at least his reasoning). If Petrov reasoned poorly and took actions there were poor in expectation, doesn't that mean something like in the majority of world's Petrov caused bad stuff to happen (or at the algorithm which is Petrov generally would)?

. . .

I think it is extremely extremely weird to make a holiday about avoiding unilateralist's curse and name it after who did exactly that. I hadn't thought about it, but if Quanticle is right [LW(p) · GW(p)], then Petrov was taking unilateralist action. (We could celebrate that he his unilateralist action was good, but then the way Petrov day is being themed is here is weird.)

As an aside for those at home, I actually objected to Ben about the "unilateralist"/"unilateralist's curse" framing separately because our button seemed a very non-obvious instance of Bostrom's original meaning* to apply this to Petrov. Unilateralist's curse (Bostrom, 2013) is about when a group of people all have the same goals but have different estimates whether an action which effects them all would be beneficial goal. The curse is that the more people in the group, the more likely someone is to have a bad estimate of the value of the goal that they act separate from everyone else. In the case of US vs Russia, this is an adversarial/enemy situation. If two people are enemies and one decides to attack the other, while it is perhaps correct to say they do so "unilaterally" but it's not the phenomenon Bostrom was trying to point out with his paper/introduction of that term and I'm the kind of person who dislike when people's introduced terminology gets eroded.

I was thinking this at the time I objected, but we could say Petrov had the same values as the rest of the military command but had different estimate of the value of a particular action (what to report), in which case we're back to the above where he was taking unilateralist action.

Our button scenario is another question. I initially thought that someone would only press it if they were a troll (typical mind fallacy, you know?) in which case I'd call it more "defection" than "unilateralist" action and so it wasn't a good fit for the framing either. If we consider that some people might actually believe the correct thing (by our true, joint underlying values) is to press the button, e.g. to save a life via donation, then that actually does fit the original intended meaning.

There other lessons of:

  • Practice not pressing big red buttons that would have bad effects, and
  • Isn't it terrible that the world is making it easier and easier to do great harm, let's point this out by doing the same . . . (ironically, I guess)

*I somewhat dislike that the OP has the section header "unilateralist action", a term taken from Bostrom in one place, but then quotes something he said elsewhere maybe implying that the "building technologies that could destroy the world" was part of the original context for unilateralist's curse.

. . .

Okay, those be the objections/comments I had brewing beneath the surface. Overall, I think our celebration of Petrov was pretty damn good with good effects and a lot of fun (although maybe it was supposed to be serious...). Ben did a tonne of work to make it happen (so did the rest of the team, especially Ray working hard to make the button).

Agree that it was a significant historical event and case study. My comments are limited to the "unilateralist" angle mostly and a bit the we should be clear which behavior/reasoning we're endorsing. I look forward to doing the overall thing again.

comment by philh · 2019-09-27T14:32:54.276Z · LW(p) · GW(p)

FWIW, I had taken that as a given.

comment by Ben Pace (Benito) · 2019-09-27T05:09:33.567Z · LW(p) · GW(p)

Indeed.

Perhaps the key problem with attempts to lift the unilateralist's curse, is that it's very easy to enforce dangerous conformity - 'conformity' being a term I made sure not to use in the OP. It's crucial to be able to not do the thing that you're being told to do under the threat of immediate and strong social punishment, especially when there's a long time scale before finding out if your action is actually the right one. Consistently going against the grain because it's better in the long run, not because it brings immediate reward, is very difficult.

Both being able to think and act for yourself, and yet also not disregard others enough to not break things, is a delicate balance, and many people end up too far on one end or the other. They find themselves punished for unilateralist action, and never speak up again; or they find that others are stopping them from being themselves, and then ignore all the costs they're imposing on their community. My current sense is that most people lean towards conformity, but also that the small number of unilateralists have caused an outsized harm.

(Then again, failures from conformity are often more silent, so I have wide error bars around the magnitude of their cost.)

comment by Rob Bensinger (RobbBB) · 2019-09-28T23:57:36.962Z · LW(p) · GW(p)

Seems like unilateralism and coordination failure is a good way of summing up humanity's general plight re nuclear weapons, which makes it relevant to a day called "Petrov Day" in a high-level way. Putting the emphasis here makes the holiday feel more like "a holiday about x-risk and a thanksgiving for our not having died to nuclear war", and less like "a holiday about the virtues of Stanislav Petrov and emulating his conduct".

If Petrov's decision was correct, or incorrect-but-reflecting-good-virtues, the relevant virtue is something like "heroic responsibility", not "refusal to be a unilateralist". I could imagine a holiday that focuses instead on heroic responsibility, or that has a dual focus. ('Lord, grant me the humility to cooperate in good equilibria, the audacity to defect from bad ones, and the wisdom to know the difference.') I'm not sure which of these options is most useful.

Replies from: quanticle
comment by quanticle · 2019-10-03T13:33:28.153Z · LW(p) · GW(p)

Well, that's one of the questions I'm raising. I'm not sure we want to encourage more "heroic responsibility" with AI technologies. Do we want someone like Stanislav Petrov to decide, "No, the warnings are false, and the AI is safe after all," and release a potentially unfriendly general AI? I would much rather not have AI at all than have it in the hands of someone who decides without consultation that their instruments are lying to them and that they know the correct thing to do based upon their judgment and intuition alone.

Replies from: TurnTrout, Idan Arye
comment by TurnTrout · 2019-10-03T15:23:40.970Z · LW(p) · GW(p)

Petrov did consult with the satellite radar operators, who said they detected nothing.

Replies from: quanticle
comment by quanticle · 2019-10-03T16:04:17.108Z · LW(p) · GW(p)

Do you have a source on Petrov consulting the radar operators? The Wikipedia article on the 1983 incident seems to imply that he did not.

Shortly after midnight, the bunker's computers reported that one intercontinental ballistic missile was heading toward the Soviet Union from the United States. Petrov considered the detection a computer error, since a first-strike nuclear attack by the United States was likely to involve hundreds of simultaneous missile launches in order to disable any Soviet means of a counterattack. Furthermore, the satellite system's reliability had been questioned in the past. Petrov dismissed the warning as a false alarm, though accounts of the event differ as to whether he notified his superiors or not after he concluded that the computer detections were false and that no missile had been launched. Petrov's suspicion that the warning system was malfunctioning was confirmed when no missile in fact arrived. Later, the computers identified four additional missiles in the air, all directed towards the Soviet Union. Petrov suspected that the computer system was malfunctioning again, despite having no direct means to confirm this. The Soviet Union's land radar was incapable of detecting missiles beyond the horizon.

From the passage above, it seems like, at the time of the decision, Petrov had no way of confirming whether the missile launches were real or not. He decided that the missile launch warnings were the result of equipment malfunction, and then followed up with land-based radar operators later to confirm that his decision had been correct.

Replies from: TurnTrout
comment by TurnTrout · 2019-10-03T16:59:17.244Z · LW(p) · GW(p)

A group of satellite radar operators told him they had registered no missiles. BBC

comment by Idan Arye · 2020-09-26T19:26:28.973Z · LW(p) · GW(p)

Petrov's choice was not about dismissing warnings, it's about picking on which side to err. Wrongfully alerting his superiors could cause a nuclear war, and wrongfully not alerting them would disadvantage his country in the nuclear war that just started. I'm not saying he did all the numbers, used Bayes's law to figure the probability there is an actual nuclear attack going on, assigned utilities to all four cases and performed the final decision theory calculations - but his reasoning did take into account the possibility of error both ways. Though... it does seem like his intuition gave utility much more weight than to probabilities.

So, if we take that rule for deciding what to do with a AGI, it won't be just "ignore everything the instruments are saying" but "weight the dangers of UFAI against the missed opportunities from not releasing it".

Which means the UFAI only needs to convince such a gatekeeper that releasing it is the only way to prevent a catastrophe, without having to convince the gatekeeper that the probabilities of the catastrophe are high or that the probabilities of the AI being unfriently are low.

comment by Slider · 2019-09-28T11:40:30.316Z · LW(p) · GW(p)

Doctor Strangelove althought being fictious evidence presents a unilateralist choice to act. US nuke bomber commander just screws the upward chain of command and goes to bomb soviets. The case that that the decision to nuke is the presidents to make is way stronger and more intuitive there.

Replies from: quanticle
comment by quanticle · 2019-10-03T13:58:19.855Z · LW(p) · GW(p)

No, that's not what happens in Dr. Strangelove at all. In Dr. Strangelove, a legitimate launch order is given, the bombers take off, and then, while they're on their way to their destination, the launch order is rescinded. However, the one bomber (due to equipment failure, I think), fails to receive the retraction of the launch order. The President, realizing that this bomber did not receive the order to turn back, authorizes the Soviets to shoot down the plane. The Soviets, however, are unable to do so, as the bomber has diverted from its primary target and is heading towards a nearer secondary target. The bomber crew, following their orders to the letter, undertake heroic efforts to get their bomb operational and drop it, even though that means sacrificing their commander.

In a sense, Dr. Strangelove is the very opposite of what Stanislav Petrov did. Rather than save humanity by disobeying orders, the crew dooms humanity by following its orders.

Replies from: Slider
comment by Slider · 2019-10-03T18:36:02.684Z · LW(p) · GW(p)

The downward chain of command holds appropriately but a person (I think the character is named Jack D Ripper) that shouldn't be making such a call is in a factual position to act as if he had received one. Part of the point is that it is surprising and that the remedy to have them court martialled is not comforting at all. Yes he does not personally go to nuke the soviets but he acts on his own without cooperation with the powers invested in him.

The points do not need to be in conflict. Ripper can doom the humanity by doing unauthorised things while the bomber crew dooms them by doing authorised things.

The bomber crew equivalents also kept the cold war cold because it was plausible that they could be used for the their trained purpose.

comment by Ben Pace (Benito) · 2019-10-03T18:53:41.730Z · LW(p) · GW(p)

Just FYI, I am planning to make another post in maybe two weeks to open further discussion to needle down the specific details of what we want to celebrate and what is a fitting way to do that, because that seems like the correct way to build traditions.

comment by clone of saturn · 2019-09-27T03:41:52.081Z · LW(p) · GW(p)

Good point. We're unlucky that nuclear war didn't break out in 1983.

Replies from: quanticle
comment by quanticle · 2019-09-27T05:24:25.385Z · LW(p) · GW(p)

In 1983, Moscow was protected by the A-35 anti-ballistic missile system. This system was (in theory) capable of stopping either a single ICBM or six Pershing II IRBMs from West Germany. The threat that Petrov's computers reported was a single ICBM, coming from the United States. If the threat had been real, Petrov's actions would have prevented the timely activation of the ABM system, preventing the Soviets from even trying to shoot down the incoming nuke.

Replies from: ChristianKl
comment by ChristianKl · 2019-09-30T13:24:12.951Z · LW(p) · GW(p)

It seems like the official story as you for example find it on Wikipedia says that the system detected five ICBMs.

Replies from: quanticle
comment by quanticle · 2019-10-03T13:24:28.525Z · LW(p) · GW(p)

From https://en.wikipedia.org/wiki/1983_Soviet_nuclear_false_alarm_incident:

Shortly after midnight, the bunker's computers reported that one intercontinental ballistic missile was heading toward the Soviet Union from the United States. Petrov considered the detection a computer error, since a first-strike nuclear attack by the United States was likely to involve hundreds of simultaneous missile launches in order to disable any Soviet means of a counterattack. Furthermore, the satellite system's reliability had been questioned in the past.[12] Petrov dismissed the warning as a false alarm, though accounts of the event differ as to whether he notified his superiors[11] or not[8] after he concluded that the computer detections were false and that no missile had been launched. Petrov's suspicion that the warning system was malfunctioning was confirmed when no missile in fact arrived. Later, the computers identified four additional missiles in the air, all directed towards the Soviet Union. Petrov suspected that the computer system was malfunctioning again, despite having no direct means to confirm this.[13] The Soviet Union's land radar was incapable of detecting missiles beyond the horizon.[12]

The initial detection was one missile. Petrov dismissed this as a false alarm. Later four more missiles were detected, and Petrov also dismissed this as a false alarm. Other accounts combine both sub-incidents together and say that five missiles were detected.

I choose to focus on the first detection because that's when Petrov made the critical decision, in my mind, to not trust the satellite early warning network. The second detection of four missiles isn't as important to me, because at that point Petrov has already chosen to disregard warnings from the satellite network.

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-26T14:28:37.246Z · LW(p) · GW(p)

Oh this is wild. This generated a strange emotion.

Anyone here know the word "Angespannt"? One of my team members taught, German word with no exact English equivalent. We talked about it —

https://www.ultraworking.com/podcast/big-project-angespannt

"It's a mix of tense and alert in a way. It's like the feeling you get before you go on stage."

Like, why should I care? I'm obviously not going to press the damn thing. And yet, simply knowing the button is there generates some tension and alertness.

Fascinating. Thank you for doing this.

(Well, sort of thank you, to be more precise...)

comment by Scott Garrabrant · 2019-09-26T21:32:05.699Z · LW(p) · GW(p)
If any users do submit a set of launch codes, tomorrow I’ll publish their identifying details.

If we make it through this, here are some ideas to make it more realistic next year:

1) Anonymous codes.

2) Karma bounty for the first person to press the button.

1+2) Randomly and publicly give some people the same code as each other, and give a karma bounty to everyone who had the code that took down the site.

3) Anyone with button rights can share button rights with anyone, and a karma bounty for sharing with the most other people that only pays out if nobody presses the button.

Replies from: lionhearted, Pattern
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:05:07.143Z · LW(p) · GW(p)

Or, if we want to go all max-Schelling at the risk of veering almost into Stalinism, tell people they'll get a karma bounty for pressing it but then coordinate with LW, CFAR, MIRI, and various meetups to ban that person for life from everything if the actually do it. 😂

comment by Pattern · 2019-09-27T02:25:19.095Z · LW(p) · GW(p)

3 seems like an incentive to create sockpuppets. (It might make more sense to combine "Button rights" and "codes".) Making limitations, for example, based on age of accounts moves the incentive from "create sockpuppets", to "have created sockpuppets".

comment by Said Achmiz (SaidAchmiz) · 2019-09-27T07:18:54.343Z · LW(p) · GW(p)

Well, it seems that no one has launched anything. However, skimming through the comments seems to indicate that this may at least partly be due to folks simply not having had enough time to coordinate any agreements about launching for some quid pro quo, or blackmail, or whatever. And, for that matter, not everyone has time to visit the site daily—I’d wager that at least some of the people who had launch codes, simply didn’t have time to go to Less Wrong all day, or forgot, etc.

Perhaps, next time, there can be more warning? Send out the launch codes a week in advance, let’s say (though maintain only a one-day window for actually using them).

That way, we can be more certain of whether the outcome was due entirely to trustworthiness, self-restraint, and a cooperative spirit, or whether it was instead due to indecisiveness and the limitations of people’s busy schedules.

comment by Gordon Seidoh Worley (gworley) · 2019-09-26T17:37:34.112Z · LW(p) · GW(p)

the temptation, the call to infamy

button shining, yearning to be pressed

can we endure these sinuous fingers coiled?

only the hours know our hearts

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:39:36.565Z · LW(p) · GW(p)

Upvoted for poetry.

Commenting to underline it for "the call to infamy" — wonderful phrase.

comment by jmh · 2019-09-26T12:18:15.303Z · LW(p) · GW(p)

I was not aware of this story and happy to hear it. While I think having the day of celebration and rememberance should be done, I wonder about the exercise with the button.

First, just not pushing the button and bring the page down for a day seems not to fit the problem. The button should be shutting down someone else's site with the realization that they will have some knowledge of that coming and have a button that shuts your page down. Perhaps next year the game could include other sites, and particularly sites whose members do not really see eye-to-eye on things.

Second, it doesn't really tell others much about avoiding such situations. Reading Eliezer's post the critical insight for me seems to be that of remaining calm and taking the time available to think a bit rather than merely react and follow instructions of a mindless process. That Petrov realized that launching 5 missiles just made no sense, so came to the conclusion that there was a system error/false positive is critical here.

Replies from: habryka4, lionhearted
comment by habryka (habryka4) · 2019-09-27T02:05:37.957Z · LW(p) · GW(p)

We had some original plans of coordinating with the EA Forum people on this, but didn't end up having enough time available to make all of that happen. Agree that the ideal reenactment scenario would include two forums (though with mutually assured destruction in the later parts of the cold war, the outcome is ultimately the same).

Replies from: jmh
comment by jmh · 2019-09-27T13:34:25.369Z · LW(p) · GW(p)

A slightly different thought that might be easier to coordinate. Have the button hide all the comments of a specific user on LW -- adds the variance that the thread is not merely bilateral. We could also add something that might obscure the actor, thought not entirely hide their action.

Additionally, we could have the button delete a selected subset of comments/posts allowing a scenario where one needs to decide if an all out attack was launched or something else is going on. That seems to be what Petrov faced. I would also add something that produced an almost identical signal even if no one pushed their button.

Though, now it's becoming more like a war game on LW than simply noting a (at least I think) positive event in history. Still, we might make it a good experiment and see what can be learned.

Maybe I'm in a dark mindset here....

Seems like today, even with (due to?) the advances in weapons and other technology that MAD assumption may no longer be believed. I recall Putin claiming Russia would in fact survive an all out war with the USA. I wonder how much that view might change the way the game plays out.

On a tangent here, part of the concern is the proliferation of the technology. What would a Guarantee Assured Destruction (GAD) policy be for any country/group seeking such technology? Is that a better world than what we have now?

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-26T15:23:33.333Z · LW(p) · GW(p)

If you've got launch codes, wait until tomorrow to read this eh? —

Lbhe pbzzrag znxrf zr jnag gb chfu gur ohggba.

uggcf://jjj.yrffjebat.pbz/cbfgf/hkLrzN2ggmwf8m8Ro/qevir-ol-ybj-rssbeg-pevgvpvfz

uggcf://jjj.yrffjebat.pbz/cbfgf/TT2egOErNz6b3zega/qrsrpgvat-ol-nppvqrag-n-synj-pbzzba-gb-nanylgvpny-crbcyr

V'z abg n zbq be nssvyvngrq va nal jnl, whfg pevgvpvmvat crbcyr qbvat n tbbq guvat jura jr'er nyy abzvanyyl ba gur fnzr grnz qevirf zr penml. Ohg gura ntnva, znlor gung'f gur Xvffvatre evtugrbhfarff dhbgr ntnva.

comment by Said Achmiz (SaidAchmiz) · 2019-09-27T03:17:29.289Z · LW(p) · GW(p)

In the comments of Ray’s post, Zvi asked the following question (about a variant where a cake gets destroyed):

I still don’t understand, in the context of the ceremony, what would cause anyone to push the button. Whether or not it would incinerate a cake, which would pretty much make you history’s greatest monster.

There are several obvious reasons why someone might push the button.

Reason one: spite. Pure, simple spite, nothing more. A very compelling reason, I assure you. (See also: “Some men just want to watch the world burn.”)

Reason two: desire for infamy. “History’s greatest monster” is much better (for many people) than being a nobody.

Reason three: personal antipathy for people who would be harmed.

I could think of more potential reasons, I suppose, but I think three examples are enough. Remember that being incapable of imagining why someone would do a bad thing, is a weakness and a failure. Strive to do better.

Replies from: jacobjacob, lionhearted
comment by jacobjacob · 2019-09-27T09:54:18.650Z · LW(p) · GW(p)

All your reasons look like People Are Bad. I think it suffices that The World is Complex and Coordination is Hard.

Consider, for example:

  • Someone thinks Petrov day is not actually a good ritual and wants to make a statement about this
  • Someone thinks the reasoning exhibited in OP/comments is naïve and wouldn't stand up to the real test, and so wants to punish people/teach them a lesson about this
  • Someone comes up with a clever argument involving logical decision theories and Everett branches meaning they should push... but they made a mistake and the argument is wrong
  • Someone thinks promising but unstable person X is about to press the button, and that this would be really bad for X's community position, and so instead they take it upon themselves to press the button to enable the promising but unstable person to be redeemed and flourish
  • Someone accidentally gives away/loses their launch codes (e.g. just keeps their gmail inbox open at work)
  • A group of people tries to set up a scheme to reliable prevent a launch, however this grows increasingly hard and confusing and eventually escalates into one of the above failure modes
  • Several people try to set up precommitments that will guarantee a stable equilibrium; however, one of them makes a mistake and interpret the result as launching being the only way for them to follow-through on it
  • Someone who feels that "this is trivial if there's no incentive to press the button" tries to "make things more interesting" and sets off a chain of events that culmninates in a failure mode like the above
  • ...

Generating this list only took a few minutes and wasn't that high effort. Lots of the examples have a lot of ways of being realised.

So overall, adding karma bounty for launching could be cool, but I don't think it's as necessary as some people think.

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2019-09-27T14:04:36.194Z · LW(p) · GW(p)

All your reasons look like People Are Bad.

I disagree, FWIW. It seems to me that “desire for infamy” may be rolled into “people are bad”, but not the other two. I do not consider either personal antipathy nor spite to be necessarily negative qualities.

Replies from: gjm
comment by gjm · 2019-09-27T19:26:17.293Z · LW(p) · GW(p)

I would be interested to know how you see spite as "not necessarily negative".

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2019-09-27T23:08:34.753Z · LW(p) · GW(p)

Well, I could note that reactive spite is game-theoretically correct; this is well-documented and surely familiar to everyone here.

But that would not be the important reason. In fact I take spitefulness to be a terminal value, and as a shard of godshatter [LW · GW] which is absolutely critical to what humans are (and, importantly, what I take to be the ideal of what humans are and should be).

It is not always appropriate, of course; nor even usually, no. Someone who is spiteful all or most of the time, who is largely driven by spite in their lives—this is not a pleasant person to be around, and nor would I wish to be like this. But someone who is entirely devoid of spite—who does not even understand it, who has never felt it nor can imagine feeling spite—I must wonder whether such a one is fully human.

There is an old Soviet animated short, called “Baba Yaga Is Opposed” (which you may watch in its entirety on YouTube; link to first of three episodes; each is ~10 minutes).

The plot is: it’s the 1980 Olympics in Moscow. Misha the bear has been chosen as the event’s mascot. Baba Yaga—the legendary witch-crone of Russian folklore—is watching the announcement on TV. “Why him!” she exclaims; “why him and not me!” “The entire world is in favor!” proclaims the television announcer; whereupon the witch declares: “But Baba Yaga is opposed!”—and embarks on a mad scheme to kidnap Misha and … well, it’s not clear what her plan is, exactly; but hijinks predictably ensue.

After “Baba Yaga Is Opposed” was aired in the Soviet Union, the cartoon’s title passed into the vernacular, referring to someone who opposes something, or refuses something, for no reason but a contrarian nature; a refusal to conform, on general principles; in short—spite.

I think we need such people. I think that “Baba Yaga is opposed” is, at times, all that stands between humanity and utter catastrophe and horror; and, much more often, all that stands in the way of plans and schemes that threaten to make our lives more dull and grey. We need there to be, always, people who will simply not go along with our grand plans, no matter how well-intentioned; who refuse to conform, to participate, not from any specific principles, but simply because they don’t want to. We need to know that however reasonable our arguments, some people won’t agree with us, and nothing we can say will make them agree. We need to know that we will never be able to convince everyone or to get everyone to go along.

I fear to imagine what will happen on the day when there is no Baba Yaga to stubbornly and spitefully oppose our best-laid plans; and I can only hope that the stories are true, that say she is immortal.

Replies from: quanticle
comment by quanticle · 2019-09-28T02:32:20.129Z · LW(p) · GW(p)

Indeed, Eliezer has written extensively about this very phenomenon. No argument is universally compelling -- there is no sequence of propositions so self evident that it will cause our opponents to either agree or spontaneously combust.

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T05:02:22.358Z · LW(p) · GW(p)

Great comment.

Side note, I occasionally make a joke that I'm sent from another part of the multiverse (colloquially, "the future") to help fix this broken fucked up instance of the universe.

The joke goes — it's not a stupid teleportation thing like Terminator, it's a really expensive process two-step process to edit even a tiny bit of information in another universe. So with right CTC relays you can edit a tiny bit of information, creating some high-variance people in a dense area, and then the only people who get their orders are people who reach a sufficient level of maturity, competence, and duty. Not everyone who we give the evolved post-sapien genetics gets their orders; the overwhelming majority fail actually.

Now, the reason we at the Agency — in the joke, I'm on the Solar Task Force — are trying to fix this universe is because it effects other parts of the multiverse. There's a lot of stuff, but here's a simple one — the coordinates of Earth are similar in many branches. Setting off tons of nukes and beaming random stuff into space calls attention to Earth's location. I believe a game theoretic solution to the Fermi Pardox was proposed recently in SciFi and no one was paying attention. I mean, did anyone check that out? Right? Don't let Earth's coordinates get out. Jeez guys. This isn't complicatd. C'mon.

Now normally things work correctly, but this particular universe came about because you idiots — I mean, not you since you weren't alive — but collectively, this idiot branch of humans took a homeless bohemian artist who was a kinda-brave messenger solider in World War One (already a disaster but then the error compounds) and they took this loser with a bad attitude and put him in charge of a major industrial power at one of the most leveraged moments in human history. He wasn't even German! He was Austrian! And he took over the Nazi Party as only the 55th member after he was sent in as a police officer to watch the group. (Look it up on Wikipedia, is true.) Then, he tries a putsch — a coup — and it fails, and the state semi-prosecutes him, making him famous, but then lets him off easily. He turns that fame (infamy, really) into wealth, that into political power, and takes over. Then he does a ton of damage, including invading and destroying the most important city in the world at the time. Right, where are all those physicists and mathematicians from? Starts with a "B"? Used to be a monarchy? Destroyed by the Nazis? And after those people aged out and had completed their work, we went through a stagnation period for quite a while? Right? Isn't that what happened?

What a comedy of fucking errors. So much emotionalism. This branch of the universe is so incredibly fucked, I hate being here, but I'm doing my best. I like you humans, some of you are marvelous and all of you I want to succeed but man I fucking hate it here. Anyway, the first time I made this joke I was worried my CO would be pissed at me since I'm breaking rule#1, but it's actually so bad here that I didn't even get paradox warnings. (A true paradox crashes the universe, which we actually do when things are sufficiently bad and the rot is liable to spread.)

Anyway, this is just a joke. But yes, "desire for infamy" — fucking homo sapien sapiens. Evolve faster, please.

Just kidding.

(If I wanted to continue the joke, I'd say I am going certainly to get in trouble sooner or later, but this amuses the hell out of me and this is a really high stress unpleasant job. Anyway, not joking, now I'll go back to building my peak performance tech company that prompts clear thinking, intentional action, and generally more eustress and joy while eliminating distress. I'll build that into one of the largest companies on Earth while also producing subtly-but-not-subtly producing useful media with a lot of subtext lessons and building an elite team that does a mix of internal inventing like Bell Labs as well diffusion PayPal Mafia style, those people also going on to also start large important prosocial institutions. After the first few billion, I'll fund better sensors for asteroid defense and bring down the cost of regular testing/monitoring bloodwork and simple "already known best practices" in biochemical regulation. Anyway, I'm just joking around cuz this amuses me and working 90-110 hours per week while in a mostly human body is very tiring. I like this whole button thing btw, this is really good. It gives me a little bit of hope. I guess hope is dangerous too though. Anyway, back to work, I'm going to teach my brilliant junior team that "there is value in writing a clear agenda of what we want to accomplish in a meeting". I'd rather be developing new branches of mathematics — I already developed one for real, it blows people's minds when I show it to them (ask me in person whenever a whiteboard is around), and I'll write it up when I have some spare time — but yeah, "we shouldn't just fuck around for no purpose in meetings" is the current level of the job. So be it. Anyway, this button thing is good, I needed this. Thanks.)

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T05:16:46.002Z · LW(p) · GW(p)

Oh in case you missed the subtext, it's a SciFi joke.

It's funny cuz it's sort of almost plausibly true and gets people thinking about what if their life had higher stakes and their decisions mattered, eh?

Obviously, it's just a silly amusing joke. And it's obviously going to look really counterproductively weird if analyzed or discussed among normal people, since they don't get nerd humor. I recommend against doing that.

Just laugh and maybe learn something.

Don't be stupid and overthink it.

Replies from: Alexei
comment by Alexei · 2019-10-02T20:45:27.158Z · LW(p) · GW(p)

I’m confused why you got downvoted so much over a joke.... sorry.

Replies from: eigen
comment by eigen · 2019-10-02T22:42:17.341Z · LW(p) · GW(p)

The fact that it's a joke is non-important; the fact that it's a bad joke is.

Maybe don't make a bad joke and think that people cannot take it, consider that maybe it's just bad.

comment by jefftk (jkaufman) · 2019-09-26T16:43:35.978Z · LW(p) · GW(p)

[EDIT: two people with codes below have objected, so I'm not up for this trade anymore, unless we figure out a way to make a broader poll]

I have launch codes. Would anyone be interested in offering counterfactual donations to https://www.givewell.org/charities/amf? I could also be interested in counterfactual donations to nuclear war-prevention organizations.

Replies from: Raemon, tcheasdfjkl, Larks, lahwran, tcheasdfjkl, peter_hurford, tcheasdfjkl, SaidAchmiz
comment by Raemon · 2019-09-26T16:45:37.168Z · LW(p) · GW(p)

oh geez

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:07:32.115Z · LW(p) · GW(p)

"Rae, this is a friendly reminder from the universe that you can only at best control the first-order effects of systems you create..."

comment by tcheasdfjkl · 2019-09-27T06:09:41.933Z · LW(p) · GW(p)

Since the day is drawing to a close and at this point I won’t get to do the thing I wanted to do, here are some scattered thoughts about this thing.

First, my plan upon obtaining the code was to immediately repeat Jeff’s offer. I was curious how many times we could iterate this; I had in fact found another person who was potentially interested in being another link in this chain (and who was also more interested in repeating the offer than nuking the site). I told Jeff this privately but didn’t want to post it publicly (reasons: thought it would be more fun if this was a surprise; didn’t think people should put that much weight on my claimed intentions anyway; thought it was valuable for the conversation to proceed as though nuking were the likely outcome).

(In the event that nobody took me up on the offer, I still wasn’t going to nuke the site.)

Other various thoughts:

  • Having talked to some people who take this exercise very seriously indeed and some who don’t understand why anyone takes it seriously at all, both perspectives make a lot of sense to me and yet I’m having trouble explaining either one to the other. Probably I should practice passing some ITTs.
  • Of the arguments raised against the trade the one that I am the most sympathetic to is TurnTrout’s argument that it’s actually very important to hold to the important principles even when there’s a naive utilitarian argument in favor of abandoning them. I agree very strongly with this idea.
  • But it also seems to me there’s a kind of… mixing levels here? The tradeoff here is between something symbolic and something very real. I think there’s a limit to the extent this is analogous to, like, “maintain a bright line against torture even when torture seems like the least bad choice”, which I think of as the canonical example of this idea.
  • (I realize some people made arguments that this symbolic thing is actually reflective or possibly determinative of probabilistic real consequences (in which case the “mixing levels” point above is wrong). (Possibly even the arguments that didn’t state this explicitly relied on the implication of this?) I guess I just…. don’t find that very persuasive, because, again, the extent to which this exercise is analogous to anything of real-world importance is pretty limited; the vast majority of people who would nuke LW for shits and giggles wouldn’t also nuke the world for shits and giggles. Rituals and intentional exercises like these have any power but I think I put less stock in them than some.)
  • Relatedly, I guess I feel like if the LW devs wanted me to take this more seriously they should’ve made it have actual stakes; having just the front page go down for just 24 hours is just not actually destroying something of real value. (I don’t mean to insult the devs or even the button project - I think this has been pretty great actually - it’s just great in more of a “this is a fun stunt/valuable discussion starter” way than a “oh shit this is a situation where trustworthiness and reliability matter” way. (I realize that doing this in a way that had stakes would have possibly been unacceptably risky; I don’t really know how to calibrate the stakes such that they both matter and are an acceptable risk.))
  • Nevertheless I am actually pleased that we’ve made it through (most of) the day without the site going down (even when someone posted (what they claim is) their code on Facebook).
  • I am more pleased than that about the discussions that have happened here. I think the discussions would have been less active and less good without a specific actual possible deal on the table, so I’m glad to have spurred a concrete proposal which I think helped pin down some discussion points that would have remained nebulous or just gone unsaid otherwise.
  • If in fact the probability of someone nuking the site is entangled with the probability of someone nuking the world (or similar), I think it’s much more likely that both share common causes than that one causes the other. If this is so, then gaining more information about where we stand is valuable even if it involves someone nuking the site (perhaps especially then?).
  • In general I think a more eventful Petrov Day is probably more valuable and informative than a less eventful one.
comment by Larks · 2019-09-26T22:10:21.171Z · LW(p) · GW(p)

I would like to add that I think this is bad (and have the codes). We are trying to build social norms around not destroying the world; you are blithely defecting against that.

Replies from: jkaufman
comment by jefftk (jkaufman) · 2019-09-26T22:20:56.086Z · LW(p) · GW(p)

I'm not doing anything unilaterally. If I do anything at this point it will be after some sort of fair polling.

comment by the gears to ascension (lahwran) · 2019-09-26T18:17:52.293Z · LW(p) · GW(p)

This seems extremely unprincipled of you :/

Replies from: jkaufman
comment by jefftk (jkaufman) · 2019-09-26T18:22:32.661Z · LW(p) · GW(p)

Clarify?

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2019-09-26T19:11:21.614Z · LW(p) · GW(p)

I thought you were threatening extortion. As it is, given that people are being challenged to uphold morality, this response is still an offer to throw that away in exchange for money, under the claim that it's moral because of some distant effect. I'd encourage you to follow Jai's example and simply delete your launch codes.

comment by tcheasdfjkl · 2019-09-26T19:49:09.247Z · LW(p) · GW(p)

yesssss shenanigans

comment by Peter Wildeford (peter_hurford) · 2019-09-26T18:08:39.989Z · LW(p) · GW(p)

Are you offering to take donations in exchange for pressing the button or not pressing the button?

Replies from: jkaufman, ramiro-p, William_S
comment by jefftk (jkaufman) · 2019-09-26T18:21:14.162Z · LW(p) · GW(p)

I would give someone my launch codes in exchange for a sufficiently large counterfactual donation.

I haven't thought seriously about how large it would need to be, because I don't expect someone to take me up on this, but if you're interested we can talk.

comment by Ramiro P. (ramiro-p) · 2019-09-28T11:23:55.680Z · LW(p) · GW(p)

I thought he was being ambiguous on purpose, so as to maximize donations.

comment by William_S · 2019-09-26T18:22:32.665Z · LW(p) · GW(p)

I think the better version of this strategy would involve getting competing donations from both sides, using some weighting of total donations for/against pushing the button to set a probability of pressing the button, and tweaking the weighting of the donations such that you expect the probability of pressing the button will be low (because pressing the button threatens to lower the probability of future games of this kind, this is an iterated game rather than a one-shot).

Replies from: Jacobian, Gurkenglas
comment by Jacob Falkovich (Jacobian) · 2019-09-26T22:10:24.990Z · LW(p) · GW(p)

Agreed. I have launch codes and will donate up to $100 without writing it in my EA budget if that prevents the nuke from being launched.

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:14:41.154Z · LW(p) · GW(p)

Nooooo you're a good person but you're promoting negotiating with terrorists literally boo negative valence emotivism to highlight third-order effects, boo, noooooo................

Replies from: Jacobian
comment by Jacob Falkovich (Jacobian) · 2019-09-27T05:34:15.376Z · LW(p) · GW(p)

As they say in the KGB, one man's nuclear terrorism is another man's charity game show.

comment by Gurkenglas · 2019-09-27T13:21:28.784Z · LW(p) · GW(p)

Participants were selected based on whether they seem unlikely to press the button, so whoever would have cared about future extortions being possible CDT-doesn't need to, because they won't be a part of it.

comment by tcheasdfjkl · 2019-09-26T20:13:00.272Z · LW(p) · GW(p)

hey actually I'm potentially interested depending on what size of donation you would consider sufficient, can you give an estimate?

Replies from: jkaufman
comment by jefftk (jkaufman) · 2019-09-26T20:29:41.774Z · LW(p) · GW(p)

Maybe a fair value would be GiveWell's best guess cost per life saved equivalent? [1] There's some harm in releasing the codes entrusted to me, but not so much that it's better for someone to die.

I would want your assurance that it really was a counterfactually valid donation, though: money you would otherwise spend selfishly, and that you would not consider part of your altruistic impact on the world.

If two other people with launch codes tell me they don't think this is a good trade then I'll retract the offer.

[1] https://www.givewell.org/how-we-work/our-criteria/cost-effectiveness/cost-effectiveness-models gives $1,672.

Replies from: CarlShulman, Raemon, mingyuan, TurnTrout, jkaufman, johnswentworth, tcheasdfjkl
comment by CarlShulman · 2019-09-26T21:12:09.173Z · LW(p) · GW(p)

I have launch codes and don't think this is good. Specifically, I think it's bad.

Replies from: Scott Garrabrant, jkaufman
comment by Scott Garrabrant · 2019-09-26T21:49:59.604Z · LW(p) · GW(p)

Did you consider the unilateralist curse before making this comment?

Do you consider it to be a bad idea if you condition the assumption that only one other person with launch access who sees this post in the time window choose to say it was a bad idea?

comment by jefftk (jkaufman) · 2019-09-26T21:28:30.284Z · LW(p) · GW(p)

Is the objection over the amount (there's a higher number where it would be a good trade), being skeptical of the counterfactuality of the donation (would the money really be spent fully selfishly?), or something else?

comment by Raemon · 2019-09-26T22:42:47.167Z · LW(p) · GW(p)

(others have said part of what I wanted to say, but didn't quite cover the thing I was worried about)

I see two potential objections:

  • how valuable is trust among LW users? (this is hard to quantify, but I think it is potentially quite high)
  • how persuasive should "it's better than for someone to die" type arguments.

My immediate thoughts are mostly about the second argument.

I think it's quite dangerous to leave oneself vulnerable to the second argument (for reasons Julia discusses on givinggladly.com in various posts). Yes, you can reflect upon whether every given cup of coffee is worth the dead-child-currency it took to buy it. But taken naively this is emotionally cognitively exhausting. (It also pushes people towards a kind of frugality that isn't actually that beneficial). The strategy of "set aside a budget for charity, based on your values, and don't feel pressure to give more after that" seems really important for living sanely while altruistic.

(I don't have a robustly satisfying answer on how to deal with that exactly, but see this comment of mine for some more expanded thoughts [LW(p) · GW(p)] of mine on this)

Now, additional counterfactual donations still seem fine to be willing to make on the fly – I've derived fuzzy-pleasure-joy from donating based on weird schemes on the Dank EA Memes FB group. But I think it is quite dangerous to feel pressure to donate to weird Dank EA Meme schemes based on "a life is at stake."

A life is always at stake. I don't think most humans can or should live this way.

Replies from: rohinmshah, lionhearted
comment by Rohin Shah (rohinmshah) · 2019-09-27T01:12:10.613Z · LW(p) · GW(p)
The strategy of "set aside a budget for charity, based on your values, and don't feel pressure to give more after that" seems really important for living sanely while altruistic.

But this situation isn't like that.

I agree you don't want to always be vulnerable to the second argument, for the reasons you give. I don't think the appropriate response is to be so hard-set in your ways that you can't take advantage of new opportunities that arise. You can in fact compare whether or not a particular trade is worth it if the situation calls for it, and a one-time situation that has an upside of $1672 for ~no work seems like such a situation.

As a meta point directed more at the general conversation than this comment in particular, I would really like it if people stated monetary values at which they would think this was a good idea. At $10, I'm at "obviously not", and at $1 million, I'm at "obviously yes". I think the range of uncertainty is something like $500 - $20,000. Currently it feels like the building of trust is being treated as a sacred value; this seems bad.

Replies from: habryka4, TurnTrout, richard-yannow, mingyuan, lionhearted
comment by habryka (habryka4) · 2019-09-27T02:03:03.296Z · LW(p) · GW(p)

My sense is that it's very unlikely to be worth it at anything below $10k, and I might be a bit tempted at around $50k, though still quite hesitant. I agree that at $1M it's very likely worth it.

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:19:55.521Z · LW(p) · GW(p)

Firm disagree. Second-order and third-order effects go limit->infinity here.

Also btw, I'm running a startup that's now looking at — best case scenario — handling significant amounts of money over multiple years.

It makes me realize that "a lot of money" on the individual level is a terrible heuristic. Seriously, it's hard to get one's mind around it, but a million dollars is decidedly not a lot of money on the global scale.

For further elaboration, this is relevant and incredibly timely:

https://slatestarcodex.com/2019/09/18/too-much-dark-money-in-almonds/

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-27T18:39:44.264Z · LW(p) · GW(p)

LW frontpage going down is also not particularly bad, so you don't need much money to compensate for it.

If you wanted to convince me, you could make a case that destroying trust is really bad, and that in this particular case pressing the button would destroy a lot of trust, but that case hasn't really been made.

Replies from: lionhearted, elityre
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T22:36:12.969Z · LW(p) · GW(p)
LW frontpage going down is also not particularly bad [...] If you wanted to convince me, you could make a case that destroying trust is really bad

Umm, respectfully, I think this is extremely arrogant. Dangerously so.

Anyways, I'm being blunt here, but I think respectful and hopefully useful. Think about this. Reasoning follows —

The instructions if you got launch codes (also in the above post) were as such (emphasis added with underline) —

"Every Petrov Day, we practice not destroying the world. One particular way to do this is to practice the virtue of not taking unilateralist action.

It’s difficult to know who can be trusted, but today I have selected a group of LessWrong users who I think I can rely on in this way. You’ve all been given the opportunity to show yourselves capable and trustworthy.

[...]

This Petrov Day, between midnight and midnight PST, if you, {{username}}, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours.

I hope to see you on the other side of this, with our honor intact."

So, to Ben Pace at least (the developer who put in a tremendous amount of hours and thought into putting this together), it represents...

*"practicing not destroying the world"

*"practicing the virtue of not taking unilateralist action"

*implications around his own uncertainty of who to trust

*de facto for Ben that he can't rely on you personally, by his standards, if you do it

*showing yourself not "capable and trustworthy" by his standards

*having the total group's "honor" "not be intact", under Ben's conception

And you want me to make a case for you on a single variable while ignoring the rather clear and straightforward written instructions for your own simple reductive understanding?

For Ben at least, the button thing was a symbolic exercise analogous to not nuking another country and he specifically asked you not to and said he's trusting you.

So, no, I don't want to "convince you" nor "make a case that destroying trust is really bad." You're literally stating you should set the burden of proof and others should "make a case."

In an earlier comment you wrote,

You can in fact compare whether or not a particular trade is worth it if the situation calls for it, and a one-time situation that has an upside of $1672 for ~no work seems like such a situation.

"No work"? You mean aside from the work that Ben and the team did (a lot) and demonstrating to the world at large that the rationality community can't press a "don't destroy our own website" button to celebrate a Soviet soldier who chose restraint?

I mean, I don't even want to put numbers on it, but if we gotta go to "least common denominator", then $1672 is less than a week's salary of the median developer in San Francisco. You'd be doing a hell of a lot more damage than that to morale and goodwill, I reckon, among the dev team here.

To be frank, I think the second-order and third-order effects of this project going well on Ben Pace alone is worth more than $1672 in "generative goodness" or whatever, and the potential disappointment and loss of faith in people he "thinks but is uncertain he can rely upon and trust" is... I mean, you know that one highly motivated person leading a community can make an immense difference right?

Just so you can get $1672 for charity ("upside") with "~no work"?

And that's just productivity, ignoring any potential negative affect or psychological distress, and being forced to reevaluate who he can trust. I mean, to pick a more taboo example, how many really nasty personal insults would you shout at a random software developer for $1672 to charity? That's almost "no work" — it's just you shouting some words, and whatever trivial psychological distress they feel, and I wager getting random insults from a stranger is much lower than having people you "are relying on and trusting" press a "don't nuke the world simulator button."

Like, if you just read what Ben wrote, you'd realize that risking destroying goodwill and faith in a single motivated innovative person alone should be priced well over $20k. I wouldn't have done it for $100M going to charity. Seriously.

If you think that's insane, stop and think why our numbers are four orders of magnitude apart — our priors must be obviously very different. And based on the comments, I'm taking into account more things than you, so you might be missing something really important.

(I could go on forever about this, but here's one more: what's the difference in your expected number of people discovering and getting into basic rationality, cognitive biases, and statistics with pressing the "failed at 'not destroying the world day' commemoration" vs not? Mine: high. What's the value of more people thinking and acting rationally? Mine: high. So multiply the delta by the value. That's just one more thing. There's a lot you're missing. I don't mean this disrespectfully, but maybe think more instead of "doing you" on a quick timetable?)

(Here's another one you didn't think about: we're celebrating a Soviet engineer. Run this headline in a Russian newspaper: "Americans try to celebrate Stanislav Petrov by not pressing 'nuke their own website' button, arrogant American pushes button because money isn't donated to charity.")

(Here's another one you didn't think about: I'll give anyone 10:1 odds this is cited in a mainstream political science journal within 15 years, which are read by people who both set and advise on policy, and that "group of mostly American and European rationalists couldn't not nuke their own site" absolutely is the type of thing to shape policy discussions ever-so-slightly.)

(Here's another one you didn't think about: some fraction of the people here are active-duty or reserve military in various countries. How does this going one way or another shape their kill/no-kill decisions in ambiguous warzones? Have you ever read any military memoirs about people who made to make those calls quickly, EX overwatch snipers in Mogadishu? No?)

(Not meant to be snarky — Please think more and trust your own intuition less.)

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-28T01:04:28.000Z · LW(p) · GW(p)

Thanks for writing this up. It's pretty clear to me that you aren't modeling me particularly well, and that it would take a very long time to resolve this, which I'm not particularly willing to do right now.

I'll give anyone 10:1 odds this is cited in a mainstream political science journal within 15 years, which are read by people who both set and advise on policy

I'll take that bet. Here's a proposal: I send you $100 today, and in 15 years if you can't show me an article in a reputable mainstream political science journal that mentions this event, then you send me an inflation-adjusted $1000. This is conditional on finding an arbiter I trust (perhaps Ben) who will:

  • Adjudicate whether it is an "article in a reputable mainstream political science journal that mentions this event"
  • Compute the inflation-adjusted amount, should that be necessary
  • Vouch that you are trustworthy and will in fact pay in 15 years if I win the bet.
comment by Eli Tyre (elityre) · 2019-09-27T19:03:34.942Z · LW(p) · GW(p)
If you wanted to convince me, you could make a case that destroying trust is really bad, and that in this particular case pressing the button would destroy a lot of trust, but that case hasn't really been made.

This basically seems right to me.

Replies from: habryka4
comment by habryka (habryka4) · 2019-09-27T20:16:37.195Z · LW(p) · GW(p)

Which part of the two statements? That destroying trust is really bad, or that the case hasn't been made?

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-28T12:42:35.762Z · LW(p) · GW(p)

That this particular case would destroy a lot of trust.

This seemed to me like a fun game with stakes of social disapproval on one side, and basically no stakes on the other. This doesn't seem like it has much bearing on the trustworthiness of members of the rationality community in situations with real stakes, where there is a stronger temptation to defect, or it would have more of a cost on the community.

I guess implicit to what I'm saying is that the front page being down for 24 hours doesn't seem that bad to me. I don't come to Less Wrong most days anyway.

comment by TurnTrout · 2019-09-27T01:52:05.916Z · LW(p) · GW(p)

But this is not a one-time situation. If you're a professional musician, would you agree to mess up at every dress rehearsal, because it isn't the real show?

More indirectly... the whole point of "celebrating and practicing our ability to not push buttons" is that we need to be able to not push buttons, even when it seems like a good idea (or necessary, or urgent that we defect while we can still salvage the the percieved situation). The vast majority of people aren't tempted by pushing a button when pushing it seems like an obviously bad idea. I think we need to take trust building seriously, and practice the art of actually cooperating. Real life doesn't grade you on how well you understand TDT considerations and how many blog posts you've read on it, it grades you on whether you actually can make the cooperation equilibrium happen.

Replies from: jp, rohinmshah
comment by jp · 2019-09-27T03:03:05.538Z · LW(p) · GW(p)

Rohin argues elsewhere for taking a vote (at least in principal). If 50% vote in favor, then he has successfully avoided "falling into the unilateralist's curse" and has gotten $1.6k for AMF. He even has some bonus for "solved the unilateralist's curse in a way that's not just "sit on his hands". Now, it's probably worth subtracting points for "the LW team asked them not to blow up the site and the community decided to anyway." But I'd consider it fair play.

comment by Rohin Shah (rohinmshah) · 2019-09-27T18:03:57.953Z · LW(p) · GW(p)
If you're a professional musician, would you agree to mess up at every dress rehearsal, because it isn't the real show?

Depends on the upside.

I think we need to take trust building seriously, and practice the art of actually cooperating.

This comment of mine was meant to address the claim "people shouldn't be too easily persuaded by arguments about people dying" (the second claim in Raemon's comment above). I agree that intuitions like this should push up the size of the donation you require.

More indirectly... the whole point of "celebrating and practicing our ability to not push buttons" is that we need to be able to not push buttons, even when it seems like a good idea (or necessary, or urgent that we defect while we can still salvage the the percieved situation). The vast majority of people aren't tempted by pushing a button when pushing it seems like an obviously bad idea.

As jp mentioned, I think the ideal thing to do is: first, each person figures out whether they personally think the plan is positive / negative, and then go with the majority opinion. I'm talking about the first step here. The second step is the part where you deal with the unilateralist curse.

Real life doesn't grade you on how well you understand TDT considerations and how many blog posts you've read on it, it grades you on whether you actually can make the cooperation equilibrium happen.

It seems to me like the algorithm people are following is: if an action would be unilateralist, and there could be disagreement about its benefit, don't take the action. This will systematically bias the group towards inaction. While this is fine for low-stakes situations, in higher-stakes situations where the group can invest effort, you should actually figure out whether it is good to take the action (via the two-step method above). We need to be able to take irreversible actions; the skill we should be practicing is not "don't take unilateralist actions", it's "take unilateralist actions only if they have an expected positive effect after taking the unilateralist curse into account".

We never have certainty, not for anything in this world. We must act anyway, and deciding not to act is also a choice. (Source)

Replies from: TurnTrout
comment by TurnTrout · 2019-09-28T01:37:35.268Z · LW(p) · GW(p)

It seems to me like the algorithm people are following is: if an action would be unilateralist, and there could be disagreement about its benefit, don't take the action. This will systematically bias the group towards inaction. While this is fine for low-stakes situations, in higher-stakes situations where the group can invest effort, you should actually figure out whether it is good to take the action (via the two-step method above). We need to be able to take irreversible actions; the skill we should be practicing is not "don't take unilateralist actions", it's "take unilateralist actions only if they have an expected positive effect after taking the unilateralist curse into account".

I don’t disagree with this, and am glad to see reminders to actually evaluate different courses of action besides the one expected of us. my comment was more debating your own valuation as being too low, it not being a one-off event once you consider scenarios either logically or causally downstream of this one, and just a general sense that you view the consequences of this event as quite isolated.

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-28T05:03:44.152Z · LW(p) · GW(p)
my comment was more debating your own valuation as being too low, it not being a one-off event once you consider scenarios either logically or causally downstream of this one

That makes sense. I don't think I'm treating it as a one-off event; it's more that it doesn't really seem like there's much damage to the norm. If a majority of people thought it was better to take the counterfactual donation, it seems like the lesson is "wow, we in fact can coordinate to make good decisions", as opposed to "whoops, it turns out rationalists can't even coordinate on not nuking their own site".

comment by Richard Yannow (richard-yannow) · 2019-09-27T04:33:51.436Z · LW(p) · GW(p)

jkaufman's initial offer was unclear. I read it (incorrectly) as "I will push the button (/release the codes) unless someone gives AMF $1672 counterfactually", not as "if someone is willing to pay me $1672, I will give them the codes". Read in the first way, Raemon's concerns about "pressure" as opposed to additional donations made on the fly may be clearer; it's not about jkaufman's opportunity to get $1672 in donations for no work, it's about everyone else being extorted for an extra $1672 to preserve their values.

comment by mingyuan · 2019-09-27T02:42:33.961Z · LW(p) · GW(p)

Perhaps a nitpick, but I feel like the building of trust is being treated less as a sacred value, and more as a quantity of unknown magnitude, with some probability that that magnitude could be really high (at least >$1672, possibly orders of magnitude higher). Doing a Fermi is a trivial inconvenience that I for one cannot handle right now; since it is a weekday, maybe others feel much the same.

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-27T18:10:24.908Z · LW(p) · GW(p)

I agree that your comment takes this (very reasonable) perspective. It didn't seem to me like any other comment was taking this perspective, but perhaps that was their underlying model.

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:16:57.549Z · LW(p) · GW(p)

I wouldn't do it for $100M.

Seriously.

Because it increases the marginal chance that humanity goes extinct ever-so-slightly.

If you have launch codes, wait until tomorrow to read the last part eh? —

(V zrna, hayrff lbh guvax gur rkcrevzrag snvyvat frpergyl cebzbgrf pnhgvba naq qrfgeblf bcgvzvfz, juvpu zvtug or gehr.)

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-27T18:05:54.853Z · LW(p) · GW(p)

Why couldn't you use the $100M to fund x-risk prevention efforts?

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T23:04:56.485Z · LW(p) · GW(p)

Well, why stop there?

World GDP is $80.6 trillion.

Why doesn't the United States threaten to nuke everyone if they don't give a very reasonable 20% of their GDP per year to fund X-Risk — or whatever your favorite worthwhile projects are?

Screw it, why don't we set the bar at 1%?

Imagine you're advising the U.S. President (it's Donald Trump right now, incidentally). Who should President Trump threaten with nuking if they don't pay up to fund X-Risk? How much?

Now, let's say 193 countries do it, and $X trillion is coming in and doing massive good.

Only Switzerland and North Korea defect. What do you do? Or rather, what do you advise Donald Trump to do?

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-28T01:12:07.109Z · LW(p) · GW(p)

I never suggested threats, and in fact I don't think you should threaten to press the button unless someone makes a counterfactual donation of $1,672.

Jeff's original comment was also not supposed to be a threat, though it was ambiguous. All of my comments are talking about the non-threat version.

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:11:59.292Z · LW(p) · GW(p)

Dank EA Memes? What? Really? How do I get in on this?

(Serious.)

(I shouldn't joke "I have launch codes" — that's grossly irresponsible for a cheap laugh — but umm, I just meta made the joke.)

Replies from: lionhearted, tetraspace-grouping
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:23:09.465Z · LW(p) · GW(p)

Note to self: Does lighthearted dark humor highlighting risk increase or decrease chances of bad things happening?

Initial speculation: it might have an inverted response curve. One or two people making the joke might increase gravity, everyone joking about it might change norms and salience.

Replies from: adele-lopez-1
comment by Adele Lopez (adele-lopez-1) · 2019-09-27T04:46:21.931Z · LW(p) · GW(p)

I noticed after playing a bunch of games of a mafia-type game with some rationalists that when people made edgy jokes about being in the mob or whatever, they were more likely to end up actually being in the mob.

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T23:07:12.804Z · LW(p) · GW(p)

There's rationalists who are in the mafia?

Whoa.

No insightful comment, just, like — this Petrov thread is the gift that keeps on giving.

Replies from: interstice
comment by interstice · 2019-09-28T00:50:54.455Z · LW(p) · GW(p)

Can't tell if joking, but they probably mean that they were "actually in the mafia" in the game, so not in the real-world mafia.

Replies from: adele-lopez-1
comment by Tetraspace (tetraspace-grouping) · 2019-09-28T01:09:36.183Z · LW(p) · GW(p)

Dank EA Memes is a Facebook group. It's pretty good.

comment by mingyuan · 2019-09-26T21:03:09.765Z · LW(p) · GW(p)

(I have launch codes and am happy to prove it to you if you want.)

Hmmm, I feel like the argument "There's some harm in releasing the codes entrusted to me, but not so much that it's better for someone to die" might prove too much? Like, death is really bad, I definitely grant that. But despite the dollar amount you gave, I feel like we're sort of running up against a sacred value thing. I mean, you could just as easily say, "There's some harm in releasing the codes entrusted to me, but not so much that it's better for someone to have a 10% chance of dying" - which would naïvely bring your price down to $167.20.

If you accept as true that that argument should be equally 'morally convincing', then you end up in a position where the only reasonable thing to do is to calculate exactly how much harm you actually expect to be done by you pressing the button. I'm not going to do this because I'm at work and it seems complicated (what is the disvalue of harm to the social fabric of an online community that's trying to save the world, and operates largely on trust? perhaps it's actually a harmless game, but perhaps it's not, hard to know - seems like the majority of effects would happen down the line).

Additionally, I could just counter-offer a $1,672 counterfactual donation to GiveWell for you to not press the button. I'm not committing to do this, but I might do so if it came down to it.

Replies from: jkaufman
comment by jefftk (jkaufman) · 2019-09-26T21:25:58.912Z · LW(p) · GW(p)

Are you telling me you don't think this is a good trade?

Replies from: mingyuan
comment by mingyuan · 2019-09-26T21:55:15.493Z · LW(p) · GW(p)

Wasn't totally sure when I wrote it, but now firmly yes.

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:09:37.036Z · LW(p) · GW(p)

This whole thread is awesome. This is the maybe the best thing that's happened on LessWrong since Eliezer more-or-less went on hiatus.

Huge respect to everyone. This is really great. Hard but great. Actually it's great because it's hard.

comment by TurnTrout · 2019-09-26T20:58:35.979Z · LW(p) · GW(p)

I'm leaning towards this not being a good trade, even though it's taxing to type that.

In the future, some people will find themselves in situations not too unlike this, where there are compelling utilitarian reasons for pressing the button.

Look, the system should be corrigible. It really, really should; the safety team's internal prediction market had some pretty lopsided results. There are untrustworthy actors with capabilities similar to or exceeding ours. If we press the button, it probably goes better than if they press it. And they can press it. Twenty people died since I started talking, more will die if we don't start pushing the world in a better direction, and do you feel the crushing astronomical weight of the entire future's eyes upon us? Even a small probability increase in a good outcome makes pressing the button worth it.

And I think your policy should still be to not press the button to launch a singleton from this epistemic state, because we have to be able to cooperate! You don't press buttons at will, under pressure, when the entire future hangs in the balance! If we can't even cooperate, right here, right now, under much weaker pressures, what do we expect of the "untrustworthy actors"?

So how about people instead donate to charity in celebration of not pressing the button?

ETA I have launch codes btw.

comment by jefftk (jkaufman) · 2019-09-26T20:34:47.978Z · LW(p) · GW(p)

Oh: and to give those potential other people time to object, I won't accept an offer before 2hr from when I posted the parent comment (4:30 Boston time)

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-26T21:27:44.438Z · LW(p) · GW(p)

The normal way to resolve unilateralist curse effects is to see how many people agree / disagree, and go with the majority. (Even if the action is irreversible, as long as everyone knows that and has taken that into account, going with the majority seems fine.)

Pro: it saves an expected life. Con: LW frontpage probably goes down for a day. Con: It causes some harm to trust. Pro: It reinforces the norm of actually considering consequences, and not holding any value too sacred.

Overall I lean towards the benefits outweighing the costs, so I support this offer.

ETA: I also have codes.

Replies from: John_Maxwell_IV, jkaufman, DanielFilan
comment by John_Maxwell (John_Maxwell_IV) · 2019-09-27T03:04:58.195Z · LW(p) · GW(p)

Pro: It reinforces the norm of actually considering consequences, and not holding any value too sacred.

Not an expert here, but my impression was sometimes it can be useful to have "sacred values" in certain decision-theoretic contexts (like "I will one-box in Newcomb's Problem even if consequentialist reasoning says otherwise"?) If I had to choose a sacred value to adopt, cooperating in epistemic prisoners' dilemmas actually seems like a relatively good choice?

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-27T18:29:55.181Z · LW(p) · GW(p)
I will one-box in Newcomb's Problem even if consequentialist reasoning says otherwise

I don't think of Newcomb's problem as being a disagreement about consequentialism; it's about causality. I'd mostly agree with the statement "I will one-box in Newcomb's Problem even if causal reasoning says otherwise" (though really I would want to add more nuance).

I feel relatively confident that most decision theorists at MIRI would agree with me on this.

If I had to choose a sacred value to adopt, cooperating in epistemic prisoners' dilemmas actually seems like a relatively good choice?

In a real prisoner's dilemma [LW · GW], you get defected against if you do that. You also need to take into account how the other player reasons. (I don't know what you mean by epistemic prisoner's dilemmas, perhaps that distinction is important.)

I also want to note that "take the majority vote of the relevant stakeholders" seems to be very much in line with "cooperating in epistemic prisoner's dilemmas", so if the offer did go through, I would expect this to strengthen that particular norm. See also this comment [LW(p) · GW(p)].

my impression was sometimes it can be useful to have "sacred values" in certain decision-theoretic contexts

I would not put it this way. It depends on what future situations you expect to be in. You might want to keep honesty as a sacred value, and tell an ax-murderer where your friend is, if you think that one day you will have to convince aliens that we do not intend them harm in order to avert a huge war. Most of us don't expect that, so we don't keep honesty as a sacred value. Ultimately it does all boil down to consequences.

comment by jefftk (jkaufman) · 2019-09-26T22:02:06.178Z · LW(p) · GW(p)

If we could figure out some reasonable way to poll people I agree, but I don't see a good way to do that, especially not on this timescale?

Replies from: DanielFilan
comment by DanielFilan · 2019-09-26T22:03:20.959Z · LW(p) · GW(p)

Presumably you could take the majority vote of comments left in a 2 hour span?

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-26T22:26:18.261Z · LW(p) · GW(p)

^ Yeah, that.

The policy of "if two people object then the plan doesn't go through" sets up a unilateralist-curse scenario for the people against the plan -- after the first person says no, every future person is now able to unilaterally stop the plan, regardless of how many people are in favor of it. (See also Scott's comment [LW(p) · GW(p)].) Ideally we'd avoid that; majority vote of comments does so (and seems like the principled solution).

(Though at this point it's probably moot given the existing number of nays.)

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:29:26.119Z · LW(p) · GW(p)

Let's, for the hell of it, assume real money got involved. Like, it was $50M or something.

Now — who would you want to be able to vote on whether destruction happens if their values aren't met with that amount of money at stake?

If it's the whole internet, most people will treat it as entertainment or competition as opposed to considering what we actually care about.

But if we're going to limit it only to people that are thoughtful, that invalidates the point of majority vote doesn't it?

Think about it, I'm not going to write out all the implications, but I think your faith in crowdsourced voting mechanisms for things with known-short-payoff against with long-unknown-costs that destroy long-unknown-gains is perhaps misplaced...?

Most people are — factually speaking — not educated on all relevant topics, not fully numerate on statistics and payoff calculations, go with their feelings instead of analysis, and are short-term thinkers..........

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-09-27T18:31:35.619Z · LW(p) · GW(p)

I agree that in general this is a problem, but I think in this particular case we have the obvious choice of the set of all people with launch codes.

(Btw, your counterargument also applies to the unilateralist curse itself.)

comment by DanielFilan · 2019-09-26T21:33:24.340Z · LW(p) · GW(p)

I'm surprised that LW being down for a day isn't on your list of cons. [ETA: or rather the LW home page]

Replies from: peter_hurford, rohinmshah, tcheasdfjkl
comment by Peter Wildeford (peter_hurford) · 2019-09-26T21:37:08.972Z · LW(p) · GW(p)

It could also be on the list of pros, depending on how one uses LW.

Replies from: Raemon
comment by Raemon · 2019-09-26T21:40:52.891Z · LW(p) · GW(p)

I feel obligated to note that it will in fact only destroy the frontpage of LW, not the rest of the site.

Replies from: jacobjacob
comment by jacobjacob · 2019-09-27T10:18:55.885Z · LW(p) · GW(p)

Ah. I thought it was the entire site. (Though it did say "Frontpage" in the post.)

comment by Rohin Shah (rohinmshah) · 2019-09-26T21:36:24.040Z · LW(p) · GW(p)

Good point, added, doesn't change the conclusion.

comment by tcheasdfjkl · 2019-09-26T22:36:39.201Z · LW(p) · GW(p)

I'll note that giving someone the launch codes merely increases the chance of the homepage going down.

comment by johnswentworth · 2019-09-26T20:55:59.097Z · LW(p) · GW(p)

I dunno, one life seems like a pretty expensive trade for the homepage staying up for a day. I bet a potential buyer could shop around and obtain launch codes for half a life.

Not saying I'd personally give up my launch code at the very reasonable cost of $836. But someone could probably be found. Especially if the buyer somehow found a way to frame someone else for the launch.

(Of course, now this comment is sitting around in plain view of everyone, the launch codes would have to come from someone other than me, even accounting for the framing.)

comment by tcheasdfjkl · 2019-09-26T20:53:06.895Z · LW(p) · GW(p)

this makes sense. I shall consider whether it makes sense for me to impulse-spend this amount of money on shenanigans (and lifesaving)

Replies from: jkaufman, tcheasdfjkl
comment by jefftk (jkaufman) · 2019-09-27T00:15:43.234Z · LW(p) · GW(p)

If you're considering it as spending on lifesaving then it doesn't sound counterfactual?

Replies from: tcheasdfjkl, jp
comment by tcheasdfjkl · 2019-09-27T00:48:09.047Z · LW(p) · GW(p)

I'm pretty sure it is? I had already decided on & committed to a donation amount for 2019, and this would be in addition to that. The lifesaving part is relevant insofar as I am happier about the prospect of this trade than I would be about paying the same amount to an individual.

The only way in which I could imagine this not being perfectly counterfactual is that given that discretionary spending choices depend some on my finances at any given point, and given that large purchases have some impact on my finances, it may be that if some other similar opportunity presented itself later on, my decision re: that opportunity could have some indirect causal connection to my current decision (not in the direct sense of "oh I already donated last month so I won't now" but just in the sense of "hmm how much discretionary-spending money do I currently have and, given that, do I want to spend $X on Y"). I'm not sure it's really ever possible to get rid of that though?

comment by jp · 2019-09-27T00:31:55.244Z · LW(p) · GW(p)

It could partially motivated by lifesaving but they wouldn't have donated otherwise. Like, not if they're a perfectly rational agent, but hey.

comment by tcheasdfjkl · 2019-09-26T22:08:41.361Z · LW(p) · GW(p)

If someone else with codes wants to make this offer now that Jeff has withdrawn his, I'm now confident I am up for this.

Replies from: mingyuan
comment by mingyuan · 2019-09-26T23:03:52.088Z · LW(p) · GW(p)

I preemptively counter-offer whatever amount of money tcheasdfjkl would pay in order for this hypothetical person not to press the button.

Replies from: tcheasdfjkl
comment by tcheasdfjkl · 2019-09-26T23:24:47.499Z · LW(p) · GW(p)

To be clear I am NOT looking for people to press the button, I am looking for people to give me launch codes.

Replies from: mingyuan
comment by mingyuan · 2019-09-26T23:29:48.000Z · LW(p) · GW(p)

Oh wow, I did not realize how ambiguous the original wording was.

comment by Said Achmiz (SaidAchmiz) · 2019-09-27T03:05:52.694Z · LW(p) · GW(p)

Forgive me if I’m being dense, but just what in the world is a “counterfactual donation”?

Replies from: habryka4
comment by habryka (habryka4) · 2019-09-27T03:09:18.244Z · LW(p) · GW(p)

Jeff does conveniently have a blogpost on this: https://www.jefftk.com/p/what-should-counterfactual-donation-mean

Replies from: gjm
comment by gjm · 2019-09-27T19:33:42.476Z · LW(p) · GW(p)

It seems extremely unfortunate that the terminology apparently shifted from "counterfactually valid" (which means the right thing) to "counterfactual" (which means almost the opposite of the right thing).

Replies from: Raemon
comment by Raemon · 2019-09-27T20:32:23.358Z · LW(p) · GW(p)

Do you have a suggestion for terminology that properly truncates? (i.e. I think it's basically impossible to expect a long phrase to end up being the one people regularly use, so if you want to fix that issue you need a single word that does the job)

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2019-09-28T04:41:38.663Z · LW(p) · GW(p)

"Additional donation" seems like the obvious choice in place of "counterfactual donation", since we just mean "additional to what you would have donated anyway", right? (The very obviousness makes me think maybe there's a downside to the term that I'm not seeing, or I'm confused in some other way.)

Replies from: tcheasdfjkl
comment by tcheasdfjkl · 2019-09-28T04:46:29.927Z · LW(p) · GW(p)

Sounds pragmatically weird in the case where the person isn't known to already be donating.

comment by Tetraspace (tetraspace-grouping) · 2019-09-26T23:37:24.075Z · LW(p) · GW(p)

I.

Clicking on the button permanently switches it to a state where it's pushed-down, below which is a prompt to enter launch codes. When moused over, the pushed-down button has the tooltip "You have pressed the button. You cannot un-press it." Screenshot.

(On an unrelated note, on r/thebutton I have a purple flair that says "60s".)

Upon entering a string of longer than 8 characters, a button saying "launch" appears below the big red button. Screenshot.

II.

I'm nowhere near the PST timezone, so I wouldn't be able to reliably pull a shenanigan whereby if I had the launch codes I would enter or not enter them depending on the amount of counterfactual money pledged to the Ploughshares Fund in the name of either launch-code-entry-state, but this sentence is not apophasis.

III.

Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.

IV. (Update)

I can no longer see the button when I am logged in. Could this mean that I have won?

Replies from: Scott Garrabrant
comment by Scott Garrabrant · 2019-09-27T00:09:54.038Z · LW(p) · GW(p)
Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.

Oh no! Someone is wrong on the internet, and I have the ability to prove them wrong...

Replies from: TurnTrout, tetraspace-grouping
comment by TurnTrout · 2019-09-27T00:22:26.628Z · LW(p) · GW(p)

Replies from: Raemon
comment by Raemon · 2019-09-27T00:22:56.625Z · LW(p) · GW(p)
Replies from: jp
comment by jp · 2019-09-27T00:25:01.358Z · LW(p) · GW(p)

Replies from: tetraspace-grouping
comment by Tetraspace (tetraspace-grouping) · 2019-09-27T00:38:46.380Z · LW(p) · GW(p)

To make sure I have this right and my LW isn't glitching: TurnTrout's comment is a Drake meme, and the two other replies in this chain are actually blank?

Replies from: Ruby
comment by Ruby · 2019-09-27T00:40:57.162Z · LW(p) · GW(p)








Replies from: jimrandomh
comment by jimrandomh · 2019-09-27T01:27:54.304Z · LW(p) · GW(p)




?


!


Replies from: jimrandomh
comment by jimrandomh · 2019-09-27T02:56:53.227Z · LW(p) · GW(p)

(This thread is our collective reenactment of the conversations about nuclear safety that happened during the cold war.)

comment by Tetraspace (tetraspace-grouping) · 2019-09-27T00:35:05.499Z · LW(p) · GW(p)

Well, at least we have a response to the doubters' "why would anyone even press the button in this situation?"

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T04:38:16.834Z · LW(p) · GW(p)

https://en.wikipedia.org/wiki/Pandora#Works_and_Days

The more famous version of the Pandora myth comes from another of Hesiod's poems, Works and Days. In this version of the myth, Hesiod expands upon her origin, and moreover widens the scope of the misery she inflicts on humanity. As before, she is created by Hephaestus, but now more gods contribute to her completion: Athena taught her needlework and weaving; Aphrodite "shed grace upon her head and cruel longing and cares that weary the limbs"; Hermes gave her "a shameful mind and deceitful nature"; Hermes also gave her the power of speech, putting in her "lies and crafty words"; Athena then clothed her; next Persuasion and the Charites adorned her with necklaces and other finery; the Horae adorned her with a garland crown. Finally, Hermes gives this woman a name: Pandora – "All-gifted" – "because all the Olympians gave her a gift". (In Greek, Pandora has an active rather than a passive meaning; hence, Pandora properly means "All-giving." The implications of this mistranslation are explored in "All-giving Pandora: mythic inversion?" below.) In this retelling of her story, Pandora's deceitful feminine nature becomes the least of humanity's worries. For she brings with her a jar (which, due to textual corruption in the sixteenth century, came to be called a box) containing "burdensome toil and sickness that brings death to men", diseases and "a myriad other pains". Prometheus had (fearing further reprisals) warned his brother Epimetheus not to accept any gifts from Zeus. But Epimetheus did not listen; he accepted Pandora, who promptly scattered the contents of her jar. As a result, Hesiod tells us, "the earth and sea are full of evils""

What's in the box? What's in the box? Don't open it! Oh, shit...

(Grace, longing and care, and being gifted causes the box to be opened. It's like history just keeps repeating itself or something...)

comment by Ben Pace (Benito) · 2019-09-27T07:01:30.961Z · LW(p) · GW(p)

"And on that day, the curse was lifted."

comment by Error · 2019-09-26T18:20:56.688Z · LW(p) · GW(p)

How did you implement the button? I run a small site, love the idea, and would like to do something similar.

comment by tcheasdfjkl · 2019-09-27T19:42:31.544Z · LW(p) · GW(p)

Can we have a recap from the mods of how Petrov Day went? How many people pressed the button, how many people tried entering anything in the launch code field, how many people tried the fake launch code posted on Facebook in particular?

Replies from: Benito
comment by Ben Pace (Benito) · 2019-09-27T19:43:50.885Z · LW(p) · GW(p)

Currently writing that post :)

Added: Will post it sometime today, but probably later on.

Replies from: lionhearted
comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-27T23:11:59.117Z · LW(p) · GW(p)

You guys are total heroes. Full stop. In the 1841 "On Heroes" sense of the word, which is actually pretty well-defined. (Good book, btw.)

comment by ryan_b · 2019-09-27T18:03:19.692Z · LW(p) · GW(p)

Generic feedback:

I had launch codes. I had hidden the map previously in my settings, which also had the effect of hiding the button, which in turn was enough to screen off any buttons should be pressed and would this really work? temptations.

I did keep checking the site to see if it went down, though.

comment by Quirinus_Quirrell · 2019-09-26T18:17:06.933Z · LW(p) · GW(p)

I have the launch codes. I'll take the site down unless Eliezer Yudkowsky publicly commits to writing a sequel chapter to HPMoR, in which I get an acceptably pleasant ending, by 9pm PST.

Replies from: Benito, DanielFilan, None
comment by Ben Pace (Benito) · 2019-09-26T19:05:13.779Z · LW(p) · GW(p)
The enemy is smart.
"The enemy knew perfectly well that you'd check whose launch codes were entered, especially since the nukes being set off at all tells us that someone can appear falsely trustworthy." Ben shut his eyes, thinking harder, trying to put himself into the enemy's shoes. Why would he, or his dark side, have done something like - "We're meant to conclude that the enemy has the launch codes. But that's actually something the enemy can only do with difficulty, or under special conditions; they're trying to create a false appearance of omnipotence." Like I would. "Later, hypothetically, the nukes actually get fired. We think it was Quirinus_Quirrell firing it, but really, it was just someone firing it independently."
"Unless that is precisely what Quirinus_Quirrell expects us to think," said Jim Babcock, his brow furrowed in concentration. "In which case he does have the launch codes, as well as the other person."
"Does Quirinus_Quirrell really use plots with that many levels of meta -"
"Yes," said Habryka and Jim.
Ben nodded distantly. "Then this could be a setup to either make us think the personalised launch codes are telling the truth about his identity when they're lying, or a setup to make us think the codes are lying when they're telling the truth, depending on what level the enemy expects us to reason at. But if the enemy is planning to make us trust the personalised codes - we would have trusted the personalised codes anyway, if we'd been given no reason to distrust them. So there's no need to go to all the work of framing another user in a way that we would realize we were intended to discover, just to trick us into going meta -"
comment by DanielFilan · 2019-09-26T18:24:32.596Z · LW(p) · GW(p)

(FYI California is currently in the PDT time zone, not PST)

comment by [deleted] · 2019-09-27T01:49:34.592Z · LW(p) · GW(p)

.

Replies from: habryka4
comment by habryka (habryka4) · 2019-09-27T02:01:36.384Z · LW(p) · GW(p)

The site will go down for a full 24 hours after the button was pressed and correct launch codes entered (not that that is the most important aspect of this situation, but I figured I would clarify anyways)

comment by johnswentworth · 2019-09-26T16:06:18.104Z · LW(p) · GW(p)

That is a very shiny button.

Replies from: bgold
comment by bgold · 2019-09-26T16:33:04.737Z · LW(p) · GW(p)

so shiny. It's like, it's begging to be pressed.

http://www.scp-wiki.net/scp-001-j

comment by gjm · 2019-09-26T22:34:21.985Z · LW(p) · GW(p)

I don't see the big shiny red button on the front page. If I visit LW in private mode, it's there. I have the map turned off. I haven't tried logging out or turning the map back on. I'm guessing that when Ben says it's "over the frontpage map" that means it's implemented in a way that makes it disappear if the map isn't there. That seems a bit odd, though it probably isn't worth the effort of fixing.

(I have a launch code but hereby declare my intention not to use it. I am intrigued by the discussions of trading launch codes, or promises to use or not use them, for valuable things like effective charitable donations, but am not interested in taking either side of any such trade.)

Replies from: habryka4
comment by habryka (habryka4) · 2019-09-26T22:40:28.667Z · LW(p) · GW(p)

Yep, you have to activate the map to see it. Just turned out to be the most convenient way of implementing it, and also worked well aesthetically.

comment by lionhearted (Sebastian Marshall) (lionhearted) · 2019-09-26T14:57:20.879Z · LW(p) · GW(p)

Rot13 comment, if you have launch codes, recommend you wait until tomorrow to read this eh?

(1) V'z phevbhf ubj znal crbcyr jvgu ynhapu pbqrf pyvpxrq gur ohggba "gb purpx vg bhg" jvgubhg ragrevat ynhapu pbqrf. V qvqa'g qb fb, npghnyyl, fb V pna bayl cerfhzr lbh'q unir gb ragre pbqrf.

(2) V jbaqre vs gur yvfg bs anzrf jnf znqr choyvp vs crbcyr jbhyq or zber yvxryl be yrff yvxryl gb cerff vg. Anvir nafjre vf yrff yvxryl, ohg vg zvtug unir n fgenatr "lbh pna'g pbageby zr ivn funzr" serrqbz rssrpg sbyybjrq ol xnobbz.

(3) Qrfver sbe yvoregl — be frys-rkcerffvba — ner obgu tbbq guvatf, lrf? Naq LRG, V guvax nzbat n uvtu-yriry pebjq, gung'f zber yvxryl gb pnhfr n ohggba chfu guna abezny obevat gebyy znyvpr. V'z erzvaqrq bs Urael Xvffvatre'f dhbgr, "Gur zbfg shaqnzragny ceboyrz bs cbyvgvpf vf abg gur pbageby bs jvpxrqarff ohg gur yvzvgngvba bs evtugrbhfarff."

Ebg13 urer gb abg zrff hc gur qngn cyhf naq ubcrshyyl abg rssrpg erfhygf gbb zhpu. V guvax pbzzragvat V jnf srryvat "Natrfcnaag" vf jvguva obhaqf naq vf abezny orunivbe — yvxr n "url sbyxf, jubn, guvf vf vagrafr ru?" — ohg pbzzragvat ba fgngvfgvpf naq orunivbe cnggreaf zber yvxryl gb rssrpg bhgpbzr.

comment by Ramiro P. (ramiro-p) · 2019-09-28T12:29:10.509Z · LW(p) · GW(p)

So far, LW is still online. It means:

a) either nobody used their launch codes, and you can trust 125 nice & smart individuals not to take unilateralist action - so we can avoid armageddon if we just have coordinated communities with the right people;

b) nobody used their launch codes, because these 125 are very like-minded people (selection bias), there's no immediate incentive to blow it up (except for some offers about counterfactual donations), but some incentive to avoid it (honor!... hope? Prove EDT, UDT...?). It doesn't model the problem of MAD, and surely it doesn't model Petrov's dilemma - he went against express orders to minimize the chance of nuclear war, so risking his career (and possibly his life).

c) or this a hoax. That's what I would do; I wouldn't risk a day of LW just to prove our honor (sorry, I grew up in a tough neighborhood and have problems with trusting others).

My point is: I think (b) and (c) are way more likely than (a), so I'd use the launch codes, and take the risk of ostracism, if I had them. I think it would yield higher expected-utility; as I said, I wouldn't risk a day of LW to prove our honor, but I'd do it to prove you shouldn't play Petrov lightly.

Please, correct me if I'm wrong.

P.S.: (d) this allows anyone to claim having launch codes and mugger others into counterfactual donations - which is brilliant.

comment by Slider · 2019-09-28T12:03:45.610Z · LW(p) · GW(p)

I hovered over the button thinking that the button appearing means I am one of the chosen ones. Afterwards it seemed I was reckless. I was curious and thought that I can just choose not press my mouse button (I did manage that). One the other hand I was hazy on the mechanics on how things work and I knew moving the mouse over the button means lower distance between bad things and present. The tooltip popup was unexpected and somewhat startled me. It could have been possible to have a mechanism go off with that and I was not considering that. Full smuchbait button would have the hover over be equivalent to pushing.

In ai thinking sometimes I used a line of thinking "Well we live in a world whewre there are nuclear buttons" about learning by accident not being realiable way to learn. Living through the experience of "what is this curious red thing?" gives a lot more detail to an abstract edgecase. Althught buttons are a shorthand for choice affordance (ie you now that buttons do stuff) and the argument more relies in doing bad choices when not even realising to be making a choice or that that kind of choice is possible.

comment by [deleted] · 2019-09-26T15:05:32.272Z · LW(p) · GW(p)

.

Replies from: habryka4
comment by habryka (habryka4) · 2019-09-26T15:50:12.377Z · LW(p) · GW(p)

Everyone has access to the button, but only 125 users were sent launch codes (after you press the button the launch-codes panel appears).

Replies from: jmh
comment by jmh · 2019-09-26T17:28:59.136Z · LW(p) · GW(p)

Certainly good to hear. I almost accidentally pressed it earlier! No codes so good fail-safe for me.