Posts

Against boots theory 2020-09-14T13:20:04.056Z · score: 30 (17 votes)
Classifying games like the Prisoner's Dilemma 2020-07-04T17:10:01.965Z · score: 73 (21 votes)
Short essays on various things I've watched 2020-06-12T22:50:01.957Z · score: 9 (2 votes)
Haskenthetical 2020-05-19T22:00:02.014Z · score: 18 (4 votes)
Chris Masterjohn on Coronavirus, Part 2 2020-04-28T21:50:01.430Z · score: 6 (5 votes)
In my culture: the responsibilities of open source maintainers 2020-04-13T13:40:01.174Z · score: 41 (15 votes)
Chris Masterjohn on Coronavirus, Part 1 2020-03-29T11:00:00.819Z · score: 14 (6 votes)
My Bet Log 2020-03-19T21:10:00.929Z · score: 17 (3 votes)
Tapping Out In Two 2019-12-05T23:10:00.935Z · score: 18 (7 votes)
The history of smallpox and the origins of vaccines 2019-12-01T20:51:29.618Z · score: 15 (5 votes)
The Effect pattern: Transparent updates in Elm 2019-10-20T20:00:01.101Z · score: 8 (2 votes)
London Rationalish meetup (part of SSC meetups everywhere) 2019-09-12T20:32:52.306Z · score: 7 (1 votes)
Is this info on zinc lozenges accurate? 2019-07-27T22:05:11.318Z · score: 31 (11 votes)
A reckless introduction to Hindley-Milner type inference 2019-05-05T14:00:00.862Z · score: 17 (5 votes)
"Now here's why I'm punching you..." 2018-10-16T21:30:01.723Z · score: 29 (18 votes)
Pareto improvements are rarer than they seem 2018-01-27T22:23:24.206Z · score: 58 (21 votes)
2017-10-08 - London Rationalish meetup 2017-10-04T14:46:50.514Z · score: 9 (2 votes)
Authenticity vs. factual accuracy 2016-11-10T22:24:38.810Z · score: 5 (9 votes)
Costs are not benefits 2016-11-03T21:32:07.811Z · score: 5 (6 votes)
GiveWell: A case study in effective altruism, part 1 2016-10-14T10:46:23.303Z · score: 0 (1 votes)
Six principles of a truth-friendly discourse 2016-10-08T16:56:59.994Z · score: 4 (7 votes)
Diaspora roundup thread, 23rd June 2016 2016-06-23T14:03:32.105Z · score: 5 (6 votes)
Diaspora roundup thread, 15th June 2016 2016-06-15T09:36:09.466Z · score: 24 (27 votes)
The Sally-Anne fallacy 2016-04-11T13:06:10.345Z · score: 34 (28 votes)
Meetup : London rationalish meetup - 2016-03-20 2016-03-16T14:39:40.949Z · score: 0 (1 votes)
Meetup : London rationalish meetup - 2016-03-06 2016-03-04T12:52:35.279Z · score: 0 (1 votes)
Meetup : London rationalish meetup, 2016-02-21 2016-02-20T14:09:42.635Z · score: 0 (1 votes)
Meetup : London Rationalish meetup, 7/2/16 2016-02-04T16:34:13.317Z · score: 1 (2 votes)
Meetup : London diaspora meetup: weird foods - 24/01/2016 2016-01-21T16:45:10.166Z · score: 1 (2 votes)
Meetup : London diaspora meetup, 10/01/2016 2016-01-02T20:41:05.950Z · score: 2 (3 votes)
Stupid questions thread, October 2015 2015-10-13T19:39:52.114Z · score: 3 (4 votes)
Bragging thread August 2015 2015-08-01T19:46:45.529Z · score: 3 (4 votes)
Meetup : London meetup 2015-05-14T17:35:18.467Z · score: 2 (3 votes)
Group rationality diary, May 5th - 23rd 2015-05-04T23:59:39.601Z · score: 7 (8 votes)
Meetup : London meetup 2015-05-01T17:16:12.085Z · score: 1 (2 votes)
Cooperative conversational threading 2015-04-15T18:40:50.820Z · score: 25 (26 votes)
Open Thread, Apr. 06 - Apr. 12, 2015 2015-04-06T14:18:34.872Z · score: 4 (5 votes)
[LINK] Interview with "Ex Machina" director Alex Garland 2015-04-02T13:46:56.324Z · score: 6 (7 votes)
[Link] Eric S. Raymond - Me and Less Wrong 2014-12-05T23:44:57.913Z · score: 23 (23 votes)
Meetup : London social meetup in my flat 2014-11-19T23:55:37.211Z · score: 2 (2 votes)
Meetup : London social meetup 2014-09-25T16:35:18.705Z · score: 2 (2 votes)
Meetup : London social meetup 2014-09-07T11:26:52.626Z · score: 2 (2 votes)
Meetup : London social meetup - possibly in a park 2014-07-22T17:20:28.288Z · score: 2 (2 votes)
Meetup : London social meetup - possibly in a park 2014-07-04T23:22:56.836Z · score: 2 (2 votes)
How has technology changed social skills? 2014-06-08T12:41:29.581Z · score: 16 (16 votes)
Meetup : London social meetup - possibly in a park 2014-05-21T13:54:16.372Z · score: 2 (2 votes)
Meetup : London social meetup - possibly in a park 2014-05-14T13:27:30.586Z · score: 2 (2 votes)
Meetup : London social meetup - possibly in a park 2014-05-09T13:37:19.129Z · score: 1 (2 votes)
May Monthly Bragging Thread 2014-05-04T08:21:17.681Z · score: 10 (10 votes)
Meetup : London social meetup 2014-04-30T13:34:43.181Z · score: 2 (2 votes)

Comments

Comment by philh on The Darwin Game - Rounds 0 to 10 · 2020-10-26T23:27:01.122Z · score: 2 (1 votes) · LW · GW

[Blue] Clone Army. 10 players pledged to submit clone bots. 8 followed through, 1 didn’t and Multicore submitted a [Red] mimic bot.

To clarify, the 8 all successfully recognize each other as clones, and the one who didn't follow through submitted nothing? Relevant for scoring my predictions on the last comment thread.

Comment by philh on The Darwin Game - Rounds 0 to 10 · 2020-10-26T23:16:59.183Z · score: 6 (3 votes) · LW · GW

So, uh. Unless I made a silly mistake somewhere, or the version in the tournament is different from what you posted in the thread... I specifically tested to make sure incomprehensibot would get ASTBot disqualified if we both survived that long. Sorry.

(Some of my requested changes to the CloneBot common code were to route around a bug in ASTBot that made it crash before I wanted it to, in ways it could recover from. ASTBot can't really handle top-level import statements due to details I don't really understand about python's namespace handling. So I requested that CloneBot not include any of those.)

Comment by philh on The Darwin Game · 2020-10-26T15:06:41.239Z · score: 2 (1 votes) · LW · GW

*At least one bot tries to simulate me after the showdown and doesn’t get disqualified: 10%.

  • At least one bot tries to simulate me after the showdown and succeeds: 5%.

I now think these were overconfident. I think it would be fairly easy to simulate incomprehensibot safely; but hard to simulate incomprehensibot in a way that would be both safe and generically useful.

The difficulty with simulating is that you need to track your opponent's internal state. If you just call MyOpponentBot(round).move(myLastMove) you'll be simulating "what does my opponent do on the first turn of the game, if it gets told that my last move was...". If you do this against incomprehensibot, and myLastMove is not None, incomprehensibot will figure out what's up and try to crash you.

So at the beginning, you initialize self.opponent = MyOpponentBot(round). And then every turn, you call self.opponent.move(myLastMove) to see what it's going to do next. I don't expect incomprehensibot to figure this out.

But if your opponent has any random component to it, you want to do that a couple of times at least to see what's going to happen. But if you call that function multiple times, you're instead simulating "what does my opponent do if it sees me play myLastMove several times in a row". And so you need to reset state using deepcopy or multiprocessing or something, and incomprehensibot has ways to figure out if you've done that. (Or I suppose you can just initialize a fresh copy and loop over the game history running .move(), which would be safe.)

But actually, the simple "initialize once and call .move() once per turn" isn't obviously terrible? Especially if you keep track of your predictions and stop paying attention to them once they deviate more than a certain amount (possibly zero) from reality. And Taleuntum's bot might catch that, I'm not sure, but I think Incomprehensibot wouldn't.

I think at some point I decided basically no one would do that, and then at some other point I forgot that it was even a possibility? But I now think that was silly of me, and someone might well do that and last until the showdown round. Trying to put a number on that would involve thinking in more depth than I did for any of my other predictions, so I'm not going to try, just leave this note.

Comment by philh on PredictIt: Presidential Market is Increasingly Wrong · 2020-10-25T21:47:59.654Z · score: 4 (2 votes) · LW · GW

I take it you mean "people might be betting on the possibility that Trump wins the election, as forecasts predict, but remains president by refusing to concede"?

Betfair's fine print excludes that possibility from the market:

This market will be settled according to the candidate that has the most projected Electoral College votes won at the 2020 presidential election. Any subsequent events such as a ‘faithless elector’ will have no effect on the settlement of this market. In the event that no Presidential candidate receives a majority of the projected Electoral College votes, this market will be settled on the person chosen as President in accordance with the procedures set out by the Twelfth Amendment to the United States Constitution.

I don't have PredictIt's fine print in front of me, but IIRC it's similar but less explicit.

Comment by philh on PredictIt: Presidential Market is Increasingly Wrong · 2020-10-22T17:05:42.510Z · score: 5 (3 votes) · LW · GW

A related crazy-seeming market here is Trump's exit date. Will he complete his first term? Predictit says 85% yes, betfair says 90% yes. The only way I can see that not happening is if he loses the election and quits out of spite or whatever, and I wouldn't be confident enough to dismiss that out of hand. But Good judgment open doesn't think it's likely, 99% he completes his term. (I guess unless he quits on inauguration day?)

Comment by philh on The Darwin Game · 2020-10-20T23:44:51.150Z · score: 2 (1 votes) · LW · GW

Oh yeah, that's true as far as I know. I guess it depends how much we trust ourselves to find all instances of this hole. A priori I would have thought "python sees a newline where splitlines doesn't" was just as likely as the reverse. (I'm actually not sure why we don't see it, I looked up what I thought was the source code for the function and it looked like it should only split on \n, \r and \r\n. But that's not what it does. Maybe there's a C implementation of it and a python implementation?)

Comment by philh on The Darwin Game · 2020-10-20T21:02:57.328Z · score: 2 (1 votes) · LW · GW

If we don't use splitlines we instead need to use something similar, right? Like, even if we don't need to worry about LF versus CRLF (which was a genuine suggestion I made), we still need to figure out if someone's got any de-indents after the start of the payload. And I don't expect us to do better without splitlines than with it.

Comment by philh on PredictIt: Presidential Market is Increasingly Wrong · 2020-10-20T16:10:56.725Z · score: 4 (2 votes) · LW · GW

What I cannot explain, at all, is how this can be true if on June 20, three months ago, the market was 63-39, and now it’s 65-40, all but unchanged. Trump improved a bit, then got worse again, with Biden’s low being at 55.

Note that on June 20, the 538 model had Trump at 22% to win (it began at 30% on June 1). The progression he offers makes sense and is consistent. The market’s doesn’t, and isn’t.

Just as a note, three months ago was July 20th, and that seems to be the date you used for the market. (The market was more like 55-45 on June 20th.) The 538 forecast was indeed 22% for Trump on June 20th, and had actually gone up to 25% by July 20th, but I don't think that changes much.

Comment by philh on Moloch games · 2020-10-20T13:17:22.116Z · score: 2 (1 votes) · LW · GW

Intuition. A Moloch game is a game such that there is a utility function , called “the Moloch’s utility function”, such that if the agents behave individually rationally, then they collectively behave as a “Moloch” that controls all players simultaneously and optimizes . In particular, the Nash equilibria correspond to local optima of .

Minor, but this tripped me up. My read of "controls all players simultaneously" would be that there's no such thing as a local optimum, it can just move directly to the global optimum from any other state. I'm not sure what would be a better wording though, and your non-intuitive definition was clear enough to set me right.

Comment by philh on The Darwin Game · 2020-10-20T08:44:35.293Z · score: 2 (1 votes) · LW · GW

I do think it would be hard to obfuscate in a way that wasn't fairly easy to detect as obfuscation. Throw out anything that uses import, any variables with __ or a handful of builtin functions and you should be good. (There's only a smallish list of builtins, I couldn't confidently say which ones to blacklist right now but I do think someone could figure out a safe list without too much trouble.) In fact, I can't offhand think of any reason a simple bot would use strings except docstrings, maybe throw out anything with those, too.

(Of course my "5% a CloneBot manages to act out" was wrong, so take that for what it's worth.)

The iterated prisoner’s dilemma with shared source code tournament a few years ago had a lot of simulators, so I assume their rules were more friendly to simulators.

I know of two such - one (results - DMRB was mine) in Haskell where you could simulate but not see source, and an earlier one (results) in Scheme where you could see source.

I think in the Haskell one it would have been hard to figure out you were being simulated. I'm not sure about the scheme one.

Comment by philh on The Darwin Game · 2020-10-19T21:37:02.700Z · score: 2 (1 votes) · LW · GW

I confess I'm a bit confused, I thought in our PM conversation I was fairly explicit that that's what I was asking about, and you were fairly explicit that it was forbidden?

It's not a big deal - even if this was forbidden I'd think it would be totally fine not to disqualify simon, and I still don't actually expect it to have been useful for me.

Comment by philh on The Darwin Game · 2020-10-19T21:27:43.519Z · score: 3 (2 votes) · LW · GW

Clever! I looked for holes in mostly the same directions as you and didn't find anything. I think I either didn't think of "things splitlines will split on but python won't", or if I did I dismissed it as being not useful because I didn't consider comments.

Comment by philh on The Darwin Game · 2020-10-19T17:59:14.871Z · score: 2 (1 votes) · LW · GW

Updated with a link to my code. I also put yours in to see how we'd fare against each other one-on-one - from quick experimentation, looks like we both get close to 2.5 points/turn, but I exploit you for approximately one point every few hundred turns, leaving me the eventual victor. :D I haven't looked closely to see where that comes from.

Of course too much depends on what other bots are around.

Comment by philh on The Darwin Game · 2020-10-19T15:58:01.639Z · score: 3 (2 votes) · LW · GW

Conditioned on "any CloneBot wins" I've given myself about 25%.

10% in that conditional would definitely be too low - I think I have above-baseline chances on all of "successfully submit a bot", "bot is a CloneBot" and "don't get disqualified". I think I expect at least three to fall to those hurdles, and five wouldn't surprise me. And of the rest, I still don't necessarily expect most of them to be very serious attempts.

By "act out" I mean it's a bot that's recognized as a CloneBot by the others but doesn't act like one - most likely cooperating with non-clones, but not-cooperating with clones would also count, it would just be silly as far as I can tell. I also include such a bot as a CloneBot for the 75%.

Comment by philh on The Darwin Game · 2020-10-19T15:31:29.406Z · score: 3 (2 votes) · LW · GW

Well played!

Comment by philh on The Darwin Game · 2020-10-19T15:30:57.864Z · score: 4 (3 votes) · LW · GW

Putting data in global state is forbidden, yeah, even if you don't do anything with it. I was a bit surprised.

Just to be clear, this would only be forbidden if you put it at the top level. If you put it in your class it would be fine. So

class CloneBot():
    ...
    def payload(self) :
        ...

    foo = 'bar' # allowed

foo = 'bar' # forbidden
Comment by philh on Coronavirus Justified Practical Advice Summary · 2020-10-19T12:56:35.900Z · score: 2 (1 votes) · LW · GW

Note that my only mention of zinc in that comment was relating to zinc lozenges and the common cold. It seems like you're talking about dietary zinc and Covid-19.

Comment by philh on The Darwin Game · 2020-10-19T12:44:35.572Z · score: 4 (3 votes) · LW · GW

by defining the __new__() method of the class after the payload

Incidentally, you could also just redefine existing methods, which was how I planned to do it. Like,

class Foo():
    def __init__(self):
        self.x = 1

    def __init__(self):
        self.x = 2

Foo().x # 2
Comment by philh on PredictIt: Presidential Market is Increasingly Wrong · 2020-10-19T12:02:57.881Z · score: 6 (3 votes) · LW · GW

Super not an expert, saying it loud so I can be corrected if wrong:

I don't think time value of money is the main thing here. The observed pattern seems to be that as the election draws closer, people get more information but the market stubbornly refuses to do so. If that pattern continues, then people get more edge as time goes on, meaning future bets will be more advantageous than current bets.

If your strategy is something like "put $100 on Biden as long as I think his odds are more than 5% better than the market thinks" this might not make much difference; waiting only helps in case Biden's odds-according-to-you suddenly drop a lot. But if you're going to bet different amounts depending on the gap, then waiting also helps in case Biden's odds-according-to-you drop a little. (I think if they go up, you can just put more money in, so waiting hasn't gained you anything. But you have to have some probability that they drop.)

Comment by philh on PredictIt: Presidential Market is Increasingly Wrong · 2020-10-19T11:54:36.091Z · score: 2 (1 votes) · LW · GW

That might explain a recent sudden divergence. I don't think it explains the trend Zvi describes in the post.

Comment by philh on The Darwin Game · 2020-10-19T08:25:27.434Z · score: 4 (2 votes) · LW · GW

I lied that I’ve already submitted one program detecting and crashing simulators. ... I added another lie that the method of detecting simulators was my friend’s idea (hopefully suggesting that there is another contestant with the same method outside the clique). I’m curious how believable my lies were, I felt them to be pretty weak, hopefully it’s only because of my inside view.

I believed both of these lies, though if I'd come to rely on them at all I might have questioned them. But I assumed your friend was in the clique.

Comment by philh on The Darwin Game · 2020-10-19T08:21:54.317Z · score: 7 (5 votes) · LW · GW

Will post a link to a github repo with my code later today (when I'm not meant to be working), but for now, here's my thought processes.

(Edit: my code is here. My entry is in incomprehensibot.)

General strategy:

I was undecided on joining the clique, but curious. I didn't want to be a sucker if someone (possibly the clique organizer) found a way to betray it. I sent out feelers, and Vanilla_Cabs shared the clique bot code with me.

I saw a way to defect against the clique. I think that's when I decided to do so, though I may have had the idea in my head beforehand. I would call my entry "bastardBot" and it would be glorious. I told Vanilla_Cabs I was in. They asked if the code was airtight. "I don't see anything I want to flag."

Someone else found that same bug, and was more honest than I. I spent some time trying to work around the fix, but couldn't see anything. I tried to get Vanilla_cabs to put a new hole in, under the pretext that I wanted some code at the top level - this was true, but it was only marginally useful. I couldn't think of any new holes that wouldn't be really freaking obvious, so instead I tried being vague about my requirements to see if they'd suggest something I could use, but they didn't. Eventually we just settled on "the exact line foo = 'bar' is permitted as an exception", and I didn't see what I could do with that.

Later, lsusr told me that that line would get me disqualified. I didn't say anything, in the hopes some clique member would wonder what it was for, include it in their bot just in case, and get disqualified.

I feel a little bad about all this, and hope Vanilla_cabs has no hard feelings.

My backup plan was: don't let anyone simulate me, and get them disqualified if they try. (New name: "incomprehensibot".) "jailbreaker.py" shows my tricks here. Defense seemed more effective than offense, and I didn't think I could safely simulate my opponent, and especially not do so safely and usefully within the time limits, so I gave up on the idea. As for my actual moves, I didn't have much in mind. After rereading (or at least reskimming) "the Darwin pregame" I settled on this:

After the showdown round, start off with something like Zvi's "I'll let you do better than me, but you could do even better by cooperating with me". Gradually move towards "I won't let you do better than me" as time progresses; if my opponent had more than a certain number of points than me, I'd refuse to play less than 3. (I chose the number of points based on expecting 550 turns on average, and gave it a minimum of 5 to allow some coordination dance early on.) Early on, skew towards playing 2 initially; if opponents randomize between 2 and 3, and pick 3 with probability >= 1/3, then 2 maximizes my score. Later, skew towards playing 3 initially, which increases my probability of beating my opponent.

"payload.py" shows my approach here. I modelled it off the early-round CloneBot moves against non-clones. If last round had been a total of 5, I'd play their last move. If it had been 4, I'd do the same, but maybe add a little. If it had been more, I'd repeat my own move, but maybe subtract one. In the 5 and >5 cases, I had some "pushing" behaviour to see if I could exploit them: if I haven't had a chance to push yet, or if pushing had worked well (or seemed like it would have worked well) in the past, or if I just hadn't tried it recently, I'd try to take one more point than I really had any right to. I didn't do that in the <5 case because that situation was my only source of randomness, which seemed important somehow.

(I'm a bit confused about, if the last-round total isn't five, should I base off my own previous move or theirs? My decisions here weren't principled.)

If this made me play 0 (I dunno if it ever would), I'd make it a 1. If it made me play less than 3, and I was too far behind (depending on round), I'd make it a 3.

Just before I went to bed Saturday night, someone sent a message to the clique group saying not to try to break simulators. Because if a clique member simulates us and gets disqualified, the tournament is restarted and the clique is smaller. That was completely right, and I felt stupid for not thinking of it sooner.

I still decided to ignore it, because I thought the game would be small enough that "fewer opponents" was better than "bigger clique". Overnight someone else said they'd already submitted a simulation-breaker, so I dunno if anyone ended up playing a simulator.

Right towards the end I started doing numerical analysis, because early on I was too enamoured with my own cleverness to notice what a good idea it was. I didn't have time to do anything thoroughly, but based on running my paload against ThreeBot (which gets 148-222 early, 10-15 late) I reduced my exploitability ramp-down from 100 rounds (chosen fairly arbitrarily) to 20 (still fairly arbitrary). Come to think of it, I don't think I compared "what proportion of the pool do I need to eventually win" between my early and late game behaviors.

It would have been interesting to have some kind of logging such that my bot could report "I think I'm being simulated right now, and this is how I know" and afterwards lsusr could me how often that happened. I assume that would be significant work for lsusr to set up though, and it adds attack surface.

Predictions:

  • I win: 20%.
  • A CloneBot wins: 75%.
  • At least one clique member submits a non-CloneBot (by accident or design): 60%.
  • At least one clique member fails to submit anything: 60%.
  • At least one bot tries to simulate me after the showdown and doesn't get disqualified: 10%.
  • At least one bot tries to simulate me after the showdown and succeeds: 5%.
  • At least one CloneBot manages to act out: 5%.
  • I get disqualified: 5%.
Comment by philh on The Darwin Game · 2020-10-16T21:55:47.498Z · score: 4 (3 votes) · LW · GW

To check, what timezone is the deadline in?

Comment by philh on The Darwin Game · 2020-10-15T09:36:34.644Z · score: 4 (2 votes) · LW · GW

Oh, geez. I figured it would be too long, but I didn't think about just how much too long. Yeah, with these constraints, even 5s per hundred moves I agree is unreasonable.

Caching seems easy enough to implement independently, I think. No need for you to add it.

Comment by philh on The Darwin Game · 2020-10-14T22:03:48.987Z · score: 4 (2 votes) · LW · GW

Thanks. I confess I'd been hoping for more like 100x that, but not really expecting it :p

Comment by philh on Fermi Challenge: Trains and Air Cargo · 2020-10-14T10:12:05.775Z · score: 5 (2 votes) · LW · GW

Comment on q1:

It looks like your calculations are giving you square miles of track. If a track is 1/1000 of a mile wide (1.6 meters? sure, close enough, judging by the height of a damsel in distress), you'd have 2.5 million linear miles from your first estimate, and 100 million linear miles from your second.

Comment by philh on The Darwin Game · 2020-10-14T08:57:44.302Z · score: 3 (2 votes) · LW · GW

Hm. Can we get a "you can use at least this amount of time per move and not be disqualified"? Without wanting to say too much, I have a strategy in mind that would rely on knowing a certain runtime is safe. (Allowing for some amount of jankiness, so that if the limit was 5s I'd consider 4s safe.)

Comment by philh on The Darwin Game · 2020-10-13T23:09:47.302Z · score: 2 (1 votes) · LW · GW

I don't know how likely it is to make a difference, but what version of python 3?

Comment by philh on Fermi Challenge: Trains and Air Cargo · 2020-10-06T13:23:46.864Z · score: 2 (1 votes) · LW · GW

Another attempt at q2:

Suppose air freight has dectupled every decade, starting at one metric ton in 1909. Then we get 10^10 metric tons in 2009 and 10^11 in 2019. That's 4½ orders of magnitude more than my other answer. :/

I currently suspect this one is too high and that one is too low, but that one is closer.

Comment by philh on Fermi Challenge: Trains and Air Cargo · 2020-10-06T12:31:32.212Z · score: 2 (1 votes) · LW · GW

q2:

I think a lot more freight goes by boat then plane. Let's say plane is 1% of boat.

I think an aircraft carrier displaces, what, 100,000 metric tons? So it's maybe reasonable to guess that a respectable bulk transport can carry 100,000 metric tons of cargo.

Let's say at any given time there are 100 of those underway, on journeys lasting 30 days. That makes about 100,000,000 metric tons shipped annually by boat, and 1,000,000 by plane.

Between 2009 and 2019 I'm gonna guess it went up by enough to count as one order of magnitude. So let's split the difference and call it 300,000 in 2009 and 3,000,000 in 2019.

Comment by philh on Fermi Challenge: Trains and Air Cargo · 2020-10-05T22:34:28.414Z · score: 11 (2 votes) · LW · GW

Attempt at q1:

The earth has a radius of 6400 km. Surface area is 4πr² and about 1/3 of that is land, giving around 200 million square kilometers of land surface.

I think... most of that is barely populated, and probably has few train tracks? Let's say 1% of it is densely popualated, 10% is sparsely populated, and 90% is basically unpopulated. In the densely populated bit, I could believe 1 km track per square km area. In the sparsely populated let's say a tenth of that, and ignore the rest. That gives... 2 million km in dense, plus another 2 million in sparse for 4 million km in total.

That feels low, I think? But I'll stick with it unless I come up with something better.

Comment by philh on Postmortem to Petrov Day, 2020 · 2020-10-05T09:37:00.395Z · score: 4 (2 votes) · LW · GW

He said:

Beyond that, loyalty and trust are also very important to me. If the admins had trusted me with the launch codes, I wouldn’t have nuked the site (intentionally).

But, well. While it seems plausible to me that he's telling the truth... I also think the actions he took are evidence that he's the kind of person who would say that kind of thing whether or not it's true.

Comment by philh on On Destroying the World · 2020-10-03T13:03:55.111Z · score: 5 (3 votes) · LW · GW

what is GreaterWrong anyway?

https://greaterwrong.com is an alternate interface to LessWrong, implemented by... I think Clone of Saturn does most of the coding and Said Achmiz does most of the design work?

Same content, different design, slightly different set of features. (E.g. no karma change notification, no voting on tags, but comment navigation is improved.) I tend to use it over LW because it's faster.

You can generally just replace lesswrong with greaterwrong in a URL.

Comment by philh on "Zero Sum" is a misnomer. · 2020-10-02T19:42:18.024Z · score: 11 (2 votes) · LW · GW

In order for the standard rationality assumptions used in game theory to apply, the payouts of a game must be utilities, not resources such as money, power, or personal property. Zero-sum transfer of resources is often far from zero-sum in utility.

Hm, I feel like when I talk about game theory I don't usually use those assumptions? Admittedly I've never studied game theory in depth. But in particular, the concept of a Nash equilibrium only seems to rely on "each player has a preference order for payouts".

Actually, I'm not really sure what assumptions you mean. I assume "the players are indifferent between a certain payout of x and a 50% chance of 2x" is one, but I don't know if there's anything missing. More questions about these assumptions:

IIUC, if utility is logarithmic in a resource, then it's roughly linear in small changes of that resource. If I have £100 then I value a 50% chance of an extra £100 noticeably differently from a certain chance of an extra £50, but if I have £10000 it's about the same. Is it mostly reasonable to act as though the axioms work for resources, provided the amounts at stake are "small" for all players? (And when people talk about game theory over resources, does that tend to be the case, implicitly or explicitly?)

What do you lose if the assumptions are violated? Broadly speaking I assume many theorems about mixed and iterated games no longer apply.

Comment by philh on What are examples of Rationalist fable-like stories? · 2020-10-01T21:12:21.702Z · score: 8 (6 votes) · LW · GW

My own Parable of the Clock, which I guess is short enough to just copy here:

The monk Dawa had a clock that had stopped, and he was content. When he wished to know the hour, he would glance at the clock, and discover that it was noon.

One day a visiting friend commented on the clock. "Why does your clock say that the hour is noon, when I am quite sure that it is six in the evening?"

Dawa found this unlikely, for the hour had always been noon in his experience. But he had been instilled with the virtues of curiosity and empiricism. If the hour is noon, I desire to believe it is noon. If the hour is six in the evening, I desire to believe it is six in the evening. Let me not become attached to beliefs I may not want. Thus fortified, he sought out other clocks.

The time was indeed six in the evening. In accordance with the virtue of relinquishment, and gently laughing inside at his past foolishness, Dawa serenely set his broken clock forwards by six hours.

Comment by philh on What are examples of Rationalist fable-like stories? · 2020-10-01T21:10:31.160Z · score: 2 (1 votes) · LW · GW

My own Parable of the Clock, which I guess is short enough to just copy here:

The monk Dawa had a clock that had stopped, and he was content. When he wished to know the hour, he would glance at the clock, and discover that it was noon.

One day a visiting friend commented on the clock. "Why does your clock say that the hour is noon, when I am quite sure that it is six in the evening?"

Dawa found this unlikely, for the hour had always been noon in his experience. But he had been instilled with the virtues of curiosity and empiricism. If the hour is noon, I desire to believe it is noon. If the hour is six in the evening, I desire to believe it is six in the evening. Let me not become attached to beliefs I may not want. Thus fortified, he sought out other clocks.

The time was indeed six in the evening. In accordance with the virtue of relinquishment, and gently laughing inside at his past foolishness, Dawa serenely set his broken clock forwards by six hours.

Comment by philh on Puzzle Games · 2020-10-01T15:38:05.364Z · score: 15 (4 votes) · LW · GW

My brother is the developer, so I passed this on to him. More spoilers:

Most of the inconsistencies in the reset behaviour are to prevent players getting stuck in a fail state with no way to escape (especially players who aren't trying to break things).

  1. You can get to the ending without resetting. You cannot hit 100% without resetting.
  2. This depends on where you draw the line between reasonable/unreasonable. If I've done my job right, you shouldn't need to do anything that feels like a bug.
Comment by philh on On Destroying the World · 2020-09-30T17:48:23.068Z · score: 13 (5 votes) · LW · GW

To the extent that you need a community member to blame for this, it is me. When doing this, I was operating under the belief that the community would be judging me personally

As a note, to the extent that you're trying to actively shoulder the blame here (rather than simply describing where you think it falls), this isn't a call you get to make. I'm not saying here that Chris does deserve blame; just that to the extent he does, you can't take that away from him onto yourself.

And... having this expectation seems like kind of the same sort of thing that went wrong with the admins' messaging? Like, on a high level you could describe what led to the site blowing up as: "the admins expected people to feel one way about a thing, and acted on that expectation, but some people felt a different way, and acted in ways that surprised the admins". Similarly, you may have expected us to feel one way about your actions, such that we judge you personally; but if some of us feel a different way, and judge differently, well...

You said you wanted this to be a learning opportunity for the community, and I think (despite varying levels of annoyance) we're overall taking it as such. To the extent that it's a learning opportunity for you as well, I hope you take it as such.

Comment by philh on On Destroying the World · 2020-09-29T20:46:51.308Z · score: 7 (4 votes) · LW · GW

I read that as "make a second account to say anonymously why you would have done it".

Comment by philh on On Destroying the World · 2020-09-29T08:42:58.439Z · score: 27 (7 votes) · LW · GW

Did a quick google. The only statistic I could find for how successful phishing attacks are is https://www.helpnetsecurity.com/2019/09/04/sme-phishing-attacks/:

43% of UK SMEs have experienced a phishing attempt through impersonation of staff in the last 12 months. Of those impersonation phishing attempts, it was discovered that two-thirds (66%) had suffered a successful attack, according to CybSafe.

66% is still way more than I expect, but there's no verifiable source. (Looks like CybSafe has incentive to exaggerate the numbers.) And it's not clear whether this is "66% of phishing attempts were successful" or "of organizations targeted, 66% suffered at least one successful attack". Certainly it doesn't support "you would have likely entered the codes as well".

Strong-downvoted pdaa's comment pending source.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T23:30:25.273Z · score: 14 (4 votes) · LW · GW

Honestly, I kind of think that would be a straightforwardly silly thing to worry about, if one were to think about it for a few moments. (And I note that it's not Chris' stated reasoning.)

Like, leave aside that the PM was indistinguishable from a phishing attack. Pretend that it had come through both email and PM, from Ben Pace, with the codes repeated. All the same... LW just isn't the kind of place where we're going to socially shame someone for

  • Not taking action
  • ...within 30 minutes of an unexpected email being sent to them
  • ...whether or not they even saw the email
  • ...in a game they didn't agree to play.
Comment by philh on On Destroying the World · 2020-09-28T17:03:08.659Z · score: 19 (5 votes) · LW · GW

Phishing attacks can often have in excess of 80% success rate. If you had received this, you would have likely entered the codes as well, even though everyone thinks that they wouldn’t. Which is just one of the reasons why it doesn’t make sense to punish recipients for making this kind of mistake.

Seconding Daniel's request for a source. But also, to clarify, does your attempt here count as one phishing attack in total, or one per message you sent?

If it's one per message, then 80% is double-plus-super-higher-than-predicted. But if it's one in total, then "you would have likely entered the codes as well" needs further justification. I said in the other thread that I wasn't super confident I wouldn't have fallen for it; but I don't think it's actively likely that I would have done, even taking your claim into account.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T16:49:22.918Z · score: 4 (2 votes) · LW · GW

Note that one thing you can do is punish people for failing to provide information. It's not necessarily easy to get that right, but it's an option that's available.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T09:59:37.274Z · score: 10 (2 votes) · LW · GW

I don't have strong feelings about this particular comment. But in general I think this is a tricky question. On the one hand you don't want to disincentivize providing the information; on the other hand you do want to be able to react to the information, and sometimes the appropriate reaction is to punish them. Maybe you want to punish them less for providing the information, but punishing them zero would also be really bad incentives.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T09:45:09.338Z · score: 12 (3 votes) · LW · GW

Even after receiving that message, it still seems like the "do not engage" action is to not enter the codes?

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T20:45:25.075Z · score: 4 (2 votes) · LW · GW

That's... a bit surprising. If I were behind this, I wouldn't have sent a message to you because you're likely to know the plan. Anyone who receives the message but doesn't fall for it is an extra chance for the scheme to fail, because if nothing else they can post in this thread where someone is most likely to see it before entering codes. (Chris, I'm curious if you did look in this thread before you put them in?) In your case you could react with admin powers too, though I dunno if you would have considered that fair game.

I feel like this gives us a small amount of evidence about the identity of the adversary, but not enough to do any real speculation with.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T15:55:34.783Z · score: 8 (5 votes) · LW · GW

My partner says that as a kid, their school did something similar as part of "don't talk to strangers" teaching. The "stranger" in question was someone the class been working with all day, introduced by their teacher.

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T14:05:40.611Z · score: 9 (5 votes) · LW · GW

I'd like to offer some combination of consoling hugs and ಠ_ಠ.

(edit: but to be clear I'm not super-confident I wouldn't have fallen for it myself. Especially if I saw that message 25 minutes after it was sent.)

Comment by philh on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T14:02:33.294Z · score: 8 (6 votes) · LW · GW

If Chris could be confident it came from the admins I'd agree, but with my current knowledge (and assuming the admins would have been honest had Chris messaged them on their normal accounts) it feels more like pentesting.

Comment by philh on The Haters Gonna Hate Fallacy · 2020-09-24T23:11:35.245Z · score: 2 (1 votes) · LW · GW

Broadly agree, but:

“I think [term X] in your post is going to cause misunderstandings, I’d suggest phrasing it differently.”

“Oh, haters are gonna hate, there’s no amount of rephrasing that’s going to prevent this from being misinterpreted if people want to.”

I think I feel differently about this depending on whether the first person has a specific suggestion. Without it, I think that sometimes the second person isn't "not bothering to put in the effort" so much as "doesn't think the effort will pay off", or even "has already put in the effort and this was the result". It may be much easier to notice that something will be misunderstood than to phrase it differently such that it won't be.