Is there a "coherent decisions imply consistent utilities"-style argument for non-lexicographic preferences? 2021-06-29T19:14:21.436Z
Recently I bought a new laptop 2021-04-10T20:29:45.685Z
You Can Do Futarchy Yourself 2020-06-14T00:16:20.823Z
Tetraspace Grouping's Shortform 2019-08-02T01:37:14.859Z


Comment by Tetraspace (tetraspace-grouping) on Conditional prediction markets are evidential, not causal · 2024-02-08T01:26:24.567Z · LW · GW

This ends up being pretty important in practise for decision markets ("if I choose to do X, will Y?"), where by default you might e.g. only make a decision if it's a good idea (as evaluated by the market), and therefore all traders will condition on the market having a high probability which is obviously quite distortionary. 

Comment by Tetraspace (tetraspace-grouping) on Tamsin Leake's Shortform · 2023-12-24T14:51:54.918Z · LW · GW

I replied on discord that I feel there's maybe something more formalisable that's like:

  • reality runs on math because, and is the same thing as, there's a generalised-state-transition function
  • because reality has a notion of what happens next, realityfluid has to give you a notion of what happens next, i.e. it normalises
  • the idea of a realityfluid that doesn't normalise only comes to mind at all because you learned about R^n first in elementary school instead of S^n

which I do not claim confidently because I haven't actually generated that formalisation, and am posting here because maybe there will be another Lesswronger's eyes on it that's like "ah, but...". 

Comment by Tetraspace (tetraspace-grouping) on Shutdown-Seeking AI · 2023-06-01T18:54:55.534Z · LW · GW

Not unexpected! I think we should want AGI to, at least until it has some nice coherent CEV target, explain at each self-improvement step exactly what it's doing, to ask for permission for each part of it, to avoid doing anything in the process that's weird, to stop when asked, and to preserve these properties. 

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2023-04-28T16:02:07.675Z · LW · GW

Even more recently I bought a new laptop. This time, I made the same sheet, multiplied the score from the hard drive by  because 512 GB is enough for anyone and that seemed intuitively the amount I prioritised extra hard drive space compared to RAM and processor speed, and then looked at the best laptop before sharply diminishing returns set in; this happened to be the HP ENVY 15-ep1503na 15.6" Laptop - Intel® Core™ i7, 512 GB SSD, Silver. This is because I have more money now, so I was aiming to maximise consumer surplus rather than minimise the amount I was spending.[1]

Surprisingly, it came with a touch screen! That's just the kind of nice thing that laptops do nowadays, because as I concluded in my post, everything nice about laptops correlates with everything else so high/low end is an axis it makes sense to sort things on. Less surprisingly, it came with a graphics card, because ditto.

Unfortunately this high-end laptop is somewhat loud; probably my next one will be less loud, up to including an explicit penalty for noise.

  1. ^

    It would have been predictable, however, at the time that I bought that new laptop, that I would have had that much money at a later date. Which means that I should have just skipped straight to consumer surplus maxxing.

Comment by Tetraspace (tetraspace-grouping) on Is the fact that we don't observe any obvious glitch evidence that we're not in a simulation? · 2023-04-26T16:58:11.501Z · LW · GW

It would be evidence at all. Simple explanation: if we did observe a glitch, that would pretty clearly be evidence we were in a simulation. So by conservation of expected evidence, non-glitches are evidence against.

Comment by Tetraspace (tetraspace-grouping) on Pausing AI Developments Isn't Enough. We Need to Shut it All Down · 2023-04-18T22:34:34.687Z · LW · GW

I don't think it's quite that; a more central example I think would be something like a post about extrapolating demographic trends to 2070 under the UN's assumptions, where then justifying whether or not 2070 is a real year is kind of a different field. 

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2023-04-10T14:03:33.097Z · LW · GW

, as a mathematical structure, is smarter than god and perfectly aligned to ; the value of  will never actually be  because  is more objectively rational, or because you made a typo and it knows you meant to say ; and no matter how complicated the mapping is from  to  it will never fall short of giving the  that gives the highest value of .

Which is why in principle you can align a superior being, like , or maybe like a superintelligence.

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2023-04-03T19:54:14.492Z · LW · GW

"The AI does our alignment homework" doesn't seem so bad - I don't have much hope for it, but because it's a prosaic alignment scheme so someone trying to implement it can't constrain where Murphy shows up, rather than because it's an "incoherent path description".

A concrete way this might be implemented is 

  • A language model is trained on a giant text corpus to learn a bunch of adaptations that make it good at math, and then fine-tuned for honesty. It's still being trained at a safe and low level of intelligence where honesty can be checked, so this gets a policy that produces things that are mostly honest on easy questions and sometimes wrong and sometimes gibberish and never superhumanly deceptive.[1]
  • It's set to work producing conceptually crisp pieces of alignment math, things like expected utility theory or logical inductors, slowly on inspectable scratchpads and so on, with the dumbest model that can actually factor scientific research[1], with human research assistants to hold their hand if that lets you make the model dumber. It does this, rather than engineering, because this kind of crisp alignment math is fairly uniquely pinned down so it can be verified, and it's easier to generate compared to any strong pivotal engineering task where you're competing against humans on their own ground so you need to be smarter than humans, so while it's operating in a more dangerous domain it's using a safer level of intelligence.[1]
  • The human programmers then use this alignment math to make an corrigible thingy that has dangerous levels of intelligence that does difficult engineering and doesn't know about humans, while this time knowing what they're doing. Getting the crisp alignment math from parallelisable language models helps a lot and gives them a large lead time, because a lot of it's the alignment version of backprop where it would have took a surprising amount of time to discover otherwise.

This all happens at safe-ish low-ish levels of intelligence (such a model would probably be able to autonomously self-replicate on the internet, but probably not reverse protein folding, which means that all the ways it could be dangerous are "well don't do that"s as long as you keep the code secret[1]), with the actual dangerous levels of optimisation being done by something made by the humans using pieces of alignment math which are constrained down to a tiny number of possibilities.

EDIT 2023-07-25: A longer debate that I think is worth reading about the model that leads it to being an incoherent path description between Holden Karnofsky (pro) and Nate Soares (against) is here; I hadn't read this as of writing this.

  1. ^

    Unless it isn't; it's a giant pile of tensors, how would you know? But this isn't special to this use case.

Comment by Tetraspace (tetraspace-grouping) on "Dangers of AI and the End of Human Civilization" Yudkowsky on Lex Fridman · 2023-03-30T20:36:24.417Z · LW · GW

The solanine poisoning example was originally posted to Reddit here, the picture of Sydney Bing from a text description was posted on Twitter here.

Comment by Tetraspace (tetraspace-grouping) on The Overton Window widens: Examples of AI risk in the media · 2023-03-25T12:00:09.398Z · LW · GW

The alignment, safety and interpretability is continuing at full speed, but if all the efforts of the alignment community are sufficient to get enough of this to avoid the destruction of the world in 2042, and AGI is created in 2037, then at the end you get a destroyed world.

It might not be possible in real life (List of Lethalities: "we can't just decide not to build AGI"), and even if possible it might not be tractable enough to be worth focusing any attention on, but it would be nice if there was some way to make sure that AGI happens after alignment is sufficient at full speed (EDIT: or, failing that, to happen later, so if alignment goes quickly that takes the world from bad outcomes to good outcomes, instead of bad outcomes to bad outcomes).

Comment by Tetraspace (tetraspace-grouping) on Alignment-related jobs outside of London/SF · 2023-03-23T20:09:55.104Z · LW · GW

80,000 Hours' job board lets you filter by city. As of the time of writing, roles in their AI Safety & Policy tag are 61/112 San Francisco, 16/112 London, 35/112 other (including remote).

Comment by Tetraspace (tetraspace-grouping) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2023-03-22T22:48:37.288Z · LW · GW

There are about 8 billion people, so your 24,000 QALYs should be 24,000,000.

Comment by Tetraspace (tetraspace-grouping) on An Appeal to AI Superintelligence: Reasons to Preserve Humanity · 2023-03-22T22:32:24.145Z · LW · GW
Comment by Tetraspace (tetraspace-grouping) on Are we too confident about unaligned AGI killing off humanity? · 2023-03-22T17:57:18.973Z · LW · GW

I don't mean to say that it's additional reason to respect him as an authority or accept his communication norms above what you would have done for other reasons (and I don't think people particularly are here), just that it's the meaning of that jokey aside.

Comment by Tetraspace (tetraspace-grouping) on Are we too confident about unaligned AGI killing off humanity? · 2023-03-21T17:48:02.171Z · LW · GW

Maybe you got into trouble for talking about that because you are rude and presumptive?

I think this is just a nod to how he's literally Roko, for whom googling "Roko simulation" gives a Wikipedia article on what happened last time. 

Comment by Tetraspace (tetraspace-grouping) on AGI will know: Humans are not Rational · 2023-03-20T21:50:29.474Z · LW · GW

What, I wonder, shall such an AGI end up "thinking" about us?

IMO: "Oh look, undefended atoms!" (Well, not in that format. But maybe you get the picture.)

You kind of mix together two notions of irrationality:

I think only the first one is really deserving of the name "irrationality". I want what I want, and if what I want is a very complicated thing that takes into account my emotions, well, so be it. Humans might be bad at getting what they want, they might be mistaken a lot of the time about what they want and constantly step on their own toes, but there's no objective reason why they shouldn't want that.

Still, when up against a superintelligence, I think that both value being fragile and humans being bad at getting what they want count against humans getting anything they want out of the interaction: 

  • Superintelligences are good at getting what they want (this is really what it means to be a superintelligence)
  • Superintelligences will have whatever goal they have, and I don't think that there's any reason why this goal would be anything to do with what humans want (the orthogonality thesis; the goals that a superintelligence has are orthogonal to how good it is at achieving them)

This together adds up to a superintelligence sees humans using resources that it could be using for something else (and it would want them to use them for something else, not just what the humans are trying to do but more, because it has its own goals), and because it's good at getting what it wants it gets those resources, which is very unfortunate for the humans.

Comment by Tetraspace (tetraspace-grouping) on Why not just boycott LLMs? · 2023-03-19T13:10:56.331Z · LW · GW

Boycotting LLMs reduces the financial benefit of doing research that is (EDIT: maybe) upstream to AGI in the tech tree.

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2023-03-03T20:23:00.428Z · LW · GW

Arbital gives a distinction between "logical decision theory" and "functional decision theory" as: 

  • Logical decision theories are a class of decision theories that have a logical counterfactual (vs. the causal counterfactual that CDT has and the evidential counterfactual EDT has).
  • Functional decision theory is the type of logical decision theory where the logical counterfactual is fully specified, and correctly gives the logical consequences of "decision function X outputs action A".

More recently, I've seen in Decision theory does not imply that we get to have nice things:

  • Logical decision theory is the decision theory where the logical counterfactual is fully specified.
  • Functional decision theory is the incomplete variant of logical decision theory where the logical consequences of "decision function X outputs action A" have to be provided by the setup of the thought experiment.

Any preferences? How have you been using it? 

Comment by Tetraspace (tetraspace-grouping) on (Cryonics) can I be frozen before being near-death? · 2023-03-01T12:29:55.744Z · LW · GW

Further to it being legally considered murder, tricky plans to get around this are things that appear to the state like possibly a tricky plan to get around murder, and result in an autopsy which at best and only if the cryonics organisation cooperates leaves one sitting around warm for over a day with no chance of cryoprotectant perfusion later.

Comment by Tetraspace (tetraspace-grouping) on Three Fables of Magical Girls and Longtermism · 2022-12-09T23:46:17.432Z · LW · GW

@ESYudkowsky on Twitter:

Rereading a bit of Hieronym's PMMM fanfic "To The Stars" and noticing how much my picture of dath ilan's attempt at competent government was influenced / inspired by Governance there, including the word itself.

Comment by Tetraspace (tetraspace-grouping) on Cleaning a Spoon is Complex · 2022-10-09T12:04:41.809Z · LW · GW

For some inspiration, put both memes side by side and listen to Landsailor. (The mechanism by which one listens to it, in turn, is also complex. I love civilisation.)

Comment by Tetraspace (tetraspace-grouping) on Russia will do a nuclear test · 2022-10-04T19:06:25.781Z · LW · GW

Relevant Manifold: Will Russia conduct a nuclear test during 2022?, currently at 26%.

Comment by Tetraspace (tetraspace-grouping) on Is there a beeminder without the punishment? · 2021-09-14T17:59:57.675Z · LW · GW

Beemium (the subscription tier that allows pledgeless goals) is $40/mo currently, increased in January 2021 from $32/mo and in 2014 from the original $25/mo.

Comment by Tetraspace (tetraspace-grouping) on Some phrases in The Map that... Confuse me- help please, to make my review of it better! · 2021-08-16T20:08:49.388Z · LW · GW

The essay What Motivated Rescuers During the Holocaust is on Lesswrong under the title Research: Rescuers during the Holocaust - it was renamed because all of the essay titles in Curiosity are questions, which I just noticed now and is cute. I found it via the URL, which is listed in the back of the book.

The bystander effect is an explanation of the whole story:

  • Because of the bystander effect, most people weren't rescuers during the Holocaust, even though that was obviously the morally correct thing to do; they were in a large group of people who could have intervened but didn't.
  • The standard way to break the bystander effect is by pointing out a single individual in the crowd to intervene, which is effectively what happened to the people who became rescuers by circumstances that forced them into action.
Comment by Tetraspace (tetraspace-grouping) on Is there a "coherent decisions imply consistent utilities"-style argument for non-lexicographic preferences? · 2021-06-30T17:41:21.176Z · LW · GW

Why would you wait until ? It seems like at any time  the expected payoff will be , which is strictly decreasing with .

Comment by Tetraspace (tetraspace-grouping) on 2 innovative life extension approaches using cryonics technology · 2021-04-02T20:56:08.392Z · LW · GW

One big advantage of getting a hemispherectomy for life extension is that, if you don't tell the Metaculus community before you do it, you can predict much higher than the community median of 16% - I would have 71 Metaculus points to gain from this, for example, much greater than the 21 in expectation I would get if the community median was otherwise accurate.

Comment by Tetraspace (tetraspace-grouping) on Rafael Harth's Shortform · 2021-02-14T17:04:37.713Z · LW · GW

This looks like the hyperreal numbers, with your  equal to their .

Comment by Tetraspace (tetraspace-grouping) on 0 And 1 Are Not Probabilities · 2020-12-27T14:20:01.161Z · LW · GW

The real number 0.20 isn't a probability, it's just the same odds but written in a different way to make it possible to multiply (specifically you want some odds product * such that A:B * C:D = AC:BD). You are right about how you would convert the odds into a probability at the end.

Comment by Tetraspace (tetraspace-grouping) on Hermione Granger and Newcomb's Paradox · 2020-12-15T17:53:24.217Z · LW · GW

Just before she is able to open the envelope, a freak magical-electrical accident sends a shower of sparks down, setting it alight. Or some other thing necessiated by Time to ensure that the loop is consistent. Similar kinds of problems to what would happen if Harry was more committed to not copying "DO NOT MESS WITH TIME".

Comment by Tetraspace (tetraspace-grouping) on Coherent decisions imply consistent utilities · 2020-12-13T02:15:30.728Z · LW · GW

I have used this post quite a few times as a citation when I want to motivate the use of expected utility theory as an ideal for making decisions, because it explains how it's not just an elegant decisionmaking procedure from nowhere but a mathematical inevitability of the requirements to not leave money on the table or to accept guaranteed losses. I find the concept of coherence theorems a better foundation than the normal way this is explained, by pointing at the von Neumann-Morgensten axioms and saying "they look true".

Comment by tetraspace-grouping on [deleted post] 2020-12-12T23:55:37.171Z

The number of observers in a universe is solely a function of the physics of that universe, so the claim that a theory that implies 2Y observers is a third as likely as a theory that implies Y observers (even before the anthropic update) is just a claim that the two theories don't have an equal posterior probability of being true.

Comment by Tetraspace (tetraspace-grouping) on Humans Who Are Not Concentrating Are Not General Intelligences · 2020-12-09T23:41:47.218Z · LW · GW

This post uses the example of GPT-2 to highlight something that's very important generally - that if you're not concentrating, you can't distinguish GPT-2 generated text that is known to be gibberish from non-gibberish.

And hence gives the important lesson, which might be hard to learn oneself if they're not concentrating, that you can't really get away with not concentrating. 

Comment by tetraspace-grouping on [deleted post] 2020-12-07T21:38:50.948Z

This is self-sampling assumption-like reasoning: you are reasoning as if experience is chosen from a random point in your life, and since most of an immortal's life is spent being old, but most of a mortal's life is spent being young, you should hence update away from being immortal. 

You could apply self-indication assumption-like reasoning to this: as if your experience is chosen from a random point in any life. Then, since you are also conditioning on being young, and both immortals and mortals have one youthhood each, just being young doesn't give you any evidence for or against being immortal that you don't already have. (This is somewhat in line with your intuitions about civilisations: immortal people live longer, so they have more Measure/prior probability, and this cancels out with the unlikelihood of being young given you're immortal)

Comment by Tetraspace (tetraspace-grouping) on Yes Requires the Possibility of No · 2020-12-04T23:17:52.502Z · LW · GW

Yes requiring the possibility of no has been something I've intuitively been aware of in social situations (anywhere where one could claim "you would have said that anyway"). 

This post does a good job of applying more examples and consequences of this (the examples cover a wide range of decisions), and tying to to the mathematical law of conservation of evidence. 

Comment by Tetraspace (tetraspace-grouping) on The next AI winter will be due to energy costs · 2020-11-30T23:24:16.630Z · LW · GW

In The Age of Em, I was somewhat confused by the talk of reversible computing, since I assumed that the Laudauer limit was some distant sci-fi thing, probably derived by doing all your computation on the event horizon of a black hole. That we're only three orders of magnitude away from it was surprising and definitely gives me something to give more consideration to. The future is reversible!

I did a back-of-the-envelope calculation about what a Landauer limit computer would look like to rejiggle my intuitions with respect to this, because "amazing sci-fi future" to "15 years at current rates of progress" is quite an update.

Then, the lower limit is  with  or  [...] A current estimate for the number of transistor switches per FLOP is .

The peak of human computational ingenuity is of course the games console. When doing something very intensive, the PS5 consumes 200 watts and does 10 teraFLOPs ( FLOPs). At the Landauer limit, that power would do  bit erasures per second. The difference is  - 6 orders of magnitude from FLOPs to bit erasure conversion, 1 order of magnitude from inefficiency, 3 orders of magnitude from physical limits, perhaps.

Comment by Tetraspace (tetraspace-grouping) on The Darwin Game - Rounds 10 to 20 · 2020-11-20T13:06:23.221Z · LW · GW

Indeed, OscillatingTwoThreeBot does behave like that. Thanks for the cooperation LiamGoddard!

Comment by Tetraspace (tetraspace-grouping) on Open & Welcome Thread – November 2020 · 2020-11-15T17:57:20.612Z · LW · GW

:0, information on the original AI box games!

In that round, the ASI convinced me that I would not have created it if I wanted to keep it in a virtual jail.

What's interesting about this is that, despite the framing of Player B being the creator of the AGI, they are not. They're still only playing the AI box game, in which Player B loses by saying that they lose, and otherwise they win.

For a time I suspected that the only way that Player A could win a serious game is by going meta, but apparently this was done just by keeping Player B swept up in their role enough to act how they would think the creator of the AGI would act. (Well, saying "take on the role of [someone who would lose]" is meta, in a sense.)

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2020-11-12T12:46:50.247Z · LW · GW

Smarkets is currently selling shares in Trump conceding if he loses at 57.14%. The Good Judgement Project's superforecasters predict that any major presidential candidate will concede with probability 88%. I assign <30% probability to Biden conceding* (scenarios where Biden concedes are probably overwhelmingly ones where court cases/recounts mean states were called wrong, which Betfair assigns ~10% probability to, and FTX kind of** assigns 15% probability to, and even these seem high), so I think it's a good bet to take.

* I think that the Trump concedes if he loses market is now unconditional, because by Smarkets' standards (projected electoral votes from major news networks) Biden has won.

** Kind of, because some TRUMP shares expired at 1 TRUMFEB share - $0.10, rather than $0 as expected, and some TRUMP shares haven't expired yet, because TRUMP holders asked. So it's possible that the value of a TRUMPFEB share might also include the value of a hypothetical TRUMPMAR share, or that TRUMPFEB trades will be nullified at some point, or some other retrospective rule change on FTX's part.

UPDATE 2020-11-16: Trump... kind of conceded? Emphasis mine:

He won because the Election was Rigged. NO VOTE WATCHERS OR OBSERVERS allowed, vote tabulated by a Radical Left privately owned company, Dominion, with a bad reputation & bum equipment that couldn’t even qualify for Texas (which I won by a lot!), the Fake & Silent Media, & more!

While he has retracted this, it met Smarkets' standards, so I'm £22.34 richer.

Comment by Tetraspace (tetraspace-grouping) on Share your personal stories of prediction markets · 2020-11-08T17:13:28.635Z · LW · GW

I bet £10 on Biden winning on Smarkets upon reading the GJP prediction, because I trust superforecasters more than prediction markets. I bet another £10 after reading Demski's post on Kelly betting - my bankroll is much larger than £33 (!! Kelly bets are enormous!) but as far as my System 1 is concerned I'm still a broke student who would have to sheepishly ask their parents to cover any losses.

Very pleased about the tenner I won, might spend it on a celebratory beer.

Comment by Tetraspace (tetraspace-grouping) on Babble challenge: 50 ways of solving a problem in your life · 2020-10-28T00:19:23.575Z · LW · GW

The problem I have and wish to solve is, of course, the accurséd Akrasia that stops me from working on AI safety.

Let's begin with the easy ones:

1 Stop doing this babble challenge early and go try to solve AI safety.

2 Stop doing this babble challenge early; at 11 pm, specifically, and immediately sleep, in order to be better able to solve AI safety tomorrow.

In fact generally sleep seems to be a problem, I spend 10 hours doing it every day (could be spent solving AI safety) and if I fall short I am tired. No good! So working on this instrumental goal.

3 Get blackout curtains to improve sleep quality

4 Get sleep mask to improve sleep quality

5 Get better mattress to improve sleep quality

6 Find a beverage with more caffeine to reduce the need for sleep

7 Order modafinil online to reduce the need for sleep

And heck while we're on the topic of stimulants

8 Order adderall online or from a friend to increase ability to focus

9 Look up good nootropics stacks to improve cognitive ability and hence ability to do AI safety

Now another constraint when doing AI safety is that I don't have a good shovel-ready list of things to try, and it's easy for me to get distracted if I can't just pick something from the task list

10 Check if complice solves this problem

11 Check if some ordinary getting-things-done (that I can stick into roam) solves this problem

12 Make a giant checklist and go down this list

13 Make a personal kanban board of things that would be nice for solving AI safety

And instrumentally useful for creating these task lists?

14 Ask friends who know about AI safety for things to do

15 Apophatically ask for suggestions for things to do via an entry on a list of 50 items for a lesswrong babble challenge

Anyway, I digress. I'm here to solve akrasia, not make a checklist. Unless I need more items on this list, in which case I will go back to checklist construction. Is this pruning? Never mind. Back to the point:

16 Set up some desktop shortcut macro thing in order to automatically start pomodoros when I open my laptop

17 Track time spent doing things useful to AI safety on a spreadsheet

18 Hey, I said "laptop"! Get a better mouse to make using the laptop more fun so I'm more likely to do hard things when using it

19 Get a better desk for more space for notes and to require less expensive shifting into/out of AI safety mode

20 On notes, use the index cards I have to make a proper zettelkasten as a cognitive aid

(Does this solve akrasia? Well, if I have better cognitive aids, then doing cognitively expensive things is easier, so I'm less likely to fail even with my current levels of willpower)

21 Start doing accountability things like promising to review a paper every X time period

22 I said levels of willpower - Google for interventions that increase conscientiousness (there's gotta be some dodgy big-5 based things) and do those?

Back to the top of the tree

23 Quit my job because it's using up energy that I could be using to do AI safety

24 Instead of doing my job, pretend to do my job while actually doing AI safety

25 Set up an AI safety screen on work laptop so it's easy to switch over to doing AI safety during breaks or lunches

Hey, I said lunch

26 Use nutritionally complete meal replacements to save time/willpower that would be spent on food preparation

27 Use nutritionally complete meal replacements to ensure that nutrient intake keeps me in top physical form

28 Exercise (this improves everything, apparently) by running on a treadmill

29 By lifting weights

30 By jogging in a large circle

31 Become a monk and live an austere lifestyle without the distractions of rich food, wine, and lust

32 Become an anti-monk and live a rich lifestyle to ensure that no willpower is wasted on distractions

33 Specifically in vice use nicotine as a performance enhancing stimulant by smoking. Back to stimulants again I guess

34 ... or by using nicotine patches or gum or something

35 By using nicotine only if I do AI safety things, in order to develop an addiction to AI safety

Hey, develop an addiction to doing AI safety! People go to serious lengths for addictions, so why not gate it on math?

36 Do so with something very addictive, like opioids

37 Use electric shocks to do classical conditioning

etc. there was a short sci-fi story about this kind of thing let me see if I can find it. Hey, actually, since I said sci-fi, adn this is a babble challenge:

38 Promise very hard to time travel back to this exact point in time, meet future self, recieve advice

(They're not here :( Oh well) Back on that akrasia-solving:

39 Make up a far-future person who I am specifically working to save (they're called Dub See Wun). Get invested in their internal life (they want to make their own star!). Feel an emotional connection to them. I'm doing it for them!

40 Specifically put up a "do it for them" poster modelled off the one in the Simpsons

41 DuckDuckGo "how to beat akrasia" and do the top suggestion

42 Adopt strategic probably false beliefs (the world will end in 1 year!! :0) in order to encourage a more aggressive search for strategies

"Aggressive search for strategies" is the virtue that the Sequences call "actually trying", so in the Sequences-sphere

43 Go to a CFAR workshop, which I heard might be kind of useful towards this sort of thing

44 Or just read the CFAR booklet and apply the wisdom found in there

45 Or some sequence on Lesswrong with exercises that applies some CFARy wisdom

Of course all this willpower boosting and efficiency and stuff wouldn't help if I was just doing the wrong thing faster (like that one Shen comic, you know the one). So:

46 Consider how much of what I think is working on AI safety is actually just self-actualisy math/CS stuff, throw that out, and actually try to solve the problem

47 Deliberately create and encourage a subagent in my mind that wants to do AI safety (call em Dub See Wun)

48 Adopt strategic infohazards in order to encourage a more focused and aggressive search for strategies

49 Post a lot about AI safety in public forums like Lesswrong so that I feel compelled to do AI safety in my private life in order to maintain the illusion that I'm some kind of AI-safety-doing-person

50 Stop doing this babble challenge at the correct time, and continue to do AI safety or sleep as in 1) or 2). Hey, this one seems good. Think I might try it now!

Comment by Tetraspace (tetraspace-grouping) on Introduction to Cartesian Frames · 2020-10-26T20:22:53.033Z · LW · GW

This means you can build an action that says something like "if I am observable, then I am not observable. If I am not observable, I am observable" because the swapping doesn't work properly.

Constructing this more explicitly: Suppose that and . Then must be empty. This is because for any action in the set , if was in then it would have to equal which is not in , and if was not in it would have to equal which is in .

Since is empty, is not observable.

Comment by Tetraspace (tetraspace-grouping) on The Darwin Game - Rounds 0 to 10 · 2020-10-26T17:34:56.229Z · LW · GW

Because the best part of a sporting event is the betting, I ask Metaculus: [Short-Fuse] Will AbstractSpyTreeBot win the Darwin Game on Lesswrong?

Comment by Tetraspace (tetraspace-grouping) on The Darwin Game - Rounds 0 to 10 · 2020-10-24T23:12:26.886Z · LW · GW

How does your CooperateBot work (if you want to share?). Mine is OscillatingTwoThreeBot which IIRC cooperates in the dumbest possible way by outputting the fixed string "2323232323...".

Comment by Tetraspace (tetraspace-grouping) on Tetraspace Grouping's Shortform · 2020-10-23T20:06:05.511Z · LW · GW

I have two questions on Metaculus that compare how good elements of a pair of cryonics techniques are: preservation by Alcor vs preservation by CI, and preservation using fixatives vs preservation without fixatives. They are forecasts of the value (% of people preserved with technique A who are revived by 2200)/(% of people preserved with technique B who are revived by 2200), which barring weird things happening with identity is the likelihood ratio of someone waking up if you learn that they've been preserved with one technique vs the other.

Interpreting these predictions in a way that's directly useful requires some extra work - you need some model for turning the ratio P(revival|technique A)/P(revival|technique B) into plain P(revival|technique X), which is the thing you care about when deciding how much to pay for a cryopreservation.

One toy model is to assume that one technique works (P(revival) = x), but the other technique may be flawed (P(revival) < x). If r < 1, it's the technique in the numerator that's flawed, and if r > 1, it's the technique in the denominator that's flawed. This is what I guess is behind the trimodality in the Metaculus community median: there are peaks at the high end, the low end, and at exactly 1, perhaps corresponding to one working, the other working, and both working.

For the current community medians (as of 2021-04-18), using that model, using the Ergo library, normalizing the working technique to 100%, I find:

Alcor vs CI:

  • EV(Preserved with Alcor) = 69%
  • EV(Preserved with Cryonics Institue) = 76%

Fixatives vs non-Fixatives

  • EV(Preserved using Fixatives) = 83%
  • EV(Preserved without using Fixatives) = 34%

(here's the Colab notebook)

Comment by Tetraspace (tetraspace-grouping) on Babble challenge: 50 ways of hiding Einstein's pen for fifty years · 2020-10-16T21:12:45.845Z · LW · GW

The annotations that some other people have put on their lists to show their thinking process as well as the list of assumptions at the start, have been interesting - I haven't done this this time, but it seems like something worth trying next time.

Keep it in my pocket the whole time.

Locked safe down the Marianas trench.

Am I a time traveller? Is that how I know? If so, hide it in dinosaur times, long before the evil forces lived.

Or hide it in the far future, long after the evil forces lived.

Send it into orbit.

Land it on the moon. Can't quite think of a way to achieve this, though. Any ideas?

Bury it in a geologically stable location and dig it up later as if it were nuclear waste.

Hide it in a gangster's treasure box hidden under some foliage, a la 20200.

Start a pen manufacturing company and create many, many identical pens. They won't be able to tell which one it is.

Eat the pen. Repeatedly, each time it passes through. For 50 years.

Find the guy with 10 years' worth of energy. Lock them in a room. Offer them their freedom if and only if they vow to protect the pen.

Surgically implant the pen under my skin (hope it's not made of biologically active materials).

Hidden safe in the walls of the house.

Hidden safe in the attic of the house.

Swiss bank vault (we had those in 1855, right?).

Inside a bottle of wine that will be aged to become a 50-year vintage in 1950.

Write a book on effective altruism (using the pen, of course) - there are probably some good cause areas around in 1855 to use as examples. They will read it, and cease to be evil, thus removing their motivation to acquire the pen.

Give Babbage some pointers on making his difference engine not suck, beginning an early steampunk cybersingularity, and ask the Great Brass Mind how to hide the pen.

Give the pen to my well-connected close friend, [famous person who lived in 1855], providing them with the same evidence I used to find that Einstein would need it.

Select, completely randomly, a point on the surface of the Earth. Bury it under a small amount of earth. Security through obscurity!

Replace each component of the pen, one at a time, until you have two pens: the old pen, and a new pen that's atom-for-item identical to the original pen. Let the evil forces find the new pen.

Create a replica of the first pen and let the evil forces find it, so that they stop looking.

Bribe every grunt of the evil forces who comes looking for your pen.

Like 10), but the other end; at that point they won't want to find it, even if they know where it is.

Find Einstein's parents. Offer them this treasured family heirloom. They will keep it safe and Einstein will inherit it.

Paint the pen black and put in in a soot-filled chimney.

Find Oliver Twist and Fagin, or some other group of Victorian urchins, who are ubiquitous in this age. Hire Fagin's street urchins to come up with and then red-team test 50-year security plans for the pen.

Become a miserly industrialist, refusing even to give my workers a day off for Christmas. When three ghosts come to visit, use information from the Ghost of Christmas Future to divine the manner in which the evil forces retrieve the pen, and make countermeasures.

All of these plans have some chance of failing, so I can obviously tolerate that. Hence, bet my money at very, very long odds - in the small sliver of timelines in which I succeed, use my money to buy out the evil forces entirely.

Call my friends at the time commission for backup. C'mon, we can't just forget about protocol here.

Go on an expedition to the Arctic and hide it in the inhospitable ice; I could probably talk some guys in pith helmets into giving me backup.

Or to the deepest jungles of the dark continent of Africa; likewise with the pith helments.

Or to the source of the Nile.

Or to the summit of the Mt. Everest or K2 or whatever's going to be most awkward for the evil forces..

Or to the Antarctic, which is colder than the Arctic in the middle part.

Or to the deserts of Australia.

Found a cult of Defending the Pen, perhaps using song lyrics from the future as substitute mystical wisdom.

Ask the longer-haired, wiser, and older version of myself who just gave me this quest for advice, since they're still standing there. Follow their advice.

Bury the pen deep in a coal mine.

Keep your head down and don't tell anyone that it's -you- who has the pen - it's not like the evil forces have any reason to suspect that, unless you give them a good reason to, like boostrapping the world to nanotech using future knowledge or something. Haha. Heh.

Hide the pen under my top hat; since it's 1855, that won't look unusual.

Dismantle the pen and hide the seven components throughout the world using techniques described above and below; being smaller, they'll be harder to find.

Join the evil forces as a simple masked minion; working for them, they won't suspect you have the pen, until one day as the second-in-command you usurp the leader (as it tradition).

Message in a bottle to the North Sentinel Island, who will repel outsiders including the evil forces.

Give a speech that's something like "evil forces, you really want to mess with me? I can leap to the moon in a single bound, and that's just to save me pulling it to ground, which I can also do. You once tried to trap me in a room and I took down your mothership's entire network before tearing it to shreds. This planet, and this pen specifically, is under my protection. Return to your galaxy," probably with dramatic orchestral music playing in the background, and then the evil forces will leave.

Check your Messing-with-Time-Wongle, standard issue equipment for all time travellers with missions to defend artefacts that are important to the timeline. Notice that the LED on it flashes green. Precommit to only sending a "green" signal to your MwTW in 50 years if the pen reaches Einstein successfully. Now Time will bend to ensure the pen is not found.

Freeze the pen in liquid nitrogen. It will now be too cold for the evil forces to touch.

The evil forces that I'm leader of, remember. Obviously my disloyal second-in-command will take umbrage if I seem not to be looking for the pen at all - I'm fairly sure they're a time traveller here to prevent Einstein from laying the physics foundations for the nuclear weapons that will destroy the world in the mid-20th century or something like that, and they keep scribbling notes on this list of about 50 items - but I can still direct them to the wrong place for 50 years. Hey, I think I saw the pen-keeper go into the middle of the Antarctic to launch a rocket!

Bury the pen in a large heap of explosives that only I know how to disarm - WWII mines are still dangerous so them being stable for 50 years should work.

Tie the pen to my ankle, everywhere I go - the traditional mores of the 19th century would make it scandalous for the evil forces to retrieve it from there!

Melt down the pen into a block of ordinary looking gunk. Remake the pen when needed years later.

Comment by Tetraspace (tetraspace-grouping) on Babble challenge: 50 ways to escape a locked room · 2020-10-12T17:54:16.635Z · LW · GW

The added resource constraints (I don't have a space elevator with me in the room... yet) made this a bit more difficult, which is very nice.

Ask someone for help via the phone

Punch through the door

Unlock the door, go through it

Punch through the wall

Punch through the window

Unlock the window, go through it

Wait for someone to help

Wait for the room to be demolished

Climb up through the ceiling...

...or through one of the missing walls (does it still count as a room?)

Create a series of Lesswrong posts diguised as babble exercises to try to come up with a way out of this room; use the best suggestion

Wait for a friendly GPT-derived AGI to rescue you (admittedly a longshot)

Quantum tunnel out of the room (rare but possible)

Release all of the energy stored in your body in a single burst to destroy walls (10 years! That's a lot!)

Release all of the energy stored in the phone's battery in a single burst to destroy walls

Use friction from rubbing clothes against wall to wear through it

Hang self with clothes (morbid, but "I" am no longer in the room)

Wait ten years, starve to death (don't worry; the GPT-derived AGI can read off my brain structures and revive me later)

Lifelog very accurately online via the phone; have myself be reconstructed outside of the room

I am already outside of the room, 10^10^100 light years away. No problem.

Release all energy stored in body in a single burst to jump through the ceiling and several miles into the sky - this might also allow me to bring a small object to the moon

Punch through the wall, but using phone to protect hands

Punch through the wall using shirt wrapped around to protect hands

Use the power armour that I am wearing as clothes to dismantle room

Wait sufficiently long that my personality is different enough that I am not in the room

Escape mentally via escapism (with help of phone games?)

Astral project

Use my cool utility-fog based sci-fi clothes to convert wall into nanobots

Redefine "inside" as "outside", like that SCP that lets you do that

Is this a real room, or a metaphorical "you" video game character? Type the console command to teleport out.

Ask the server admin to teleport me out.

Ask the real life server admin of the simulation we are embedded in to teleport me out (Elon Musk does this with Telsa stock prices)

Tap on the wall of the room to send a Morse code message asking for help.

Use phone's wifi to connect to the door's bluetooth and unlock it via the app.

Run at the door really hard.

The phone is a Nokia. Drop it on the ground and the room crumbles.

The phone is that Samsung phone that has batteries that set on fire (with 10 years of charge, that might be bad news for me?) Do so, then use the automatic door unlocking (that happens as a fire safety measure) to leave the room.

Pull off a bit of the phone's casing and use it as a lockpick.

The phone is that iPhone that can bend easily. Bend it into a shape that can prise the door open. Exit through door.

As above, but prise the window open. Exit through window.

Stop imagining the room.

Use lucid dream powers to escape the room.

Go to sleep and dream of a different place

Grow large enough to break through the room's walls

The walls are made of air so I can walk through them.

The walls are made of antimatter and annihilate with the surrounding environment.

The walls are made of ice and will melt soon.

Rub together two stick-like objects (derived from my phone, probably) to start a fire, as fire safety measure the door unlocks, etc

Do the five movements to travel to another dimension where we are not trapped

Hack the wi-fi. As an expert hacker, my captors will thus have to recruit me in order to fix their wifi. As they open the door, slip past them.

The room is completely empty. The air pressure outside causes the walls to immediately buckle and break.

Comment by Tetraspace (tetraspace-grouping) on Babble challenge: 50 ways of sending something to the moon · 2020-10-01T11:40:05.086Z · LW · GW

About halfway through I forgot that I was only meant to be bringing something to the moon rather than having to visit it myself, and some of my items are very broad (the first one could make up a whole list in itself).

This was very fun!


space elevator

jump really, really hard

electromagnetic cannon

accelerate the spin of the earth until it falls apart

decelerate the orbit of the moon until it falls, by flying comets past it

or by painting one side of the moon black

or by using a giant rocket

or by detonating enough antimatter weaponry

flap your arms, again really, really hard

shine a torch at the moon (photons reach there)

start in space and use an ion drive

project orion nuclear bomb detonated below you

program an AGI and ask the AGI how to get to the moon

build a very tall ladder


wings made of wax

throw it really, really hard

spin around and let go

stand under an asteroid strike and join the ejecta

wait for quantum fluctuations to teleport you there

wait for random gravitational solar system pertubations to bring the moon to you

wait for another civilisation to bring you to the moon

time travel to before Theia hit and join the original moon


add mass to the moon until it becomes the planet and you are on the moon

find the space rocks the apollo astronauts brought back and stand on them

project orion but with fusion

project orion but with antimatter

trigger false vacuum collapse with particle accelerator and use new physics to develop as yet unknowable way of travelling to moon

astral projection

bird with a spacesuit

space helicopter

vacuum-filled zepellin

submarine with reactionless thruster inside

perpetual motion machine

buy a ticket on musk's starship

invest in dogecoin, use billions from dogecoin to start space program

stand above a supervolcano and hope ejecta takes you high enough

run very very fast reaching orbital velocity

very long space elevator reaching down from moon

very very long space elevator reaching down from mars

create microscopic black hole and use gravitational slingshot

carefully warp space to make a staircase built from the metric

make a normal staircase


very, very fast bicycle with a ramp

add mass to moon until gravitational tide from moon lifts you from the surface of the earth

deorbit the earth-moon system into the sun and join it in the molten iron in the sun's core

apollo 11 mission

Comment by Tetraspace (tetraspace-grouping) on Comparing LICDT and LIEDT · 2020-07-24T16:21:58.425Z · LW · GW

The statement of the law of logical causality is:

Law of Logical Causality: If conditioning on any event changes the probability an agent assigns to its own action, that event must be treated as causally downstream.

If I'm interpreting things correctly, this is just because anything that's upstream gets screened off, because the agent knows what action it's going to take.

You say that LICDT pays the blackmail in XOR blackmail because it follows this law of logical causality. Is this because, conditioned on the letter being sent, if there is a disaster the agent assigns  to sending money, and if there isn't a disaster the agent assigns  to sending money, so the disaster must be causally downstream of the decision to send money if the agent is to know whether or not it sends money?

Comment by Tetraspace (tetraspace-grouping) on Smoking Lesion Steelman · 2020-07-21T02:43:38.359Z · LW · GW

I didn't find the conclusion about the smoke-lovers and non-smoke-lovers obvious in the EDT case at first glance, so I added in some numbers and ran through the calculations that the robots will do to see for myself and get a better handle on what not being able to introspect but still gaining evidence about your utility function actually looks like.

Suppose that, out of the  robots that have ever been built,  are smoke-lovers and  are non-smoke-lovers. Suppose also the smoke-lovers end up smoking with probability  and non-smoke-lovers end up smoking with probability .

Then  robots smoke, and  robots don't smoke. So by Bayes' theorem, if a robot smokes, there is a   chance that it's killed, and if a robot doesn't smoke, there's a chance that it's killed.

Hence, the expected utilities are:

  • An EDT non-smoke-lover looks at the possibilities. It sees that if it smokes, it expects to get utilons, and that if it doesn't smoke, it expects to get  utilons.
  • An EDT smoke-lover looks at the possibilities. It sees that if it smokes, it expects to get  utilons, and if it doesn't smoke, it expects to get  utilons.

Now consider some equilibria. Suppose that no non-smoke-lovers smoke, but some smoke-lovers smoke. So  and . So (taking limits as  along the way):

  • non-smoke-lovers expect to get  utilons if they smoke, and  utilons if they don't smoke.  so non-smoke-lovers will choose not to smoke.
  • smoke-lovers expect to get  utilons if they smoke, and  utilons if they don't smoke. Smoke-lovers would be indifferent between the two if . This works fine if at least 90% of robots are smoke lovers, and equilibrium is achieved. But if less than 90% of robots are smoke-lovers, then there is no point at which they would be indifferent, and they will always choose not to smoke.

But wait! This is fine if more than 90% are smoke-lovers, but if fewer than 90% are smoke-lovers, then they would always choose not to smoke, that's inconsistent with the assumption that  is much larger than . So instead suppose that  is only only a little bit bigger than , say that . Then:

  • non-smoke-lovers expect to get  utilons if they smoke, and  utilons if they don't smoke. They will choose to smoke if , i.e. if smoke-lovers smoke so rarely that not smoking would make them believe they're a smoke-lover about to be killed by the blade runner.
  • smoke-lovers expect to get   utilons if they smoke, and  utilons if they don't smoke. They are indifferent between these two when . This means that, when  is at the equilibrium point, non-smoke-lovers will not choose to smoke when fewer than 90% of robots are smoke-lovers, which is exactly when this regime applies.

I wrote a quick python simulation to check these conclusions, and it was the case that  for , and  for  there as well.

Comment by Tetraspace (tetraspace-grouping) on Reductive Reference · 2020-06-25T13:13:16.203Z · LW · GW

Your reliable thermometer doesn't need to be well-calibrated - it only has to show the same value whenever it's used to measure boiling water, regardless of what that value is. So the dependence isn't quite so circular, thankfully.