Tetraspace Grouping's Shortform 2019-08-02T01:37:14.859Z · score: 2 (1 votes)


Comment by tetraspace-grouping on Grue_Slinky's Shortform · 2019-10-01T10:36:48.703Z · score: 1 (1 votes) · LW · GW

Are we allowed to I-am-Groot the word "cake" to encode several bits per word, or do we have to do something like repeat "cake" until the primes that it factors into represent a desired binary string?

(edit: ah, only nouns, so I can still use whatever I want in the other parts of speech. or should I say that the naming cakes must be "cake", and that any other verbal cake may be whatever this speaking cake wants)

Comment by tetraspace-grouping on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-28T01:09:36.183Z · score: 2 (2 votes) · LW · GW

Dank EA Memes is a Facebook group. It's pretty good.

Comment by tetraspace-grouping on Follow-Up to Petrov Day, 2019 · 2019-09-28T00:59:41.532Z · score: 18 (9 votes) · LW · GW

If anyone asks, I entered a code that I knew was incorrect as a precommitment to not nuke the site.

Comment by tetraspace-grouping on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-27T00:38:46.380Z · score: 30 (9 votes) · LW · GW

To make sure I have this right and my LW isn't glitching: TurnTrout's comment is a Drake meme, and the two other replies in this chain are actually blank?

Comment by tetraspace-grouping on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-27T00:35:05.499Z · score: 12 (4 votes) · LW · GW

Well, at least we have a response to the doubters' "why would anyone even press the button in this situation?"

Comment by tetraspace-grouping on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-26T23:37:24.075Z · score: 17 (7 votes) · LW · GW


Clicking on the button permanently switches it to a state where it's pushed-down, below which is a prompt to enter launch codes. When moused over, the pushed-down button has the tooltip "You have pressed the button. You cannot un-press it." Screenshot.

(On an unrelated note, on r/thebutton I have a purple flair that says "60s".)

Upon entering a string of longer than 8 characters, a button saying "launch" appears below the big red button. Screenshot.


I'm nowhere near the PST timezone, so I wouldn't be able to reliably pull a shenanigan whereby if I had the launch codes I would enter or not enter them depending on the amount of counterfactual money pledged to the Ploughshares Fund in the name of either launch-code-entry-state, but this sentence is not apophasis.


Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.

IV. (Update)

I can no longer see the button when I am logged in. Could this mean that I have won?

Comment by tetraspace-grouping on Novum Organum: Preface · 2019-09-24T01:44:38.049Z · score: 16 (3 votes) · LW · GW

At the start of the Sequences, you are told that rationality is a martial art, used to amplify the power of the unaided mind in the same way that a martial art doesn't necessarily make you stronger but just lets you use your body properly.

Bacon, on the other hand, throws the prospect of using the unaided mind right out; Baconian rationality is a machine, like a pulley or a lever, where you apply your mind however feebly to one end and by its construction the other end moves a great distance or applies a great force (either would do for the metaphor).

If I have my history right, Bacon's machine is Science. Its function is to accumulate a huge mountain of evidence, so big that even a human could be persuaded by it, and instruction in the use of science is instruction in being persuaded by that mountain of evidence. Philosophers of old simply ignored the mountain of evidence (failed to use the machine) and maybe relied on syllogisms and definitions and hence failed to move the stone column.

And later, with the aid of Bacon's machine, it turns out that one discovers that you don't really need this huge mountain of evidence or the systematic stuff and that an ideal reasoner could simply perform a Bayesian update on each bit that comes in and get to the truth way faster, while avoiding all the slowness or all the mistakes that come if you insist on setting up the machine every single time. At your own risk, of course - get your stance slightly wrong lifting a stone column, and you throw your back out.

Comment by tetraspace-grouping on A Critique of Functional Decision Theory · 2019-09-15T14:43:55.051Z · score: 5 (3 votes) · LW · GW

An agent also faces a guaranteed payoffs problem in Parfit's hitchhiker, since the driver has already made their prediction (the agent knows they're safe in the town) so the agent's choice is between losing $1,000 and losing $0. Is it also a bad idea for the agent to pay the $1000 in this problem?

Comment by tetraspace-grouping on ozziegooen's Shortform · 2019-09-10T15:09:12.386Z · score: 1 (1 votes) · LW · GW

There's something of a problem with sensitivity; if the x-risk from AI is ~0.1, and the difference in x-risk from some grant is ~10^-6, then any difference in the forecasts is going to be completely swamped by noise.

(while people in the market could fix any inconsistency between the predictions, they would only be able to look forward to 0.001% returns over the next century)

Comment by tetraspace-grouping on Open & Welcome Thread - September 2019 · 2019-09-06T21:43:49.344Z · score: 1 (1 votes) · LW · GW

Is the issue that it's pain-based and hence makes my life worse (probably false for me: maths is fun and gives me a sense of pride and accomplishment when I do it, it's just that darn System 1 always saying "better for you if you play Kerbal Space Program"), or that social punishment isn't always available and therefore ought not to be relied on (this is probably an issue for me), or some third thing?

Comment by tetraspace-grouping on Open & Welcome Thread - September 2019 · 2019-09-04T20:17:40.376Z · score: 6 (4 votes) · LW · GW

Previously: August.

Dear Diary,

In the intervening month I have done chapters 8 and 9 of Tao's Analysis I, which feels terribly slow. Two chapters in a month? I could do the whole book in that time if I tried! And I know that I can because I have, like I'm getting a physics degree and it definitely feels like I've done at least one textbook worth of learning per term.

One of the active ingredients seems to be time pressure, which is present but not salient here - if I fail, all that happens is the wrong math is deployed to steer the future of the lightcone, which doesn't hold a candle to me losing a little bit of status. Ah, to be a brain.

Thus: by October I'll have finished Analysis I; think less of me if I haven't.

(And perhaps I'll have done even more!)

UPDATE SEP 26: You can rest easy now; I have completed the book.

Comment by tetraspace-grouping on I think I came up with a good utility function for AI that seems too obvious. Can you people poke holes in it? · 2019-08-30T15:34:21.095Z · score: 1 (1 votes) · LW · GW

This AI wouldn't be trying to convince a human to help it, just that it's going to succeed.

So instead of convincing humans that a hell-world is good, it would convince the humans that it was going to create a hell-world (and they would all disapprove, so it would score low).

I think what this ends up doing is having everyone agree with a world that sounds superficially good but is actually terrible in a way that's difficult for unaided humans to realize e.g. the AI convinces everyone that it will create an idyllic natural world where people live forager lifestyles in harmony etc. etc., everyone approves because they like nature and harmony and stuff, it proceeds to create such an idyllic natural world, and wild animal suffering outweighs human enjoyment forevermore.

Comment by tetraspace-grouping on I think I came up with a good utility function for AI that seems too obvious. Can you people poke holes in it? · 2019-08-29T13:14:00.142Z · score: 4 (3 votes) · LW · GW

One thing I'd be concerned about is that there are a lot of possible futures that sound really appealing, and that a normal human would sign off on, but are actually terrible (similar concept: siren worlds).

For example, in a world of Christians the AI would score highly on a future where they get to eternally rest and venerate God, which would get really boring after about five minutes. In a world of Rationalists the AI would score highly on a future where they get to live on a volcano island with catgirls, which would also get really boring after about five minutes.

There are potentially lots of futures like this (that might work for a wider range of humans), and because the metric (inferred approval after it's explained) is different from the goal (whether the future is good) and there's optimisation pressure increasing with the number of futures considered, I would expect it to be Goodharted.

Some possible questions this raises:

  • On futures: I can't store the entire future in my head, so the AI would have to only describe some features. Which features? How to avoid the selection of features determining the outcome?
  • On people: What if the future involves creating new people, who most people currently would want to live in that future? What about animals? What about babies?
Comment by tetraspace-grouping on Tetraspace Grouping's Shortform · 2019-08-26T19:15:41.503Z · score: 2 (2 votes) · LW · GW

Here are three statements I believe with a probability of about 1/9:

  • The two 6-sided dice on my desk, when rolled, will add up to 5.
  • An AI system will kill at least 10% of humanity before the year 2100.
  • Starvation was a big concern in ancient Rome's prime (claim borrowed from Elizabeth's Epistemic Spot Check post).

Except I have some feeling that the "true probability" of the 6-sided die question is pretty much bang on exactly 1/9, but that the "true probability" of the Rome and AI xrisk questions could be quite far from 1/9 and to say the probability is precisely 1/9 seems... overconfident?

From a straightforward Bayesian point of view, there is no true probability. It's just my subjective degree of belief! I'd be willing to make a bet at 8/1 odds on any of these, but not at worse odds, and that's all there really is to say on the matter. It's the number I multiply by the utilities of the outcomes to make decisions.

One thing you could do is imagine a set of hypotheses that I have that involve randomness, and then I have a probability distribution over which of these hypotheses is the true one, and by mapping each hypothesis to the probability it assigns to the outcome my probability distribution over hypotheses becomes a probability distribution over probabilities. This is sharply around 1/9 for the dice rolls, and widely around 1/9 for AI xrisk, as expected, so I can report 50% confidence intervals just fine. Except sensible hypotheses about historical facts probably wouldn't be random, because either starvation was important or it wasn't, that's just a true thing that happens to exist in my past, maybe.

I like jacobjacob's interpretation of a probability distribution over probabilities as an estimate of what your subjective degree of belief would be if you thought about the problem for longer (e.g. 10 hours). The specific time horizon seems a bit artificial (extreme case: I'm going to chat with an expert historian in 10 hours and 1 minute) but it does work and gives me the kind of results that makes sense. The advantage of this is that you can quite straightforwardly test your calibration (there really is a ground truth) - write down your 50% confidence interval, then actually do the 10 hours of research, and see how often the degree of belief you end up with lies inside the interval.

Comment by tetraspace-grouping on Epistemic Spot Check: The Fate of Rome (Kyle Harper) · 2019-08-24T23:11:37.520Z · score: 4 (4 votes) · LW · GW

What do the probability distributions listed below the claims mean specifically?

Comment by tetraspace-grouping on Tetraspace Grouping's Shortform · 2019-08-24T18:08:33.448Z · score: 4 (2 votes) · LW · GW

Imagine two prediction markets, both with shares that give you $1 if they pay out and $0 otherwise.

One is predicting some event in the real world (and pays out if this event occurs within some timeframe) and has shares currently priced at $X.

The other is predicting the behaviour of the first prediction market. Specifically, it pays out if the price of the first prediction market exceeds an upper threshhold $T before it goes below a lower threshhold $R.

Is there anything that can be said in general about the price of the second prediction market? For example, it feels intuitively like if T >> X, but R is only a little bit smaller than X, then assigning a high price to shares of the second prediction market violates conservation of evidence - is this true, and can it be quantified?

Comment by tetraspace-grouping on Time Travel, AI and Transparent Newcomb · 2019-08-22T22:51:21.588Z · score: 1 (1 votes) · LW · GW

We would also expect destroying time machines to be a convergent instrumental goal in this universe, since any agent that does this would be more likely to have been created. So by default powerful enough optimization processes would try to prevent time travel.

Comment by tetraspace-grouping on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-07T15:24:56.193Z · score: 3 (3 votes) · LW · GW

The counterfactual oracle can answer questions for which you can evaluate answers automatically (and might be safe because it doesn't care about being right in the case where you read the prediction so it won't manipulate you), and the low-bandwith oracle can answer multiple-choice questions (and might be safe because none of the multiple-choice options are unsafe).

My first thought for this is to ask the counterfactual oracle for an essay on the importance of coffee, and in the case where you don't see its answer, you get an expert to write the best essay on coffee possible, and score the oracle by the similarity between what it writes and what the expert writes. Though this only gives you human levels of performance.

Comment by tetraspace-grouping on Open & Welcome Thread - August 2019 · 2019-08-04T23:04:35.201Z · score: 15 (9 votes) · LW · GW

I might as well post a monthly update on my doing things that might be useful for me doing AI safety.

I decided to just continue with what I was doing last year before I got distracted, and learn analysis, from Tao's Analysis I, on the grounds that it's maths which is important to know and that I will climb the skill tree analysis -> topology -> these fixed point exercises. Have done chapters 5, 6 and 7.

My question on what it would be most useful for me to be doing remains if anyone has any input.

Comment by tetraspace-grouping on Occam's Razor: In need of sharpening? · 2019-08-04T22:28:21.753Z · score: 11 (3 votes) · LW · GW

The formalisation used in the Sequences (and algorithmic information theory) is the complexity of a hypothesis is the shortest computer program that can specify that hypothesis.

An illustrative example is that, when explaining lightning, Maxwell's equations are simpler in this sense than the hypothesis that Thor is angry because the shortest computer program that implements Maxwell's equations is much simpler than an emulation of a humanlike brain and its associated emotions.

In the case of many-worlds vs. Copenhagen interpretation, a computer program that implemented either of them would start with the same algorithm (Schrodinger's equation), but (the claim is) that the computer program for Copenhagen would have to have an extra section that specified how collapse upon observation worked that many-worlds wouldn't need.

Comment by tetraspace-grouping on Tetraspace Grouping's Shortform · 2019-08-02T16:50:01.941Z · score: 5 (3 votes) · LW · GW

In Against Against Billionaire Philanthropy, Scott says

The same is true of Google search. I examined the top ten search results for each donation, with broadly similar results: mostly negative for Zuckerberg and Bezos, mostly positive for Gates.

With Gates' philanthropy being about malaria, Zuckerberg's being about Newark schools, and Bezos' being about preschools.

Also, as far as I can tell, Moskovitz' philanthropy is generally considered positively, though of course I would be in a bubble with respect to this. Also also, though I say this without really checking, it seems that people are pretty much all against the Sacklers' donations to art galleries and museums.

Squinting at these data points, I can kind of see a trend: people favour philanthropy that's buying utilons, and are opposed to philanthropy that's buying status. They like billionaires funding global development more than they like billionaires funding local causes, and they like them funding art galleries for the rich least of all.

Which is basically what you'd expect if people were well-calibrated and correctly criticising those who need to be taken down a peg.

Comment by tetraspace-grouping on Tetraspace Grouping's Shortform · 2019-08-02T15:01:08.524Z · score: 2 (2 votes) · LW · GW

It was inspired by yours - when I read your post I remembered that there was this thing about Solomonoff induction that I was still confused about - though I wasn't directly trying to answer your question so I made it its own thread.

Comment by tetraspace-grouping on Tetraspace Grouping's Shortform · 2019-08-02T01:37:15.009Z · score: 5 (3 votes) · LW · GW

The simplicity prior is that you should assign a prior probability 2^-L to the description of length L. This sort of makes intuitive sense, since it's what you'd get if you generated the description through a series of coinflips...

... except there are 2^L descriptions of length L, so the total prior probability you're assigning is sum(2^L * 2^-L) = sum(1) = unnormalisable.

You can kind of recover this by noticing that not all bitstrings correspond to an actual description, and for some encodings their density is low enough that it can be normalised (I think the threshold is that less than 1/L descriptions of length L are "valid")...

...but if that's the case, you're being fairly information inefficient because you could compress descriptions further, and why are you judging simplicity using such a bad encoding, and why 2^-L in that case if it doesn't really correspond to complexity properly any more? And other questions in this cluster.

I am confused (and maybe too hung up on something idiosyncratic to an intuitive description I heard).

Comment by tetraspace-grouping on AI Safety Debate and Its Applications · 2019-07-25T01:15:06.613Z · score: 1 (1 votes) · LW · GW

In the case of MNIST, how good is the judge itself - for example, if you were to pick the six pixels optimally to give it the most information, how well would it perform?

Comment by tetraspace-grouping on Open Thread July 2019 · 2019-07-16T22:33:48.141Z · score: 13 (5 votes) · LW · GW

I'm off from university (3rd year physics undergrad) for the summer and hence have a lot of free time, and I want to use this to make as much progress as possible towards the goal of getting a job in AI safety technical research. I have found that I don't really know how to do this.

Some things that I can do:

  • work through undergrad-level maths and CS textbooks
  • basic programming (since I do physics, this is at the level required to implement simple numerical methods in MATLAB)
  • the stuff in Andrew Ng's machine learning Coursera course

Thus far I've worked through the first half of Hutton's Programming in Haskell on the grounds that functional programming maybe teaches a style of thought that's useful and opens doors to more theoretical CS stuff.

I'm optimising for something slightly different that purely becoming good at AI safety, in that at the end I'd like to have some legible things to point to or list on a CV or something (or become better-placed to later acquire such legible things).

I'd be interested to hear from people who know more about what would be helpful for this.

Comment by tetraspace-grouping on Open Thread July 2019 · 2019-07-13T21:25:21.690Z · score: 22 (9 votes) · LW · GW

There's no official, endorsed CFAR handbook that's publicly available for download. The CFAR handbook from summer 2016, which I found on libgen, warns

While you may be tempted to read ahead, be forewarned - we've often found that participants have a harder time grasping a given technique if they've already anchored themselves on an incomplete understanding. Many of the explanations here are intentionally approximate or incomplete, because we believe this content is best transmitted in person. It helps to think of this handbook as a companion to the workshop, rather than as a standalone resource.

which I think is still their view on the matter.

I have heard that they would be more comfortable with people learning rationality techniques in-person from a friend, so if you know any CFAR alumni you could ask them (they'd probably also have a better answer to your question).

Comment by tetraspace-grouping on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-07-03T16:58:40.734Z · score: 5 (4 votes) · LW · GW

Submission. Counterfactual oracle. Give the oracle the set of questions on Metaculus that have a resolve date before some future date T, and receive output in the form of ordered pairs of question IDs and predictions. The score of the Oracle in the case where we don't see its answers is the number of Metaculus points that it would have earned by T if it had made a prediction on those questions at the time when we asked it.

Comment by tetraspace-grouping on Recommendation Features on LessWrong · 2019-06-15T13:22:17.340Z · score: 8 (4 votes) · LW · GW

Is there any way to mark a post as unread? It's recommending me a lot of sequences that it believes I'm halfway through when in fact I've just briefly checked a couple of posts in it, and it would be nice if I could start it again from the beginning.

Comment by tetraspace-grouping on What should rationalists think about the recent claims that air force pilots observed UFOs? · 2019-05-28T18:12:46.749Z · score: 1 (1 votes) · LW · GW

Cutting-edge modern military AI seems to all be recently developed; the first flight of the F-22 Raptor was in 1997, while the first deployment of Sea Hunter was in 2016. I also think there are strong incentives for civilian organisations to develop AI that aren't present for fighter jets.

Comment by tetraspace-grouping on Open Thread May 2019 · 2019-05-11T18:11:40.450Z · score: 5 (3 votes) · LW · GW

While "rationality" claims to be defined as "stuff that helps you win", and while on paper if it turned out that the Sequences didn't help you arrive at correct conclusions we'd stop calling that "rationality" and call something else "rationality", in practise the word "rationality" points at "the stuff in the Sequences" rather than the "stuff that helps you win", and that people with stuff that helps you win that isn't the type of thing that you'd find in the Sequences have to call it something else to be unambiguous. Such is language.

Comment by tetraspace-grouping on Why the AI Alignment Problem Might be Unsolvable? · 2019-03-28T15:36:07.923Z · score: 1 (1 votes) · LW · GW
You cannot program a general intelligence with a fundamental drive to ‘not intervene in human affairs except when things are about to go drastically wrong otherwise, where drastically wrong is defined as either rape, torture, involuntary death, extreme debility, poverty or existential threats’ because that is not an optimization function.

In the extreme limit, you could create a horribly gerrymandered utility function where you assign 0 utility to universes where those bad things are happening, 1 utility to universes where they aren't, and some reduced impact thing which means that it usually prefers to do nothing.

Comment by tetraspace-grouping on Fixed Point Exercises · 2019-01-03T02:33:11.557Z · score: 1 (1 votes) · LW · GW

Do you have any recommended reading for learning enough math to do these exercises? I'm sort of using these as a textbook-list-by-proxy (e.g. google "Intermediate value theorem", check which area of math it's from, oh hey it's Analysis, get an introductory textbook in Analysis, repeat), though I also have little knowledge of the field and don't want to wander down suboptimal paths.

Comment by tetraspace-grouping on Embedded Agents · 2018-10-30T00:12:09.569Z · score: 20 (10 votes) · LW · GW

Very nice! I like the colour-coding scheme, and the way it ties together those bullet points in MIRI's research agenda.

Looks like these sequences are going to be a great (content-wise and aesthetically) introduction to a lot of the ideas behind agent foundations; I'm excited.

Comment by tetraspace-grouping on Open Thread August 2018 · 2018-08-07T13:05:04.908Z · score: 3 (3 votes) · LW · GW

Why does this line of reasoning not apply to friendly AIs?

Why would the unfriendly AI halt? Is there really no better way for it to achieve its goals?

Comment by tetraspace-grouping on Anthropics made easy? · 2018-06-14T14:13:56.440Z · score: 1 (2 votes) · LW · GW

The argument at the start just seems to move the anthropics problem one step back - how do we know whether we "survived"* the cold war?

*Not sure how to succinctly state this better; I mean if Omega told me that the True Probability of surviving the Cold War was 1%, I would update on the safety of the Cold War in a different direction than if it told me 99%, even though both entail me, personally, surviving the Cold War.

Comment by tetraspace-grouping on Why Universal Comparability of Utility? · 2018-05-13T23:15:27.592Z · score: 3 (1 votes) · LW · GW

Ahead of time, you can't really tell precisely what problems you'll be faced with - reality is allowed to throw pretty much anything at you. It's a useful property, then, if decisions are possible to make in all situations, so you can guarantee that e.g. new physics won't throw you into undefined behavior.

Comment by tetraspace-grouping on April Fools: Announcing: Karma 2.0 · 2018-04-01T14:09:39.584Z · score: 38 (9 votes) · LW · GW

I'd like to report a bug. My comments aren't larger than worlds, which is a pity, because the kind of content I produce is clearly the most insightful and intelligent of all. I'm also humble to boot - more humble than you could ever believe - which is one of the rationalist virtues that any non-tribal fellow would espouse.

Comment by tetraspace-grouping on Torture vs. Dust Specks · 2016-07-13T17:39:18.587Z · score: 1 (1 votes) · LW · GW

New situation: 3^^^3 people being tortured for 50 years, or one person getting tortured for 50 years and getting a single speck of dust in their eye.

By do unto others, I should, of course, torture the innumerably vast number of people, since I'd rather be tortured for 50 years than be tortured for 50 years and get dust in my eye.