Posts

Announcement: AI alignment prize round 4 winners 2019-01-20T14:46:47.912Z · score: 80 (19 votes)
Announcement: AI alignment prize round 3 winners and next round 2018-07-15T07:40:20.507Z · score: 102 (29 votes)
How to formalize predictors 2018-06-28T13:08:11.549Z · score: 16 (5 votes)
UDT can learn anthropic probabilities 2018-06-24T18:04:37.262Z · score: 63 (19 votes)
Using the universal prior for logical uncertainty 2018-06-16T14:11:27.000Z · score: 0 (0 votes)
Understanding is translation 2018-05-28T13:56:11.903Z · score: 133 (44 votes)
Announcement: AI alignment prize round 2 winners and next round 2018-04-16T03:08:20.412Z · score: 153 (45 votes)
Using the universal prior for logical uncertainty (retracted) 2018-02-28T13:07:23.644Z · score: 39 (10 votes)
UDT as a Nash Equilibrium 2018-02-06T14:08:30.211Z · score: 34 (11 votes)
Beware arguments from possibility 2018-02-03T10:21:12.914Z · score: 13 (9 votes)
An experiment 2018-01-31T12:20:25.248Z · score: 32 (11 votes)
Biological humans and the rising tide of AI 2018-01-29T16:04:54.749Z · score: 55 (18 votes)
A simpler way to think about positive test bias 2018-01-22T09:38:03.535Z · score: 34 (13 votes)
How the LW2.0 front page could be better at incentivizing good content 2018-01-21T16:11:17.092Z · score: 38 (19 votes)
Beware of black boxes in AI alignment research 2018-01-18T15:07:08.461Z · score: 70 (29 votes)
Announcement: AI alignment prize winners and next round 2018-01-15T14:33:59.892Z · score: 166 (63 votes)
Announcing the AI Alignment Prize 2017-11-04T11:44:19.000Z · score: 1 (1 votes)
Announcing the AI Alignment Prize 2017-11-03T15:47:00.092Z · score: 155 (67 votes)
Announcing the AI Alignment Prize 2017-11-03T15:45:14.810Z · score: 7 (7 votes)
The Limits of Correctness, by Bryan Cantwell Smith [pdf] 2017-08-25T11:36:38.585Z · score: 3 (3 votes)
Using modal fixed points to formalize logical causality 2017-08-24T14:33:09.000Z · score: 3 (3 votes)
Against lone wolf self-improvement 2017-07-07T15:31:46.908Z · score: 30 (28 votes)
Steelmanning the Chinese Room Argument 2017-07-06T09:37:06.760Z · score: 5 (5 votes)
A cheating approach to the tiling agents problem 2017-06-30T13:56:46.000Z · score: 3 (3 votes)
What useless things did you understand recently? 2017-06-28T19:32:20.513Z · score: 7 (7 votes)
Self-modification as a game theory problem 2017-06-26T20:47:54.080Z · score: 10 (10 votes)
Loebian cooperation in the tiling agents problem 2017-06-26T14:52:54.000Z · score: 5 (5 votes)
Thought experiment: coarse-grained VR utopia 2017-06-14T08:03:20.276Z · score: 16 (16 votes)
Bet or update: fixing the will-to-wager assumption 2017-06-07T15:03:23.923Z · score: 26 (26 votes)
Overpaying for happiness? 2015-01-01T12:22:31.833Z · score: 32 (33 votes)
A proof of Löb's theorem in Haskell 2014-09-19T13:01:41.032Z · score: 29 (30 votes)
Consistent extrapolated beliefs about math? 2014-09-04T11:32:06.282Z · score: 6 (7 votes)
Hal Finney has just died. 2014-08-28T19:39:51.866Z · score: 33 (35 votes)
"Follow your dreams" as a case study in incorrect thinking 2014-08-20T13:18:02.863Z · score: 29 (31 votes)
Three questions about source code uncertainty 2014-07-24T13:18:01.363Z · score: 9 (10 votes)
Single player extensive-form games as a model of UDT 2014-02-25T10:43:12.746Z · score: 21 (12 votes)
True numbers and fake numbers 2014-02-06T12:29:08.136Z · score: 19 (29 votes)
Rationality, competitiveness and akrasia 2013-10-02T13:45:31.589Z · score: 14 (15 votes)
Bayesian probability as an approximate theory of uncertainty? 2013-09-26T09:16:04.448Z · score: 16 (18 votes)
Notes on logical priors from the MIRI workshop 2013-09-15T22:43:35.864Z · score: 18 (19 votes)
An argument against indirect normativity 2013-07-24T18:35:04.130Z · score: 1 (14 votes)
"Epiphany addiction" 2012-08-03T17:52:47.311Z · score: 52 (56 votes)
AI cooperation is already studied in academia as "program equilibrium" 2012-07-30T15:22:32.031Z · score: 36 (37 votes)
Should you try to do good work on LW? 2012-07-05T12:36:41.277Z · score: 36 (41 votes)
Bounded versions of Gödel's and Löb's theorems 2012-06-27T18:28:04.744Z · score: 32 (33 votes)
Loebian cooperation, version 2 2012-05-31T18:41:52.131Z · score: 13 (14 votes)
Should logical probabilities be updateless too? 2012-03-28T10:02:09.575Z · score: 12 (15 votes)
Common mistakes people make when thinking about decision theory 2012-03-27T20:03:08.340Z · score: 51 (46 votes)
An example of self-fulfilling spurious proofs in UDT 2012-03-25T11:47:16.343Z · score: 20 (21 votes)
The limited predictor problem 2012-03-21T00:15:26.176Z · score: 10 (11 votes)

Comments

Comment by cousin_it on Why Are So Many Rationalists Polyamorous? · 2019-10-22T08:57:14.197Z · score: 7 (3 votes) · LW · GW

Yeah, I think the data in Jacob's post supports network effects as a more likely explanation than common factors.

Comment by cousin_it on What's your big idea? · 2019-10-22T06:43:14.215Z · score: 5 (2 votes) · LW · GW

Taxes that would slow down urbanisation (by making the state complicit in increases in urban land price/costs of urban services) sound like a real bad idea.

AFAIK the claim is that taxing land value would lead to lower rents overall, not higher. There's some econ reasoning behind that.

Comment by cousin_it on What's your big idea? · 2019-10-21T13:59:43.143Z · score: 10 (3 votes) · LW · GW

The usual Georgist story is that the problem of allocating land can be solved by taxing away all unimproved value of land (or equivalently by the government owning all land and renting it out to the highest bidder), and that won't distort the economy, but the people who profit from current land allocation are disproportionately powerful and will block this proposal. Is that related to the problem you're trying to solve?

Comment by cousin_it on Polyamory is Rational(ist) · 2019-10-18T19:48:40.016Z · score: 3 (1 votes) · LW · GW

Would it be fair to summarize that being more rationalist doesn't much increase your chance of choosing poly on your own, but strongly increases your chance of being convinced to become poly by other rationalists?

Comment by cousin_it on The best of the www, in my opinion · 2019-10-18T09:25:35.633Z · score: 10 (6 votes) · LW · GW

I'm fascinated by some individual people on the internet who have managed to build their own small universes of content: Cosma Shalizi's notebooks, Bill Beaty's science site, Kragen Sitaker's mailing list, John Baez's weekly finds, I could name more but this should already keep you busy for a year or two :-)

Comment by cousin_it on Strong stances · 2019-10-16T10:01:52.216Z · score: 9 (4 votes) · LW · GW

Are successful people unusually confident and optimistic (less rational than average) or unusually good at noticing and taking opportunities (more rational than average)?

Comment by cousin_it on What's going on with "provability"? · 2019-10-13T08:49:03.388Z · score: 8 (5 votes) · LW · GW

Is this just because there are some statements such that neither P nor not-P can be proven, yet one of them must be true? If so, I would (stubbornly) contest that perhaps P and not-P really are both non-true.

Sure, truth is delicate. To move forward in studying this topic, just go with the interpretation that "neither P nor not-P can be proven".

Löb: How can a system of arithmetic prove anything? Much less prove things about proofs?

Arithmetization. If you have a bunch of axioms (which are finite strings in a finite alphabet) and a bunch of rules of inference (which are mechanistic rules for deriving new strings from old ones) then both can be encoded as integers and arithmetic. Then something like "there exists a sequence of deductions leading to such-and-such theorem" can be encoded as "there exists an integer such that..." and then a bunch of arithmetic.

For example, I’ve heard that there are proofs that PA is consistent. Let’s say one of those proofs is set in Proof System X. Now how do we know that Proof System X is consistent? Perhaps it can be proven consistent by using Proof System Y? Do we just end up making an infinite chain of appeals up along a tower of proof systems?

Yeah, these proofs just appeal to something stronger and are pretty pointless.

Oh, speaking of ZFC. There seems to be a debate about whether we should accept the Axiom of Choice. Isn’t it...obviously true?

Well, it doesn't follow from the other axioms, and it has some counterintuitive consequences. That's enough to make it debatable :-)

Comment by cousin_it on Sets and Functions · 2019-10-11T13:15:10.270Z · score: 11 (8 votes) · LW · GW

You're clearly putting a lot of effort into this, but I have doubts about the exposition style. For someone who doesn't know multiple math fields, cat theory is useless. For someone who does, a more compact intro should suffice, like these slides or the SEP entry.

Comment by cousin_it on Thoughts on "Human-Compatible" · 2019-10-11T09:51:24.015Z · score: 3 (1 votes) · LW · GW

Yeah, then I agree with both points. Sneaky!

Comment by cousin_it on Thoughts on "Human-Compatible" · 2019-10-11T08:03:11.042Z · score: 3 (1 votes) · LW · GW

Why would such "dual purpose" plans have higher approval value than some other plan designed purely to maximize approval?

Comment by cousin_it on Thoughts on "Human-Compatible" · 2019-10-11T07:15:27.077Z · score: 4 (2 votes) · LW · GW

You can read about counterfactual oracles in this paper. Stuart also ran a contest on LW about them.

Comment by cousin_it on Thoughts on "Human-Compatible" · 2019-10-10T10:47:49.654Z · score: 22 (9 votes) · LW · GW

Reading this made me realize a pretty general idea, which we can call "decoupling action from utility".

Consequentialist AI: figure out which action, if carried out, would maximize paperclips; then carry out that action.

Decoupled AI 1: figure out which action, if carried out, would maximize paperclips; then print a description of that action.

Decoupled AI 2: figure out which action, if described to a human, would be approved; then carry out that action. (Approval-directed agent)

Decoupled AI 3: figure out which prediction, if erased by a low probability event, would be true; then print that prediction. (Counterfactual oracle)

Any other ideas for "decoupled" AIs, or risks that apply to this approach in general?

Comment by cousin_it on Reflections on Premium Poker Tools: Part 2 - Deciding to call it quits · 2019-10-09T09:09:51.683Z · score: 13 (7 votes) · LW · GW

Yeah, if you're building a money-making funnel, it's a good idea to make numerical guesses about how each step of the funnel will perform, and sanity check them. Reddit or YouTube audience size is only top of funnel, you also need to estimate click through rate, conversion rate etc. There are many articles online about that, and many cheap experiments you can run.

Comment by cousin_it on Noticing Frame Differences · 2019-10-04T09:29:08.525Z · score: 9 (4 votes) · LW · GW

Seconding everything that Rohin said.

More generally, if you want to talk in an informed way about any science topic that's covered on LW (game theory, probability theory, computational complexity, mathematical logic, quantum mechanics, evolutionary biology, economics...) and you haven't read some conventional teaching materials and done at least a few conventional exercises, there's a high chance you'll be kidding yourself and others. Eliezer gives an impression of getting away with it, but a) he does read stuff and solve stuff, b) cutting corners has burned him a few times.

Comment by cousin_it on What are we assuming about utility functions? · 2019-10-03T13:39:19.256Z · score: 6 (3 votes) · LW · GW

It seems to me that every program behaves as if it was maximizing some utility function. You could try to restrict it by saying the utility function has to be "reasonable", but how?

  • If you say the utility function must have low complexity, that doesn't work - human values are pretty complex.
  • If you say the utility function has to be about world states, that doesn't work - human values are about entire world histories, you'd prevent suffering in the past if you could.
  • If you say the utility function has to be comprehensible to a human, that doesn't work - an AI extrapolating octopus values could give you something pretty alien.

So I'm having trouble spelling out precisely, even to myself, how AIs that satisfy the "utility hypothesis" differ from those that don't. How would you tell, looking at the AI and what it does?

Comment by cousin_it on Noticing Frame Differences · 2019-10-01T13:54:33.048Z · score: 16 (8 votes) · LW · GW

I think Vanessa is right. You're looking for a term to describe games where threats or cooperation are possible. The term for such games is non-zero-sum.

There are two kinds of games: zero-sum (or fixed-sum), where the sum of payoffs to players is always the same regardless of what they do. And non-zero-sum (or variable-sum), where the sum can vary based on what the players do. In the first kind of game, threats or cooperation don't exist, because anything that helps one player automatically hurts the other by the same amount. In the second kind, threats and cooperation are possible, e.g. there can be a button that nukes both players, which represents a threat and an opportunity for cooperation.

Calling a game "zero or negative sum" is just confusing the issue. You can give everyone a ton of money unconditionally, making the game "positive sum", and the strategic picture of the game won't change at all. The strategic feature you're interested in isn't the sign, but the variability of the sum, which is known as "non-zero-sum".

If you're thinking about strategic behavior, an LWish folk knowledge of prisoner's dilemmas and such is really not much to go on. Going through a textbook of game theory and solving exercises would be literally the best time investment. My favorite one is Ken Binmore's "Fun and Games", it's on my desk now. An updated version, "Playing for Real", can be downloaded for free.

Comment by cousin_it on The first step of rationality · 2019-09-29T20:49:13.333Z · score: 10 (6 votes) · LW · GW

I feel that focusing too much on myself is kind of a bad habit, or at least it makes me unhappy. (Bertrand Russell noticed the same.) It's more fun to do things or hang out with people.

Comment by cousin_it on A simple environment for showing mesa misalignment · 2019-09-26T11:42:27.617Z · score: 6 (4 votes) · LW · GW

That's a great illustration of what "off-distribution performance" means, thank you for writing this! Just to complete the last paragraph, some things were also abundant in the ancestral environment (like sunlight, or in-person socializing) that are scarce now, so many people end up kinda missing these things because they are important to development, but not spending very much effort to get them.

Comment by cousin_it on How can I reframe my study motivation? · 2019-09-25T08:09:41.072Z · score: 5 (2 votes) · LW · GW

Books I’ve been eyeing / trying to read include The Sequences, The Selfish Gene, and Superintelligence.

These works are written for a popular audience, they only teach you to talk the talk. I think it's better to read textbooks and solve exercises, that way you learn to walk the walk. For some topics, like AI risk, there aren't any textbooks with exercises; but you'll still do good by learning adjacent topics like logic/computation/ML, for which there are good textbooks.

A good strategy is to read a chapter, then solve all exercises not marked "very hard" before moving to the next. Otherwise - no reading ahead. If some exercise is blocking you, you can peek at the answer, but only after spending 5 uninterrupted minutes trying to solve the exercise.

Comment by cousin_it on Build a Causal Decision Theorist · 2019-09-24T08:54:29.213Z · score: 5 (3 votes) · LW · GW

If that’s the case, then I assume that you defect in the twin prisoner’s dilemma.

I do. I would rather be someone who didn’t. But I don’t see path to becoming that person without lobotomizing myself.

You could just cooperate, without taking such drastic measures, no?

Comment by cousin_it on Bíos brakhús · 2019-09-21T07:47:39.856Z · score: 7 (3 votes) · LW · GW

Consider the agent that wants to maximize amount of paperclips produced next week. Under the usual formalism, it has stable preferences. Under your proposed formalism, it has changing preferences - on Tuesday it no longer cares about amount of production on Monday. So it seems like this formalism loses information about stability. So I don't see the point.

Comment by cousin_it on Interview with Aella, Part I · 2019-09-20T08:25:05.346Z · score: 14 (9 votes) · LW · GW

She's probably alright in person, but the ideas don't seem to merit an interview.

Comment by cousin_it on Interview with Aella, Part I · 2019-09-20T05:35:42.284Z · score: 3 (1 votes) · LW · GW

Jacob is just really into these topics. His blog has polyamory interviews, a tutorial on online dating, a tutorial on being an escort, etc.

Comment by cousin_it on Matthew Barnett's Shortform · 2019-09-12T09:39:35.570Z · score: 7 (3 votes) · LW · GW

Huh? Most 8 year olds can't even make themselves study instead of playing Fortnite, and certainly don't understand the issues with unplanned pregnancies. I'd say 16-18 is about the right age where people can start relying on internal structure instead of external. Many take even longer, and need to join the army or something.

Comment by cousin_it on Matthew Barnett's Shortform · 2019-09-12T06:50:35.956Z · score: 6 (4 votes) · LW · GW

Yeah, that's one argument for tradition: it's simply not the pit of misery that its detractors claim it to be. But for parenting in particular, I think I can give an even stronger argument. Children aren't little seeds of goodness that just need to be set free. They are more like little seeds of anything. If you won't shape their values, there's no shortage of other forces in the world that would love to shape your children's values, without having their interests at heart.

Comment by cousin_it on [Link] Truth-telling is aggression in zero-sum frames (Jessica Taylor) · 2019-09-11T09:03:23.604Z · score: 3 (1 votes) · LW · GW

The usual story is that couching truth in politeness is positive-sum - it helps everyone, compared to a world where everyone's truth thorns are out all the time. Jessica does take a passing shot at that story ("non-zero-sum frames, of course, usually interpret truth-telling positively"), but it doesn't seem conclusive to me.

Comment by cousin_it on Political Violence and Distraction Theories · 2019-09-07T06:34:14.450Z · score: 3 (1 votes) · LW · GW

It would be really nice (for both this transcript and previous ones) if you didn't have speaker names in bold at the start of each paragraph, but only those paragraphs where the speaker changes. As it is, it's interfering with my reading quite a bit.

Comment by cousin_it on How to Throw Away Information · 2019-09-06T08:38:53.523Z · score: 5 (2 votes) · LW · GW

Yeah, it seems that if Y isn't uniquely determined by X, we can't reversibly erase Y from X. Let's say we flip two coins, X = the first, Y = the sum. Since S depends only on X and some randomness, knowing S is equivalent to knowing some distribution over X. If that distribution is non-informative about Y, it must be (1/2,1/2). But then we can't reconstruct X from S and Y.

Comment by cousin_it on The Power to Judge Startup Ideas · 2019-09-05T10:14:34.065Z · score: 9 (4 votes) · LW · GW

If we're talking about the power to say "this startup will fail", I feel like I had this power all along and it's not very useful, unless you can make money by shorting startups or something. It would be more useful to have a power that said "this startup will succeed" - that would actually make you rich, but this post doesn't seem to be it.

Comment by cousin_it on Book Review: Ages Of Discord · 2019-09-03T13:30:42.579Z · score: 15 (14 votes) · LW · GW

Many of the books you review seem to fall into a specific pattern: big speculative narratives for laypeople. I don't know what you find in them. To me such books are much less interesting than specialist literature, memoirs, or fiction.

Comment by cousin_it on How to Make Billions of Dollars Reducing Loneliness · 2019-08-31T23:13:56.773Z · score: 17 (8 votes) · LW · GW

Sociologists think there are three conditions necessary for making friends: proximity; repeated, unplanned interactions; and a setting that encourages people to let their guard down and confide in each other.

To me these sound more like conditions for making acquaintances, which leads to exactly the emptiness you describe. A true friend isn't just someone you spend time with, it's someone who won't betray you. Maybe true friendship requires going through some hardship together, learning from experience that the other person won't betray you. And maybe there's even a startup idea about setting up such experiences :-)

Comment by cousin_it on Embedded Agency via Abstraction · 2019-08-29T20:04:36.031Z · score: 5 (2 votes) · LW · GW

Yeah. I guess I was assuming that the agent knows the list of tuples and also knows that they came from the procedure I described; the distribution follows from that :-)

Comment by cousin_it on Six AI Risk/Strategy Ideas · 2019-08-29T14:07:15.416Z · score: 6 (3 votes) · LW · GW

It seems to me that AI will need to think about impossible worlds anyway - for counterfactuals, logical uncertainty, and logical updatelessness/trade. That includes worlds that are hard to simulate, e.g. "what if I try researching theory X and it turns out to be useless for goal Y?" So "logical doors" aren't that unlikely.

Comment by cousin_it on Embedded Agency via Abstraction · 2019-08-29T10:45:57.690Z · score: 5 (2 votes) · LW · GW

Hadn’t seen the dice example, is it from Jaynes? (I don’t yet see why you’re better off randomising)

Well, one way to forget the sum is to generate random pairs of dice for each possible sum and replace one of them with your actual pair. For example, if your dice came up (3 5), you can rewrite your memory with something like "the result was one of (1 1) (2 1) (3 1) (4 1) (4 2) (2 5) (3 5) (4 5) (6 4) (6 5) (6 6)". Is there a simpler way?

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-28T11:52:17.553Z · score: 5 (2 votes) · LW · GW

Ah, sorry, you're right. To prevent bucket brigades, it's enough to stop using oracles for N days whenever an N-day oracle has an erasure event, and the money from "erasure insurance" can help with that. When there are no erasure events, we can use oracles as often as we want. That's a big improvement, thanks!

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-28T11:00:09.430Z · score: 5 (2 votes) · LW · GW

Sure, in case of erasure you can decide to use oracles less, and compensate your clients with money you got from "erasure insurance" (since that's a low probability event). But that doesn't seem to solve the problem I'm talking about - UFAI arising naturally in erasure-worlds and spreading to non-erasure-worlds through oracles.

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-27T22:54:05.333Z · score: 3 (1 votes) · LW · GW

Yeah, agreed on both points.

Comment by cousin_it on A Personal Rationality Wishlist · 2019-08-27T22:52:27.494Z · score: 3 (1 votes) · LW · GW

I just googled a bit and apparently there are many kinds of "stepper bikes" that you ride standing up and the pedals move up and down, and it looks pretty fun. Not sure if they're better at climbing than regular bikes, though.

Comment by cousin_it on Matt Goldenberg's Short Form Feed · 2019-08-27T16:19:28.203Z · score: 5 (2 votes) · LW · GW

Yeah, I guess I wasn't separating these things. A belief like "capitalists take X% of the value created by workers" can feel important both for its moral urgency and for its explanatory power - in politics that's pretty typical.

Comment by cousin_it on Matt Goldenberg's Short Form Feed · 2019-08-27T15:06:35.167Z · score: 3 (1 votes) · LW · GW

Yeah. This problem is especially bad in politics. I've been calling it "importance disagreements", e.g. here and here. There's no definitive blogpost, you're welcome to write one :-)

Comment by cousin_it on A Personal Rationality Wishlist · 2019-08-27T14:19:07.246Z · score: 9 (5 votes) · LW · GW

Yeah. Many people say bikes are more efficient at transforming power to movement, so biking should be always easier according to physics, but in reality walking is sometimes easier. I can think of a couple explanations: 1) biking doesn't give the best leverage to your strongest muscles, so you end up tiring out the weaker ones; 2) at slow speed, balancing the bike takes extra effort comparable to walking. I suspect both can be fixed by changing the construction of the bike while still allowing high speed on level roads.

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-27T12:47:50.761Z · score: 5 (2 votes) · LW · GW

Submission?: high-bandwidth counterfactual oracles are dangerous and shouldn't be used. Explained in this comment.

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-27T12:06:11.027Z · score: 17 (6 votes) · LW · GW

Thinking about this some more, all high-bandwidth oracles (counterfactual or not) risk receiving messages crafted by future UFAI to take over the present. If the ranges of oracles overlap in time, such messages can colonize their way backwards from decades ahead. It's especially bad if humanity's FAI project depends on oracles - that increases the chance of UFAI in the world where oracles are silent, which is where the predictions come from.

One possible precaution is to use only short-range oracles, and never use an oracle while still in prediction range of any other oracle. But that has drawbacks: 1) it requires worldwide coordination, 2) it only protects the past. The safety of the present depends on whether you'll follow the precaution in the future. And people will be tempted to bend it, use longer or overlapping ranges to get more power.

In short, if humanity starts using high-bandwidth oracles, that will likely increase the chance of UFAI and hasten it. So such oracles are dangerous and shouldn't be used. Sorry, Stuart :-)

Comment by cousin_it on A Personal Rationality Wishlist · 2019-08-27T10:47:00.051Z · score: 5 (4 votes) · LW · GW

I'm confused. I can run up stairs from a standing start, but can't achieve the same acceleration at the same incline on a bike.

Comment by cousin_it on Six AI Risk/Strategy Ideas · 2019-08-27T08:59:54.315Z · score: 7 (3 votes) · LW · GW

Many such intuitions seem to rely on "doors" between worlds. That makes sense - if we have two rooms of animals connected by a door, then killing all animals in one room will just lead to it getting repopulated from the other room, which is better than killing all animals in both rooms with probability 1/2. So in that case there's indeed a difference between the two kinds of risk.

The question is, how likely is a door between two Everett branches, vs. a door connecting a possible world with an impossible world? With current tech, both are impossible. With sci-fi tech, both could be possible, and based on the same principle (simulating whatever is on the other side of the door). But maybe "quantum doors" are more likely than "logical doors" for some reason?

Comment by cousin_it on A Personal Rationality Wishlist · 2019-08-27T07:56:26.792Z · score: 4 (4 votes) · LW · GW

Biking uphill is harder than walking uphill, though. I wonder if there's a simple mechanical fix (apart from getting off your bike and walking it uphill).

Comment by cousin_it on Six AI Risk/Strategy Ideas · 2019-08-27T07:30:20.114Z · score: 8 (3 votes) · LW · GW

Multiple simultaneous DSAs under CAIS

Taking over the world is a big enough prize, compared to the wealth of a typical agent, that even a small chance of achieving it should already be enough to act. And waiting is dangerous if there's a chance of other agents outrunning you. So multiple agents having DSA but not acting for uncertainty reasons seems unlikely.

Logical vs physical risk aversion

Imagine you care about the welfare of two koalas living in separate rooms. Given a choice between both koalas dying with probability 1/2 or a randomly chosen koala dying with probability 1, why is the latter preferable?

You could say our situation is different because we're the koala. Fine. Imagine you're choosing between a 1/2 physical risk and a 1/2 logical risk to all humanity, but both of them will happen in 100 years when you're already dead, so the welfare of your copies isn't at question. Why is the physical risk preferable? How is that different from the koala situation?

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-26T21:37:39.337Z · score: 8 (3 votes) · LW · GW

Yeah. And low-bandwidth oracles can have a milder version of the same problem. Consider your "consequentialist" idea: if UFAI is about to arise, and one of the offered courses of action leads to UFAI getting stopped, then the oracle will recommend against that course of action (and for some other course where UFAI wins and maxes out the oracle's reward).

Comment by cousin_it on Contest: $1,000 for good questions to ask to an Oracle AI · 2019-08-26T12:44:13.327Z · score: 12 (5 votes) · LW · GW

Submission for a counterfactual oracle: precommit that, if the oracle stays silent, a week from now you'll try to write the most useful message to your past self, based on what happens in the world during that week. Ask the oracle to predict that message. This is similar to existing solutions, but slightly more meta, because the content of the message is up to your future self - it could be lottery numbers, science papers, disaster locations, or anything else that fits within the oracle's size limit. (If there's no size limit, just send the whole internet.)

You could also form a bucket brigade to relay messages from further ahead, but that's a bad idea. If the oracle's continued silence eventually leads to an unfriendly AI, it can manipulate the past by hijacking your chain of messages and thus make itself much more likely. The same is true for all high-bandwidth counterfactual oracles - they aren't unfriendly in themselves, but using them creates a thicket of "retrocausal" links that can be exploited by any potential future UFAI. The more UFAI risk grows, the less you should use oracles.

Comment by cousin_it on Can we really prevent all warming for less than 10B$ with the mostly side-effect free geoengineering technique of Marine Cloud Brightening? · 2019-08-24T19:48:10.637Z · score: 9 (4 votes) · LW · GW

The 2015 report "Climate Intervention: Reflecting Sunlight to Cool Earth" says existing instruments aren't precise enough to measure albedo change from such a project, and measuring its climate impact is even more tricky. That also makes small-scale experimentation difficult. Basically you'd have to go to 100% and then hope that it worked. As someone who ran many A/B tests, that makes me hesitant to press the button until we have better ways to measure the impact.