My take on higher-order game theory 2021-11-30T05:56:00.990Z
Nisan's Shortform 2021-09-12T06:05:04.965Z
April 15, 2040 2021-05-04T21:18:08.912Z
What is a VNM stable set, really? 2021-01-25T05:43:59.496Z
Why you should minimax in two-player zero-sum games 2020-05-17T20:48:03.770Z
Book report: Theory of Games and Economic Behavior (von Neumann & Morgenstern) 2020-05-11T09:47:00.773Z
Conflict vs. mistake in non-zero-sum games 2020-04-05T22:22:41.374Z
Beliefs at different timescales 2018-11-04T20:10:59.223Z
Counterfactuals and reflective oracles 2018-09-05T08:54:06.303Z
Counterfactuals, thick and thin 2018-07-31T15:43:59.187Z
An environment for studying counterfactuals 2018-07-11T00:14:49.756Z
Logical counterfactuals and differential privacy 2018-02-04T00:17:43.000Z
Oracle machines for automated philosophy 2015-02-17T15:10:04.000Z
Meetup : Berkeley: Beta-testing at CFAR 2014-03-19T05:32:26.521Z
Meetup : Berkeley: Implementation Intentions 2014-02-27T07:06:29.784Z
Meetup : Berkeley: Ask vs. Guess (vs. Tell) Culture 2014-02-19T20:16:30.017Z
Meetup : Berkeley: The Twelve Virtues 2014-02-12T19:56:53.045Z
Meetup : Berkeley: Talk on communication 2014-01-24T03:57:50.244Z
Meetup : Berkeley: Weekly goals 2014-01-22T18:16:38.107Z
Meetup : Berkeley meetup: 5-minute exercises 2014-01-15T21:02:26.223Z
Meetup : Meetup at CFAR, Wednesday: Nutritionally complete bread 2014-01-07T10:25:33.016Z
Meetup : Berkeley: Hypothetical Apostasy 2013-06-12T17:53:40.651Z
Meetup : Berkeley: Board games 2013-06-04T16:21:17.574Z
Meetup : Berkeley: The Motivation Hacker by Nick Winter 2013-05-28T06:02:07.554Z
Meetup : Berkeley: To-do lists and other systems 2013-05-22T01:09:51.917Z
Meetup : Berkeley: Munchkinism 2013-05-14T04:25:21.643Z
Meetup : Berkeley: Information theory and the art of conversation 2013-05-05T22:35:00.823Z
Meetup : Berkeley: Dungeons & Discourse 2013-03-03T06:13:05.399Z
Meetup : Berkeley: Board games 2013-01-29T03:09:23.841Z
Meetup : Berkeley: CFAR focus group 2013-01-23T02:06:35.830Z
A fungibility theorem 2013-01-12T09:27:25.637Z
Proof of fungibility theorem 2013-01-12T09:26:09.484Z
Meetup : Berkeley meetup: Board games! 2013-01-08T20:40:42.392Z
Meetup : Berkeley: How Robot Cars Are Near 2012-12-17T19:46:33.980Z
Meetup : Berkeley: Boardgames 2012-12-05T18:28:09.814Z
Meetup : Berkeley meetup: Hermeneutics! 2012-11-26T05:40:29.186Z
Meetup : Berkeley meetup: Deliberate performance 2012-11-13T23:58:50.742Z
Meetup : Berkeley meetup: Success stories 2012-10-23T22:10:43.964Z
Meetup : Different location for Berkeley meetup 2012-10-17T17:19:56.746Z
[Link] "Fewer than X% of Americans know Y" 2012-10-10T16:59:38.114Z
Meetup : Different location: Berkeley meetup 2012-10-03T08:26:09.910Z
Meetup : Pre-Singularity Summit Overcoming Bias / Less Wrong Meetup Party 2012-09-24T14:46:05.475Z
Meetup : Vienna meetup 2012-09-22T13:14:23.668Z
Meetup report: How harmful is cannabis, and will you change your habits? 2012-09-09T04:50:10.943Z
Meetup : Berkeley meetup: Cannabis, Decision-Making, And A Chance To Change Your Mind 2012-08-29T03:50:23.867Z
Meetup : Berkeley meetup: Operant conditioning game 2012-08-21T15:07:36.431Z
Meetup : Berkeley meetup: Discussion about startups 2012-08-14T17:09:10.149Z
Meetup : Berkeley meetup: Board game night 2012-08-01T06:40:27.322Z
Meetup : Berkeley meetup: Rationalist group therapy 2012-07-25T05:50:53.138Z
Meetup : Berkeley meetup: Argument mapping software 2012-07-18T19:50:27.973Z


Comment by Nisan on Jimrandomh's Shortform · 2022-01-13T05:31:45.420Z · LW · GW

Suppose Alice has a crush on Bob and wants to sort out her feelings with Carol's help. Is it bad for Alice to inform Carol about the crush on condition of confidentiality?

Comment by Nisan on Animal welfare EA and personal dietary options · 2022-01-05T19:28:58.959Z · LW · GW

Your Boycott-itarianism could work just through market signals. As long as your diet makes you purchase less high-cruelty food and more low-cruelty food, you'll increase the average welfare of farm animals, right? Choosing a simple threshold and telling everyone about it is additionally useful for coordination and maybe sending farmers non-market signals, if you believe those work.

If you really want the diet to be robustly good with respect to the question of whether farm animals' lives are net-positive, you'd want to tune the threshold so as not to change the number of animals consumed (per person per year, compared to a default diet, over the whole community). One would have to estimate price elasticities and dig into the details of "cage-free", etc.

Comment by Nisan on My take on higher-order game theory · 2021-12-02T04:48:20.614Z · LW · GW

Yep, I skimmed it by looking at the colorful plots that look like Ising models and reading the captions. Those are always fun.

Comment by Nisan on My take on higher-order game theory · 2021-12-01T18:59:31.031Z · LW · GW

No, I just took a look. The spin glass stuff looks interesting!

Comment by Nisan on My take on higher-order game theory · 2021-12-01T07:23:54.501Z · LW · GW

I think you're saying , right? In that case, since embeds into , we'd have embedding into . So not really a step up.

If you want to play ordinal games, you could drop the requirement that agents are computable / Scott-continuous. Then you get the whole ordinal hierarchy. But then we aren't guaranteed equilibria in games between agents of the same order.

I suppose you could have a hybrid approach: Order is allowed to be discontinuous in its order- beliefs, but higher orders have to be continuous? Maybe that would get you to .

Comment by Nisan on Tears Must Flow · 2021-11-30T21:52:35.384Z · LW · GW

And as a matter of scope, your reaction here is incorrect. [...] Reacting to it as a synecdoche of the agricultural system does not seem useful.

On my reading, the OP is legit saddened by that individual turkey. One could argue that scope demands she be a billion times sadder all the time about poultry farming in general, but that's infeasible. And I don't think that's a reductio against feeling sad about an individual turkey.

Sometimes, sadness and crying are about integrating one's beliefs. There's an intuitive part of your mind that doesn't understand your models of big, global problems. But, like a child, it understands the small tragedies you encounter up close. If it's shocked and surprised, then it is still learning what the rest of you knows about the troubles of the world. If it's angry and outraged, then there's a sense in which those feelings are "about" the big, global problems too.

Comment by Nisan on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T01:11:03.255Z · LW · GW

I apologize, I shouldn't have leapt to that conclusion.

Comment by Nisan on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-23T21:56:21.042Z · LW · GW

it legitimately takes the whole 4 years after that to develop real AGI that ends the world. FINE. SO WHAT. EVERYONE STILL DIES.

By Gricean implicature, "everyone still dies" is relevant to the post's thesis. Which implies that the post's thesis is that humanity will not go extinct. But the post is about the rate of AI progress, not human extinction.

This seems like a bucket error, where "will takeoff be fast or slow?" and "will AI cause human extinction?" are put in the same bucket.

Comment by Nisan on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-23T21:50:22.515Z · LW · GW

The central hypothesis of "takeoff speeds" is that at the time of serious AGI being developed, it is perfectly anti-Thielian in that it is devoid of secrets

No, the slow takeoff model just precludes there being one big secret that unlocks both 30%/year growth and dyson spheres. It's totally compatible with a bunch of medium-sized $1B secrets that different actors discover, adding up to hyperbolic economic growth in the years leading up to "rising out of the atmosphere".

Rounding off the slow takeoff hypothesis to "lots and lots of little innovations adding up to every key AGI threshold, which lots of actors are investing $10 million in at a time" seems like black-and-white thinking, demanding that the future either be perfectly Thielien or perfectly anti-Thielien. The real question is a quantitative one — how lumpy will takeoff be?

Comment by Nisan on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-23T21:45:02.283Z · LW · GW

"Takeoff Speeds" has become kinda "required reading" in discussions on takeoff speeds. It seems like Eliezer hadn't read it until September of this year? He may have other "required reading" from the past four years to catch up on.

(Of course, if one predictably won't learn anything from an article, there's not much point in reading it.)

Comment by Nisan on Transcript: "You Should Read HPMOR" · 2021-11-02T20:37:48.174Z · LW · GW

I don't think "viciousness" is the word you want to use here.

Comment by Nisan on Self-Integrity and the Drowning Child · 2021-10-27T03:50:05.210Z · LW · GW

Ah, great! To fill in some of the details:

  • Given agents and numbers such that , there is an aggregate agent called which means "agents and acting together as a group, in which the relative power of versus is the ratio of to ". The group does not make decisions by combining their utility functions, but instead by negotiating or fighting or something.

  • Aggregation should be associative, so .

  • If you spell out all the associativity relations, you'll find that aggregation of agents is an algebra over the operad of topological simplices. (See Example 2

  • Of course we still have the old VNM-rational utility-maximizing agents. But now we also have aggregates of such agents, which are "less Law-aspiring" than their parts.

  • In order to specify the behavior of an aggregate, we might need more data than the component agents and their relative power . In that case we'd use some other operad.

Comment by Nisan on Lies, Damn Lies, and Fabricated Options · 2021-10-20T23:34:11.196Z · LW · GW

I like that you glossed the phrase "have your cake and eat it too":

It's like a toddler thinking that they can eat their slice of cake, and still have that very same slice of cake available to eat again the next morning.

I also like that you explained the snowclone "lies, damned lies, and statistics". I'm familiar with both of these cliches, but they're generally overused to the point of meaninglessness. It's clear you used them with purpose.

Comment by Nisan on My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage) · 2021-10-17T21:21:30.885Z · LW · GW

The psychotic break you describe sounds very scary and unpleasant, and I'm sorry you experienced that.

Comment by Nisan on There's No Fire Alarm for Artificial General Intelligence · 2021-09-22T23:57:03.158Z · LW · GW

Typo: "common, share, agreed-on" should be "...shared...".

Comment by Nisan on Quantum Non-Realism · 2021-09-15T05:18:47.150Z · LW · GW


Comment by Nisan on Quantum Non-Realism · 2021-09-14T17:09:03.572Z · LW · GW

"Shut up (−1/3)i and calculate." is a typo that isn't present in the original post.

Comment by Nisan on Nisan's Shortform · 2021-09-12T06:05:06.375Z · LW · GW

People are fond of using the neologism "cruxy", but there's already a word for that: "crucial". Apparently this sense of "crucial" can be traced back to Francis Bacon.

Comment by Nisan on Final Words · 2021-08-25T08:36:02.062Z · LW · GW

This story originally had a few more italicized words, and they make a big difference:

"Don't," Jeffreyssai said. There was real pain in it.


"I do not know," said Jeffreyssai, from which the overall state of the evidence was obvious enough.

Some of the formatting must have been lost when it was imported to LessWrong 2.0. You can see the original formatting at and in Rationality: AI to Zombies.

Comment by Nisan on Mildly against COVID risk budgets · 2021-08-17T22:41:57.496Z · LW · GW

It seems to me that if I make some reasonable-ish assumptions, then 2 micromorts is equivalent to needing to drive for an hour at a random time in my life. I expect the value of my time to change over my life, but I'm not sure in which direction. So equating 2 micromorts with driving for an hour tonight is probably not a great estimate.

How do you deal with this? Have you thought about it and concluded that the value of your time today is a good estimate of the average value over your life? Or are you assuming that the value of your time won't change by more than, say, a factor of 2 over your life?

Comment by Nisan on The Landmark Forum — a rationalist's first impression · 2021-08-10T19:31:41.085Z · LW · GW

Abstract and pdf. Wow, 500 pages!

Comment by Nisan on [Link] Musk's non-missing mood · 2021-07-13T21:40:39.940Z · LW · GW

Agreed that it shouldn't be hard to do that, but I expect that people will often continue to do what they find intrinsically motivating, or what they're good at, even if it's not overall a good idea. If this article can be believed, a senior researcher said that they work on capabilities because "the prospect of discovery is too sweet".

Comment by Nisan on Daniel Kokotajlo's Shortform · 2021-07-08T00:21:49.703Z · LW · GW

It's fine to say that if you want the conversation to become a discussion of AI timelines. Maybe you do! But not every conversation needs to be about AI timelines.

Comment by Nisan on Confusions re: Higher-Level Game Theory · 2021-07-02T17:49:03.979Z · LW · GW

I feel excited about this framework! Several thoughts:

I especially like the metathreat hierarchy. It makes sense because if you completely curry it, each agent sees the foe's action, policy, metapolicy, etc., which are all generically independent pieces of information. But it gets weird when an agent sees an action that's not compatible with the foe's policy.

You hinted briefly at using hemicontinuous maps of sets instead of or in addition to probability distributions, and I think that's a big part of what makes this framework exciting. Maybe if one takes a bilimit of Scott domains or whatever, you can have an agent that can be understood simultaneously on multiple levels, and so evade commitment races. I haven't thought much about that.

I think you're right that the epiphenomenal utility functions are not good. I still think using reflective oracles is a good idea. I wonder if the power of Kakutani fixed points (magical reflective reasoning) can be combined with the power of Kleene fixed points (iteratively refining commitments).

Comment by Nisan on Is there a "coherent decisions imply consistent utilities"-style argument for non-lexicographic preferences? · 2021-06-30T18:43:53.386Z · LW · GW

Oh you're right, I was confused.

Comment by Nisan on Is there a "coherent decisions imply consistent utilities"-style argument for non-lexicographic preferences? · 2021-06-29T21:10:01.514Z · LW · GW

I've no idea if this example has appeared anywhere else. I'm not sure how seriously to take it.

Comment by Nisan on Is there a "coherent decisions imply consistent utilities"-style argument for non-lexicographic preferences? · 2021-06-29T21:08:08.539Z · LW · GW

Consider the following game: At any time , you may say "stop!", in which case you'll get the lottery that resolves to an outcome you value at with probability , and to an outcome you value at with probability . If you don't say "stop!" in that time period, we set .

Let's say at every instant in you can decide to either say "stop!" or to wait a little longer. (A dubious assumption, as it lets you make infinitely many decisions in finite time.) Then you'll naturally wait until and get a payoff of . It would have been better for you to say "stop!" at , in which case you'd get .

You can similarly argue that it's irrational for your utility to be discontinuous in the amount of wine in your glass: Otherwise you'll let the waiter fill up your glass and then be disappointed the instant it's full.

Comment by Nisan on Sam Altman and Ezra Klein on the AI Revolution · 2021-06-27T22:04:53.027Z · LW · GW

I haven't seen a writeup anywhere of how it was trained.

Comment by Nisan on Sam Altman and Ezra Klein on the AI Revolution · 2021-06-27T07:03:51.769Z · LW · GW

The instruction-following model Altman mentions is documented here. I didn't notice it had been released!

Comment by Nisan on How one uses set theory for alignment problem? · 2021-05-30T07:37:20.938Z · LW · GW

See section 2 of this Agent Foundations research program and citations for discussion of the problems of logical uncertainty, logical counterfactuals, and the Löbian obstacle. Or you can read this friendly overview. Gödel-Löb provability logic has been used here.

I don't know of any application of set theory to agent foundations research. (Like large cardinals, forcing, etc.)

Comment by Nisan on Dario Amodei leaves OpenAI · 2021-05-28T20:02:25.556Z · LW · GW

Ah, 90% of the people discussed on this post are now working for Anthropic, along with a few other ex-OpenAI safety people.

Comment by Nisan on The Homunculus Problem · 2021-05-28T19:28:01.372Z · LW · GW

Here's a fun and pointless way one could rescue the homunculus model: There's an infinite regress of homunculi, each of which sees a reconstructed image. As you pass up the chain of homunculi, the shadow gets increasingly attenuated, approaching but never reaching complete invisibility. Then we identify "you" with a suitable limit of the homunculi, and what you see is the entire sequence of images under some equivalence relation which "forgets" how similar A and B were early in the sequence, but "remembers" the presence of the shadow.

Comment by Nisan on The Homunculus Problem · 2021-05-28T19:06:45.603Z · LW · GW

The homunculus model says that all visual perception factors through an image constructed in the brain. One should be able to reconstruct this image by asking a subject to compare the brightness of pairs of checkerboard squares. A simplistic story about the optical illusion is that the brain detects the shadow and then adjusts the brightness of the squares in the constructed image to exactly compensate for the shadow, so the image depicts the checkerboard's inferred intrinsic optical properties. Such an image would have no shadow, and since that's all the homunculus sees, the homunculus wouldn't perceive a shadow.

That story is not quite right, though. Looking at the picture, the black squares in the shadow do seem darker than the dark squares outside the shadow, and similarly for the white squares. I think if you reconstructed the virtual image using the above procedure you'd get an image with an attenuated shadow. Maybe with some more work you could prove that the subject sees a strong shadow, not an attenuated one, and thereby rescue Abram's argument.

Edit: Sorry, misread your comment. I think the homunculus theory is that in the real image, the shadow is "plainly visible", but the reconstructed image in the brain adjusts the squares so that the shadow is no longer present, or is weaker. Of course, this raises the question of what it means to say the shadow is "plainly visible"...

Comment by Nisan on The Homunculus Problem · 2021-05-27T23:29:33.753Z · LW · GW

This is the sort of problem Dennett's Consciousness Explained addresses. I wish I could summarize it here, but I don't remember it well enough.

It uses the heterophenomenological method, which means you take a dataset of earnest utterances like "the shadow appears darker than the rest of the image" and "B appears brighter than A", and come up with a model of perception/cognition to explain the utterances. In practice, as you point out, homunculus models won't explain the data. Instead the model will say that different cognitive faculties will have access to different pieces of information at different times.

Comment by Nisan on The Argument For Spoilers · 2021-05-22T20:30:14.045Z · LW · GW

Very interesting. I would guess that to learn in the presence of spoilers, you'd need not only a good model of how you think, but also a way of updating the way you think according to the model's recommendations. And I'd guess this is easiest in domains where your object-level thinking is deliberate rather than intuitive, which would explain why the flashcard task would be hardest for you.

When I read about a new math concept, I eventually get the sense that my understanding of it is "fake", and I get "real" understanding by playing with the concept and getting surprised by its behavior. I assumed the surprise was essential for real understanding, but maybe it's sufficient to track which thoughts are "real" vs. "fake" and replace the latter with the former.

Comment by Nisan on The Argument For Spoilers · 2021-05-21T21:31:20.755Z · LW · GW

Have you had any success learning the skill of unseeing?

  • Are you able to memorize things by using flashcards backwards (looking at the answer before the prompt) nearly as efficiently as using them the usual way?
  • Are you able to learn a technical concept from worked exercises nearly as well as by trying the exercises before looking at the solutions?
  • Given a set of brainteasers with solutions, can you accurately predict how many of them you would have been able to solve in 5 minutes if you had not seen the solutions?
Comment by Nisan on Reflexive Oracles and superrationality: prisoner's dilemma · 2021-05-13T21:31:04.970Z · LW · GW

See also this comment from 2013 that has the computable version of NicerBot.

Comment by Nisan on Prisoner's Dilemma (with visible source code) Tournament · 2021-05-13T21:26:54.011Z · LW · GW

This algorithm is now published in "Robust program equilibrium" by Caspar Oesterheld, Theory and Decision (2019) 86:143–159,, which calls it ϵGroundedFairBot.

The paper cites this comment by Jessica Taylor, which has the version that uses reflective oracles (NicerBot). Note also the post by Stuart Armstrong it's responding to, and the reply by Vanessa Kosoy. The paper also cites a private conversation with Abram Demski. But as far as I know, the parent to this comment is older than all of these.

Comment by Nisan on Challenge: know everything that the best go bot knows about go · 2021-05-11T10:05:42.256Z · LW · GW

Or maybe it means we train the professional in the principles and heuristics that the bot knows. The question is if we can compress the bot's knowledge into, say, a 1-year training program for professionals.

There are reasons to be optimistic: We can discard information that isn't knowledge (lossy compression). And we can teach the professional in human concepts (lossless compression).

Comment by Nisan on Challenge: know everything that the best go bot knows about go · 2021-05-11T05:44:58.846Z · LW · GW

This sounds like a great goal, if you mean "know" in a lazy sense; I'm imagining a question-answering system that will correctly explain any game, move, position, or principle as the bot understands it. I don't believe I could know all at once everything that a good bot knows about go. That's too much knowledge.

Comment by Nisan on April 15, 2040 · 2021-05-06T03:39:09.354Z · LW · GW

The assistant could have a private key generated by the developer, held in a trusted execution environment. The assistant could invoke a procedure in the trusted environment that dumps the assistant's state and cryptographically signs it. It would be up to the assistant to make a commitment in such a way that it's possible to prove that a program with that state will never try to break the commitment. Then to trust the assistant you just have to trust the datacenter administrator not to tamper with the hardware, and to trust the developer not to leak the private key.

Comment by Nisan on Psyched out · 2021-05-06T03:24:30.063Z · LW · GW

Welcome! I recommend checking out the Sequences. It's what I started with.

Comment by Nisan on April 15, 2040 · 2021-05-05T06:42:54.158Z · LW · GW

Yep, that is a good question and I'm glad you're asking it!

I don't know the answer. One part of it is whether the assistant is able and willing to interact with me in a way that is compatible with how I want to grow as a person.

Another part of the question is whether people in general want to become more prosocial or more cunning, or whatever. Or if they even have coherent desires around this.

Another part is whether it's possible for the assistant to follow instructions while also helping me reach my personal growth goals. I feel like there's some wiggle room there. What if, after I asked whether I'd be worse off if the government collapsed, the assistant had said "Remember when we talked about how you'd like to get better at thinking through the consequences of your actions? What do you think would happen if the government collapsed, and how would that affect people?"

Comment by Nisan on April 15, 2040 · 2021-05-05T06:28:01.552Z · LW · GW

Yeah, I would be very nervous about making an exception to my assistant's corrigibility. Ultimately, it would be prudent to be able to make some hard commitments after thinking very long and carefully about how to do that. In the meantime, here are a couple corrigibility-preserving commitment mechanisms off the top of my head:

  • Escrow: Put resources in a dumb incorrigible box that releases them under certain conditions.
  • The AI can incorrigibly make very short-lived commitments during atomic actions (like making a purchase).

Are these enough to maintain competitiveness?

Comment by Nisan on April 15, 2040 · 2021-05-04T23:18:51.231Z · LW · GW

Yeah, I spend at least as much time interacting with my phone/computer as with my closest friends. So if my phone were smarter, it would affect my personal development as much as my friends do, which is a lot.

Comment by Nisan on Dario Amodei leaves OpenAI · 2021-05-03T08:33:27.117Z · LW · GW

Also Daniela Amodei, Nicholas Joseph, and Amanda Askell apparently left in December, January, and February, according to their LinkedIn profiles.

Comment by Nisan on Homeostatic Bruce · 2021-04-09T06:04:05.434Z · LW · GW

Probably this one: Stuck in the middle with Bruce (first mentioned on Less Wrong here.

Comment by Nisan on My research methodology · 2021-03-24T05:37:15.915Z · LW · GW

Red-penning is a general problem-solving method that's kinda similar to this research methodology.

Comment by Nisan on Dario Amodei leaves OpenAI · 2021-02-05T17:43:13.850Z · LW · GW

Also Jacob Jackson left, saying his new project is

Comment by Nisan on Dario Amodei leaves OpenAI · 2021-02-05T17:06:58.753Z · LW · GW

Gwern reports that Tom Brown, Sam McCandlish, Tom Henighan, and Ben Mann have also left.