Being a Robust Agent

post by Raemon · 2018-10-04T21:58:25.522Z · score: 67 (30 votes) · LW · GW · 7 comments


  Game Theory in the Rationalsphere

Epistemic status: not adding anything new, but figured there should be a clearer reference post for this concept.

There's a concept which many LessWrong essays have pointed at it, but I don't think there's a single post really spelling it out. I've built up an understanding of it through conversations with Zvi and Critch, and reading particular posts by Eliezer such as Meta-Honesty [LW · GW]. (Note: none of them necessarily endorse this post, it's just my own understanding)

The idea is: you might want to become a more robust agent.

By default, humans are a kludgy bundle of ad-hoc impulses. But we have the ability to reflect upon our decision making, and the implications thereof, and derive better overall policies.

I don't think is quite the same thing as instrumental rationality (although it's tightly entwined). If your goals are simple and well-understood, and you're interfacing in a social domain with clear rules, the most instrumentally rational thing might be to not overthink it and follow common wisdom.

But it's particularly important if you want to coordinate with other agents, over the long term. Especially on ambitious, complicated projects in novel domains.

Some examples of this:

Game Theory in the Rationalsphere

The EA and Rationality worlds include lots of people with ambitious, complex goals. They have a bunch of common interests and probably should be coordinating on a bunch of stuff. But:

Being a robust agent means taking that into account, and executing strategies that work in a messy, mixed environment with confused allies, active adversaries, and sometimes people who are a little bit of both. (Although this includes creating credible incentives and punishments to deter adversaries from bothering, and encouraging allies to become less confused).

I'm still mulling over exactly how to translate any of this into actionable advice (for myself, let alone others). But all the other posts I wanted to write felt like they'd be easier if I could reference this concept in an off-the-cuff fashion without having to explain it in detail.


Comments sorted by top scores.

comment by habryka (habryka4) · 2018-10-04T23:24:36.315Z · score: 11 (6 votes) · LW · GW
If there isn't enough incentive for others to cooperate with you, don't get upset for them if they defect (or "hit the neutral button [LW · GW].") BUT maybe try to create a coordination mechanism so that there is enough incentive.

It seems like "getting upset" is often a pretty effective way of creating exactly the kind of incentive that leads to cooperation. I am reminded of the recent discussion on investing in the commons, where introducing a way to punish defectors greatly increased total wealth. Generalizing that to more everyday scenarios, it seems that being angry at someone is often (though definitely not always, and probably not in the majority of cases) a way to align incentives better.

(Note: I am not arguing in favor of people getting more angry more often, just saying that not getting angry doesn't seem like a core aspect of the "robust agent" concept that Raemon is trying to point at here)

comment by Raemon · 2018-10-05T00:17:30.106Z · score: 3 (2 votes) · LW · GW

Ah. The thing I was trying to point at here was the "Be Nice, At Least Until You Can Coordinate Meanness" thing.

The world is full of people who get upset at you for not living up to the norms they prefer. There are, in fact, so many people who will get upset for so many contradictory norms that it just doesn't make much sense to try to live up to them all, and you shouldn't be that surprised that it doesn't work.

The motivating examples were something like "Bob gets upset at people for doing thing X. A little while later, people are still doing thing X. Bob gets upset again. Repeat a couple times. Eventually it (should, according to me) become clear that a) getting upset isn't having the desired effect, or at most is producing the effect of "superficially avoid behavior X when Bob is around". And meanwhile, getting upset is sort of emotionally exhausting and the cost doesn't seem worth it."

I do agree that "get upset" (or more accurately "perform upset-ness") works reasonably well as localized strategy, and can scale up a bit if you can rally more people to get upset on your behalf. But the post was motivated by people who seemed to get upset... unreflectively?

comment by Richard_Kennaway · 2018-10-05T19:07:54.645Z · score: 3 (2 votes) · LW · GW
Eventually it (should, according to me) become clear that a) getting upset isn't having the desired effect, or at most is producing the effect of "superficially avoid behavior X when Bob is around".

Or "avoid Bob", "drop Bob as a friend", "leave Bob out of anything new", etc. What, if anything, becomes clear to Bob or to those he gets angry with is very underdetermined.

comment by Raemon · 2018-10-05T00:38:08.048Z · score: 2 (1 votes) · LW · GW

(I updated the wording a bit but am not quite happy with it. I do think the underlying point was fairly core to the robust agent thing: you want policies for achieving your goals that actually work. "Getting upset in situation X" might be a good policy, but if you're enacting it as an adaption-executor rather than as a considered policy, it may not actually be adaptive in your circumstance)

comment by shminux · 2018-10-05T03:42:57.455Z · score: 0 (2 votes) · LW · GW

Seems like you are trying to elaborate on Eliezer's maxim Rationality is Systematized Winning [LW · GW]. Some of what you mentioned implies shedding any kind of ideology, though sometimes wearing a credible mask of having one. Also being smarter than most people around you, both intellectually and emotionally. Of course, if you are already one of those people, then you don't need rationality, because, in all likelihood, you have already succeeded in what yo

comment by Raemon · 2018-10-05T23:00:38.854Z · score: 5 (3 votes) · LW · GW


I think the thing I'm gesturing at here is related but different to the systemized winning thing.

Some distinctions that I think make sense. (But would defer to people who seem further ahead in this path than I)

  • Systemized Winning – The practice of identifying and doing the thing that maximizes your goal (or, if you're not a maximizer, ensures a good distribution of satisfactory outcomes)

  • Law Thinking – (i.e. Law vs Tools [LW · GW]) – Lawful thinking is having a theoretical understanding of what would be the optimal action for maximizing utility, given various constraints. This is a useful idea for a civilization to have. Whether it's directly helpful for you to maximize your utility depends on your goals, environment, and shape-of-your-mental-faculties.

  • I'd guess most humans of average intelligence), what you want is for some else to do Law thinking, figuring out the best thing, figure out the best approximation of the best thing, and then distill it down to something you can easily learn.

  • Being a Robust Agent - Particular strategies, for pursuing your goals, wherein you strive to have rigorous policy-making, consistent preferences (or consistent ways to resolve inconsistency), ways to reliably trust yourself and others, etc.

  • You might summarize this as "the strategy of embodying lawful thinking to achieve your goals." (not sure if that quite makes sense)

  • I expect this to be most useful for people who either

  • find rigorous policy-level, consistency-driven thinking easy, such that it's just the most natural way for them to approach their problems

  • have an preference to ensure that their solutions to problems don't break down in edge cases (i.e. nerds often like having explicit understandings of things independent of how useful it is)

  • people with goals that will likely cause them to run into edge cases, such that it's more valuable to have figured out in advance how to handle those.

When you look at the Meta-Honesty [LW · GW] post... I don't think the average person will find it a particularly valuable tool for achieving their goals. But I expect there to be a class of person who actually needs it as a tool to figure out how to trust people in domains where it's often necessary to hide or obfuscate information.

Whether you want your decision-theory robust enough such that Omega simulating you will give you a million dollars depends a lot on whether you expect Omega to actually be simulating you and making that decision. I know at least some people who are actually arranging their life with that sort of concern in mine.

comment by Raemon · 2018-10-05T23:08:56.837Z · score: 5 (3 votes) · LW · GW

I do think there's an alternate frame where you just say "no, rationality is specifically about being a robust agent. There are other ways to be effective, but rationality is the particular way of being effective where you try to have cognitive patterns with good epistemology and robust decision theory."

This is in tension with the "rationalists should win", thing. Shrug.

I think it's important to have at least one concept that is "anyone with goals should ultimately be trying to solve them the best way possible", and at least one concept that is "you might consider specifically studying cognitive patterns and policies and a cluster of related things, as a strategy to pursue particular goals."