A reply to Agnes Callard 2020-06-28T03:25:27.378Z · score: 91 (25 votes)
Public Positions and Private Guts [Transcript] 2020-06-26T23:00:52.838Z · score: 21 (6 votes)
How alienated should you be? 2020-06-14T15:55:24.043Z · score: 35 (18 votes)
Outperforming the human Atari benchmark 2020-03-31T19:33:46.355Z · score: 59 (23 votes)
Mod Notice about Election Discussion 2020-01-29T01:35:53.947Z · score: 63 (22 votes)
Circling as Cousin to Rationality 2020-01-01T01:16:42.727Z · score: 72 (35 votes)
Self and No-Self 2019-12-29T06:15:50.192Z · score: 47 (17 votes)
T-Shaped Organizations 2019-12-16T23:48:13.101Z · score: 51 (14 votes)
ialdabaoth is banned 2019-12-13T06:34:41.756Z · score: 31 (18 votes)
The Bus Ticket Theory of Genius 2019-11-23T22:12:17.966Z · score: 66 (20 votes)
Vaniver's Shortform 2019-10-06T19:34:49.931Z · score: 10 (1 votes)
Vaniver's View on Factored Cognition 2019-08-23T02:54:00.915Z · score: 41 (9 votes)
Conversation on forecasting with Vaniver and Ozzie Gooen 2019-07-30T11:16:58.633Z · score: 43 (11 votes)
Commentary On "The Abolition of Man" 2019-07-15T18:56:27.295Z · score: 65 (15 votes)
Is there a guide to 'Problems that are too fast to Google'? 2019-06-17T05:04:39.613Z · score: 49 (15 votes)
Steelmanning Divination 2019-06-05T22:53:54.615Z · score: 152 (61 votes)
Public Positions and Private Guts 2018-10-11T19:38:25.567Z · score: 95 (30 votes)
Maps of Meaning: Abridged and Translated 2018-10-11T00:27:20.974Z · score: 54 (22 votes)
Compact vs. Wide Models 2018-07-16T04:09:10.075Z · score: 32 (13 votes)
Thoughts on AI Safety via Debate 2018-05-09T19:46:00.417Z · score: 88 (21 votes)
Turning 30 2018-05-08T05:37:45.001Z · score: 75 (24 votes)
My confusions with Paul's Agenda 2018-04-20T17:24:13.466Z · score: 90 (22 votes)
LW Migration Announcement 2018-03-22T02:18:19.892Z · score: 139 (37 votes)
LW Migration Announcement 2018-03-22T02:17:13.927Z · score: 2 (2 votes)
Leaving beta: Voting on moving to 2018-03-11T23:40:26.663Z · score: 6 (6 votes)
Leaving beta: Voting on moving to 2018-03-11T22:53:17.721Z · score: 139 (42 votes)
LW 2.0 Open Beta Live 2017-09-21T01:15:53.341Z · score: 23 (23 votes)
LW 2.0 Open Beta starts 9/20 2017-09-15T02:57:10.729Z · score: 24 (24 votes)
Pair Debug to Understand, not Fix 2017-06-21T23:25:40.480Z · score: 8 (8 votes)
Don't Shoot the Messenger 2017-04-19T22:14:45.585Z · score: 11 (11 votes)
The Quaker and the Parselmouth 2017-01-20T21:24:12.010Z · score: 6 (7 votes)
Announcement: Intelligence in Literature Prize 2017-01-04T20:07:50.745Z · score: 9 (9 votes)
Community needs, individual needs, and a model of adult development 2016-12-17T00:18:17.718Z · score: 12 (13 votes)
Contra Robinson on Schooling 2016-12-02T19:05:13.922Z · score: 4 (5 votes)
Downvotes temporarily disabled 2016-12-01T17:31:41.763Z · score: 17 (18 votes)
Articles in Main 2016-11-29T21:35:17.618Z · score: 3 (4 votes)
Linkposts now live! 2016-09-28T15:13:19.542Z · score: 27 (30 votes)
Yudkowsky's Guide to Writing Intelligent Characters 2016-09-28T14:36:48.583Z · score: 4 (5 votes)
Meetup : Welcome Scott Aaronson to Texas 2016-07-25T01:27:43.908Z · score: 1 (2 votes)
Happy Notice Your Surprise Day! 2016-04-01T13:02:33.530Z · score: 14 (15 votes)
Posting to Main currently disabled 2016-02-19T03:55:08.370Z · score: 22 (25 votes)
Upcoming LW Changes 2016-02-03T05:34:34.472Z · score: 46 (47 votes)
LessWrong 2.0 2015-12-09T18:59:37.232Z · score: 92 (96 votes)
Meetup : Austin, TX - Petrov Day Celebration 2015-09-15T00:36:13.593Z · score: 1 (2 votes)
Conceptual Specialization of Labor Enables Precision 2015-06-08T02:11:20.991Z · score: 10 (11 votes)
Rationality Quotes Thread May 2015 2015-05-01T14:31:04.391Z · score: 9 (10 votes)
Meetup : Austin, TX - Schelling Day 2015-04-13T14:19:21.680Z · score: 1 (2 votes)
Sapiens 2015-04-08T02:56:25.114Z · score: 42 (36 votes)
Thinking well 2015-04-01T22:03:41.634Z · score: 28 (29 votes)
Rationality Quotes Thread April 2015 2015-04-01T13:35:48.660Z · score: 7 (9 votes)


Comment by vaniver on (answered: yes) Has anyone written up a consideration of Downs's "Paradox of Voting" from the perspective of MIRI-ish decision theories (UDT, FDT, or even just EDT)? · 2020-07-08T22:27:20.376Z · score: 2 (1 votes) · LW · GW
In an election with two choices, in a model where everybody has 50% chance of voting for either side, I don't think the claim is true.

I also think that in that case, the odds of a tie don't decrease faster than linearly, but you need to take into account symmetry arguments and precision arguments. That is:

Suppose there are 2N other voters and everyone else votes by flipping a coin. Then the number of votes for side A will be binomially distributed with distribution (2N,0.5) with mean N, and the votes for side B will be 2N-A, and the net votes A - B will be 2A-2N, with an expected value of 0.

But how likely is it to be 0 exactly (i.e. a tie that you flip to a win)? Well, that's the probability that A is N exactly, which is a decreasing function of N. Suppose N is 1,000 (i.e. there are 2,000 voters); then it's 1.7%. Suppose it's 1,000,000; then it's 0.05%. But 1.7% divided by a thousand is less than 0.05%.

But from the perspective of everyone in the election, it's not clear why 'you chose last.' Presumably everyone on the side with one extra vote would think "aha, it would have been a tied election if I hadn't voted," and splitting that up gives us our linear factor.

As well, this hinged on the probability being 0.5 exactly. If instead it was 50.1% favor for A, the odds of a tie are basically unchanged for the 2,000 voter election (we've only shifted the number of expected A voters by 2), but drop to 1e-5 for the 2M voter election, a drop by a factor larger than a thousand. (The expected number of net A voters is now 2,000, which is a much higher barrier to overcome by chance.)

However, symmetry doesn't help us here. Suppose you have a distribution over the 'bias' of the coin the other voters are flipping; a tie is just as unlikely if A is favored as if B is favored, and the more spread out our distribution over the bias is, the worse the odds of a tie are, because for large elections only biases very close to p=0.5 contribute any meaningful chance of a tie.

Comment by vaniver on Antitrust-Compliant AI Industry Self-Regulation · 2020-07-07T22:55:01.172Z · score: 8 (5 votes) · LW · GW

My summary / commentary:

Often, AI safety proponents talk about things that might be nice, like agreements to not do dangerous things, and focus on the questions of how to make those agreements in everyone's interest, or to measure compliance with them, or so on. Often these hopes take the shape of voluntary agreements adopted by professional organizations, or by large companies that jointly dominate a field. [In my personal view, it seems more likely we can convince AI engineers and researchers than legislators to adopt sensible policies, especially in the face of potentially rapid change.]

This paper asks the question: could such agreements even be legal? What underlying factors drive legality, so that we could structure the agreements to maximize the probability that they would hold up in court?

Overall, I appreciated the groundedness of the considerations, and the sense of spotting a hole that I might otherwise have missed. [I'm too used to thinking of antitrust in the context of 'conspiracy against the public' that it didn't occur to me that a 'conspiracy for the public' might run afoul of the prohibitions, and yet once pointed out it seems definitely worth checking.]

An obvious followup question that occurs to me: presumably in order to be effective, these agreements would have to be international. [Some sorts of unsafe AI, like autonomous vehicles, mostly do local damage, but other sorts of unsafe AI, like autonomous hackers, can easily do global damage, and creators can preferentially seek out legal environments favorable to their misbehavior.] Are there similar sorts of obstacles that would stand in the way of global coordination?

Comment by vaniver on Antitrust-Compliant AI Industry Self-Regulation · 2020-07-07T22:22:36.418Z · score: 4 (3 votes) · LW · GW

Direct pdf link.

Comment by vaniver on A reply to Agnes Callard · 2020-06-30T23:45:56.423Z · score: 7 (3 votes) · LW · GW

Also relevant: The Asshole Filter.

Comment by vaniver on A reply to Agnes Callard · 2020-06-30T23:44:37.556Z · score: 6 (3 votes) · LW · GW
Thanks for bringing this problem to my attention.

You're welcome, and I'm curious to see what you end up thinking here.

I think if you disagree with what someone thinks, or plans to do, the rational response is an argument to persuade them that they are wrong. (This is true irrespectively of whether they were, themselves, arguing, and it goes for the fruit-seller, the wrestler, etc. too.)

As pointed out by Raemon in a sibling comment, here I think we want to start using a more precise word than "rational." [Up until this point, I think I've been using "engage rationally" in a 'standard' way instead of in a 'Less Wrong specific way'.]

I'm going to say the 'argumentative' response is an 'argument to persuade them that they are wrong', and agree that purely argumentative responses are important for communicative rationality. The thing that's good about argumentative responses (as opposed to, say, purely persuasive ones) is that they attempt to be weaker when the claims they favor are not true than when they are true; and this helps us sort our beliefs and end up with truer ones.

I think for many disagreements, however, I want to do a thing that doesn't quite feel like argumentation; I want to appeal to reality. This involves two steps: first, an 'argument' over what observations imply about our beliefs, and second, an observation of reality that then shifts our beliefs. The first is an argument, and we do actually have to agree on the relationship between observations and beliefs for the second step to do anything useful. This doesn't help us establish logical truths, or things that would be true in any world, except indirectly; what it does help us do is establish empirical truths, or things that are true in our world (but could be false in others). Imagine a portal across universes that allows us to communicate with aliens who live under different physics than we do; it would be a tremendous surprise for our mathematicians and their mathematicians to disagree, whereas our chemists and their chemists disagreeing wouldn't be surprising at all.

I think that the wrestling match falls into this category; if a rival claims "I could out-wrestle Plato", then while Plato could respond with theories of wrestling and other logic, the quickest path to truth seems to be Plato responding with "let's settle that question in the ring." There's the same two-part structure of "agree on what test bears on the question" and then "actually running the test." I don't think buying fruit falls into this category. ["Quickest path to truth" might not be the right criterion, here, but it feels likely to be close.]

Continuing to flesh out this view, besides "appeal to logic" and "appeal to reality" there's something like "assertion of influence." This seems like the category that buying fruit falls into; I have some ability to change the external world to be more to my liking, and I trade some of that ability to the merchant for fruit. There seem to be ethical ways to do this (like free commerce) and unethical ways to do this (like stealing), and in particular there seem to be many ways for assertion of influence to choke off good things.

I think 'ethical' and 'unethical' look more like 'congruent with values X' or 'incongruent with values X' than it does like 'logically valid' or 'logically invalid'. [In this way, it more resembles the category of empirical truth, in which things are 'congruent with world X' or 'incongruent with world X' as opposed to 'congruent with all possible worlds' or not.]

And so we end up with questions that look like "how do we judge what influence is congruent with our values, and what influence in incongruent with our values?", and further questions upstream like "what do our meta-values imply our values should be?", and so on.

[There's much more to say here, but I think I'll leave it at this for now.]

Comment by vaniver on A reply to Agnes Callard · 2020-06-29T23:35:13.372Z · score: 8 (3 votes) · LW · GW

A quick comment on just this part:

I also am confused by the assertion that the petition would benefit in some way from a "if after consideration you decide we're wrong, we'll support you" clause. It does not seem necessary or wise, before one attempts to persuade another that they are wrong, to agree to support their conclusion if they engage in careful consideration. Even when incentives are aligned.

I think what this does is separate out the "I think you should update your map" message and the "I am shifting your incentives" message. If I think that someone would prefer the strawberry ice cream to the vanilla ice cream, I can simply offer that information to them, or I can advise them to get the strawberry, and make it a test of our friendship whether they follow my advice, rewarding them if they try it and like it, and punishing them in all other cases.

In cases where you simply want to offer information, and not put any 'undue' pressure on their decision-making, it seems good to be able to flag that. The broader question is something like "what pressures are undue, and why?"; you could imagine that there are cases where I want to shift the incentives of others, like telling a would-be bicycle thief that I think they would prefer not stealing the bike to stealing it, and part of that is because I would take actions against them if they did.

Comment by vaniver on Public Positions and Private Guts [Transcript] · 2020-06-29T06:55:33.869Z · score: 4 (2 votes) · LW · GW

I've edited this a bit to make it clearer and delete some filler words or awkward phrasing. In general, I think the post is better as a reading experience than the talk, except for maybe the Q&A at the end.

Comment by vaniver on A reply to Agnes Callard · 2020-06-28T19:39:50.677Z · score: 21 (6 votes) · LW · GW
Our petition should have a clause talking about how terrible it is for the NYT to bow to mobs of enraged internet elites but that it would be hypocritical of them to choose now as their moment to grow a spine. At least this gets the right ideas across.

Another way to look at this is that it's offered information; our culture has some rules, and their culture has some rules, and they're proposing a massive rule violation in our culture, and in the interest of mutual understanding we're telling them that we would view it as hostile.

Now, you might say "this is a symmetric weapon!"; the people who claimed that Bennet's decision to print Tom Cotton's op-ed was a massive rule violation in their culture are doing basically the same thing. I reply that we have to represent our culture if it want it to be present; competing views are more reason to defend the core principles of our society, not less.

[Of course, I am not arguing for doing anything against your conscience, except insofar as I think your conscience is mistaken about what should be unethical.]

We take some steps to make our petition a non-mob. Like, maybe we require that everyone who signs it restate it in their own words or something, or that everyone who signs it be someone initially skeptical who changed their mind as a result of hearing both sides.

Petitions allow for intellectual specialization of labor; specialists create a position, and then others choose whether or not to sign on. This allows for compression and easy communication; forcing everyone to restate it taxes participation and makes the result harder to comprehend. (Suppose many of the comments actually include disagreement with planks of the petition; how then should it be interpreted?)

Similarly, restricting it to people who are "initially skeptical" is selection on beliefs, not methodology, and is adverse selection (as people who initially picked the right answer are now barred).

Comment by vaniver on A reply to Agnes Callard · 2020-06-28T19:09:59.179Z · score: 8 (4 votes) · LW · GW
To me, both the original tweet and your reply seem to miss the point entirely. I didn't sign this petition out of some philosophical position on what petitions should or shouldn't be used for. I did it because I see something very harmful happening and think this is a way to prevent it.

I think it is very important to have things that you will not do, even if they are effective at achieving your immediate goals. That is, I think you do have a philosophical position here, it's just a shallow one.

I disagree with the position Callard has staked out that petitions are inconsistent with being a philosophical hero, but for reasons that presumably we could converge on; hence the reply, and a continuing conversation in the comments.

Comment by vaniver on A reply to Agnes Callard · 2020-06-28T19:03:28.624Z · score: 30 (8 votes) · LW · GW
I see the central issue--also raised in replies to my tweet--as: if you believe someone's arguing in bad faith, isn't it ok to engage non-rationally w them?

I agree the question "isn't it okay to engage non-rationally w them?" is the central question. I disagree on the first half, though; my main question is: what makes you think the NYT is arguing?

If, say, you put forward your argument for why petitions are bad, and it was broadly ignored, that would be bad; if there were arguments against pseudonyms, and we crushed them rather than responding to them, that would be bad. But this is a place where someone is exercising arbitrary judgment, and presenting petitions is an old and peaceful way of influencing the arbitrary exercise of power.

I think that when Plato goes to the agora and sees someone selling fruit for drachmae, he does not think "it would be unreasonable to settle an argument by paying my interlocutor; is it ok to engage non-rationally with the merchant?" I think when Plato goes to the wrestling ring, he does not think "physical strength does not determine correctness of arguments, is it ok to engage non-rationally with my opponent?"

Now, probably at one point he's engaged rationally with the questions of whether and how to engage with commerce and sport, and that seems good to me. But the Plato who tries to wrestle with arguments instead of with his body is confused, not heroic.

Comment by vaniver on A reply to Agnes Callard · 2020-06-28T06:29:29.972Z · score: 3 (2 votes) · LW · GW


Comment by vaniver on A reply to Agnes Callard · 2020-06-28T03:28:26.043Z · score: 5 (3 votes) · LW · GW

Changing sites with a reply is always a fraught business; my defense is that I don't have a Twitter account, and Twitter is horrible for longform discussion. If exactly one person wants to post a link to this post in that Twitter thread, I'd appreciate it.

Comment by vaniver on SlateStarCodex deleted because NYT wants to dox Scott · 2020-06-24T21:19:12.018Z · score: 10 (6 votes) · LW · GW

I think it makes sense to be precise and polite, and to make allowances for misunderstandings. I also think it makes sense to have boundaries and have the hypothesis of malice (with a low prior, both because malice is rare and it's easy to see it where none exists).

That said, my prior for malice from the NYT was pretty high, and various details have updated me further towards that hypothesis.

Comment by vaniver on Are Humans Fundamentally Good? · 2020-06-21T20:42:19.796Z · score: 8 (5 votes) · LW · GW

My favorite treatment of this question (as a question of philosophy) comes from Xunzi, who wrote an essay called "Human Nature Is Bad," which begins:

People’s nature is bad. Their goodness is a matter of deliberate effort.

The good things that make up civilization, he claims, come from deliberate adherence to codes of conduct and from principles that are adopted through deliberate effort, instead of listening to one's nature.

He, of course, is drawing a distinction between one's reasoned habits and instinctual emotions as if reasoning were not itself an instinctual process, but this seems like the right call to me; the decision relevance of whether humans are fundamentally good is whether, in times of uncertainty, they should trust their 'base nature' or their 'cultivated disposition,' and whether people can be trusted to do the right thing without instruction or systems, or whether those need to be carefully constructed so that things go well. (Hence the founders working carefully on the social contract and institutional design.)

As a questions of history, it depends a lot on what you think is "good," and what you mean by "fundmentally." Pre-state peoples varied widely in their customs, habits, and ability to leave records. What records we do have--like the fraction of excavated corpses who died from human violence--suggest humans now are much less violent than humans then. The development of human civilization does seem to have been progress; if humans 'sprang into existence' as 'good', we would expect things to look quite different.

In real live examples of anarchy, does society devolve because humans are not fundamentally good or because of some other reason?

Many of the things that we think of as markers of civilization, like careful planning and investment in the future, grow much rarer in times of significant uncertainty, like periods of anarchy. The categories you propose feel a bit strange in trying to make sense of this situation. Like, if I decide not to plant a tree because it's a bunch of work for me now, and I don't know who will eat the fruits in the future (since someone else might take the tree from me), one could say the absence of investment is due to my rational pessimism. Or one might say this is because humans don't fundamentally respect the property rights of others, which is a sign that human nature is bad. Or one might say this is because humans don't naturally believe in the lie of private property, which is a sign that human nature is good. Or one might say that this is not because of deficiencies in human nature broadly construed, but because of the actions of a handful of assholes who ruin everything for everyone else.

Comment by vaniver on Possible takeaways from the coronavirus pandemic for slow AI takeoff · 2020-06-17T22:14:23.463Z · score: 8 (4 votes) · LW · GW
However, in AI alignment, the hope is to learn from failures of narrow AI systems, and use that to prevent failures in more powerful AI systems.

This also jumped out at me as being only a subset of what I think of as "AI alignment"; like, ontological collapse doesn't seem to have been a failure of narrow AI systems. [By 'ontological collapse', I mean the problem where the AI knows how to value 'humans', and then it discovers that 'humans' aren't fundamental and 'atoms' are fundamental, and now it's not obvious how its preferences will change.]

Perhaps you mean "AI alignment in the slow takeoff frame", where 'narrow' is less a binary judgment and more of a continuous judgment; then it seems more compelling, but I still think the baseline prediction should be doom if we can only ever solve problems after encountering them.

Comment by vaniver on Mod Notice about Election Discussion · 2020-06-17T03:27:57.610Z · score: 14 (4 votes) · LW · GW

Suppose I want to talk about how ideological factions often align themselves on epistemic grounds instead of moral grounds. To give an example, I talk about how you might expect factions to be the "prefers strawberry" faction and the "prefers vanilla" faction, but in fact when you look at the world the faction membership tests are "thinks vanilla causes cancer" and "thinks vanilla doesn't cause cancer." And perhaps the actual cause is upstream, and is closer to "doesn't trust agribusiness-funded research" vs. "does trust agribusiness-funded research."

I could have instead given the example that had me actually thinking about ideological factions, and how much they are based on epistemic grounds vs. moral grounds. But likely the discussion then would be about the object-level point, of which faction is more correct, or perhaps even about which faction can consider LW part of its territory, independent of which is correct.

When Eliezer talked about this, in Belief as Attire, he used a real example, although one that was not quite contemporary at the time, and was called out for it in the comments.

And if your goal is to figure out whether LW is territory for faction A or faction B, this rule is here to say: Don't.

Comment by vaniver on How alienated should you be? · 2020-06-16T05:09:28.701Z · score: 6 (4 votes) · LW · GW

Also, a thing worth mentioning; in January when I wrote this, most of my attention was on how to convince people that current institutions were inadequate, while not making them give up on humanity. And now the situation seems reversed, where it seems like people need to be reminded of the overwhelming awe and love for the human race.

Comment by vaniver on You Can Do Futarchy Yourself · 2020-06-15T20:02:25.093Z · score: 7 (3 votes) · LW · GW
Futarchy by definition is a system for group decision making. That means that you can't use it for decisions that involve one person.

Why can't an individual be the trivial group?

Like, it seems like I could set my lunch orders by having a prediction market on how I would rate lunches conditional on ordering them, ordering the item with the highest predicted rating, and then rating it afterwards. The main challenge is populating the prediction market with actors who expect the profit to be high enough to be worth the cognitive work, but that's only indirectly related.

Comment by vaniver on How alienated should you be? · 2020-06-15T00:20:33.113Z · score: 6 (4 votes) · LW · GW

Yes, tho in the spirit of "less wrong" instead of "not wrong" I think of "less alien" instead of "not alien." (I think in order to actually get rid of alienation, you would need to get rid of individuality.)

The practical consideration is from the x-risk angle. If you're pro-humanity and buy the basic scientific picture of the universe, almost all of the value looks like it's in the future, when there can be many more humans than there are now experiencing more joy and less misery. Even if you look at it from a selfish perspective, almost all of your expected lifeyears come from the chance of living an extremely long time, tho you have to be patient for discounting to not wipe that out.

But if you buy the basic social picture of humanity, almost all of your contribution comes from 'staying in your lane' and being a cog in a complicated machine much more subtle than you could have designed on your own. Perhaps you could become an expert in a narrow field of study and slightly shift things, but asking the big questions is 'above your pay grade,' and you should mostly expect to make things worse instead of better by looking at those questions or taking actions in response.

And so it seems like people need to let go of many parts of that picture to be highly effective; but also, I think the basic social picture is one of the main reasons many people have to be pro-humanity in the first place. [As opposed to in favor of themselves or their specific friends, in opposition to a hostile world.]

Comment by vaniver on What are some Civilizational Sanity Interventions? · 2020-06-14T04:51:44.941Z · score: 7 (4 votes) · LW · GW
My understanding is that part of the reason our government is apparently so dysfunctional is that the electoral system is biased toward polarization.

While I think better voting systems would be better (score voting or approval voting seem like clear improvement over the status quo), the electoral system has been this way for a long time, but polarization has increased dramatically recently. That suggests to me it's not downstream of the voting system, and simple fixes to the voting system won't solve it.

Comment by vaniver on What past highly-upvoted posts are overrated today? · 2020-06-10T01:20:13.401Z · score: 13 (5 votes) · LW · GW

You might be interested in the 2018 Review, which spurred discussion of this sort, both as reviews on the posts and in new posts that were replies.

Comment by vaniver on What past highly-upvoted posts are overrated today? · 2020-06-09T22:03:33.669Z · score: 23 (9 votes) · LW · GW

If there are strong but not widely publicized criticisms of highly upvoted posts, it might make sense to have those criticisms more widely publicized (so people don't just take those things at face value). But this feels like a special case of your point, where it's really the criticisms that are underrated.

Comment by vaniver on Quarantine Bubbles Require Directness, and Tolerance of Rudeness · 2020-06-08T18:46:44.140Z · score: 6 (3 votes) · LW · GW
As a background assumption, I'm focused on the societal costs of getting infected, rather than the personal costs, since in most places the latter seem negligible unless you have pre-existing health conditions.

But, of course, any 12-person bubble that contains someone with a pre-existing health condition can't rest on 11 of the people thinking "oh, but I'm healthy!".

From a social perspective, I think it's quite clear that the average person is far from being effectively isolated, since R is around 0.9 and you can only get to around half of that via only household infection.

I think 'the average person' is the wrong thing to think about here. When the infection is rare, R will be driven by the actions of the riskiest people, since they're the ones who predominantly have it, spread it, and catch it. If 50% of the population has an actual risk of 0, and there aren't any graph connections between them and the other 50% of the population, then the whole population R will be driven by the connected half (and will only have slowed by by whatever connections got severed to the hermit half).

On the one hand, this is a message for hope ("you can probably relax to 'normal human' standards and only have an R of 1"), but also 'normal human' standards might be incompatible in other ways (someone who lives with 0 or 1 other person has much less to fear from a household secondary attack rate of 0.3 than someone who lives in a house of 12 people).

From a personal perspective, I think the real thing to care about is whether the other people are about as careful as you. ... But by the same logic there's nothing special about a 12 person bubble

Sure, 12 is a magic number, and actually weighing the tradeoffs should lead to different thresholds in different situations. But the overall thing you're trying to balance is "risk cost" against "socialization gains", and even if costs are linear, sublinear benefits scuttle these sorts of symmetry analyses.

I think the bit of this that I'm having the hardest time wrapping my head around is something like "if you accept people that are as careful as you, then you are less careful than you used to be." Like, suppose you have a 12-person bubble, all of whom don't interact with the outside world. Then if you say "we are open to all bubbles with at most 12 people, all of whom don't interact with the outside world", you now potentially have a bubble whose size is measured in the hundreds, which is a pretty different situation than the one you started in.

Comment by vaniver on Reply to Paul Christiano's “Inaccessible Information” · 2020-06-07T21:24:37.083Z · score: 7 (4 votes) · LW · GW
But if on some absolute scale you say that AlphaZero is a design / search hybrid, then presumably you should also say the OpenAI Five is a design / search hybrid, since it uses PPO at the outer layer, which is a designed algorithm. This seems wrong.

I think I'm willing to bite that bullet; like, as far as we know the only stuff that's "search all the way up" is biological evolution.

But 'hybrid' seems a little strange; like, I think design normally has search as a subcomponent (in imaginary space, at least, and I think often also search through reality), and so in some sense any design that isn't a fully formed vision from God is a design/search hybrid. (If my networks use RELU activations 'by design', isn't that really by the search process of the ML community as a whole? And yet it's still useful to distinguish networks which determine what nonlinearity to use from local data, which which networks have it determined for them by an external process, which potentially has a story for why that's the right thing to do.)

Total horse takeover seems relevant as another way to think about intervening to 'control' things at varying levels of abstraction.

[The core thing about design that seems important and relevant here is that there's a "story for why the design will work", whereas search is more of an observational fact of what was out there when you looked. It seems like it might be easier to build a 'safe design' out of smaller sub-designs, whereas trying to search for a safe algorithm using search runs into all the anthropic problems of empiricism.]

Comment by vaniver on Reply to Paul Christiano's “Inaccessible Information” · 2020-06-06T03:13:37.745Z · score: 7 (4 votes) · LW · GW
But for now such approaches are being badly outperformed by search (in AI).

I suspect the edge here depends on the level of abstraction. That is, Go bots that use search can badly outperform Go bots that don't use any search, but using search at the 'high level' (like in MuZero) only somewhat outperforms using design at that level (like in AlphaZero).

It wouldn't surprise me if search always has an edge (at basically any level, exposing things to adjustment by gradient descent makes performance on key metrics better), but if the edge is small it seems plausible to focus on design.

Comment by vaniver on GPT-3: a disappointing paper · 2020-06-01T18:54:24.756Z · score: 30 (8 votes) · LW · GW
I take a critical tone here in an effort to cut that hype off at the pass.

Maybe this is just my AI safety focus, or something, but I find myself annoyed by 'hype management' more often than not; I think the underlying root cause of the frustration is that it's easier to reach agreement on object-level details than interpretations, which are themselves easier than interpretations of interpretations.

Like, when I heard "GPT-3", I thought "like GPT-2, except one more," and from what I can tell that expectation is roughly accurate. The post agrees, and notes that since "one" doesn't correspond to anything here, the main thing this tells you is that this transformer paper came from people who feel like they own the GPT name instead of people who don't feel that. It sounds like you expected "GPT" to mean something more like "paradigm-breaker" and so you were disappointed, but this feels like a ding on your expectations more than a ding on the paper.

But under the hype management goal, the question of whether we should celebrate it as "as predicted, larger models continue to perform better, and astoundingly 175B parameters for the amount of training we did still hasn't converged" or criticize it as "oh, it is a mere confirmation of a prediction widely suspected" isn't a question of what's in the paper (as neither disagree), or even your personal take, but what you expect the social distribution of takes is, so that your statement is the right pull on the group beliefs.


Maybe putting this another way, when I view this as "nostalgebraist the NLP expert who is following and sharing his own research taste", I like the post, as expert taste is useful even if you the reader disagree; and when I view it as "nostalgebraist the person who has goals for social epistemology around NLP" I like it less.

Comment by vaniver on OpenAI announces GPT-3 · 2020-05-31T23:28:00.527Z · score: 4 (2 votes) · LW · GW

But does it ever hallucinate the need to carry the one when it shouldn't?

Comment by vaniver on Speculations on the Future of Fiction Writing · 2020-05-29T21:11:20.378Z · score: 9 (5 votes) · LW · GW
The movie industry has been around long enough, and is diverse enough, that I'd be very surprised if there were million-dollar bills lying around waiting to be picked up like this.

Prediction markets for box office results are more than a million dollar bill, I think, and yet reduce the power of the people who decide whether or not they get used.

Also, speaking of people caring about accuracy, it reminds me of the story Neil deGrasse Tyson tells about confronting James Cameron about the lazy fake sky in Titanic, and he responded with

Last I checked, Titanic has grossed a billion dollars worldwide. Imagine how much more it would have grossed had I gotten the sky correct.

But the ending of the story is that later they hire him to make an accurate sky for their director's cut, and he made a company that provides that service now.

It wouldn't shock me if a firm of smart rational-fic writers could do this sort of 'script doctoring' cheaply enough to be worth it to filmmakers, and the main problem is that the buyers don't know what to ask for and the sellers don't know how to find the buyers.

Comment by vaniver on Studies On Slack · 2020-05-29T20:58:48.270Z · score: 2 (1 votes) · LW · GW

From The Sources of Economic Growth by Richard Nelson, but I think it's a quote from James Fisk, Bell Labs President:

If the new work of an individual proves of significant interest, both scientifically and in possible communications applications, then it is likely that others in the laboratory will also initiate work in the field, and that people from the outside will be brought in. Thus a new area of laboratory research will be started. If the work does not prove of interest to the Laboratories, eventually the individual in question will be requested to return to the fold, or leave. It is hoped the pressure can be informal. There seems to be no consensus about how long to let someone wander, but it is clear that young and newly hired scientists are kept under closer rein than the more senior scientists. However even top-flight people, like Jansky, have been asked to change their line of research. But, in general, the experience has been that informal pressures together with the hiring policy are sufficient to keep AT&T and Western Electric more than satisfied with the output of research.

[Most recently brought to my attention by this post from a few days ago]

Comment by vaniver on Predictions/questions about conquistadors? · 2020-05-29T20:47:59.757Z · score: 2 (1 votes) · LW · GW

My (weakly held) take is that a category of 'usual medieval weaponry' obscures a lot of detail that turns out to be relevant. Like even talking about 'swords', a 3 foot sword made of Toledo steel is a very different beast from a macuahuitl. They're about equally sharp and long, but the steel sword is lighter, allows fighting more closely together (note that, at this time, a lot of the successful European tactics require people somewhat tightly packed working in concert), and is more durable. (The obsidian blades, while they could slice clean through people and horses, weren't very effective against mail and would break on impact with another sword.)

Comment by vaniver on Predictions/questions about conquistadors? · 2020-05-29T18:11:09.272Z · score: 2 (1 votes) · LW · GW
I predict that guns weren't that big a deal; they probably were useful as surprise weapons (shocking and demoralizing enemies not used to dealing with them) but that most of the fighting would be done by swords, bows, etc.

I think you should count pikes and swords differently, here, especially if the Spaniards are using the pike square.

Comment by vaniver on Trust-Building: The New Rationality Project · 2020-05-29T17:27:55.733Z · score: 7 (4 votes) · LW · GW
What we want isn't a lack of factionalism, it's unity. ... You have high trust in this network, and believe the evidence you receive from it by default.

I think one of the ways communities can differ are the local communication norms. Rather than saying something like "all communities have local elders whose word is trusted by the community", and then trying to figure out who the elders are in various communities, you can try to measure something like "how do people react to the statements of elders, and how does that shape the statements elders make?". In some communities, criticism of elders is punished, and so they can make more loose or incorrect statements and the community can coordinate more easily (in part because they can coordinate around more things). In other communities, criticism of elders is rewarded, and so they can only make more narrow or precise statements, and the community can't coordinate as easily (in part because they can coordinate around fewer, perhaps higher quality, things).

It seems to me like there's a lot of value in looking at specific mechanisms there, and trying to design good ones. Communities where people do more reflexive checking of things they read, more pedantic questions, and so on do mean "less trust" in many ways and do add friction to the project of communication, but that friction does seem asymmetric in an important way.

Comment by vaniver on Trust-Building: The New Rationality Project · 2020-05-29T17:11:13.882Z · score: 5 (3 votes) · LW · GW
When it feels like there's no need to explore, and all you need to do is practice your routine and enjoy what you have, the right assumption is that you are missing an opportunity. This is when exploration is most urgent.

I think good advice is often of the form "in situation X, Y is appropriate"; from a collection of such advice you can build a flowchart of observations to actions, and end up with a full policy.

Whenever there is a policy that is "regardless of the observation, do Y", I become suspicious. Such advice is sometimes right--it may be the case that Y strictly dominates all other options, or it performs well enough that it's not worth the cost of checking whether you're in the rare case where something else is superior.

Is the intended reading of this "exploitation and routine is never correct"? Is exploration always urgent?

Comment by vaniver on Cortés, Pizarro, and Afonso as Precedents for Takeover · 2020-05-26T23:54:26.743Z · score: 2 (1 votes) · LW · GW

Tercios were very strong during the era Conn Nugent is pointing at; "nobody in Europe could stand up to them" is probably an exaggeration but not by much. They had a pretty good record under Ferdinand II, and then for various dynastic reasons, Spain was inherited by a Habsburg who became Holy Roman Emperor, and then immediately faced coalitions against him as the 'most powerful man in Christendom.' So we don't really get to see what would have happened had they tried to fight their way to continental prominence, since they inherited to it.

It's also not obvious that, if you have spare military capacity in 1550 (or whenever), you would want to use it conquering bits of Europe instead of conquering bits elsewhere, if the difficulty for the latter is sufficiently lower and the benefits not sufficiently higher.

Comment by vaniver on Why aren’t we testing general intelligence distribution? · 2020-05-26T17:51:08.839Z · score: 44 (18 votes) · LW · GW

First, you might be interested in tests like the Wonderlic, which are not transformed to a normal variable, and instead use raw scores. [As a side note, the original IQ test was not normalized--it was a quotient!--and so the name continues to be a bit wrong to this day.]

Second, when we have variables like height, there are obvious units to use (centimenters). Looking at raw height distributions makes sense. When we discover that the raw height distribution (split by sex) is a bell curve, that tells us something about how height works.

When we look at intelligence, or results on intelligence tests, there aren't obvious units to use. You can report raw scores (i.e. number of questions correctly answered), but in order for the results to be comparable the questions have to stay the same (the Wonderlic has multiple forms, and differences between the forms do lead to differences in measured test scores). For a normalized test, you normalize each version separately, allowing you to have more variable questions and be more robust to the variation in questions (which is useful as an anti-cheating measure).

But 'raw score' just pushes the problem back a step. Why the 50 questions of the Wonderlic? Why not different questions? Replace the ten hardest questions with easier ones, and the distribution looks different. Replace the ten easiest questions with harder ones, and the distribution looks different. And for any pair of tests, we need to construct a translation table between them, so we can know what a 32 on the Wonderlic corresponds to on the ASVAB.

Using a normal distribution sidesteps a lot of this. If your test is bad in some way (like, say, 5% of the population maxing out the score on a subtest), then your resulting normal distribution will be a little wonky, but all sufficiently expressive tests can be directly compared. Because we think there's this general factor of intelligence, this also means tests are more robust to inclusion or removal of subtests than one might naively expect. (If you remove 'classics' from your curriculum, the people who would have scored well on classics tests will still be noticeable on average, because they're the people who score well on the other tests. This is an empirical claim; the world didn't have to be this way.)

"Sure," you reply, "but this is true of any translation." We could have said intelligence is uniformly distributed between 0 and 100 and used percentile rank (easier to compute and understand than a normal distribution!) instead. We could have thought the polygenic model was multiplicative instead of additive, and used a lognormal distribution instead. (For example, the impact of normally distributed intelligence scores on income seems multiplicative, but if we had lognormally distributed intelligence scores it would be linear instead.) It also matters whether you get the splitting right--doing a normal distribution on height without splitting by sex first gives you a worse fit.

So in conclusion, for basically as long as we've had intelligence testing there have been normalized and non-normalized tests, and today the normalized tests are more popular. From my read, this is mostly for reasons of convenience, and partly because we expect the underlying distribution to be normal. We don't do everything we could with normalization, and people aren't looking for mixture Gaussian models in a way that might make sense.

Comment by vaniver on Get It Done Now · 2020-05-22T21:08:53.771Z · score: 11 (7 votes) · LW · GW

OHIO has also been a useful corrective for me, as I've had a lot of success 'processing things subconsciously', where if I think about a problem, ignore it for a while, and then come back, the problem will have been solved by my subconscious in the meantime. But while this is quite appropriate for math problems, there's a huge category of logistical, administrative, and coordinative tasks for which it doesn't make sense, and nevertheless I have some impulse to try it.

Comment by vaniver on Extracting Value from Inadequate Equilibria · 2020-05-22T21:03:36.419Z · score: 2 (1 votes) · LW · GW

That's it, thanks!

Comment by vaniver on Extracting Value from Inadequate Equilibria · 2020-05-19T19:38:44.453Z · score: 4 (2 votes) · LW · GW

There's also a quote, which I don't remember the provenance of and can't quickly find, which was something like "the main purpose of think tanks is to generate ideas that are ready to be deployed in times of crisis." 

Comment by vaniver on Extracting Value from Inadequate Equilibria · 2020-05-19T19:34:40.758Z · score: 8 (4 votes) · LW · GW

See A Key Power of the President is to Coordinate the Execution of Existing Concrete Plans.

Comment by vaniver on Isn't Tesla stock highly undervalued? · 2020-05-18T19:07:05.127Z · score: 7 (4 votes) · LW · GW

The point I haven't seen addressed in the comments is I think Tesla has unusually potent ingredients for a more than 10% chance of a 10x+ upside. Just scaling up its gigafactories and dominating battery production across all industries seems like a sufficient ingredient to tell a disjunction of such stories. 

IMO this is addressed by the "market is already pricing it at 10x growth" point. To unroll that, consider three cases: company grows to 100x, company grows to 10x, and company stays at 1x. In the world where those are the only options, pricing the stock at 10x its "stay the same size" value means that the 1x case is roughly 9 times more likely than the 100x case (and otherwise doesn't constrain things). Someone who thinks it's 10%/0%/90% should have the same EV as someone who thinks it's 5%/50%/45% or 0%/100%/0%.

Now, you can argue it's 10%/50%/40%, or whatever, and so it should be priced at 20x instead of 10x, but this is more in the "the usual kind of high-priced high-quality stock" territory.

Making an equity bet where the maximum loss is 1x therefore still seems attractive to me.

This also seems slightly off to me; all bets have a direct maximum loss of 1x, in some meaningful sense, and the "real loss" is going to be in the opportunity cost. That is, if I buy $10 of Amazon and you buy $10 of Tesla, and mine becomes worth $100 and yours becomes worth $1, we can look at this as you choosing to be $9 poorer or $99 poorer depending on where we put the baseline.

Comment by vaniver on Isn't Tesla stock highly undervalued? · 2020-05-18T18:48:58.250Z · score: 2 (1 votes) · LW · GW

Specifically, Musk doesn't think LiDAR will help, and Waymo and others use it heavily, and from what I know of how this stuff works, the more sensors the better, for now. (I wouldn't be surprised if Musk turns out to be right in the long run, but also wouldn't be surprised if Tesla starts quietly adding LiDAR.)

Comment by vaniver on The EMH Aten't Dead · 2020-05-16T07:33:55.437Z · score: 59 (19 votes) · LW · GW

I think this post is missing the important part of actually doing this well / being a chosen one, from my perspective. That is, it seems to think of the EMH as something like an on/off switch, where either you think the market is better than you always and so you just blindly trust it, or you think you're better than the market always, and so you should be actively trading.

But my experience has been about every five years, an opportunity comes by and I think "oh, this is a major opportunity," and each time I've been right. [FWIW I didn't have this for COVID, because of a general plan to focus on work instead of the markets leading up to the pandemic, and then when I started thinking "oh this will actually be as bad as 1918" I was too busy trying to solve practical problems related to it that I didn't think seriously about the question of "should I be trading on this?"; I think the me that had thought about trading based off it would have mostly made the right calls.]

This is, of course, a small sample size, and I've made many active trades that weren't associated with that feeling, whose records have been much worse. Each of the 'edge' investments has also had another side to it, and the other side didn't have the same feeling to guide me. For example, I correctly timed BP's bottom during the Deepwater Horizon crisis, but then when it recovered my decision of when to sell it was essentially random. I think most of the people who predicted COVID a week early were then not able to outperform the market on the other side, and various things that I've seen people say about why they expect a continued edge seemed wrong to me. (For example, someone mentioned that they could evaluate potential treatments better than the market--which I think is true, because I think this person is actually a world-class expert at that specific problem--but I think that ability won't actually give them an edge when it comes to asset prices. I don't think anyone thinking just about biology would have correctly predicted the recent bottom or where we'd recover to, for example.)

Nevertheless, I'm pretty convinced that I sometimes have an edge, and more importantly can tell when I have an edge and when I'm just guessing. I think something like 1-10% of rationalists are in this category, or could be if they believed it, much like I think a comparable number of rationalists could be superforecasters if they tried. And historically, knowing to take the "oh, this is a major opportunity" signal seriously, instead of treating it as "just another good idea", would have made a huge difference, and I think I've under-updated each time on how much to move things to be more ready the next time one comes along. Which is the main reason I think this is worth bringing up.

[Like, inspired by his weakened faith in the EMH, Eliezer attempted to time the bottom of the market, and succeeded. It seems better if more people attempt this sort of thing, at an appropriately humble frequency.]

Comment by vaniver on Studies On Slack · 2020-05-13T20:23:57.382Z · score: 9 (5 votes) · LW · GW

If management just funds research indiscriminately, then they'll end up with random research directions, and the exponentially-vast majority of random research directions suck. Xerox and Bell worked in large part because they successfully researched things targeted toward their business applications - e.g. programming languages and solid-state electronics.

That said, I think there's still a compelling point in slack's favor here; my impression is that Bell Labs (and probably Xerox?) put some pressure on people to research things that would eventually be helpful, but put most of its effort into hiring people with good taste and high ability in the first place. 

Comment by vaniver on Project Proposal: Gears of Aging · 2020-05-11T03:17:45.866Z · score: 6 (3 votes) · LW · GW

Agreed on the general point that having an overall blueprint is sensible, and that any particular list of targets implies an underlying model.

Note the inclusion of senescent cells. Today, it is clear that senescent cells are not a root cause of aging, since they turn over on a timescale of days to weeks. Senescent cells are an extraneous target. Furthermore, since senescent cell counts do increase with age, there must also be some root cause upstream of that increase - and it seems unlikely to be any of the other items on the original SENS list. Some root cause is missing. If we attempted to address aging by removing senescent cells (via senolytics), whatever root cause induces the increase in senescent cells in the first place would presumably continue to accumulate, requiring ever-larger doses of senolytics until the senolytic dosage itself approached toxicity - along with whatever other problems the root cause induced.

I think this paper ends up supporting this conclusion, but the reasoning as summarized here is wrong. That they turn over on a timescale of days to weeks is immaterial; the core reason to be suspicious of senolytics as actual cure is that this paper finds that the production rate is linearly increasing with age and the removal rate doesn't keep up. (In their best-fit model, the removal rate just depends on the fraction of senolytic cells.) Under that model, if you take senolytics and clear out all of your senescent cells, the removal rate bounces back, but the production rate is steadily increasing.

You wouldn't have this result for different models--if, for example, the production rate didn't depend on age and the removal rate did. You would still see senescent cells turning over on a timescale of days to weeks, but you would be able to use senolytics to replace the natural removal process, and that would be sustainable at steady state.

Comment by vaniver on Assessing Kurzweil predictions about 2019: the results · 2020-05-07T18:40:47.288Z · score: 12 (6 votes) · LW · GW

One of the things I find really hard about tech forecasting is that most of tech adoption is driven by market forces / comparative economics ("is solar cheaper than coal?"), but raw possibility / distance in the tech tree is easier to predict ("could more than half of schools be online?"). For about the last ten years we could have had the majority of meetings and classes online if we wanted to, but we didn't want to--until recently. Similarly, people correctly called that the Internet would enable remote work, in a way that could make 'towns' the winners and 'big cities' the losers--but they incorrectly called that people would prefer remote work to in-person work, and towns to big cities. 

[A similar thing happened to me with music-generation AI; for years I think we've been in a state where people could have taken off-the-shelf method A and done something interesting with it on a huge music dataset, but I think everyone with a huge music dataset cares more about their relationship with music producers than they do about making the next step of algorithmic music.]

Comment by vaniver on Stop saying wrong things · 2020-05-02T17:44:06.892Z · score: 16 (9 votes) · LW · GW

See Paul Graham on Robert Morris. I also remember a blog post (discussed on LW) that I thought was called "stop making stupid mistakes", which wasn't this one, but instead was about someone who was okay at chess talking to a friend that was good at chess about how to get better, and getting the unpalatable lesson that it wasn't about learning cool new tricks, but slowly ironing out all of the mistakes that he's currently making. 

Comment by vaniver on SlateStarCodex 2020 Predictions: Buy, Sell, Hold · 2020-05-02T06:01:16.943Z · score: 4 (2 votes) · LW · GW

97. I eat at/from Sliver more than any other restaurant in Q4 2020: 50%

Given the substantial chance that things have changed a lot or there is equal amounts of eating at all restaurants, I’ll sell this to 30%.

I get you're a NY pizza partisan, but I think you're underweighting how good Sliver is.

Comment by vaniver on ricraz's Shortform · 2020-05-01T23:49:40.348Z · score: 2 (1 votes) · LW · GW

To be pedantic: we care about "consequence-desirability-maximisers" (or in Rohin's terminology, goal-directed agents) because they do backwards assignment.

Valid point.

But I think the pedantry is important, because people substitute utility-maximisers for goal-directed agents, and then reason about those agents by thinking about utility functions, and that just seems incorrect.

This also seems right. Like, my understanding of what's going on here is we have:

  • 'central' consequence-desirability-maximizers, where there's a simple utility function that they're trying to maximize according to the VNM axioms
  • 'general' consequence-desirability-maximizers, where there's a complicated utility function that they're trying to maximize, which is selected because it imitates some other behavior

The first is a narrow class, and depending on how strict you are with 'maximize', quite possibly no physically real agents will fall into it. The second is a universal class, which instantiates the 'trivial claim' that everything is utility maximization.

Put another way, the first is what happens if you hold utility fixed / keep utility simple, and then examine what behavior follows; the second is what happens if you hold behavior fixed / keep behavior simple, and then examine what utility follows.

Distance from the first is what I mean by "the further a robot's behavior is from optimal"; I want to say that I should have said something like "VNM-optimal" but actually I think it needs to be closer to "simple utility VNM-optimal." 

I think you're basically right in calling out a bait-and-switch that sometimes happens, where anyone who wants to talk about the universality of expected utility maximization in the trivial 'general' sense can't get it to do any work, because it should all add up to normality, and in normality there's a meaningful distinction between people who sort of pursue fuzzy goals and ruthless utility maximizers.

Comment by vaniver on ricraz's Shortform · 2020-04-30T23:45:51.293Z · score: 2 (1 votes) · LW · GW

Which seems very very complicated.

That's right.

I realized my grandparent comment is unclear here:

but need a very complicated utility function to make a utility-maximizer that matches the behavior.

This should have been "consequence-desirability-maximizer" or something, since the whole question is "does my utility function have to be defined in terms of consequences, or can it be defined in terms of arbitrary propositions?". If I want to make the deontologist-approximating Innocent-Bot, I have a terrible time if I have to specify the consequences that correspond to the bot being innocent and the consequences that don't, but if you let me say "Utility = 0 - badness of sins committed" then I've constructed a 'simple' deontologist. (At least, about as simple as the bot that says "take random actions that aren't sins", since both of them need to import the sins library.)

In general, I think it makes sense to not allow this sort of elaboration of what we mean by utility functions, since the behavior we want to point to is the backwards assignment of desirability to actions based on the desirability of their expected consequences, rather than the expectation of any arbitrary property.


Actually, I also realized something about your original comment which I don't think I had the first time around; if by "some reasonable percentage of an agent's actions are random" you mean something like "the agent does epsilon-exploration" or "the agent plays an optimal mixed strategy", then I think it doesn't at all require a complicated utility function to generate identical behavior. Like, in the rock-paper-scissors world, and with the simple function 'utility = number of wins', the expected utility maximizing move (against tough competition) is to throw randomly, and we won't falsify the simple 'utility = number of wins' hypothesis by observing random actions.

Instead I read it as something like "some unreasonable percentage of an agent's actions are random", where the agent is performing some simple-to-calculate mixed strategy that is either suboptimal or only optimal by luck (when the optimal mixed strategy is the maxent strategy, for example), and matching the behavior with an expected utility maximizer is a challenge (because your target has to be not some fact about the environment, but some fact about the statistical properties of the actions taken by the agent).


I think this is where the original intuition becomes uncompelling. We care about utility-maximizers because they're doing their backwards assignment, using their predictions of the future to guide their present actions to try to shift the future to be more like what they want it to be. We don't necessarily care about imitators, or simple-to-write bots, or so on. And so if I read the original post as "the further a robot's behavior is from optimal, the less likely it is to demonstrate convergent instrumental goals", I say "yeah, sure, but I'm trying to build smart robots (or at least reasoning about what will happen if people try to)."

Comment by vaniver on ricraz's Shortform · 2020-04-29T23:27:41.768Z · score: 2 (1 votes) · LW · GW

If a reasonable percentage of an agent's actions are random, then to describe it as a utility-maximiser would require an incredibly complex utility function (because any simple hypothesised utility function will eventually be falsified by a random action).

I'd take a different tack here, actually; I think this depends on what the input to the utility function is. If we're only allowed to look at 'atomic reality', or the raw actions the agent takes, then I think your analysis goes through, that we have a simple causal process generating the behavior but need a very complicated utility function to make a utility-maximizer that matches the behavior.

But if we're allowed to decorate the atomic reality with notes like "this action was generated randomly", then we can have a utility function that's as simple as the generator, because it just counts up the presence of those notes. (It doesn't seem to me like this decorator is meaningfully more complicated than the thing that gave us "agents taking actions" as a data source, so I don't think I'm paying too much here.)

This can lead to a massive explosion in the number of possible utility functions (because there's a tremendous number of possible decorators), but I think this matches the explosion that we got by considering agents that were the outputs of causal processes in the first place. That is, consider reasoning about python code that outputs actions in a simple game, where there are many more possible python programs than there are possible policies in the game.