## Posts

## Comments

**aspera**on 2013 Less Wrong Census/Survey · 2013-11-26T04:08:28.351Z · LW · GW

It's true: if you're optimizing for altruism, cooperation is clearly better.

I guess it's not really a "dilemma" as such, since the optimal solution doesn't depend at all on what anyone else does. If you're trying to maximize EV, defect. If you're trying to maximize other people's EV, cooperate.

**aspera**on 2013 Less Wrong Census/Survey · 2013-11-25T18:05:49.030Z · LW · GW

My confidence bounds were 75% and 98% for defect, so my estimate was diametrically opposed to yours. If the admittedly low sample size of these comments is any indication, we were both way off.

Why do you think most would cooperate? I would expect this demographic to do a consequentialist calculation, and find that an isolated cooperation has almost no effect on expected value, whereas an isolated defection almost quadruples expected value.

**aspera**on 2013 Less Wrong Census/Survey · 2013-11-25T18:02:23.279Z · LW · GW

Nice job on the survey. I loved the cooperate/defect problem, with calibration questions.

I defected, since a quick expected value calculation makes it the overwhelmingly obvious choice (assuming no communcation between players, which I am explicitly violating right now). Judging from comments, it looks like my calibration lower bound is going to be way off.

**aspera**on Bayesianism for Humans · 2013-10-29T15:58:56.631Z · LW · GW

I agree that the statement is not crystal clear. It makes it possible to confuse the (change in the average) with the (average of the change).

Mathematically speaking, we represent our beliefs as a probability distribution on the possible outcomes, and change it upon seeing the result of a test (possibly for every outcome). The statement is that “if we *average* the possible posterior probability distributions weighted by how likely they are, we will end up with our original probability distribution.”

If that were not the case, it would imply that we were failing to make use of all of the prior information we have in our original distribution.

A misunderstood reading of the statement is that “the average of the absolute *change* in the probability distribution on measurement is zero.” This is not the case, as you rightly point out. It would imply that we expect the test to yield no information.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-16T15:06:42.970Z · LW · GW

For the moment, I'm going to strike the comment from the post. I don't want to ascribe a viewpoint to VincentYu that he doesn't actually hold.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-15T19:03:33.543Z · LW · GW

I added a section called "Deciding how to decide" that (hopefully) deals with this issue appropriately. I also amended the conclusion, and added you as an acknowledgement.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-15T03:45:35.701Z · LW · GW

I'm not sure why it got moved: maybe not central to the thesis of LW, or maybe not high enough quality. I'm going to add some discussion of counter-arguments to the limit method. Maybe that will make a difference.

I noticed that the discussion picked up when it got moved, and I learned some useful stuff from it, so I'm not complaining.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-14T03:11:52.364Z · LW · GW

Ok, I think I've got it. I'm not familiar with VNM utility, and I'll make sure to educate myself.

I'm going to edit the post to reflect this issue, but it may take me some time. It is clear (now that you point it out) that we can think of the ill-posedness coming from our insistence that the solution conform to aggregative utilitarianism, and it may be possible to sidestep the paradox if we choose another paradigm of decision theory. Still, I think it's worth working as an example, because, as you say, AU is a good general standard, and many readers will be familiar with it. At the minimum, this would be an interesting finite AU decision problem.

Thanks for all the time you've put into this.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T20:23:40.188Z · LW · GW

I would like to include this issue in the post, but I want to make sure I understand it first. Tell me if this is right:

It is possible mathematically to represent a countably infinite number of immortal people, as well as the process of moving them between spheres. Further, we should not expect

a priorithat a problem involving such infinities would have a solution equivalent to those solutions reached by taking infinite limits of an analogous finite problem. Some confusion arises when we introduce the concept of “utility” to determine which of the two choices is better, since utility only serves as a basis on which to make decision for finite problems.

If that’s what you’re saying, I have a couple of questions.

Do you view the paradox as therefore unresolvable as stated, or would you claim that a different resolution is correct?

If I carefully restricted my claim about ill-posedness to the question of which choice is better from a utilitarian sense, would you agree with it?

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T16:16:09.996Z · LW · GW

The final section has been edited to reflect the concerns of some of the commenters.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T03:29:33.804Z · LW · GW

Thanks to whomever moved this to Discussion. From the FAQ, I wasn't sure where to put it. This is better, in retrospect.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T03:27:39.380Z · LW · GW

Thanks! Do you guys want to copy edit my journal papers? ;)

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T03:27:14.162Z · LW · GW

You're completely right! As stated, the problem is ill posed, i.e. it has no unique solution, so we didn't solve it.

Instead, we solved a *similar* problem by introducing a new parameter, \alpha. It was useful because we gained a mathematical description that works for very large n and s, and which matches our intuition about the problem.

It is important to recognize, as you point out, that that taking limits does not *solve* the problem. It just elucidates why we can't solve it as stated.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-13T03:23:32.197Z · LW · GW

I agree that it's a lot to cover, but I wanted to work a full example. We talk a lot on LW about decision analysis and paradoxes in the abstract, but I'm coming from a math/physics background, and it's much more helpful for me to see concrete examples. I assume some other people feel the same way.

Self-referential problems would be an interesting area to study, but I'm not familiar with the techniques. I suspect you're right, though.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-12T22:36:48.209Z · LW · GW

Fixed. Thanks for reading so closely. It's amazing how many little mistakes can survive after 10 read-throughs.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T22:58:30.857Z · LW · GW

By the way, are you talking about this meme, or is there another problem with monkeys and bananas?

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T22:55:39.064Z · LW · GW

Great problem, thanks for mentioning it!

I think the answer to "how many balls did you put in the vase as T->\infty" and "How many balls have been destroyed as T->\infty" both have well defined answers. It's just a fallacy to assume that the "total number of balls in the vase as T->\infty" is equal to the difference between these quantities in their limits.

**aspera**on Advice for a smart 8-year-old bored with school · 2013-10-10T21:40:21.294Z · LW · GW

My parents stopped me from skipping a grade, and apart from a few math tricks, we didn't work on additional material at home. I fell into a trap of "minimum effort for maximum grade," and got really good at guessing the teacher's password. The story didn't change until graduate school, when I was unable to meet the minimum requirements without working, and that eventually led me to seek out fun challenges on my own.

I now have a young son of my own, and will not make the same mistake. I'm going to make sure he *expects* to fail sometimes, and that I praise his efforts to go beyond what's required. No idea if it will work.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T20:18:05.917Z · LW · GW

MrMind explains in better language below.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T16:56:24.916Z · LW · GW

The plots were done in Mathematica 9, and then I added the annotations in PowerPoint, including the dashed lines. I had to combine two color functions for the density plot, since I wanted to highlight the fact that the line s=n represented indifference. Here's the code:

r = 1; ua = 1;ub = -1;
f1[n*, s*] := (n*s - s^2*r ) (ua - ub);
Show[DensityPlot[-f1[n, s], {n, 0, 20}, {s, 0, 20}, ColorFunction -> "CherryTones", Frame -> False, PlotRange -> {-1000, 0}], DensityPlot[f1[n, s], {n, 0, 20}, {s, 0, 20}, ColorFunction -> "BeachColors", Frame -> False, PlotRange -> {-1000, 0}]]

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T01:13:42.774Z · LW · GW

No, I mean a function whose limit doesn't equal its defined value at infinity. As a trivial example, I could define a utility function to be 1 for all real numbers in [-inf,+inf) and 0 for +inf. The function could never actually be evaluated at infinity, so I'm not sure what it would mean, but I couldn't claim that the limit was giving me the "correct" answer.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-10T00:31:23.928Z · LW · GW

Thanks for clearing up the countability. It's clear that there are some cases where taking limits will fail (like when the utility is discontinuous at infinity), but I don't have an intuition about how that issue is related to countability.

**aspera**on On the importance of taking limits: Infinite Spheres of Utility · 2013-10-09T23:56:31.962Z · LW · GW

In the above example, the number of people and the number of days they live were uncountable, if I'm not mistaken. The take-home message is that you *do not* get an answer if you just evaluate the problem for sets like that, but you *might* if you take a limit.

Conclusions that involve infinity don't map uniquely on to finite solutions because they don't supply enough information. Above, "infinite immortal people" refers to a concept that encapsulates three different answers. We had to invent a new parameter, alpha, which was not supplied in the original problem, to come up with a well defined result. In essence, we didn't actually answer the question. We made up our own problem that was *similar* to the original one.

**aspera**on The best 15 words · 2013-10-07T21:03:16.712Z · LW · GW

Here is some clarification from Zinsser himself (*ibid*.):

"Who am I writing for? It's a fundamental question, and it has a fundamental answer: You're writing for yourself. Don't try to visualize the great mass audience. There is no such audience - every reader is a different person.

This may seem to be a paradox. Earlier I warned that the reader is... impatient... . Now I'm saying you must write for yourself and not be gnawed by worry over whether the reader is tagging along. I'm talking about two different issues. One is craft, the other is attitude. The first is a question of mastering a precise skill. The second is a question of how you use the skill to express your personality.

In terms of craft, there's no excuse for losing readers through sloppy workmanship. ... But on the larger issue of whether the reader likes you, or likes what you are saying or how you are saying it, or agrees with it, or feels an affinity for your sense of humor or your vision of life, don't give him a moment's worry. You are who you are, he is who he is, and either you'll get along or you won't.

N.B: These paragraphs are not contiguous in the original text.

**aspera**on The best 15 words · 2013-10-03T21:58:44.064Z · LW · GW

*On Writing Well*, by William Zinsser

Every word should do useful work. Avoid cliché. Edit extensively. Don’t worry about people liking it. There is more to write about than you think.

**aspera**on Math is Subjunctively Objective · 2013-10-03T16:38:35.399Z · LW · GW

It makes no sense to call something “true” without specifying prior information. That would imply that we could never update on evidence, which we know not to be the case for statements like “2 + 3 = 5.” Much of the confusion comes from different people meaning different things by the proposition “2 + 3 = 5,” which we can resolve as usual by tabooing the symbols.

Consider the propositions "
A =“The next time I put two sheep and three sheep in a pen, I will end up with five sheep in the pen.”

B = “The universe works as if in all cases, combining two of something with three of something results in five of that thing.”
C = “the symbolic expression 2 + 3 = 5 is consistent with mathematical formalism”

These are a few examples of what we might mean when we ask “Is ‘2+3=5’ true?” In all cases, we can in principle perform the computation of P(A|Q), or P(B|Q), etc, where Q represents prior information including what I know about sheep and mathematical formalism.

**aspera**on Can Counterfactuals Be True? · 2013-10-03T06:42:45.948Z · LW · GW

As usual, I'm late to the discussion.

The probability that a counterfactual is true should be handled with the same probabilistic machinery we always use. Once the set of prior information is defined, it can be computed as usual with Bayes. The confusing point seems to be that the prior information is *contrary to what actually occurred*, but there's no reason this should be different than any other case with limited prior information.

For example, suppose I drop a glass above a marble floor. Define:

sh = “my glass shattered”

f = “the glass fell to the floor under the influence of gravity”

and define sh_0 and f_0 as the negations of these statements. We wish to compute

P(sh_0|f_0,Q) = P(sh_0|Q)P(f_0|sh_0,Q)/P(f_0|Q),

where Q is all other prior information, including my understanding of physics. As long as these terms exist, we have no problem. The confusion seems to stem from the assumption that P(f_0|sh_0,Q) = P(f_0|Q) = 0, since f_0 is contrary to our observations, and in this case seemingly mutually exclusive with Q.

But probability is in the mind. From the perspective of an observer at the moment the glass is dropped, P(f_0|Q) at least includes cases in which she is living in the Matrix, or aliens have harnessed the glass in a tractor beam. Both of these cases hold finite probability consistent with Q. From the perspective of someone remembering the observed event, P(f_0|Q) might include cases in which her memory is not trustworthy.

In the usual colloquial case, we’re taking the perspective of someone running a thought experiment on a histroical event with limited information about history and physics. The glass-dropping case limits the possible cases covered by P(f_0|Q) considerably, but the Kennedy-assassination case leaves a good many of them open. All terms are well defined in Bayes’ rule above, and I see no problem with computing in principle the probability of the counterfactual being true.

**aspera**on Newcomb's Problem and Regret of Rationality · 2012-11-26T23:23:50.756Z · LW · GW

I'm confused about why this problem is different from other decision problems.

Given the problem statement, this is not an acausal situation. No physics is being disobeyed - Kramers Kronig still works, relativity still works. It's completely reasonable that my choice could be predicted from my source code. Why isn't this just another example of prior information being appropriately applied to a decision?

Am I dodging the question? Does EY's new decision theory account for truly acausal situations? If I based my decision on the result of, say, a radioactive decay experiment performed after Omega left, could I still optimize?

**aspera**on Meetup : Weekly meetup, Champaign IL: Cafe Paradiso · 2012-11-26T22:13:14.318Z · LW · GW

Ha - thanks. FIxed. But I guess if other people want to Skype in from around the world, they're welcome to.

**aspera**on Torture vs. Dust Specks · 2012-11-23T05:40:16.850Z · LW · GW

Yes, we are running on corrupted hardware at about 100 Hz, and I agree that defining broad categories to make first-cut decisions is necessary.

But if we were designing a morality program for a super-intelligent AI, we would want to be as mathematically consistent as possible. As shminux implies, we can construct pathological situations that exploit the particular choice of discontinuities to yield unwanted or inconsistent results.

**aspera**on Where Recursive Justification Hits Bottom · 2012-11-23T05:27:11.926Z · LW · GW

I think it would be possible to have an anti-Occam prior if the total complexity of the universe is bounded.

Suppose we list integers according to an unknown rule, and we favor rules with high complexity. Given the problem statement, we should take an anti-Occam prior to determine the rule given the list of integers. It doesn't diverge because the list has finite length, so the complexity is bounded.

Scaling up, the universe presumably has a finite number of possible configurations given any prior information. If we additionally had information that led us to take an Anti-Occam prior, it would not diverge.

**aspera**on Seeking a "Seeking Whence 'Seek Whence'" Sequence · 2012-11-16T20:20:48.271Z · LW · GW

I'm also looking for a discussion of the symmetry related to conservation of probability through Noether's theorem. A quick Google search only finds quantum mechanics discussions, which relate it to spatial invariances, etc.

If there's no symmetry, it's not a conservation law. Surely someone has derived it carefully. Does anyone know where?

**aspera**on Torture vs. Dust Specks · 2012-11-16T20:06:02.323Z · LW · GW

The idea that the utility should be continuous is mathematically equivalent to the idea that an infinitesimal change on the discomfort/pain scale should give an infinitesimal change in utility. If you don't use that axiom to derive your utility funciton, you can have sharp jumps at arbitrary pain thresholds. That's perfectly OK - but then you have to choose where the jumps are.

**aspera**on The Quotation is not the Referent · 2012-11-09T20:17:19.150Z · LW · GW

I think that in physics we would deal with this as a mapping problem. Jonh's and Mary's beliefs about the planet live in different spaces, and we need to pick a basis on which to project them in order to compare them. We use language as the basis. But then when we try to map between concepts, we find that the problem is ill posed: it doesn't have a unique solution because the maps are not all 1:1.

**aspera**on 2012 Less Wrong Census/Survey · 2012-11-09T00:02:21.687Z · LW · GW

Nice job writing the survey - fun times. I kind of want to hand it out to my non-LW friends, but I don't want to corrupt the data.

**aspera**on Meetup : Meetup, Champaign IL, · 2012-11-08T22:57:35.462Z · LW · GW

Thanks, I'll check it out.

**aspera**on Torture vs. Dust Specks · 2012-11-07T22:41:19.364Z · LW · GW

Bravo, Eliezer. Anyone who says the answer to this is obvious is either WAY smarter than I am, or isn't thinking through the implications.

Suppose we want to define Utility as a function of pain/discomfort on the continuum of [dust speck, torture] and including the number of people afflicted. We can choose whatever desiderata we want (e.g. positive real valued, monotonic, commutative under addition).

But what if we choose as one desideratum, "There is no number *n* large enough such that Utility(n dust specks) > Utility(50 yrs torture)." What does that imply about the function? It can't be analytic in *n* (even if *n* were continuous). That rules out multaplicative functions trivially.

Would it have singularities? If so, how would we combine utility functions at singular values? Take limits? How, exactly?

Or must dust specks and torture live in different spaces, and is there no basis that can be used to map one to the other?

The bottom line: is it possible to consistently define utility using the above desideratum? It seems like it must be so, since the answer is obvious. It seems like it must not be so, because of the implications for the utility function as the arguments change.

Edit:
After discussing with my local meetup, this is somewhat resolved. The above desiderata require the utility to be *bounded* in the number of people, *n*. For example, it could be a staurating exponential function. This is self-consistent, but inconsistent with the notion that because experience is independent, utilities should add.

Interestingly, it puts strict mathematical rules on how utility can scale with n.

**aspera**on Meetup : Meetup, Champaign IL, · 2012-11-07T21:48:06.307Z · LW · GW

Also, I suggest you read Torture vs Dust Specks. I found it to be very troubling, and would love to talk about it at the meeting.

**aspera**on Conservation of Expected Evidence · 2012-11-01T16:34:35.895Z · LW · GW

Is this the same as Jaynes' method for construction of a prior using transformation invariance on acquisition of new evidence?

Does conservation of expected evidence always uniquely determine a probability distribution? If so, it should eliminate a bunch of extraneous methods of construction of priors. For example, you would immediately know if an application of MaxEnt was justified.

**aspera**on Every Cause Wants To Be A Cult · 2012-10-15T19:55:07.053Z · LW · GW

That thought occurred to me too, and then I decided that EY was using "entropy" as "the state to which everything naturally tends" But after all, I think it's possible to usefully extend the metaphor.

There *is* a higher number of possible cultish microstates than non-cultish microstates, because there are fewer logically consistent explanations for a phenomenon than logically inconsistent ones. In each non-cultish group, rational argument and counter-argument should naturally push the group toward one describing observed reality. By contrast, cultish groups can fill up the rest of concept-space.

**aspera**on Signs you're on LW too much · 2012-10-12T03:20:46.458Z · LW · GW

You can't remember whether or not bleggs exist in real life.

**aspera**on Where to Draw the Boundary? · 2012-10-11T05:18:45.159Z · LW · GW

Maybe this is covered in another post, but I'm having trouble cramming this into my brain, and I want to make sure I get this straight:

Consider a thingspace. We can divide the thingspace into any number of partially-overlapping sets that don’t necessarily span the space. Each set is assigned a word, and the words are not unique.

Our job is to compress mental concepts in a lossy way into short messages to send between people, and we do so by referring to the words. Inferences drawn from the message have associated uncertanties that depended on the characteristics we believe members of the sets to have, word redundancy, etc.

In principle, we can draw whichever boundaries we like in thingspace (and, I suppose, they don’t need to be hard boundaries). But EY is saying that it’s *wise* to draw the boundaries in a way that "feels" right, which presumably means that the members have certain things in common. Then when we make inferences, the pdfs are sharply peaked (since we required that for set membership), and the calculation is simpler to do.

He also says that it’s possible to make a "mistake" in defining the sets. Does this result from the failure to be consistent in our definitions, a failure to assign uncertainties correctly, or a failure to define the sets in a wise way?

**aspera**on Welcome to Less Wrong! · 2012-10-10T21:05:20.404Z · LW · GW

That's very helpful, thanks. I'm trying to shove everything I read here into my current understanding of probability and estimation. Maybe I should just read more first.

**aspera**on Welcome to Less Wrong! · 2012-10-10T18:38:00.642Z · LW · GW

There are a couple things I still don't understand about this.

Suppose I have a bent coin, and I believe that P(heads) = 0.6. Does that belief pay rent? Is it a "floating belief?" It is not, in principle, falsifiable. It's not a question of measurement accuracy in this case (unless you're a frequentist, I guess). But I can gather *some* evidence for or against it, so it's not uninformative either. It is useful to have something between grounded and floating beliefs to describe this belief.

Second, when LWers talk about beliefs, or "the map," are they referring to a model of *what we expect to observe*, or *how things actually happen*? This would dictate how we deal with measurement uncertainties. In the first case, they must be included in the map, trivially. In the second case, the map still has an uncertainty associated with it that *results* from back-propagation of measurement uncertainty in the updating process. But then it might make sense to talk only about grounded or floating beliefs, and to attribute the fuzzy stuff in between to our inability to observe without uncertainty.

Your distinction makes sense - I'm just not sure how to apply it.

**aspera**on Mysterious Answers to Mysterious Questions · 2012-10-10T15:57:11.876Z · LW · GW

I think this is the kind of causal loop he has in mind. But a key feature of the hypothesis is that you *can't* predict what's meant to happen. In that case, he's equally good at predicting any outcome, so it's a perfectly uninformative hypothesis.

**aspera**on My Wild and Reckless Youth · 2012-10-10T00:45:52.518Z · LW · GW

Is this what CFAR is trying to do?

I would be interested to hear what other members of the community think about this. I accidentally found Bayes after being trained as a physicist, which is not entirely unlike traditional rationality. But I want to teach my brother, who doesn't have any science or rationality background. Has anyone had success with starting at Bayes and going from there?

**aspera**on Mysterious Answers to Mysterious Questions · 2012-10-10T00:16:07.826Z · LW · GW

I jest, but the sense of the question is serious. I really do want to teach the people I'm close to how to get started on rationality, and I recognize that I'm not perfect at it either. Is there a serious conversation somewhere on LW about being an aspiring rationalist living in an irrational world? Best practices, coping mechanisms, which battles to pick, etc?

**aspera**on Mysterious Answers to Mysterious Questions · 2012-10-09T23:07:45.189Z · LW · GW

My mother's husband professes to believe that our actions have no control over the way in which we die, but that "if you're meant to die in a plane crash and avoid flying, then a plane will end up crashing into you!" for example.

After explaining how I would expect that belief to constrain experience (like how it would affect plane crash statistics), as well as showing that he himself was demonstrating his unbelief every time he went to see a doctor, he told me that you "just can't apply numbers to this," and "Well, you shouldn't *tempt* fate."

My question to the LW community is this: How do you avoid kicking people in the nuts all of the time?

**aspera**on Guessing the Teacher's Password · 2012-10-09T04:36:56.770Z · LW · GW

I agree with you, a year and a half late. In fact, the idea can be extended to EY's concept of "floating beliefs," webs of code words that are only defined with respect to one another, and not with respect to evidence. It should be noted that if at any time, a member of the web is correlated in some way with evidence, then so is the entire web.

In that sense, it doesn't seem like wasted effort to maintain webs of "passwords," as long as we're responsible about updating our best guesses about reality based on only those beliefs that are evidence-related. In the long term, given enough memory capacity, it should speed our understanding.

**aspera**on Fake Explanations · 2012-10-09T04:22:39.321Z · LW · GW

Unless I misunderstand, this story is a parable. EY is communicating with a handwaving example that the effectiveness of a code doesn't depend on the alphabet used. In the code used to describe the plate phenomenon, “magic” and “heat conduction” are interchangeable symbols which formally carry zero information, since the coder doesn't use them to discriminate among cases.

I’m sincerely confused as to why comments center on the motivations of the students and the professor. Isn't that irrelevant? Or did EY mean for the discussion to go this way? Does it matter?