Posts
Comments
Since utility functions are only unique up to affine transformation, I don't know what to make of this comment. Do you have some sort of canonical representation in mind or something?
See also Kreps, Notes on the Theory of Choice. Note that one of these two restrictions are required in order to specifically prevent infinite expected utility. So if a lottery spits out infinite expected utility, you broke something in the VNM axioms.
For anyone who's interested, a quick and dirty explanation is that the preference relation is primitive, and we're trying to come up with an index (a utility function) that reproduces the preference relation. In the case of certainty, we want a function U:O->R where O is the outcome space and R is the real numbers such that U(o1) > U(o2) if and only if o1 is preferred to o2. In the case of uncertainty, U is defined on the set of probability distributions over O, i.e. U:M(O) -> R. With the VNM axioms, we get U(L) = E_L[u(o)] where L is some lottery (i.e. a probability distribution over O). U is strictly prohibited from taking the value of infinity in these definition. Now you probably could extend them a little bit to allow for such infinities (at the cost of VNM utility perhaps), but you would need every lottery with infinite expected value to be tied for the best lottery according to the preference relation.
I do this too, though in smaller bites. fitfths? fourths? I'm not sure, actually, but it seems to work.
Good point.
Funny, I read your post and my initial reaction was that this evidence cuts against PUA. (Now I'm not sure whether it supports PUA or not, but I lean towards support).
PUA would predict that this phrase
...while I devote myself to worshiping the ground she walks on.
is unattractive.
Well based on your track record there, it seems like a prudent move to avoid making bets with you ;)
(Though I agree with you and should be shaming them rather than defending them.)
If the basilisk is correct* it seems any indirect approach is doomed, but I don't see how it prevents a direct approach. But that has it's own set of probably-insurmountable problems, I'd wager.
* I remain highly uncertain about that, but it's not something I can claim to have a good grasp on or to have thought a lot about.
I think I understand X, and it seems like a legitimate problem, but the comment I think you're referring to here seems to contain (nearly) all of X and not just half of it. So I'm confused and think I don't completely understand X.
Edit: I think I found the missing part of X. Ouch.
I also have this problem and would like to know how to fix it / if dual n-back might help.
Politics is the mind killer for a variety of reasons besides ridiculously strong priors that are never swayed by evidence. Strong priors isn't even the entirety of the phenomena to be explained (though it is a big part), let alone a fundamental explanation.
Also, I really like Noah's post (and was about to post it in the current open thread before I found your post). Not only did Noah attach a word to a pretty commonly occurring phenomenon, the word seems to have a great set of connotations attached to it, given some goals about improving discourse.
What do you mean by 'content' here? The basic narrative each model tells about the economy?
I think I agree with you. The big difference between the models I learned in undergrad and the models I learned in grad school was that in undergrad, everything was static. In grad school, the models were dynamic - i.e. a sequence of equilibria over time instead of just one.
FWIW I'm a grad student in econ, and in my experience the undergrad and graduate macro are completely different. I recall Greg Mankiw sharing a similar sentiment on his blog at some point, but can't be bothered to look it up.
That was like, half the point of my post. I obviously suck at explaining myself.
I think the combination of me skimming and thinking in terms of the underlying preference relation instead of intertheoretic weights caused me to miss it, but yeah, It's clear you already said that.
Thanks for throwing your brain into the pile.
No problem :) Here are some more thoughts:
It seems correct to allow the probability distribution over ethical theories to depend on the outcome - there are facts about the world which would change my probability distribution over ethical theories, e.g. facts about the brain or human psychology. Not all meta-ethical theories would allow this, but some do.
I'm nearly certain that if you use preference relation over sets framework, you'll recover a version of each ethical theory's utility function, and this even happens if you allow the true ethical theory to be correlated with the outcome of the lottery by using a conditional distribution P(m|o) instead of P(m). Implicitly, this will define your k_m's and c_m's, given a version of each m's utility function, U_m(.).
It seems straightforward to add uncertainty over meta-preferences into the mix, though now we'll need meta-meta-preferences over M2xM1xO. In general, you can always add uncertainty over meta^n-preferences, and the standard VNM axioms should get you what you want, but in the limit the space becomes infinite-dimensional and thus infinite, so the usual VNM proof doesn't apply to the infinite tower of uncertainty.
It seems incorrect to have M be a finite set in the first place since competing ethical theories will say something like "1 human life = X dog lives", and X could be any real number. This means, once again, we blow up the VNM proof. On the other hand, I'm not sure this is any different than complaining that O is finite, in which case if you're going to simplify and assume O is finite, you may as well do the same for M.
This strikes me as the wrong approach. I think that you probably need to go down to the level of meta-preferences and apply VNM-type reasoning to this structure rather than working with the higher-level construct of utility functions. What do I mean by that? Well, let M denote the model space and O denote the outcome space. What I'm talking about is a preference relation > on the space MxO. If we simply assume such a > is given (satisfying the constraint that (m1, o1) > (m1, o2) iff o1 >_m1 o2 where >_m1 is model m1's preference relation) , then the VNM axioms applied to (>, MxO) and the distribution on M are probably sufficient to give a utility function, and it should have some interesting relationship with the utility functions of each competing ethical model. (I don't actually know this, it just seems intuitively plausible. Feel free to do the actual math and prove me wrong.)
On the other hand, we'd like to allow the set of >_m's to determine > (along with P(m)), but I'm not optimistic. It seems like this should only happen when the utility functions associated with each >_m, U_m(o), are fully unique rather than unique up to affine transformation. Basically, we need our meta-preferences over the relative badness of doing the wrong thing under competing ethical theories to play some role in determining >, and that information simply isn't present in the >_m's.
(Even though my comment is a criticism, I still liked the post - it was good enough to get me thinking at least)
Edit: clarity and fixing _'s
One interesting fact from Chapter 4 (on weather predictions) that seems worth mentioning: Weather forecasters are also very good at manually and intuitively (i.e. without some rigorous mathematical method) fixing the predictions of their models. E.g. they might know that model A always predicts rain a hundred miles or so too far west from the Rocky Mountains. So to fix this, they take the computer output and manually redraw the lines (demarking level sets of precipitation) about a hundred miles east, and this significantly improves their forecasts.
Also: the national weather service gives the most accurate weather predictions. Everyone else will exaggerate to a greater or lesser degree in order to avoid getting flak from consumers about, e.g., rain on their wedding day (because not-rain or their not-wedding day is far less of a problem).
I just started a research project with my adviser developing new posterior sampling algorithms for dynamic linear models (linear gaussian discrete time state space models). Right now I'm in the process of writing up the results of some simulations testing a couple known algorithms, and am about to start some simulations testing some AFAIK unknown algorithms. There's a couple interesting divergent threads coming off this project, but I haven't really gotten into those yet.
Off the cuff: it's probably a random walk.
Edit: It's now pretty clear to me that's false, but plotting the ergodic means of several "chains" seems like a good way to figure it out.
Edit 2: In retrospect, I should have predicted that. If anyone is interested, I can post some R code so you can see what happens.
The book, Wicked is based on Wizard of Oz and has some related themes IIRC. (I really didn't like the musical based on the book though. But I might just dislike musicals in general; FWIW I also didn't like the only other musical I've seen in person - Rent.)
Experimental economists use mechanical turk sometimes. At least, were encourage to use it in the experimental economics class I just took.
In response to:
I believe the hard-line Bayesian response to that would be that model checking should itself be a Bayesian process.
and
"But," the soft Bayesians might say, "how do you expand that 'something else' into new models by Bayesian means? You would need a universal prior, a prior whose support includes every possible hypothesis. Where do you get one of those? Solomonoff? Ha! And if what you actually do when your model doesn't fit looks the same as what we do, why pretend it's Bayesian inference?"
I think a hard line needs to be drawn between statistics and epistemology. Statistics is merely a method of approximating epistemology - though a very useful one. The best statistical method in a given situation is the one that best approximates correct epistemology. (I'm not saying this is the only use for statistics, but I can't seem to make sense of it otherwise)
Now suppose Bayesian epistemology is correct - i.e. let's say Cox's theorem + Solomonoff prior. The correct answer to any induction problem is to do the true Bayesian update implied by this epistemology, but that's not computable. Statistics gives us some common ways to get around this problem. Here are a couple:
1) Bayesian statistics approach: restrict the class of possible models and put a reasonable prior over that class, then do the Bayesian update. This has exactly the same problem that Mencius and Cosma pointed out.
2) Frequentist statistics approach: restrict the class of possible models and come up with a consistent estimate of which model in that class is correct. This has all the problems that Bayesians constantly criticize frequentists for, but it typically allows for a much wider class of possible models in some sense (crucially, you often don't have to assume distributional forms)
3) Something hybrid: e.g., Bayesian statistics with model checking. Empirical Bayes (where the prior is estimated from the data). Etc.
Now superficially, 1) looks the most like the true Bayesian update - you don't look at the data twice, and you're actually performing a Bayesian update. But you don't get points for looking like the true Bayesian update, you get points for giving the same answer as the true Bayesian update. If you do 1), there's always some chance that the class of models you've chosen is too restrictive for some reason. Theoretically you could continue to do 1) by just expanding the class of possible models and putting a prior over that class, but at some point that becomes computationally infeasible. Model checking is a computationally feasible way of approximating this process. And, a priori, I see no reason to think that some frequentist method won't give the best computationally feasible approximation in some situation.
So, basically, a "hardline Bayesian" should do model checking and sometimes even frequentist statistics. (Similarly, a "hardline frequentist" in the epistemological sense should sometimes do Bayesian statistics. And, in fact, they do this all the time in econometrics.)
Even then, the reason this happens might be plausibly explained by the changing information of the bookstore rather than actual intransitivity.
It's a Schelling point, er, joke isn't the right word, but it's funny because the day was supposed to be a Schelling point. And you forgot about it.
This is simultaneously hilarious and weak evidence that the holiday isn't working as intended (though I think repeating the holiday every year will do the trick).
I notice that I'm confused: the maximum score on the Quantitative section is 800 (at that time), and Ph.D. econ programs won't even consider you if you're under a 780. The quantitative exam is actually really easy for math types. When you sign up for the GRE, you get a free CD with 2 practice exams. When I took it, I took the first practice exam without studying at all and got a 760 or so on the quantitiative section (within 10 pts). After studying I got a 800 on the second practice exam and on the actual exam, I got a 790. The questions were basic algebra for the most part with a bit of calculus and basic stat at the top end and a tricky question here and there. The exam was easy - really easy. I was a math major at a tiny / terrible liberal arts school; nothing like MIT or any self respecting state school. So it seems like it should be easy for anyone with a halfway decent mathematics background.
Now you're telling me people intending to major in econ in grad school average a 706, and people intending to major in math average a 733? That's low. Really low relative to my expectations. I would have expected a 730 in econ and maybe a 760 in math.
Possible explanations:
1) Tons of applicants who don't want to believe that they aren't cut out for their field create a long tail on the low side while the high side is capped at 800.
2) Master's programs are, in general, more lenient and there are a large number of people who only intend to go to them, creating the same sort of long tail effect as above in 1).
3) There's way more low-tier graduate programs than I thought in both fields willing to accept the average or even below average student.
4) Weirdness in how these fields are classified (e.g. I don't see statistics there anywhere, is that included in math?)
5) the quantitative section of the standard GRE actually doesn't matter if you're headed to a math or physics program (someone in that field care to comment?). Note: the quantitative section of the standard GRE does matter in econ, but typically only as a way to make the first cut (usually at 760 or 780, depending on the school). I don't know much of the details here though.
6) very few people actually study for the GRE like I did - i.e. buy a prep book and work through it. This depresses their scores even though they're much better quantitatively than I am.
Unsurprisingly since these are in when-I-though-of-them order, 1)-3) appeal to me the most, but 5) and 6) also seem plausible. I don't see why 4) would bias the scores down instead of up so it seems unlikely a priori.
Not surprising, given my experience. Most religion majors I've met were relatively smart and often made fun of the more fundamentalist/evangelical types who typically were turned off by their religion classes. Religion majors seemed like philosophy-lite majors (which is consistent with the rankings).
Edit: Also, relative to Religion, econ has a bunch of poor english speakers that pull the other two categories down. (Note: the "analytical" section is/was actually a couple of very short essays)
That seems to explain why Econ majors get a premium, but that doesn't seem to explain why econ majors don't rank higher, or am I missing something?
I didn't look at the data. I was commenting on your assessment of what they did, which showed that you didn't know how the F test works. Your post made it seem as if all they did was run an F test that compared the average response of the control and treatment groups and found no difference.
Ok, yeah, translating what the researchers did into a Bayesian framework isn't quite right either. Phil should have translated what they did into a frequentist framework - i.e. he still straw manned them. See my comment here.
Both the t-test and the F-test work by assuming that every subject has the same response function to the intervention:
response = effect + normally distributed error
where the effect is the same for every subject.
The F test / t test doesn't quite say that. It makes statements about population averages. More specifically, if you're comparing the mean of two groups, the t or F test says whether the average response of one group is the same as the other group. Heterogeneity just gets captured by the error term. In fact, econometricians define the error term as the difference between the true response and what their model says the mean response is (usually conditional on covariates).
The fact that the authors ignored potential heterogeneity in responses IS a problem for their analysis, but their result is still evidence against heterogeneous responses. If there really are heterogeneous responses we should see that show up in the population average unless:
- The positive and negative effects cancel each other out exactly once you average across the population. (this seems very unlikely)
- The population average effect size is nonzero but very small, possibly because the effect only occurs in a small subset of the population (even if it's large when it does occur) or something similar but more complicated. In this case, a large enough sample size would still detect the effect.
Now it might not be very strong evidence - this depends on sample size and the likely nature of the heterogeneity (or confounders, as Cyan mentions). And in general there is merit in your criticism of their conclusions. But I think you've unfairly characterized the methods they used.
It's just not an argument against Phil that someone might take some of the data in the paper and do a Bayesian analysis that the authors did not do.
That's not what I'm saying. I'm saying that what the authors did do IS evidence against the hypothesis in question. Evidence against a homogenous response is evidence against any response (it makes some response less likely)
hey do, but did the paper he dealt with write within a Bayesian framework? I didn't read it, but it sounded like standard "let's test a null hypothesis" fare.
You don't just ignore evidence because someone used a hypothesis test instead of your favorite Bayesian method. P(null | p value) != P(null)
This is a lot like evaporative cooling of group beliefs
My advisor, Jarad Niemi, has posted a bunch of lectures on Bayesian statistics to youtube, most of them short and all pretty good IMHO. The lectures are made for Stat 544, a course at Iowa State University. They assume a decent familiarity with probability theory - most students in the course have seen most of chapters 1-5 of Casella and Berger in detail - and some knowlege of R.
If it is indeed a megameetup, I'd like to attend (from Ames, IA so in the 7 hour range).
EDIT: FWIW I'm also willing to carpool with anyone (nearly) passing through or (nearly) on the way.
I agree, but I'm not sure it was intended as an insult. The effect in (some) readers is similar though, so maybe I'm splitting hairs.
The best way not to do something is to do the best thing you could be doing instead in the best way.
I think so.
Sure, I'm just saying that personal usefulness shouldn't be the only reason you upvote.
This honestly made me smile in a "man do I love LW" sort of way.
Upvote comments that you think are useful on LW in general, not just comments you found personally useful. (A note to myself as I read this thread).
You're right - and I think this is a common failure mode of the population at large, but my most common failure mode is not finding something in a quick google search then failing to just ask someone else who probably knows while either wasting too much time searching or giving up. At the risk of the typical mind fallacy, perhaps this is the most common failure mode of the average LW member as well. If the grandparent could somehow be changed to target people like me better, I think that would improve it the most.
Also related (specifically, getting offended by people who are acting, gasp, irrationally): The problem with too many rational memes
See especially the comments. There are some good strategies in there for dealing with offense in this specific context, some of which may generalize.
Wei Dai suggests that offense is experienced when people feel they are being treated as being low status.
I would generalize this and say that offense is experienced when people feel they are being treated as being lower status than they feel they are/deserve.
The reason for the generalization: some people get offended by just about everything, it seems, and one way to explain it is a blatant grab for status. It's not that they think they're being treated as low status in an absolute sense necessarily, they just think they should be treated as higher status relative to however they're being treated.
Does anyone know if there any negative effects of drinking red bull or similar energy drinks regularly?
I typically use tea (caffeine) as my stimulant of choice on a day to day basis, but the effects aren't that large. During large Magic: the Gathering tournaments, I typically drink a red bull or two (depending on how deep into the tournament I go) in order stay energetic and focused - usually pretty important/helpful since working on around 4 hours of sleep is the norm for these things.
Red bull works so well that I'm considering promoting it to semi-daily use, but I'd like to know exactly what I'm buying if I do this.
Edit: After saying it out loud, I just realized that if I use red bull regularly, it might lose its effects due to caffeine/whatever dependency. TNSTAAFL strikes again :-/ Still interested in any evidence though.
(what, you were expecting large RCTs? dream on)
Ahem.
You may say I'm a dreamer,
but I'm not the only one
I hope someday we'll randomize
and control, then we'll have fun
(Mediocre, but it took me two minutes. I'm satisfied.)
What Luke said. Also, signalling "don't mess with me" though perhaps that use isn't relevant here.
Great post!
One potential problem is having too many maximal probability moments at once, depending one the nature of the hacks you're trying to implement. It's an embarassment of riches, honestly.
For example, I had a maximal probability moment for about 7 or 8 life-hacks after I came back from minicamp and there was no way I could implement all of them at once because each one would require some amount of concerted effort, so I was better off focusing on a couple at first. When this comes up, often there is a best hack or two to focus on, but the trivial inconvenience of figuring out which ones to focus on may just prevent you from implementing any. I know it's happened to me. When in this situation, just pick something. Anything. It doesn't matter, really. Implementing something is much better than wanting to implement the best something but actually implementing nothing at all and the marginal gain from implementing the best thing probably isn't worth the risk of implementing nothing.
But then I have to interact with peo- NO BRAIN!!! SHUT UP!!! INTERACTING WITH PEOPLE ISN'T BAD!! AND WE'LL HAVE TO INTERACT WITH THEM ANYWAY WHEN THEY DON'T SHOW UP AND IT WILL BE MUCH WORSE!!! WE ARE MAKING THAT PHONE CALL!!!
The conversation I have with myself every time I implement this strategy. Yes, I yell at my brain. Otherwise the insolent bastard won't listen.