My weekly review habit 2020-06-21T14:40:02.607Z · score: 60 (22 votes)
Wireless is a trap 2020-06-07T15:30:02.352Z · score: 102 (49 votes)
Learning to build conviction 2020-05-16T23:30:01.272Z · score: 31 (16 votes)
College advice for people who are exactly like me 2020-04-13T01:30:02.265Z · score: 44 (15 votes)
College discussion thread 2014-04-01T03:21:58.126Z · score: 5 (6 votes)
Channel factors 2014-03-12T04:52:40.026Z · score: 17 (18 votes)
A critique of effective altruism 2013-12-02T16:53:35.360Z · score: 70 (75 votes)


Comment by benkuhn on My weekly review habit · 2020-06-21T19:45:07.494Z · score: 5 (3 votes) · LW · GW

I have almost no discipline, I've just spent a lot of time making my habits take so little effort that that doesn't matter :) Figuring out how to make it easy for myself to prioritize, and stick to those priorities, every day is actually a common recurring weekly review topic!

(I considered laying out my particular set of todo-related habits, but I don't think they'd be very helpful to anyone else because of how personal it is—the important part is thinking about it a lot from the perspective of "how can I turn this into a habit that doesn't require discipline for me," not whatever idiosyncratic system you end up with.)

Comment by benkuhn on College advice for people who are exactly like me · 2020-04-14T22:29:54.246Z · score: 1 (1 votes) · LW · GW

Thanks! Glad you enjoyed them!

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-12-08T03:45:31.142Z · score: 3 (2 votes) · LW · GW

Thanks, this comment is really useful!

It is generally accepted that you do not need to go to direct sunlight type lux levels indoors to get most of the benefits.... I am not an expert on the latest studies, but if you want to build an indoor experimental setup to get to the bottom of what you really like, my feeling is that installing more than 4000 lux, as a peak capacity in selected areas, would definitely be a waste of money and resources.

Do you have any pointers to where I might go to read the latest studies?

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-12-08T03:43:18.010Z · score: 3 (2 votes) · LW · GW
A typical 60W equivalent LED bulb draws 7.5W and is 90% efficient

Where are you getting this number? As far as I know, the most efficient LEDs today are around 50% efficient.

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-11-27T05:24:28.842Z · score: 3 (3 votes) · LW · GW

I've done a bit of research on this. I think something along these lines is practical. My biggest uncertainty is what a "usable form factor" is (in particular I don't know how much diffusion you'd need, or in what shape, with a very small emitter like this).

FWIW, the Yuji chips are insanely expensive per lumen and seem to be on the low end of efficiency (actually they seem like such a bad deal that I'm worried I'm missing something). The chip that came out on top in my spreadsheet was this Bridgelux chip which is about 1/10 as expensive per lumen and 2x as efficient (but has a larger light-emitting surface and a CRI of 90 instead of 95).

With a low-profile, silent fan and efficient (120+ lm/W) emitters, you shouldn't need that much of a heat sink to keep the fixtures below, say, 60° C.

Disclaimer: I haven't actually finished building my DIY lamp attempt, so this could all turn out to be wrong :)

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-11-27T05:13:35.642Z · score: 16 (7 votes) · LW · GW

I'm one of the friends mentioned. Here's some more anecdata, most importantly including what I think is the current easiest way to try a lumenator (requiring only one fixture instead of huge numbers of bulbs):

I don't have seasonal depression, but after spending a winter in a tropical country, it was extremely noticeable that it's harder for me to focus and I have less willpower when it's dark out (which now starts at 4:15). I bought an extremely bright light and put it right next to my desk, in my peripheral vision while I work. It was an immediate and very noticeable improvement; I estimate it buys me 30-120 minutes of focus per day, depending on how overcast it is.

You can see a before-and-after here, although my phone camera's dynamic range is not good enough to really capture the difference.

Everyone who has visited my house since I got the lightbulb has remarked on how nice it feels, which I was initially surprised by since the bulb is 5600k and not particularly high-CRI.

My current setup is honestly kinda crappy (but still amazing). I'm working on a much nicer DIY version, but in the mean time, here's the stuff I bought:

  • 250-watt corn bulb (~= 40 60w-equivalent bulbs; $100)
    • This bulb has a pretty loud fan (~50db at close range); if you don't like noise, you can buy two of the 120-watt version.
  • this E39 fixture ($15)
    • the clamp is too weak to hold the bulb, but you can jerry-rig a support by embedding the socket into the styrofoam packaging that the light comes in :P
    • Also if you use this you'll need to turn it off and on by unplugging as there is no switch on the fixture.
  • these E39 to E26 adapters ($10 for 4)
    • buy if you want to put in an overhead light or traditional lamp
    • note that the bulb does not fit well in many fixtures because it is very large and heavy

(Amazon links are affiliate so I can see whether they are useful to people)

Comment by benkuhn on Effective Altruism from XYZ perspective · 2015-07-12T03:23:49.127Z · score: 7 (7 votes) · LW · GW

Every time I pay for electricity for my computer rather than sending the money to a third world peasant is, according to EA, a failure to maximize utility.

I'm sad that people still think EAers endorse such a naive and short-time-horizon type of optimizing utility. It would obviously not optimize any reasonable utility function over a reasonable timeframe for you to stop paying for electricity for your computer.

More generally, I think most EAers have a much more sophisticated understanding of their values, and the psychology of optimizing them, than you give them credit for. As far as I know, nobody who identifies with EA routinely makes individual decisions between personal purchases and donating. Instead, most people allocate a "charity budget" periodically and make sure they feel ok about both the charity budget and the amount they spend on themselves. Very few people, if any, cut personal spending to the point where they have to worry about, e.g., electricity bills.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-04T07:07:57.292Z · score: 0 (0 votes) · LW · GW

Yes, I glossed over the possibility of prisons bribing judges to screw up the data set. That's because the extremely small influence of marginal data points and the cost of bribing judges would make such a strategy incredibly expensive.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-03T22:49:53.922Z · score: 0 (0 votes) · LW · GW

Yep. Concretely, if you take one year to decide that each negative reform has been negative, the 20-80 trade that the OP posts is a net positive to society if you expect the improvement to stay around for 4 years.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-03T22:48:52.747Z · score: 2 (2 votes) · LW · GW

To increase p'-p, prisons need to incarcerate prisoners which are less prone to recidivism than predicted. Given that past criminality is an excellent predictor of future criminality, this leads to a perverse incentive towards incarcerating those who were unfairly convicted (wrongly convicted innocents or over-convinced lesser offenders).

If past criminality is a predictor of future criminality, then it should be included in the state's predictive model of recidivism, which would fix the predictions. The actual perverse incentive here is for the prisons to reverse-engineer the predicted model, figure out where it's consistently wrong, and then lobby to incarcerate (relatively) more of those people. Given that (a) data science is not the core competency of prison operators; (b) prisons will make it obvious when they find vulnerabilities in the model; and (c) the model can be re-trained faster than the prison lobbying cycle, it doesn't seem like this perverse incentive is actually that bad.

Comment by benkuhn on LW survey: Effective Altruists and donations · 2015-05-19T04:16:56.671Z · score: 5 (5 votes) · LW · GW

Gwern has a point that it's pretty trivial to run this robustness check yourself if you're worried. I ran it. Changing the $1 to $100 reduces the coefficient of EA from about 1.8 to 1.0 (1.3 sigma), and moving to $1000 reduces it from 1.0 to 0.5 (about two sigma). The coefficient remains highly significant in all cases, and in fact becomes more significant with the higher constant in the log.

Comment by benkuhn on LW survey: Effective Altruists and donations · 2015-05-18T00:00:41.775Z · score: 0 (0 votes) · LW · GW

What do you mean by "dollar amounts become linear"? I haven't seen a random variable referred to as "linear" before (on its own, without reference to another variable as in "y is linear in x").

Comment by benkuhn on Effective effective altruism: Get $400 off your next charity donation · 2015-04-18T21:06:26.058Z · score: 0 (0 votes) · LW · GW

For people who would otherwise not have multiple credit cards, the increase in credit score can be fairly substantial.

In addition to Dorikka's comment, you are not liable for fraudulent charges; usually the intermediating bank is.

Comment by benkuhn on Effective effective altruism: Get $400 off your next charity donation · 2015-04-18T02:39:15.543Z · score: 0 (0 votes) · LW · GW

If you don't want to bother signing up for a bunch of cards, the US Bank Cash+ card gives 5% cash back for charitable donations, up to I think $2000 per quarter. This is a worse percentage but lower-effort and does not ding your credit (as long as you don't miss payments, obvs).

Also, as I understand, it's actually better not to cancel the cards you sign up for (unless they have an annual fee), because "average age of credit line" is a factor in the FICO score. Snip them up, set up auto-pay and fraud alerts and forget about them, but don't cancel them.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T17:59:50.758Z · score: 1 (1 votes) · LW · GW

Of course, the case of Beanie Babies is more comparable to Dogecoin than Bitcoin, and the Dutch tulip story has in reality been quite significantly overblown (see , scrolling down to "Legal Changes"). But then I suppose the reference class of "highly unique things" will necessarily include things each of which has unique properties... :)

I think the way to go here is to assemble a larger set of potentially comparable cases. If you keep finding yourself citing different idiosyncratic distinctions (e.g. Bitcoin was the only member to be not-overblown AND have a hard cap on its supply AND get over 3B market cap AND ...), this suggests that you need to be more inclusive about your reference class in order to get a good estimate.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T16:43:45.678Z · score: 2 (2 votes) · LW · GW

The difference is that it's easy to make more tulips or Beanie Babies, but the maximum number of Bitcoins is fixed.

Yes, this is what I mean by reference class tennis :)

Actually, according to Wikipedia, it's hypothesized that part of the reason that tulip prices rose as quickly as they did was that it took 7-12 years to grow new tulip bulbs (and many new bulb varieties had only a few bulbs in existence). And the Beanie Baby supply was controlled by a single company. So the lines are not that sharp here, though I agree they exist.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T01:56:15.945Z · score: 3 (3 votes) · LW · GW

Is my general line of reasoning correct here, and is the style of reasoning a good style in the general case? I am aware that Eliezer raises points against "small probability multiplied by high impact" reasoning, but the fact is that a rational agent has to have a belief about the probability of any event, and inaction is itself a form of action that could be costly due to missing out on everything; privileging inaction is a good heuristic but only a moderately strong one.

Sometimes, especially in markets and other adversarial situations, inaction is secretly a way to avoid adverse selection.

Even if you're a well-calibrated agent--so that if you randomly pick 20 events with a 5% subjective probability, one of them will happen--the set "all events where someone else is willing to trade on odds more favorable than 5%" is not a random selection of events.

Whether the Bitcoin markets are efficient enough to worry about this is an open question, but it should at least be a signal for you to make your case more robust than pulling a 5% number out of thin air, before you invest. I think the Reddit commenters were reasonable (a sentence I did not expect to type) for pointing this out, albeit uncharitably.

Is "take the inverse of the size of the best-fitting reference class" a decent way of getting a first-order approximation? If not, why not? If yes, what are some heuristics for optimizing it?

In my experience, this simply shifts the debate to which reference class is the best-fitting one, aka reference-class tennis. For instance, a bitcoin detractor could argue that the reference class should also include Beanie Babies, Dutch tulips, and other similar stores of value.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T08:37:04.475Z · score: 0 (0 votes) · LW · GW

I was told that you only run into severe problems with model accuracy if the base rates are far from 50%. Accuracy feels pretty interpretable and meaningful here as the base rates are 30%-50%.

It depends on how much signal there is in your data. If the base rate is 60%, but there's so little signal in the data that the Bayes-optimal predictions only vary between 55% and 65%, then even a perfect model isn't going to do any better than chance on accuracy. Meanwhile the perfect model will have a poor AUC but at least one that is significantly different from baseline.

[ROC AUC] penalises you for having poor prediction even when you set the sensitivity (the threshold) to a bad parameter. The F Score is pretty simple, and doesn't have this drawback - it's just a combination of some fixed sensitivity and specificity.

I'm not really sure what you mean by this. There's no such thing as an objectively "bad parameter" for sensitivity (well, unless your ROC curve is non-convex); it depends on the relative cost of type I and type II errors.

The F score isn't comparable to AUC since the F score is defined for binary classifiers and the ROC AUC is only really meaningful for probabilistic classifiers (or I guess non-probabilitstic score-based ones like uncalibrated SVMs). To get an F score for a binary classifier you have to pick a single threshold, which seems even worse to me than any supposed penalization for picking "bad sensitivities."

there is ongoing research and discussion of this, which is confusing because as far as math goes, it doesn't seem like that hard of a problem.

Because different utility functions can rank models differently, the problem "find a utility-function-independent model statistic that is good at ranking classifiers" is ill-posed. A lot of debates over model scoring statistics seem to cash out to debates over which statistics seem to produce model selection that works well robustly over common real-world utility functions.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T06:11:07.026Z · score: 0 (0 votes) · LW · GW

I would beware the opinions of individual people on this, as I don't believe it's a very settled question. For instance, my favorite textbook author, Prof. Frank Harrell, thinks 22k is "just barely large enough to do split-sample validation." The adequacy of leave-one-out versus 10-fold depends on your available computational power as well as your sample size. 200 seems certainly not enough to hold out 30% as a test set; there's way too much variance.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T06:04:56.428Z · score: 0 (0 votes) · LW · GW

Is this ambiguous?

It wasn't clear that this applied to the statement "we couldn't improve on using these" (mainly because I forgot you weren't considering interactions).

I excluded the rater and ratee from the averages.

Okay, that gets rid of most of my worries. I'm not sure it account for covariance between correlation estimates of different averages, so I'd be interested in seeing some bootstrapped confidence intervals). But perhaps I'm preempting future posts.

Also, thinking about it more, you point out a number of differences between correlations, and it's not clear to me that those differences are significant as opposed to just noise.

I'm not sure whether this answers your question, but I used log loss as a measure of accuracy.

I was using "accuracy" in the technical sense, i.e., one minus what you call "Total Error" in your table. (It's unfortunate that Wikipedia says scoring rules like log-loss are a measure of the "accuracy" of predictions! I believe the technical usage, that is, percentage properly classified for a binary classifier, is a more common usage in machine learning.)

The total error of a model is in general not super informative because it depends on the base rate of each class in your data, as well as the threshold that you choose to convert your probabilistic classifier into a binary one. That's why I generally prefer to see likelihood ratios, as you just reported, or ROC AUC scores (which integrates over a range of thresholds).

(Although apparently using AUC for model comparison is questionable too, because it's noisy and incoherent in some circumstances and doesn't penalize miscalibration, so you should use the H measure instead. I mostly like it as a relatively interpretable, utility-function-independent rough index of a model's usefulness/discriminative ability, not a model comparison criterion.)

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T01:46:40.501Z · score: 5 (5 votes) · LW · GW

Nice writeup! A couple comments:

If the dataset contained information on a sufficiently large number of dates for each participant, we could not improve on using [frequency with which members of the opposite sex expressed to see them again, and the frequency with which the participant expressed interest in seeing members of the opposite sex again].

I don't think this is true. Consider the following model:

  • There is only one feature, eye color. The population is split 50-50 between brown and blue eyes. People want to date other people iff they are of the same eye color. Everyone's ratings of eye color are perfect.

In this case, with only selectivity ratings, you can't do better than 50% accuracy (any person wants to date any other person with 50% probability). But with eye-color ratings, you can get it perfect.

[correlation heatmap]

My impression is that there are significant structural correlations in your data that I don't really understand the impact of. (For instance, at least if everyone rates everyone, I think the correlation of attr with attrAvg is guaranteed to be positive, even if attr is completely random.)

As a result, I'm having a hard time interpreting things like the fact that likeAvg is more strongly correlated with attr than with like. I'm also having a hard time verifying your interpretations of the observations that you make about this heatmap, because I'm not sure to what extent they are confounded by the structural correlations.

It seems implausible to me that each of the 25 correlations between the five traits of attractiveness, fun, ambition, intelligence and sincerity is positive.

Nitpick: There are only 10 distinct such correlations that are not 1 by definition.

The predictive power that we obtain

Model accuracy actually isn't actually a great measure of predictive power, because it's sensitive to base rates. (You at least mentioned the base rates, but it's still hard to know how much to correct for the base rates when you're interpreting the goodness of a classifier.)

As far as I know, if you don't have a utility function, scoring classifiers in an interpretable way is still kind of an open problem, but you could look at ROC AUC as a still-interpretable but somewhat nicer summary statistic of model performance.

Comment by benkuhn on Effective Writing · 2014-07-19T06:17:43.941Z · score: 4 (4 votes) · LW · GW

I can't speak to "best," but I suggest reading Style: Lessons in Clarity and Grace by Joseph M. Williams, which crystallizes lots of non-trivial components of "good writing." (The link is to an older, less expensive edition which I used.)

I'll also second "write a lot" and "read a lot." Reading closely and with purpose in mind will speed up the latter (as opposed to the default of throwing books at your brain and hoping to pick up good writing by osmosis). Also, read good writers.

Comment by benkuhn on Too good to be true · 2014-07-12T01:14:35.235Z · score: 7 (7 votes) · LW · GW

In your "critiquing bias" section you allege that 3/43 studies supporting a link is "still surprisingly low". This is wrong; it is actually surprisingly high. If B ~ Binom(43, 0.05), then P(B > 2) ~= 0.36.*

*As calculated by the following Python code:

from scipy.stats import binom
b = binom(43, 0.05)
p_less_than_3 = sum(b.pmf(i) for i in [0,1,2])
print 1 - p_less_than_3
Comment by benkuhn on Relative and Absolute Benefit · 2014-06-19T02:55:30.860Z · score: 4 (4 votes) · LW · GW

I think you're being a little uncharitable to people who promote interventions that seem positional (e.g. greater educational attainment). It may be true that college degrees are purely for signalling and hence positional goods, but:

(a) it improves aggregate welfare for people to be able to send costly signals, so we shouldn't just get rid of college degrees;

(b) if an intervention improves college graduation rate, it (hopefully) is not doing this by handing out free diplomas, but rather by effecting some change in the subjects that makes them more capable of sending the costly signal of graduating from college, which is an absolute improvement.

Similarly, while height increase has no plausible mechanism for improving absolute wellbeing, some mechanisms for improving absolute wellbeing are measured using height as a proxy (most prominently nutritional status in developing countries).

It should definitely be a warning sign if an intervention seems only to promote a positional good, but it's more complex than it seems to determine what's actually positional.

Comment by benkuhn on Request for concrete AI takeover mechanisms · 2014-04-28T02:03:19.782Z · score: 3 (3 votes) · LW · GW

Fun question.

The takeover vector that leaps to mind is remote code execution vulnerabilities on websites connected to important/sensitive systems. This lets you bootstrap from ability to make HTTP GET requests, to (partial) control over any number of fun targets, like banks or Amazon's shipping.

The things that are one degree away from those (via e.g. an infected thumb drive) are even more exciting:

  • Iranian nuclear centrifuges
  • US nuclear centrifuges
  • the electrical grid
  • hopefully not actual US nuclear weapons, but this should be investigated...

Plausible first attempt: get into a defense contractor's computers and install Thompson's compiler backdoor. Now the AI can stick whatever code it wants on various weapons and blackmail anyone it wants or cause havoc in any number of other ways.

Comment by benkuhn on Open Thread April 16 - April 22, 2014 · 2014-04-17T16:47:00.033Z · score: 1 (1 votes) · LW · GW

Yes, definitely agree that politicians can dupe people into hiring them. Just wanted to raise the point that it's very workplace-dependent. The takeaway is probably "investigate your own corporate environment and figure out whether doing your job well is actually rewarded, because it may not be".

Comment by benkuhn on Open Thread April 16 - April 22, 2014 · 2014-04-17T01:34:33.007Z · score: 7 (7 votes) · LW · GW

I'd beware conflating "interpersonal skills" with "playing politics." For CEO at least (and probably CTO as well), there are other important factors in job performance than raw engineering talent. The subtext of your comment is that the companies you mention were somehow duped into promoting these bad engineers to executive roles, but they might have just decided that their CEO/CTO needed to be good at managing or recruiting or negotiating, and the star engineer team lead didn't have those skills.

Second, I think that the "playing politics" part is true at some organizations but not at others. Perhaps this is an instance of All Debates are Bravery Debates.

My model is something like: having passable interpersonal/communication skills is pretty much a no-brainer, but beyond that there are firms where it just doesn't make that much of a difference, because they're sufficiently good at figuring out who actually deserves credit for what that they can select harder for engineering ability than for politics. However, there are other organizations where this is definitely not the case.

Comment by benkuhn on Beware technological wonderland, or, why text will dominate the future of communication and the Internet · 2014-04-13T21:20:27.810Z · score: 4 (4 votes) · LW · GW

I would expect the relevant factor to be mental, not physical, exertion. Unfortunately that's a lot harder to measure.

Comment by benkuhn on Beware technological wonderland, or, why text will dominate the future of communication and the Internet · 2014-04-13T20:04:02.181Z · score: 1 (1 votes) · LW · GW

Do you have actual data on this? Otherwise I'm very tempted to call typical mind.

Comment by benkuhn on Supply, demand, and technological progress: how might the future unfold? Should we believe in runaway exponential growth? · 2014-04-11T21:36:17.216Z · score: 4 (4 votes) · LW · GW

One story for exponential growth that I don't see you address (though I didn't read the whole post, so forgive me if I'm wrong) is the possibility of multiplicative costs. For example, perhaps genetic sequencing would be a good case study? There seem to be a lot of multiplicative factors there: amount of coverage, time to get one round of coverage, amount of DNA you need to get one round of coverage, ease of extracting/preparing DNA, error probability... With enough such multiplicative factors, you'll get exponential growth in megabases per dollar by applying the same amount of improvement to each factor sequentially (whereas if the factors were additive you'd get linear improvement).

Comment by benkuhn on College discussion thread · 2014-04-01T03:24:22.698Z · score: 5 (5 votes) · LW · GW

If anyone's admitted/visiting Harvard, let me know! I go there and would be happy to meet up and/or answer your questions. There are some other students on here as well.

Comment by benkuhn on Increasing the pool of people with outstanding accomplishments · 2014-03-29T18:57:27.205Z · score: 1 (1 votes) · LW · GW

"outstanding" still has some of the same connotations to me, although less so. But I may be in the minority here.

Comment by benkuhn on Increasing the pool of people with outstanding accomplishments · 2014-03-29T03:43:36.062Z · score: 6 (6 votes) · LW · GW

This is a side-note, but I find it off-putting when people use "impressive" when they want something that's like "awesome" but more formal-sounding. (This usage seems to be fairly common in the EA community, for some reason.) I'm sure that you understand the difference between the two, and that you actually terminally care about people doing awesome things, not manufacturing resume items, but to a casual reader it might sound like the latter.

Comment by benkuhn on Effective Altruism Summit 2014 · 2014-03-20T17:51:27.698Z · score: 3 (3 votes) · LW · GW

I don't think they're comparable? I enjoyed the EA summit for different reasons than I enjoyed CFAR. (I had already been to a CFAR workshop by the time of the summit, and was still very happy I went.)

Comment by benkuhn on Effective Altruism Summit 2014 · 2014-03-20T00:57:11.001Z · score: 9 (9 votes) · LW · GW

Definitely attend this if you can! I went last year and it was an amazing experience. Highly, highly recommended. (Yes, even more highly than donating the price of tickets to your favorite effective charity.)

EDIT: also, feel free to ask me stuff about the participant experience, either here or by private message.

Comment by benkuhn on FiveThirtyEight (Nate Silver) rolls out new blog today, and attempts to teach people Bayes' rule. · 2014-03-19T05:01:50.312Z · score: 0 (0 votes) · LW · GW

Yes, that's what I meant by "very rare:" there are situations where it happens, like the model that you gave, but I don't think ones that happen in real life likely to contribute a very large effect. You need really insane publication bias to get a large effect there.

Comment by benkuhn on FiveThirtyEight (Nate Silver) rolls out new blog today, and attempts to teach people Bayes' rule. · 2014-03-18T16:25:11.344Z · score: 2 (2 votes) · LW · GW

Yeah. This is an example where using the actual formula is helpful rather than just speaking heuristically. It's actually somewhat difficult to translate from the author's hand-wavy model to the real Bayes' Theorem (and it would be totally opaque to someone who hadn't seen Bayes before).

"Study support for headline" is supposed to be the Bayes factor P(study supports headline | headline is true) / P(study supports headline | headline is false). (Well actually, everything is also conditioned on you hearing about the study.) If you actually think about that, it's clear that it should be very rare to find a study that is more likely to support its conclusion if that conclusion is not true.

EDIT: the author is not actually Nate Silver.

Comment by benkuhn on FiveThirtyEight (Nate Silver) rolls out new blog today, and attempts to teach people Bayes' rule. · 2014-03-18T16:19:16.416Z · score: 1 (1 votes) · LW · GW

No. The odds that the study supports the headline in the second example are 1/16. The formula he gives is

(final opinion on headline) = (initial gut feeling) * (study support for headline)

where the latter two are odds ratios. From context, "final opinion on headline" is pretty clearly supposed to be "opinion on whether the headline is true."

Comment by benkuhn on Channel factors · 2014-03-15T03:11:23.238Z · score: 1 (1 votes) · LW · GW

Yes; I found that linked from Scott's blog today and hadn't previously read it. Hopefully the more explicit angle of "these are happening to you and you should do something about them" is still helpful to people.

Comment by benkuhn on On not diversifying charity · 2014-03-14T16:14:27.172Z · score: 3 (5 votes) · LW · GW

This post has a number of misconceptions that I would like to correct.

It is a truism within the Effective Altruism movement that you should not diversify charity donations.

Not really. Timeless decision theory considerations suggest that you actually should be splitting your donations, because globally we should be splitting our options. I think many other effective altruists take this stance as well. (See below for explanation.)

Nonlinear Utility Function:

If your utility function is nonlinear, this is fine as long as it's differentiable.

Not necessarily. You are leaving out an important point here, which is that this argument only goes through when differentiability implies that on the margin your utility can be well-approximated by a linear function. That only works if the charity's budget is large relative to your contribution. If you're donating large amounts of money (say over $10k to a small org like MIRI or CEA), this may not hold and it's not necessarily irrational to donate to multiple organizations.


Also seconding what Squark pointed out, which is that risk-aversion is a feature of your utility function, not something that you tack onto your utility function. Wikipedia has a good description/picture, but basically, it's rational to be risk averse for gains in widgets if and only if your utility function has diminishing returns to widgets. (It's also worth pointing out, since by your phrasing I'm not sure whether you know this, that literally everyone's utility function is nonlinear in literally everything.)

Since this is the case when "widgets" are donations to a particular charity, the socially optimal global allocation of funds does not have all the funds going to the single highest-EV charity. If you follow timeless decision theory, this suggests that you should make the split that you think would be optimal if everyone else who followed TDT made that split (which accounts for a sizeable amount of donation).

Comment by benkuhn on Channel factors · 2014-03-12T13:15:17.630Z · score: 0 (0 votes) · LW · GW

Poorly. I'm still not very good at it. "Being done with browser tabs" is not a very concrete trigger. I'm actually considering writing a Chrome extension that will do this for me.

Comment by benkuhn on A vote against spaced repetition · 2014-03-10T22:26:53.846Z · score: 38 (40 votes) · LW · GW

Good information! This is really more "a vote against flashcards" than "a vote against spaced repetition", though, at least given your concrete issues with flashcards. Spaced repetition is an algorithm for figuring out when to review material that you want to memorize; flashcards are one thing that spaced repetition is applied to, because it's easy to stick flashcards in a computer. As far as I know, no matter what object-level mnemonic devices you're using, spaced repetition is still strictly better than "when I feel like I'm forgetting" or "right before a test" or any of the other obvious review strategies, if you can deal with the cognitive load of scheduling things, or get a computer to do it for you.

Is there space for some sort of SRS that allows for input of the more helpful types of memorizations that you listed (pictures, venn diagrams, etc.)?

Comment by benkuhn on [LINK] Latinus rationalior ist. · 2014-03-06T19:11:28.064Z · score: 8 (8 votes) · LW · GW

Nitpick: Lingua latina rationalior est.

Comment by benkuhn on Proportional Giving · 2014-03-05T04:54:16.604Z · score: 0 (0 votes) · LW · GW

It doesn't matter how the utility is distributed, just that it averages out to 10 utils per dollar (such that having the kid is a good buy overall), while the costs rise.

Comment by benkuhn on Strategic choice of identity · 2014-03-04T17:19:00.090Z · score: 3 (3 votes) · LW · GW

You don't actually answer Kaj's criticism, though, which is that the statistical concept of "heritability" does not mean the same thing as the English word "heritability". See Wiki article for details on how it can be confounded.

Comment by benkuhn on Proportional Giving · 2014-03-04T05:39:02.678Z · score: 0 (0 votes) · LW · GW

Suppose that I weight my own utility such that I'm willing to buy utility at 10 utils per dollar. As gjm noted, this exchange rate should stay constant unless my utility weightings change. But suppose that there are a number of things that provide utility at this rate:

  • playing video games
  • drinking alcohol
  • renting fancy cars
  • owning a house
  • having children

These things become available increasingly late in life, so my consumption would increase even though I spent money rationally (well, rationally_{weighted utilitarianism}) throughout.

Comment by benkuhn on Proportional Giving · 2014-03-04T05:20:46.241Z · score: 1 (1 votes) · LW · GW

Will MacAskill does, and I think many other CEA employees. I think Jeff Kaufman and Julia Wise do an ad-hoc thing but similar in that they mostly don't treat the percentage donated as relevant--they set their personal allowance based on making their best effort without taking into account how much they're currently earning. (I'm not 100% sure this is accurate though.) I don't know the giving habits of many other EtGers but I wouldn't be surprised if they used a broadly similar method to Jeff and Julia.

Comment by benkuhn on Proportional Giving · 2014-03-02T23:55:44.682Z · score: 19 (15 votes) · LW · GW

Proportional giving was designed for people who didn't even necessarily want to be intrinsically motivated to give money (e.g. paying taxes or perhaps tithing to a church). If you want to raise money from such people, proportional donation aligns the incentives much better than threshold.

That said, there are a couple reasons why it's still useful for effective altruists:

  • The thing you mentioned about near mode.

  • As you get older, you gain more ability to buy utility at good prices: for instance, kids become increasingly expensive as they age.

  • It sets a norm that's easier for people to follow. For instance, fewer people would join Giving What We Can if the pledge were "give everything above $X" instead of "give 10% of your income".

  • It's more inclusive. Not everyone can give away everything above (e.g.) US$36k. A lot more people can give away 10% of income.

Nevertheless, many effective altruists (e.g. Toby Ord) do practice the fixed-income approach.

Comment by benkuhn on Lifestyle interventions to increase longevity · 2014-03-01T04:17:41.283Z · score: 1 (1 votes) · LW · GW

At least according to Val, activating System 2 requires SNS activity.

Comment by benkuhn on Lifestyle interventions to increase longevity · 2014-02-28T00:28:29.989Z · score: 4 (4 votes) · LW · GW

For clarity, I don't trust Wiseman since I've never read anything and my prior for pop-sci is low. Luke's endorsement is a positive update to his credibility.

Fully verifying is expensive, but spot-checking is cheap (this post took me about 10 minutes, e.g.). Similarly, most people barely check GiveWell's research at all, but it still matters a lot that it's so transparent, because it's a hard-to-fake signal, and facilitates spot-checking.

Re: music--it looks like you were referring to a different study on the benefits of listening to music than the one I found in Amazon's preview of Wiseman. "Listen to classical music " would have been another high-VoI addition to the OP.

Further studies indicate that "self-selected relaxing music" has the same effect, and that it's probably mediated by general reduction of SNS arousal. This suggests that (a) if you're doing an SNS-heavy task, like difficult math, you may not want to listen to music at the same time; (b) anything else you would expect to move you around the autonomic spectrum should work the same way (e.g. meditation). On the other hand, neither of the studies asked subjects to do anything while listening to music, so it's unclear whether the effect would stay visible. A possibly interesting meta-analysis is here. If doing anything while listening to music makes the effect go away, then I would guess that meditation or the autonomic-spectrum navigation that CFAR teaches is a more efficient way to reduce blood pressure.

I don't know if Wiseman went into any of those in his book, but my take-away is to do some research before installing any new habit.