Posts

How I build and run behavioral interviews 2024-02-26T05:50:05.328Z
What should we do about network-effect monopolies? 2023-03-06T00:50:07.554Z
Why and how to write things on the Internet 2022-12-29T22:40:04.636Z
Staring into the abyss as a core life skill 2022-12-22T15:30:05.093Z
Be less scared of overconfidence 2022-11-30T15:20:07.738Z
Searching for outliers 2022-03-21T02:40:17.296Z
What I've been doing instead of writing 2021-09-12T12:40:14.532Z
My favorite essays of life advice 2020-12-23T23:00:10.832Z
To listen well, get curious 2020-12-13T00:20:09.608Z
Tips for the most immersive video calls 2020-09-27T20:36:51.422Z
Tools for keeping focused 2020-08-05T02:10:08.707Z
Attention is your scarcest resource 2020-07-30T01:00:13.705Z
Be impatient 2020-07-24T01:30:06.341Z
What should we do about network-effect monopolies? 2020-07-06T00:40:02.413Z
My weekly review habit 2020-06-21T14:40:02.607Z
Wireless is a trap 2020-06-07T15:30:02.352Z
Learning to build conviction 2020-05-16T23:30:01.272Z
College advice for people who are exactly like me 2020-04-13T01:30:02.265Z
College discussion thread 2014-04-01T03:21:58.126Z
Channel factors 2014-03-12T04:52:40.026Z
A critique of effective altruism 2013-12-02T16:53:35.360Z

Comments

Comment by benkuhn on What I've been doing instead of writing · 2021-09-13T01:19:43.504Z · LW · GW

Oops—I forgot that this was going to be auto-crossposted to LW and probably would have prevented that if I remembered, since it's weirdly meta (it was intended mostly as a placeholder for people visiting my website and wondering why there were no recent posts). I guess I'll leave it here now since it got a lot more upvotes than I expected though :)

Comment by benkuhn on My favorite essays of life advice · 2020-12-25T14:12:47.436Z · LW · GW

I think it's still important to note that "not giving up" can lead not just to lack of success, but also to value destruction (Pets.com; Theranos; WeWork). 

If you're going to interpret the original "don't give up" advice so literally and blindly that "no matter what the challenges are I'm going to figure them out" includes committing massive fraud, then yes, it will be bad advice for you. That's a really remarkably uncharitable interpretation.

Comment by benkuhn on My favorite essays of life advice · 2020-12-25T03:09:52.270Z · LW · GW

Not sure if this is your typo or a LW bug, but "essay" appears not to actually be hyperlinked?

Comment by benkuhn on My favorite essays of life advice · 2020-12-24T15:38:28.101Z · LW · GW

I don't think founder/investor class conflict makes that much sense as an explanation for that. It's easy to imagine a world in which investors wanted their money returned when the team updates downwards on their likelihood of success. (In fact, that sometimes happens! I don't know whether Sam would do that but my guess is only if the founders want to give up.)

I also don't think at least Sam glorifies pivots or ignores opportunity cost. For instance the first lecture from his startup course:

And pivots are supposed to be great, the more pivots the better. So this isn't totally wrong, things do evolve in ways you can't totally predict.... But the pendulum has swung way out of whack. A bad idea is still bad and the pivot-happy world we're in today feels suboptimal.... There are exceptions, of course, but most great companies start with a great idea, not a pivot.... [I]f you look at the track record of pivots, they don't become big companies. I myself used to believe ideas didn't matter that much, but I'm very sure that's wrong now.

---

More generally, I agree that this claim clashes strongly with some rationalists' worldviews, and it's plausible that it just increases the variance of outcomes and not the mean. But given that outcomes are power-law distributed (mean is proportional to variance!), the number of people endorsing it from on top of a giant pile of utility, and the perhaps surprisingly low number of highly successful rationalists, I'd recommend rationalists treat it with curiosity instead of dismissiveness.

Comment by benkuhn on My favorite essays of life advice · 2020-12-24T14:48:18.524Z · LW · GW

Oops, thanks! Added a link to the signup form on my site. (And fixed my RSS rendering to not do forms like that in the future.)

Comment by benkuhn on To listen well, get curious · 2020-12-13T18:47:04.854Z · LW · GW

Yes, and I think the different words were useful!

You're repeating / elaborating on things that are in the post, but were not particularly emphasized. I didn't emphasize them because I've personally had the "deeply internalized felt sense of how easy it is for humans to misunderstand each-other" that you describe for a long time, and only more recently got the "be curious" part, and so I emphasized that because it was the missing piece for me (and didn't totally realize the degree to which the other part was load-bearing / could be the missing piece for others).

Comment by benkuhn on Tips for the most immersive video calls · 2020-10-08T01:25:55.303Z · LW · GW

Yeah, if you don't want to DIY it, you can apparently also use a teleprompter for a similar effect. I haven't tried one, but am curious to! The only trade-off I can think of (other than cost and being cumbersome) is that you probably sacrifice some image quality from all the reflection shenanigans, but not sure how big of a deal that would be.

Comment by benkuhn on Tips for the most immersive video calls · 2020-09-29T21:43:13.871Z · LW · GW
  1. Even if decent, I'd be surprised if the microphone compares well to the BoomPro (it's farther from your mouth so will pick up more noise, and most non-standalone mics are optimized for low cost not quality)
  2. I think most earbuds are a lot worse than Pixel USB-C buds for calls—it looks like these have a relatively nice microphone, and relatively poor noise isolation (= allow you to hear your own voice much better)
Comment by benkuhn on Tips for the most immersive video calls · 2020-09-28T11:26:24.983Z · LW · GW

Oh OK, then the audio improvements the post describes will work fine as-is (except for the non-headset mic) and the video ones are probably not worth it. And I guess the networking advice becomes "invest in properly debugging your wifi" instead of running a cable.

Comment by benkuhn on Tips for the most immersive video calls · 2020-09-28T01:25:44.017Z · LW · GW

I assume you don't care about video if you'll be walking around? (I think you lose a lot by giving up video, but de gustibus non disputandum.)

For audio, you could possibly wireless-ify the audio setup in the article with something like an AptX Low Latency to 3.5mm adapter and switching from the BoomPro to the Antlion ModMic Wireless (since I don't think those adapters support mics).

Note that Bluetooth sucks and I haven't tested this so it may not work as well as it sounds like it should. But if you want to keep open-back headphones and a good mic, that's probably your best bet.

Comment by benkuhn on Tools for keeping focused · 2020-08-31T22:29:21.403Z · LW · GW

Yay, happy to hear it was helpful!

Comment by benkuhn on Tools for keeping focused · 2020-08-07T02:01:13.252Z · LW · GW

FYI, you can also disable the red circle from within the Slack preferences (maybe you already knew this, but if not, sorry the post wasn't more explicit!)

Comment by benkuhn on Tools for keeping focused · 2020-08-05T14:00:13.850Z · LW · GW

Thanks, fixed!

Comment by benkuhn on Developmental Stages of GPTs · 2020-07-28T11:32:44.217Z · LW · GW

I'm confused about the "because I could not stop for death" example. You cite it as an example of GPT-3 developing "the sense of going somewhere, at least on the topic level," but it seems to have just memorized the Dickinson poem word for word; the completion looks identical to the original poem except for some punctuation.

(To be fair to GPT-3, I also never remember where Dickinson puts her em dashes.)

Comment by benkuhn on What should we do about network-effect monopolies? · 2020-07-13T10:05:46.794Z · LW · GW

Grubhub used to be over 50% but is now behind Doordash, so maybe doesn't qualify.

Apple has a monopoly on iOS app distribution (aside from rooted phones) and is using it to extract rents, which is what the link is about.

Firefox has 4% market share compared to Chrome's 65%.

Amazon has 40-50% of the ecommerce market depending on which stats you trust.

Google Search has 85%+ market share.

Comment by benkuhn on What should we do about network-effect monopolies? · 2020-07-13T09:58:43.524Z · LW · GW

https://www.ftc.gov/tips-advice/competition-guidance/guide-antitrust-laws/single-firm-conduct/monopolization-defined

Courts do not require a literal monopoly before applying rules for single firm conduct; that term is used as shorthand for a firm with significant and durable market power — that is, the long term ability to raise price or exclude competitors. That is how that term is used here: a "monopolist" is a firm with significant and durable market power. Courts look at the firm's market share, but typically do not find monopoly power if the firm (or a group of firms acting in concert) has less than 50 percent of the sales of a particular product or service within a certain geographic area. Some courts have required much higher percentages.
Comment by benkuhn on What should we do about network-effect monopolies? · 2020-07-13T09:55:37.829Z · LW · GW

Oops this was super unclear, sorry—the thing that ties together all of these crappy websites isn't money issues, just that they're the winner in a network-effect-based business, thus have no plausible competitors and no incentive to become more useful / less crappy.

Comment by benkuhn on What should we do about network-effect monopolies? · 2020-07-06T17:42:59.970Z · LW · GW

What is a "real average"?

Comment by benkuhn on My weekly review habit · 2020-06-21T19:45:07.494Z · LW · GW

I have almost no discipline, I've just spent a lot of time making my habits take so little effort that that doesn't matter :) Figuring out how to make it easy for myself to prioritize, and stick to those priorities, every day is actually a common recurring weekly review topic!

(I considered laying out my particular set of todo-related habits, but I don't think they'd be very helpful to anyone else because of how personal it is—the important part is thinking about it a lot from the perspective of "how can I turn this into a habit that doesn't require discipline for me," not whatever idiosyncratic system you end up with.)

Comment by benkuhn on College advice for people who are exactly like me · 2020-04-14T22:29:54.246Z · LW · GW

Thanks! Glad you enjoyed them!

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-12-08T03:45:31.142Z · LW · GW

Thanks, this comment is really useful!

It is generally accepted that you do not need to go to direct sunlight type lux levels indoors to get most of the benefits.... I am not an expert on the latest studies, but if you want to build an indoor experimental setup to get to the bottom of what you really like, my feeling is that installing more than 4000 lux, as a peak capacity in selected areas, would definitely be a waste of money and resources.

Do you have any pointers to where I might go to read the latest studies?

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-12-08T03:43:18.010Z · LW · GW
A typical 60W equivalent LED bulb draws 7.5W and is 90% efficient

Where are you getting this number? As far as I know, the most efficient LEDs today are around 50% efficient.

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-11-27T05:24:28.842Z · LW · GW

I've done a bit of research on this. I think something along these lines is practical. My biggest uncertainty is what a "usable form factor" is (in particular I don't know how much diffusion you'd need, or in what shape, with a very small emitter like this).

FWIW, the Yuji chips are insanely expensive per lumen and seem to be on the low end of efficiency (actually they seem like such a bad deal that I'm worried I'm missing something). The chip that came out on top in my spreadsheet was this Bridgelux chip which is about 1/10 as expensive per lumen and 2x as efficient (but has a larger light-emitting surface and a CRI of 90 instead of 95).

With a low-profile, silent fan and efficient (120+ lm/W) emitters, you shouldn't need that much of a heat sink to keep the fixtures below, say, 60° C.

Disclaimer: I haven't actually finished building my DIY lamp attempt, so this could all turn out to be wrong :)

Comment by benkuhn on Could someone please start a bright home lighting company? · 2019-11-27T05:13:35.642Z · LW · GW

I'm one of the friends mentioned. Here's some more anecdata, most importantly including what I think is the current easiest way to try a lumenator (requiring only one fixture instead of huge numbers of bulbs):

I don't have seasonal depression, but after spending a winter in a tropical country, it was extremely noticeable that it's harder for me to focus and I have less willpower when it's dark out (which now starts at 4:15). I bought an extremely bright light and put it right next to my desk, in my peripheral vision while I work. It was an immediate and very noticeable improvement; I estimate it buys me 30-120 minutes of focus per day, depending on how overcast it is.

You can see a before-and-after here, although my phone camera's dynamic range is not good enough to really capture the difference.

Everyone who has visited my house since I got the lightbulb has remarked on how nice it feels, which I was initially surprised by since the bulb is 5600k and not particularly high-CRI.

My current setup is honestly kinda crappy (but still amazing). I'm working on a much nicer DIY version, but in the mean time, here's the stuff I bought:

  • 250-watt corn bulb (~= 40 60w-equivalent bulbs; $100)
    • This bulb has a pretty loud fan (~50db at close range); if you don't like noise, you can buy two of the 120-watt version.
  • this E39 fixture ($15)
    • the clamp is too weak to hold the bulb, but you can jerry-rig a support by embedding the socket into the styrofoam packaging that the light comes in :P
    • Also if you use this you'll need to turn it off and on by unplugging as there is no switch on the fixture.
  • these E39 to E26 adapters ($10 for 4)
    • buy if you want to put in an overhead light or traditional lamp
    • note that the bulb does not fit well in many fixtures because it is very large and heavy

(Amazon links are affiliate so I can see whether they are useful to people)

Comment by benkuhn on Effective Altruism from XYZ perspective · 2015-07-12T03:23:49.127Z · LW · GW

Every time I pay for electricity for my computer rather than sending the money to a third world peasant is, according to EA, a failure to maximize utility.

I'm sad that people still think EAers endorse such a naive and short-time-horizon type of optimizing utility. It would obviously not optimize any reasonable utility function over a reasonable timeframe for you to stop paying for electricity for your computer.

More generally, I think most EAers have a much more sophisticated understanding of their values, and the psychology of optimizing them, than you give them credit for. As far as I know, nobody who identifies with EA routinely makes individual decisions between personal purchases and donating. Instead, most people allocate a "charity budget" periodically and make sure they feel ok about both the charity budget and the amount they spend on themselves. Very few people, if any, cut personal spending to the point where they have to worry about, e.g., electricity bills.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-04T07:07:57.292Z · LW · GW

Yes, I glossed over the possibility of prisons bribing judges to screw up the data set. That's because the extremely small influence of marginal data points and the cost of bribing judges would make such a strategy incredibly expensive.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-03T22:49:53.922Z · LW · GW

Yep. Concretely, if you take one year to decide that each negative reform has been negative, the 20-80 trade that the OP posts is a net positive to society if you expect the improvement to stay around for 4 years.

Comment by benkuhn on A Proposal for Defeating Moloch in the Prison Industrial Complex · 2015-06-03T22:48:52.747Z · LW · GW

To increase p'-p, prisons need to incarcerate prisoners which are less prone to recidivism than predicted. Given that past criminality is an excellent predictor of future criminality, this leads to a perverse incentive towards incarcerating those who were unfairly convicted (wrongly convicted innocents or over-convinced lesser offenders).

If past criminality is a predictor of future criminality, then it should be included in the state's predictive model of recidivism, which would fix the predictions. The actual perverse incentive here is for the prisons to reverse-engineer the predicted model, figure out where it's consistently wrong, and then lobby to incarcerate (relatively) more of those people. Given that (a) data science is not the core competency of prison operators; (b) prisons will make it obvious when they find vulnerabilities in the model; and (c) the model can be re-trained faster than the prison lobbying cycle, it doesn't seem like this perverse incentive is actually that bad.

Comment by benkuhn on LW survey: Effective Altruists and donations · 2015-05-19T04:16:56.671Z · LW · GW

Gwern has a point that it's pretty trivial to run this robustness check yourself if you're worried. I ran it. Changing the $1 to $100 reduces the coefficient of EA from about 1.8 to 1.0 (1.3 sigma), and moving to $1000 reduces it from 1.0 to 0.5 (about two sigma). The coefficient remains highly significant in all cases, and in fact becomes more significant with the higher constant in the log.

Comment by benkuhn on LW survey: Effective Altruists and donations · 2015-05-18T00:00:41.775Z · LW · GW

What do you mean by "dollar amounts become linear"? I haven't seen a random variable referred to as "linear" before (on its own, without reference to another variable as in "y is linear in x").

Comment by benkuhn on Effective effective altruism: Get $400 off your next charity donation · 2015-04-18T21:06:26.058Z · LW · GW

For people who would otherwise not have multiple credit cards, the increase in credit score can be fairly substantial.

In addition to Dorikka's comment, you are not liable for fraudulent charges; usually the intermediating bank is.

Comment by benkuhn on Effective effective altruism: Get $400 off your next charity donation · 2015-04-18T02:39:15.543Z · LW · GW

If you don't want to bother signing up for a bunch of cards, the US Bank Cash+ card gives 5% cash back for charitable donations, up to I think $2000 per quarter. This is a worse percentage but lower-effort and does not ding your credit (as long as you don't miss payments, obvs).

Also, as I understand, it's actually better not to cancel the cards you sign up for (unless they have an annual fee), because "average age of credit line" is a factor in the FICO score. Snip them up, set up auto-pay and fraud alerts and forget about them, but don't cancel them.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T17:59:50.758Z · LW · GW

Of course, the case of Beanie Babies is more comparable to Dogecoin than Bitcoin, and the Dutch tulip story has in reality been quite significantly overblown (see http://en.wikipedia.org/wiki/Tulip_mania#Modern_views , scrolling down to "Legal Changes"). But then I suppose the reference class of "highly unique things" will necessarily include things each of which has unique properties... :)

I think the way to go here is to assemble a larger set of potentially comparable cases. If you keep finding yourself citing different idiosyncratic distinctions (e.g. Bitcoin was the only member to be not-overblown AND have a hard cap on its supply AND get over 3B market cap AND ...), this suggests that you need to be more inclusive about your reference class in order to get a good estimate.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T16:43:45.678Z · LW · GW

The difference is that it's easy to make more tulips or Beanie Babies, but the maximum number of Bitcoins is fixed.

Yes, this is what I mean by reference class tennis :)

Actually, according to Wikipedia, it's hypothesized that part of the reason that tulip prices rose as quickly as they did was that it took 7-12 years to grow new tulip bulbs (and many new bulb varieties had only a few bulbs in existence). And the Beanie Baby supply was controlled by a single company. So the lines are not that sharp here, though I agree they exist.

Comment by benkuhn on Bitcoin value and small probability / high impact arguments · 2015-04-01T01:56:15.945Z · LW · GW

Is my general line of reasoning correct here, and is the style of reasoning a good style in the general case? I am aware that Eliezer raises points against "small probability multiplied by high impact" reasoning, but the fact is that a rational agent has to have a belief about the probability of any event, and inaction is itself a form of action that could be costly due to missing out on everything; privileging inaction is a good heuristic but only a moderately strong one.

Sometimes, especially in markets and other adversarial situations, inaction is secretly a way to avoid adverse selection.

Even if you're a well-calibrated agent--so that if you randomly pick 20 events with a 5% subjective probability, one of them will happen--the set "all events where someone else is willing to trade on odds more favorable than 5%" is not a random selection of events.

Whether the Bitcoin markets are efficient enough to worry about this is an open question, but it should at least be a signal for you to make your case more robust than pulling a 5% number out of thin air, before you invest. I think the Reddit commenters were reasonable (a sentence I did not expect to type) for pointing this out, albeit uncharitably.

Is "take the inverse of the size of the best-fitting reference class" a decent way of getting a first-order approximation? If not, why not? If yes, what are some heuristics for optimizing it?

In my experience, this simply shifts the debate to which reference class is the best-fitting one, aka reference-class tennis. For instance, a bitcoin detractor could argue that the reference class should also include Beanie Babies, Dutch tulips, and other similar stores of value.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T08:37:04.475Z · LW · GW

I was told that you only run into severe problems with model accuracy if the base rates are far from 50%. Accuracy feels pretty interpretable and meaningful here as the base rates are 30%-50%.

It depends on how much signal there is in your data. If the base rate is 60%, but there's so little signal in the data that the Bayes-optimal predictions only vary between 55% and 65%, then even a perfect model isn't going to do any better than chance on accuracy. Meanwhile the perfect model will have a poor AUC but at least one that is significantly different from baseline.

[ROC AUC] penalises you for having poor prediction even when you set the sensitivity (the threshold) to a bad parameter. The F Score is pretty simple, and doesn't have this drawback - it's just a combination of some fixed sensitivity and specificity.

I'm not really sure what you mean by this. There's no such thing as an objectively "bad parameter" for sensitivity (well, unless your ROC curve is non-convex); it depends on the relative cost of type I and type II errors.

The F score isn't comparable to AUC since the F score is defined for binary classifiers and the ROC AUC is only really meaningful for probabilistic classifiers (or I guess non-probabilitstic score-based ones like uncalibrated SVMs). To get an F score for a binary classifier you have to pick a single threshold, which seems even worse to me than any supposed penalization for picking "bad sensitivities."

there is ongoing research and discussion of this, which is confusing because as far as math goes, it doesn't seem like that hard of a problem.

Because different utility functions can rank models differently, the problem "find a utility-function-independent model statistic that is good at ranking classifiers" is ill-posed. A lot of debates over model scoring statistics seem to cash out to debates over which statistics seem to produce model selection that works well robustly over common real-world utility functions.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T06:11:07.026Z · LW · GW

I would beware the opinions of individual people on this, as I don't believe it's a very settled question. For instance, my favorite textbook author, Prof. Frank Harrell, thinks 22k is "just barely large enough to do split-sample validation." The adequacy of leave-one-out versus 10-fold depends on your available computational power as well as your sample size. 200 seems certainly not enough to hold out 30% as a test set; there's way too much variance.

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T06:04:56.428Z · LW · GW

Is this ambiguous?

It wasn't clear that this applied to the statement "we couldn't improve on using these" (mainly because I forgot you weren't considering interactions).

I excluded the rater and ratee from the averages.

Okay, that gets rid of most of my worries. I'm not sure it account for covariance between correlation estimates of different averages, so I'd be interested in seeing some bootstrapped confidence intervals). But perhaps I'm preempting future posts.

Also, thinking about it more, you point out a number of differences between correlations, and it's not clear to me that those differences are significant as opposed to just noise.

I'm not sure whether this answers your question, but I used log loss as a measure of accuracy.

I was using "accuracy" in the technical sense, i.e., one minus what you call "Total Error" in your table. (It's unfortunate that Wikipedia says scoring rules like log-loss are a measure of the "accuracy" of predictions! I believe the technical usage, that is, percentage properly classified for a binary classifier, is a more common usage in machine learning.)

The total error of a model is in general not super informative because it depends on the base rate of each class in your data, as well as the threshold that you choose to convert your probabilistic classifier into a binary one. That's why I generally prefer to see likelihood ratios, as you just reported, or ROC AUC scores (which integrates over a range of thresholds).

(Although apparently using AUC for model comparison is questionable too, because it's noisy and incoherent in some circumstances and doesn't penalize miscalibration, so you should use the H measure instead. I mostly like it as a relatively interpretable, utility-function-independent rough index of a model's usefulness/discriminative ability, not a model comparison criterion.)

Comment by benkuhn on Using machine learning to predict romantic compatibility: empirical results · 2014-12-18T01:46:40.501Z · LW · GW

Nice writeup! A couple comments:

If the dataset contained information on a sufficiently large number of dates for each participant, we could not improve on using [frequency with which members of the opposite sex expressed to see them again, and the frequency with which the participant expressed interest in seeing members of the opposite sex again].

I don't think this is true. Consider the following model:

  • There is only one feature, eye color. The population is split 50-50 between brown and blue eyes. People want to date other people iff they are of the same eye color. Everyone's ratings of eye color are perfect.

In this case, with only selectivity ratings, you can't do better than 50% accuracy (any person wants to date any other person with 50% probability). But with eye-color ratings, you can get it perfect.

[correlation heatmap]

My impression is that there are significant structural correlations in your data that I don't really understand the impact of. (For instance, at least if everyone rates everyone, I think the correlation of attr with attrAvg is guaranteed to be positive, even if attr is completely random.)

As a result, I'm having a hard time interpreting things like the fact that likeAvg is more strongly correlated with attr than with like. I'm also having a hard time verifying your interpretations of the observations that you make about this heatmap, because I'm not sure to what extent they are confounded by the structural correlations.

It seems implausible to me that each of the 25 correlations between the five traits of attractiveness, fun, ambition, intelligence and sincerity is positive.

Nitpick: There are only 10 distinct such correlations that are not 1 by definition.

The predictive power that we obtain

Model accuracy actually isn't actually a great measure of predictive power, because it's sensitive to base rates. (You at least mentioned the base rates, but it's still hard to know how much to correct for the base rates when you're interpreting the goodness of a classifier.)

As far as I know, if you don't have a utility function, scoring classifiers in an interpretable way is still kind of an open problem, but you could look at ROC AUC as a still-interpretable but somewhat nicer summary statistic of model performance.

Comment by benkuhn on Effective Writing · 2014-07-19T06:17:43.941Z · LW · GW

I can't speak to "best," but I suggest reading Style: Lessons in Clarity and Grace by Joseph M. Williams, which crystallizes lots of non-trivial components of "good writing." (The link is to an older, less expensive edition which I used.)

I'll also second "write a lot" and "read a lot." Reading closely and with purpose in mind will speed up the latter (as opposed to the default of throwing books at your brain and hoping to pick up good writing by osmosis). Also, read good writers.

Comment by benkuhn on Too good to be true · 2014-07-12T01:14:35.235Z · LW · GW

In your "critiquing bias" section you allege that 3/43 studies supporting a link is "still surprisingly low". This is wrong; it is actually surprisingly high. If B ~ Binom(43, 0.05), then P(B > 2) ~= 0.36.*

*As calculated by the following Python code:

from scipy.stats import binom
b = binom(43, 0.05)
p_less_than_3 = sum(b.pmf(i) for i in [0,1,2])
print 1 - p_less_than_3
Comment by benkuhn on Relative and Absolute Benefit · 2014-06-19T02:55:30.860Z · LW · GW

I think you're being a little uncharitable to people who promote interventions that seem positional (e.g. greater educational attainment). It may be true that college degrees are purely for signalling and hence positional goods, but:

(a) it improves aggregate welfare for people to be able to send costly signals, so we shouldn't just get rid of college degrees;

(b) if an intervention improves college graduation rate, it (hopefully) is not doing this by handing out free diplomas, but rather by effecting some change in the subjects that makes them more capable of sending the costly signal of graduating from college, which is an absolute improvement.

Similarly, while height increase has no plausible mechanism for improving absolute wellbeing, some mechanisms for improving absolute wellbeing are measured using height as a proxy (most prominently nutritional status in developing countries).

It should definitely be a warning sign if an intervention seems only to promote a positional good, but it's more complex than it seems to determine what's actually positional.

Comment by benkuhn on Request for concrete AI takeover mechanisms · 2014-04-28T02:03:19.782Z · LW · GW

Fun question.

The takeover vector that leaps to mind is remote code execution vulnerabilities on websites connected to important/sensitive systems. This lets you bootstrap from ability to make HTTP GET requests, to (partial) control over any number of fun targets, like banks or Amazon's shipping.

The things that are one degree away from those (via e.g. an infected thumb drive) are even more exciting:

  • Iranian nuclear centrifuges
  • US nuclear centrifuges
  • the electrical grid
  • hopefully not actual US nuclear weapons, but this should be investigated...

Plausible first attempt: get into a defense contractor's computers and install Thompson's compiler backdoor. Now the AI can stick whatever code it wants on various weapons and blackmail anyone it wants or cause havoc in any number of other ways.

Comment by benkuhn on Open Thread April 16 - April 22, 2014 · 2014-04-17T16:47:00.033Z · LW · GW

Yes, definitely agree that politicians can dupe people into hiring them. Just wanted to raise the point that it's very workplace-dependent. The takeaway is probably "investigate your own corporate environment and figure out whether doing your job well is actually rewarded, because it may not be".

Comment by benkuhn on Open Thread April 16 - April 22, 2014 · 2014-04-17T01:34:33.007Z · LW · GW

I'd beware conflating "interpersonal skills" with "playing politics." For CEO at least (and probably CTO as well), there are other important factors in job performance than raw engineering talent. The subtext of your comment is that the companies you mention were somehow duped into promoting these bad engineers to executive roles, but they might have just decided that their CEO/CTO needed to be good at managing or recruiting or negotiating, and the star engineer team lead didn't have those skills.

Second, I think that the "playing politics" part is true at some organizations but not at others. Perhaps this is an instance of All Debates are Bravery Debates.

My model is something like: having passable interpersonal/communication skills is pretty much a no-brainer, but beyond that there are firms where it just doesn't make that much of a difference, because they're sufficiently good at figuring out who actually deserves credit for what that they can select harder for engineering ability than for politics. However, there are other organizations where this is definitely not the case.

Comment by benkuhn on Beware technological wonderland, or, why text will dominate the future of communication and the Internet · 2014-04-13T21:20:27.810Z · LW · GW

I would expect the relevant factor to be mental, not physical, exertion. Unfortunately that's a lot harder to measure.

Comment by benkuhn on Beware technological wonderland, or, why text will dominate the future of communication and the Internet · 2014-04-13T20:04:02.181Z · LW · GW

Do you have actual data on this? Otherwise I'm very tempted to call typical mind.

Comment by benkuhn on Supply, demand, and technological progress: how might the future unfold? Should we believe in runaway exponential growth? · 2014-04-11T21:36:17.216Z · LW · GW

One story for exponential growth that I don't see you address (though I didn't read the whole post, so forgive me if I'm wrong) is the possibility of multiplicative costs. For example, perhaps genetic sequencing would be a good case study? There seem to be a lot of multiplicative factors there: amount of coverage, time to get one round of coverage, amount of DNA you need to get one round of coverage, ease of extracting/preparing DNA, error probability... With enough such multiplicative factors, you'll get exponential growth in megabases per dollar by applying the same amount of improvement to each factor sequentially (whereas if the factors were additive you'd get linear improvement).

Comment by benkuhn on College discussion thread · 2014-04-01T03:24:22.698Z · LW · GW

If anyone's admitted/visiting Harvard, let me know! I go there and would be happy to meet up and/or answer your questions. There are some other students on here as well.

Comment by benkuhn on Increasing the pool of people with outstanding accomplishments · 2014-03-29T18:57:27.205Z · LW · GW

"outstanding" still has some of the same connotations to me, although less so. But I may be in the minority here.