Let Us Do Our Work As Well 2021-09-17T00:40:18.443Z
Economic AI Safety 2021-09-16T20:50:50.335Z
Film Study for Research 2021-09-14T18:53:25.831Z
Does Diverse News Decrease Polarization? 2021-09-11T02:30:16.583Z
Measurement, Optimization, and Take-off Speed 2021-09-10T19:30:57.189Z
Model Mis-specification and Inverse Reinforcement Learning 2018-11-09T15:33:02.630Z
Latent Variables and Model Mis-Specification 2018-11-07T14:48:40.434Z
[link] Essay on AI Safety 2015-06-26T07:42:11.581Z
The Power of Noise 2014-06-16T17:26:30.329Z
A Fervent Defense of Frequentist Statistics 2014-02-18T20:08:48.833Z
Another Critique of Effective Altruism 2014-01-05T09:51:12.231Z
Macro, not Micro 2013-01-06T05:29:38.689Z
Beyond Bayesians and Frequentists 2012-10-31T07:03:00.818Z
Recommendations for good audio books? 2012-09-16T23:43:31.596Z
What is the evidence in favor of paleo? 2012-08-27T07:07:07.105Z
PM system is not working 2012-08-02T16:09:06.846Z
Looking for a roommate in Mountain View 2012-08-01T19:04:59.872Z
Philosophy and Machine Learning Panel on Ethics 2011-12-17T23:32:20.026Z
Help me fix a cognitive bug 2011-06-25T22:22:31.484Z
Utility is unintuitive 2010-12-09T05:39:34.176Z
Interesting talk on Bayesians and frequentists 2010-10-23T04:10:27.684Z


Comment by jsteinhardt on Where do your eyes go? · 2021-09-20T02:13:15.056Z · LW · GW

I enjoyed this quite a bit. Vision is very important in sports as well, but I hadn't thought to apply it to other areas, despite generally being into applying sports lessons to research (i.e.

In sports, you have to choose between watching the person you're guarding and watching the ball / center of play. Or if you're on offense, between watching where you're going and watching the ball. Eye contact is also important for (some) passing.

What's most interesting is the second-level version of this, where good players watch their opponent's gaze (for instance, making a move exactly when the opponent's gaze moves somewhere else). I wonder if there's an analog in video games / research?

Comment by jsteinhardt on Let Us Do Our Work As Well · 2021-09-17T14:34:10.557Z · LW · GW

Thanks, really appreciate the references!

Comment by jsteinhardt on Economic AI Safety · 2021-09-17T00:04:45.552Z · LW · GW

If there was a feasible way to make the algorithm open, I think that would be good (of course FB would probably strongly oppose this). As you say, people wouldn't directly design / early adopt new algorithms, but once early adopters found an alternative algorithm that they really liked, word of mouth would lead many more people to adopt it. So I think you could eventually get widespread change this way.

Comment by jsteinhardt on Film Study for Research · 2021-09-15T02:07:26.748Z · LW · GW

Thanks for the feedback!

I haven't really digged into Gelman's blog, but the format you mention is a perfect example of the expertise of understanding some research. Very important skill, but not the same as actually conducting the research that goes into a paper.

Research consists of many skills put together. Understanding prior work and developing the taste to judge it is one of the more important individual skills in research (moreso than programming, at least in most fields). So I think the blog example is indeed a central one.

In research, especially in a weird new field like alignment, it's rare to find another researcher who want to conduct precisely the same research. But that's the basis of every sport and game: people want to win the same game. It make the whole "learning from other" slightly more difficult IMO. You can't just look for what works, you constantly have to repurpose ideas that work in slightly different field and/or approaches and check for the loss in translation.

I agree with this, although I think creative new ideas often come from people who have also mastered the "standard" skills. And indeed, most research is precisely about coming up with new ideas, which is a skill that you can cultivate my studying how others generate ideas.

More tangentially, you may be underestimating the amount of innovation in sports. Harden and Jokic both innovate in basketball (among others), but I am pretty sure they also do lots of film study. Jokic's innovation probably comes from having mastered other sports like water polo and the resulting skill transfer. I would guess that mastery of fruitfully adjacent fields is a productive way to generate ideas.

Comment by jsteinhardt on Measurement, Optimization, and Take-off Speed · 2021-09-11T02:39:06.210Z · LW · GW

Thanks, sounds good to me!

Comment by jsteinhardt on Experimentally evaluating whether honesty generalizes · 2021-07-13T01:19:55.142Z · LW · GW

Actually, another issue is that unsupervised translation isn't "that hard" relative to supervised translation--I think that you can get pretty far with simple heuristics, such that I'd guess making the model 10x bigger matters more than making the objective more aligned with getting the answer right (and that this will be true for at least a couple more 10x-ing of model size, although at some point the objective will matter more).

This might not matter as much if you're actually outputting explanations and not just translating from one language to another. Although it is probably true that for tasks that are far away from the ceiling, "naive objective + 10x larger model" will outperform "correct objective".

Comment by jsteinhardt on Experimentally evaluating whether honesty generalizes · 2021-07-13T01:12:50.121Z · LW · GW

Thanks Paul, I generally like this idea.

Aside from the potential concerns you bring up, here is the most likely way I could see this experiment failing to be informative: rather than having checks and question marks in your tables above, really the model's ability to solve each task is a question of degree--each table entry will be a real number between 0 and 1. For, say, tone, GPT-3 probably doesn't have a perfect model of tone, and would get <100% performance on a sentiment classification task, especially if done few-shot.

The issue, then, is that the "fine-tuning for correctness" and "fine-tuning for coherence" processes are not really equivalent--fine-tuning for correctness is in fact giving GPT-3 additional information about tone, which improves its capabilities. In addition, GPT-3 might not "know" exactly what humans mean by the word tone, and so fine-tuning for correctness also helps GPT-3 to better understand the question.

Given these considerations, my modal expectation is that fine-tuning for correctness will provide moderately better results than just doing coherence, but it won't be clear how to interpret the difference--maybe in both cases GPT-3 provides incoherent outputs 10% of the time, and then additionally coherent but wrong outputs 10% of the time when fine-tuned for correctness, but 17% of the time when fine-tuned only for coherence. What would you conclude from a result like that? I would still have found the experiment interesting, but I'm not sure I would be able to draw a firm conclusion.

So perhaps my main feedback would be to think about how likely you think such an outcome is, how much you mind that, and if there are alternative tasks that avoid this issue without being significantly more complicated.

Comment by jsteinhardt on AI x-risk reduction: why I chose academia over industry · 2021-03-15T05:52:28.441Z · LW · GW

This doesn't seem so relevant to capybaralet's case, given that he was choosing whether to accept an academic offer that was already extended to him.

Comment by jsteinhardt on Covid 2/18: Vaccines Still Work · 2021-02-19T16:16:25.082Z · LW · GW

I think if you account for undertesting, then I'd guess 30% or more of the UK was infected during the previous peak, which should reduce R by more than 30% (the people most likely to be infected are also most likely to spread further), and that is already enough to explain the drop.

Comment by jsteinhardt on Making Vaccine · 2021-02-06T01:18:27.894Z · LW · GW

I wasn't sure what you meant by more dakka, but do you mean just increasing the dose? I don't see why that would necessarily work--e.g. if the peptide just isn't effective.

I'm confused because we seem to be getting pretty different numbers. I asked another bio friend (who is into DIY stuff) and they also seemed pretty skeptical, and Sarah Constantin seems to be as well:

Not disbelieving your account, just noting that we seem to be getting pretty different outputs from the expert-checking process and it seems to be more than just small-sample noise. I'm also confused because I generally trust stuff from George Church's group, although I'm still near the 10% probability I gave above.

I am certainly curious to see whether this does develop measurable antibodies :).

Comment by jsteinhardt on Making Vaccine · 2021-02-05T02:52:45.316Z · LW · GW

Ah got it, thanks!

Comment by jsteinhardt on Making Vaccine · 2021-02-05T02:24:30.096Z · LW · GW

Have you run this by a trusted bio expert? When I did this test (picking a bio person who I know personally, who I think of as open-minded and fairly smart), they thought that this vaccine is pretty unlikely to be effective and that the risks in this article may be understated (e.g. food grade is lower-quality than lab grade, and it's not obvious that inhaling food is completely safe). I don't know enough biology to evaluate their argument, beyond my respect for them.

I'd be curious if the author, or others who are considering trying this, have applied this test.

My (fairly uninformed) estimates would be:
 - 10% chance that the vaccine works in the abstract
 - 4% chance that it works for a given LW user
 - 3% chance that a given LW user has an adverse reaction
  -12% chance at least 1 LW user has an adverse reaction

Of course, from a selfish perspective, I am happy for others to try this. In the 10% of cases where it works I will be glad to have that information. I'm more worried that some might substantially overestimate the benefit and underestimate the risks, however.

Comment by jsteinhardt on Making Vaccine · 2021-02-05T02:18:13.308Z · LW · GW

I don't think I was debating the norms, but clarifying how they apply in this case. Most of my comment was a reaction to the "pretty important" and "timeless life lessons", which would apply to Raemon's comment whether or not he was a moderator.

Comment by jsteinhardt on Making Vaccine · 2021-02-05T02:16:28.006Z · LW · GW

Often, e.g. Stanford profs claiming that COVID is less deadly than the flu for a recent and related example.

Comment by jsteinhardt on Making Vaccine · 2021-02-04T19:38:54.276Z · LW · GW

Hmm, important as in "important to discuss", or "important to hear about"?

My best guess based on talking to a smart open-minded biologist is that this vaccine probably doesn't work, and that the author understates the risks involved. I'm interpreting the decision to frontpage as saying that you think I'm wrong with reasonably high confidence, but I'm not sure if I should interpret it that way.

Comment by jsteinhardt on Covid 12/24: We’re F***ed, It’s Over · 2021-01-16T06:13:56.568Z · LW · GW

That seems irrelevant to my claim that Zvi's favored policy is worse than the status quo.

Comment by jsteinhardt on Covid 12/24: We’re F***ed, It’s Over · 2021-01-16T06:11:45.627Z · LW · GW

This isn't based on personal anecdote, sudies that try to estimate this come up with 3x. See eg the MicroCovid page:

Comment by jsteinhardt on Covid 12/31: Meet the New Year · 2021-01-03T07:32:32.957Z · LW · GW

You may well be right. I guess we don't really know what the sampling bias is (it would have to be pretty strongly skewed towards incoming UK cases though to get to a majority, since the UK itself was near 50%).

Comment by jsteinhardt on Covid 12/31: Meet the New Year · 2021-01-01T07:54:58.249Z · LW · GW

See here:

Comment by jsteinhardt on Covid 12/31: Meet the New Year · 2021-01-01T00:40:34.258Z · LW · GW

I don't think it's correct to say that it remains stable at 0.5-1% of samples in Denmark. There were 13 samples of the new variant last week, vs. only 3 two weeks ago, if I understood the data correctly. If it went from 0.5% to 1% in a week then you should be alarmed. (Although 3 and 13 are both small enough that it's hard to compute a growth rate, but it certainly seems consistent with the UK data to me.)

I think better evidence against non-infectiousness would be Italy and Israel, where the variant seems to be dominant but there isn't runaway growth. But:
 - Italy was on a downtick and then imposed a stronger lockdown, yet the downtick switched to being flat. So R does seem to have increased in Italy.
 - Israel is vaccinating everyone fairly quickly right now.

Comment by jsteinhardt on Covid 12/31: Meet the New Year · 2020-12-31T21:57:31.301Z · LW · GW

Zvi, I still think that your model of vaccination ordering is wrong, and that the best read of the data is that frontline essential workers should be very highly prioritized from a DALY / deaths averted perspective. I left this comment on the last thread that explains my reasoning in detail, looking at both of the published papers I've seen that model vaccine ordering: link. I'd be happy to elaborate on it but I haven't yet seen anyone provide any disagreement.

More minor, but regarding rehab facilities, from a bureaucratic perspective they are "congregate living facilities" and in the same category as retirement homes. I don't think New York is doing anything exceptional by having them high on the list, for instance California is doing the same thing if I understand correctly. We can of course argue over whether it's good for them to be high on the list; I personally think of them as 20-person group houses and so feel reasonably good prioritizing them highly, though I'm not confident in that conclusion.

Comment by jsteinhardt on Covid 12/24: We’re F***ed, It’s Over · 2020-12-25T05:00:31.523Z · LW · GW

Zvi, I agree with you that the CDC's reasoning was pretty sketchy, but I think their actual recommendation is correct while everyone else (e.g. the UK) is wrong. I think the order should be something like:

Nursing homes -> HCWs -> 80+ -> frontline essential workers -> ...

(Possibly switching the order of HCWs and 80+.)

The public analyses saying that we should start with the elderly are these two papers:

Notably, both papers don't even consider vaccinating essential workers as a potential intervention. The only option categories are by age, comorbidities, and whether you're a healthcare worker. The first paper only considers age and concludes unsurprisingly that if your only option is to order by age, you should start with the oldest. In the second paper, which includes HCWs as a category (modeling them as having higher susceptibility but not higher risk of transmitting to others), HCWs jump up on the queue to right after the 80+ age group (!!!). Since the only factor being considered is susceptibility, presumably many types of essential workers would also have a higher susceptibility and fall into the same group.

If we apply the Zvi cynical lens here, we can ask why these papers perform an analysis that suggests prioritizing healthcare workers but don't bother to point out that the same analysis applies to 10% of the population (hint: there is less than 10% available vaccines and the authors are in the healthcare profession).

The actual problem with the original CDC recommendations was that essential workers is so broad a category that it encompasses lots of people who aren't actually at increased risk (because their jobs don't require much contact). The new recommendations revised this to focus on frontline essential workers, a more-focused category that is about half of all essential workers. This is a huge improvement but I think even the original recommendations are better than the UK approach of prioritizing only based on age.

Remember, we should focus on results. If the CDC is right while everyone else is wrong, even if the stated reasoning is bad, pressuring them to conform to everyone else's worse approach is even worse.

Comment by jsteinhardt on Why are young, healthy people eager to take the Covid-19 vaccine? · 2020-12-02T17:58:58.939Z · LW · GW

Mo Bamba (NBA) and Cody Garbrandt (UFC) are both pro athletes who are still out of commission months later. I found this looking for NBA information, and only about 50 NBA players have gotten Covid, so this suggests at least 2% chance of pretty bad long term symptoms.

Comment by jsteinhardt on Pain is not the unit of Effort · 2020-12-02T08:02:32.345Z · LW · GW

I think that the right amount level of effort leaves you tired but warm inside, like you look forward doing this again, rather than just feeling you HAVE to do this again.


This is probably true in a practical sense (otherwise you won't sustain it as a habit), but I'm not sure it describes a well-defined level of effort. For me an extreme effort could still lead to me looking forward to it, if I have a concrete sense of what that effort bought me (maybe I do some tedious and exhausting footwork drills, but I understand the sense in which this will carry over into a game-like situation, so it feels rewarding; but I wouldn't be able to sustainably put in that same level of effort if I couldn't visualize the benefits).

It seems to me like to calibrate the right level of effort requires some other principle (for physical activity this would be based on rates of adaptation to avoid overtraining), and then you should perform visualization or other mental exercises to align your psychology with that level of effort. 

Comment by jsteinhardt on Pain is not the unit of Effort · 2020-12-02T07:52:26.976Z · LW · GW

If most workouts are painful, then I agree you are probably overtraining. But if no workouts at all are painful, you're probably missing opportunities to improve. And many workouts should at least be uncomfortable for parts of it. E.g. when lifting, for the last couple deadlift sets I often feel incredibly gassed and don't feel like doing another one. But this can be true even when I'm far away from my limits (like, a month later I'll be lifting 30 pounds more and feel about as tired, rather than failing to do the lift).

My guess is that on average 1-2 workouts a week should feel uncomfortable in some way, and 1-2 workouts a month should feel painful, if you're training optimally. But it probably varies by sport (I'm mostly thinking sports like soccer or basketball that are high on quickness and lateral movement, but only moderate on endurance).

ETA: Regarding whether elite athletes are performing optimally, it's going to depend on the sport, but in say basketball where players have 10+ years careers, teams generally have a lot of incentive to not destroy players' bodies. Most of the wear and tear comes from games, while training outside of games is often preventing injuries by preparing the body for high and erratic levels of contact in games. (I could imagine that in say gymnastics, or maybe even American football, the training incentives are misaligned with long-term health, but I don't know much about either.)

Comment by jsteinhardt on Why are young, healthy people eager to take the Covid-19 vaccine? · 2020-11-29T16:53:40.895Z · LW · GW

You could look at papers published on medrxiv rather than news articles, which would resolve the clickbait issue, though you'd still have to assess the study quality.

Comment by jsteinhardt on Why are young, healthy people eager to take the Covid-19 vaccine? · 2020-11-29T04:29:12.129Z · LW · GW

Have you tried googling yourself and were unable to find them? (Sorry that I'm too lazy to re-look them up myself, but given that LW is mostly leisure for me I don't feel like doing it, and I'd be somewhat surprised if you googled for stuff and didn't find it.)

Comment by jsteinhardt on Why are young, healthy people eager to take the Covid-19 vaccine? · 2020-11-22T16:17:58.051Z · LW · GW

I also think you are probably overestimating vaccine risks (the main risk is that its effectiveness wanes, and that it interferes with future antibody responses from similar vaccines; not that you'll get horrible side effects) but that isn't necessary to explain why people want the vaccine now.

Comment by jsteinhardt on Why are young, healthy people eager to take the Covid-19 vaccine? · 2020-11-22T16:14:02.864Z · LW · GW

I think cutting the IFR by 25 on the basis of one study is a mistake, the chance of the study being fatally flawed is greater than 1 in 25. On the other hand 0.5% is overall CFR and would be lower for young people.

I think it's hard to cut risk of long term effects by more than a factor of 10 from published estimates. Note there is evidence of long term effects contrary to your claim, i.e. studies that do 6 week follow ups and find people still with some symptom. This isn't 6 months but is still surprisingly long and should shift our belief about 6 months at least somewhat. Also novel disease that attacks many parts of the body is some evidence. I agree the evidence is exaggerated to scare us but it feels like a different situation from reinfection where it actually is almost impossible to find instances except when immunocompromised.

But I think perhaps the most important is that even young people are currently limiting their activities in many undesirable ways in accordance with local government ordinances (which apply equally to old and young). Vaccination allows one to end or partially end these limitations--even if not in a legal sense, probably at least in a moral sense.

Comment by jsteinhardt on Why Boston? · 2020-10-13T06:50:04.150Z · LW · GW

I noticed the prudishness, but "rudeness" to me parses as people actually telling you what's on their mind, rather than the passive-aggressive fake niceness that seems to dominate in the Bay Area. I'll personally take the rudeness :).

Comment by jsteinhardt on Why Boston? · 2020-10-13T06:46:42.306Z · LW · GW

On the other hand, the second-best place selects for people who don't care strongly about optimizing for legible signals, which is probably a plus. (An instance of this: In undergrad the dorm that, in my opinion, had the best culture was the run-down dorm that was far from campus.)

Comment by jsteinhardt on Why Boston? · 2020-10-11T05:29:57.228Z · LW · GW

Many of the factors affecting number of deaths are beyond a place's control, such as how early on the pandemic spread to that place, and how densely populated the city is. I don't have a strong opinion about MA but measuring by deaths per capita isn't a good way of judging the response.

Comment by jsteinhardt on What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers · 2020-09-17T02:02:03.053Z · LW · GW

That's not really what a p-value means though, right? The actual replication rate should depend on the prior and the power of the studies.

Comment by jsteinhardt on What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers · 2020-09-12T18:03:46.264Z · LW · GW

What are some of the recommendations that seem most off base to you?

Comment by jsteinhardt on Covid-19 6/11: Bracing For a Second Wave · 2020-06-13T19:58:18.847Z · LW · GW

My prediction: infections will either go down or only slowly rise in most places, with the exception of one or two metropolitan areas. If I had to pick one it would be LA, not sure what the second one will be. The places where people are currently talking about spikes won't have much correlation with the places that look bad two weeks from now (i.e. people are mostly chasing noise).

I'm not highly confident in this, but it's been a pretty reliable prediction for the past month at least...

Comment by jsteinhardt on Estimating COVID-19 Mortality Rates · 2020-06-13T08:00:32.737Z · LW · GW

Here is a study that a colleague recommends: Tweet version:

Their point estimate is 0.64% but with likely heterogeneity across settings.

Comment by jsteinhardt on Quarantine Bubbles Require Directness, and Tolerance of Rudeness · 2020-06-10T02:33:59.701Z · LW · GW

I don't think bubble size is the right thing to measure; instead you should measure the amount of contract you have with people, weighted by time, distance, indoor/outdoor, mask-wearing, and how likely the other person is to be infected (I.e. how careful they are).

An important part of my mental model is that infection risk is roughly linear in contact time.

Comment by jsteinhardt on Quarantine Bubbles Require Directness, and Tolerance of Rudeness · 2020-06-08T07:50:07.485Z · LW · GW

As a background assumption, I'm focused on the societal costs of getting infected, rather than the personal costs, since in most places the latter seem negligible unless you have pre-existing health conditions. I think this is also the right lens through which to evaluate Alameda's policy, although I'll discuss the personal calculation at the end.

From a social perspective, I think it's quite clear that the average person is far from being effectively isolated, since R is around 0.9 and you can only get to around half of that via only household infection. So a 12 person bubble isn't really a bubble... It's 12 people who each bring in non trivial risk from the outside world. On the other hand they're also not that likely to infect each other.

From a personal perspective, I think the real thing to care about is whether the other people are about as careful as you. By symmetry there's no reason to think that another house that practices a similar level of precaution is more likely to get an outside infection than your house is. But by the same logic there's nothing special about a 12 person bubble: you should be trying to interact with people with the same or better risk profile as you (from a personal perspective; from a societal perspective you should interact with riskier people, at least if you're low risk, because bubbles full of risky people are the worst possible configuration and you want to help break those up).

Comment by jsteinhardt on Quarantine Bubbles Require Directness, and Tolerance of Rudeness · 2020-06-08T04:52:54.259Z · LW · GW

I think the biggest issue with the bubble rule is that the math doesn't work out. The secondary attack rate between house members is ~30% and probably much lower between other contacts. At that low of a rate, these games with the graph structure buy very little and may be harmful because they increase the fraction of contact occurring between similar people (which is bad because the social cost of a pair of people interacting is roughly the product of their infection risks).

Comment by jsteinhardt on Estimating COVID-19 Mortality Rates · 2020-06-07T20:14:30.925Z · LW · GW

I'm not trying to intimidate; I'm trying to point out that I think you're making errors that could be corrected by more research, which I hoped would be helpful. I've provided one link (which took me some time to dig up). If you don't find this useful that's fine, you're not obligated to believe me and I'm not obligated to turn a LW comment into a lit review.

Comment by jsteinhardt on Estimating COVID-19 Mortality Rates · 2020-06-07T18:58:05.422Z · LW · GW

The CFR will shift substantially over time and location as testing changes. I'm not sure how you would reliably use this information. IFR should not change much and tells you how bad it is for you personally to get sick.

I wouldn't call the model Zvi links expert-promoted. Every expert I talked to thought it had problems, and the people behind it are economists not epidemiologists or statisticians.

For IFR you can start with seroprevalence data here and then work back from death rates:

Regarding back-of-the-envelope calculations, I think we have different approaches to evidence/data. I started with back-of-the-envelope calculations 3 months ago. But I would have based things on a variety of BOTECs and not a single one. Now I've found other sources that are taking the BOTEC and doing smarter stuff on top of it, so I mostly defer to those sources, or to experts with a good track record. This is easier for me because I've worked full-time on COVID for the past 3 months; if I weren't in that position I'd probably combine some of my own BOTECs with opinions of people I trusted. In your case, I predict Zvi if you asked him would also say the IFR was in the range I gave.

Comment by jsteinhardt on Estimating COVID-19 Mortality Rates · 2020-06-07T17:04:38.882Z · LW · GW

Ben, I think you're failing to account for under-testing. You're computing the case fatality rate when you want the infection fatality rate. Most experts, as well as the well-done meta analyses, place the IFR in the 0.5%-1% range. I'm a little bit confused why you're relying on this back of the envelope rather than the pretty extensive body of work on this question.

Comment by jsteinhardt on Ben Hoffman's donor recommendations · 2018-07-30T17:59:04.732Z · LW · GW

I don't understand why this is evidence that "EA Funds (other than the global health and development one) currently funges heavily with GiveWell recommended charities", which was Howie's original question. It seems like evidence that donations to OpenPhil (which afaik cannot be made by individual donors) funge against donations to the long-term future EA fund.

Comment by jsteinhardt on RFC: Philosophical Conservatism in AI Alignment Research · 2018-05-15T04:24:03.648Z · LW · GW

I like the general thrust here, although I have a different version of this idea, which I would call "minimizing philosophical pre-commitments". For instance, there is a great deal of debate about whether Bayesian probability is a reasonable philosophical foundation for statistical reasoning. It seems that it would be better, all else equal, for approaches to AI alignment to not hinge on being on the right side of this debate.

I think there are some places where it is hard to avoid pre-commitments. For instance, while this isn't quite a philosophical pre-commitment, it is probably hard to develop approaches that are simultaneously optimized for short and long timelines. In this case it is probably better to explicitly do case splitting on the two worlds and have some subset of people pursuing approaches that are good in each individual world.

Comment by jsteinhardt on [deleted post] 2018-03-19T19:43:11.984Z

FWIW I understood Zvi's comment, but feel like I might not have understood it if I hadn't played Magic: The Gathering in the past.

EDIT: Although I don't understand the link to Sir Arthur's green knight, unless it was a reference to the fact that M:tG doesn't actually have a green knight card.

Comment by jsteinhardt on Takeoff Speed: Simple Asymptotics in a Toy Model. · 2018-03-06T13:54:41.342Z · LW · GW

Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)

I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:

  • I think AI safety likely involves solving a number of difficult conceptual problems, such that it would take >5 years (I would guess something like 10-30 years, with very wide error bars) of research to have solutions that we are happy with. Moreover, many of the relevant problems have short-term analogues that can be worked on today. (Indeed, some of these align with your own research interests, e.g. imputing value functions of agents from actions/decisions; although I am particularly interested in the agnostic case where the value function might lie outside of the given model family, which I think makes things much harder.)
  • I suppose the summary point of the above is that even if you think AI is a ways off (my median estimate is ~50 years, again with high error bars) research is not something that can happen instantaneously, and conceptual research in particular can move slowly due to being harder to work on / parallelize.
  • While I have uncertainty about fast take-off, that still leaves some probability that fast take-off will happen, and in that world it is an important enough problem that it is worth thinking about. (It is also very worthwhile to think about the probability of fast take-off, as better estimates would help to better direct resources even within the AI safety space.)
  • Finally, I think there are a number of important safety problems even from sub-human AI systems. Tech-driven unemployment is I guess the standard one here, although I spend more time thinking about cyber-warfare/autonomous weapons, as well as changes in the balance of power between nation-states and corporations. These are not as clearly an existential risk as unfriendly AI, but I think in some forms would qualify as a global catastrophic risk; on the other hand I would guess that most people who care about AI safety (at least on this website) do not care about it for this reason, so this is more idiosyncratic to me.

Happy to expand on/discuss any of the above points if you are interested.



Comment by jsteinhardt on Takeoff Speed: Simple Asymptotics in a Toy Model. · 2018-03-06T13:32:48.176Z · LW · GW

Very minor nitpick, but just to add, FLI is as far as I know not formally affiliated with MIT. (FHI is in fact a formal institute at Oxford.)

Comment by jsteinhardt on Zeroing Out · 2017-11-05T22:19:45.863Z · LW · GW

Hi Zvi,

I enjoy reading your posts because they often consist of clear explanations of concepts I wish more people appreciated. But I think this is the first instance where I feel I got something that I actually hadn't thought about before at all, so I wanted to convey extra appreciation for writing it up.



Comment by jsteinhardt on Seek Fair Expectations of Others’ Models · 2017-10-20T03:53:12.702Z · LW · GW

I think the conflation is "decades out" and "far away".

Comment by jsteinhardt on [deleted post] 2017-10-17T03:04:59.264Z

Galfour was specifically asked to write his thought up in this thread:

It seems either this was posted to the wrong place, or there is some disagreement within the community (e.g. between Ben in that thread and the people downvoting).