Posts

Shall we count the living or the dead? 2021-06-14T00:38:22.968Z
Effect heterogeneity and external validity in medicine 2019-10-25T20:53:53.012Z
Counterfactual outcome state transition parameters 2018-07-27T21:13:12.014Z
The New Riddle of Induction: Neutral and Relative Perspectives on Color 2017-12-02T16:15:08.912Z
Odds ratios and conditional risk ratios 2017-01-25T03:55:04.420Z
Is Caviar a Risk Factor For Being a Millionaire? 2016-12-09T16:27:14.760Z
Link: The Economist on Paperclip Maximizers 2016-06-30T12:40:33.942Z
Link: Evidence-Based Medicine Has Been Hijacked 2016-03-16T19:57:49.294Z
Clearing An Overgrown Garden 2016-01-29T22:16:56.620Z
Meetup : Palo Alto Meetup: Lightning Talks 2016-01-20T20:04:34.593Z
Meetup : Palo Alto Meetup: Introduction to Causal Inference 2016-01-03T02:22:37.793Z
Meetup : Palo Alto Meetup: The Economics of AI 2016-01-03T02:20:41.540Z
Post-doctoral Fellowships at METRICS 2015-11-12T19:13:12.419Z
On stopping rules 2015-08-02T21:38:08.617Z
Meetup : Boston: Trigger action planning 2015-05-24T20:00:31.195Z
Meetup : Boston: Making space in Interpersonal Interactions 2015-05-24T19:58:56.123Z
Meetup : Boston: How to Beat Perfectionism 2015-05-08T17:42:02.809Z
Meetup : Boston: Unconference 2015-03-19T16:58:45.803Z
Prediction Markets are Confounded - Implications for the feasibility of Futarchy 2015-01-26T22:39:33.638Z
Meetup : Boston: Antifragile 2015-01-02T20:04:48.211Z
Meetup : Boston: Self Therapy 2014-11-13T17:20:19.375Z
Meetup : The Design Process 2014-10-24T03:37:24.473Z
Meetup : Boston Meetup - New Location 2014-10-15T04:39:13.533Z
Meetup : Meta Meetup 2014-10-02T16:49:44.048Z
Meetup : Social Skills 2014-09-10T00:55:23.241Z
Meetup : Passive Investing and Financial Independence 2014-09-10T00:53:05.455Z
Meetup : Prediction Markets and Futarchy 2014-09-02T14:13:33.018Z
Meetup : Nick Bostrom Talk on Superintelligence 2014-09-02T14:09:43.925Z
Meetup : The Psychology of Video Games 2014-08-11T04:58:57.663Z
Ethical Choice under Uncertainty 2014-08-10T22:13:38.756Z
Causal Inference Sequence Part II: Graphical Models 2014-08-04T23:10:02.285Z
Causal Inference Sequence Part 1: Basic Terminology and the Assumptions of Causal Inference 2014-07-30T20:56:31.866Z
Sequence Announcement: Applied Causal Inference 2014-07-30T20:55:41.741Z

Comments

Comment by Anders_H on The New Riddle of Induction: Neutral and Relative Perspectives on Color · 2022-07-04T11:11:10.102Z · LW · GW

I don't think the existence of lawlike phenomena is controversial, at least not on this forum. Otherwise, how do you account for the remarkable patterns to our observations?  Of course, it is not possible to determine what those phenomena are, but I don't think my solution requires this. It just requires that our sensory algorithm responds the same way every time. 

Comment by Anders_H on Why did no LessWrong discourse on gain of function research develop in 2013/2014? · 2021-06-21T14:20:49.547Z · LW · GW

I found the original website for Prof. Lipsitch's "Cambridge Working Group" from 2014 at http://www.cambridgeworkinggroup.org/  . While the website does not focus exclusively on gain-of-function, this was certainly a recurring theme in his public talks about this. 

The list of signatories (which I believe has not been updated since 2016) includes several members of our community (apologies to anyone who I have missed):

  • Toby Ord, Oxford University
  • Sean O hEigeartaigh, University of Oxford
  • Daniel Dewey, University of Oxford
  • Anders Sandberg, Oxford University
  • Anders Huitfeldt, Harvard T.H. Chan School of Public Health
  • Viktoriya Krakovna, Harvard University PhD student
  • Dr. Roman V. Yampolskiy, University of Louisville
  • David Manheim, 1DaySooner

 

Interestingly, there was an opposing group arguing in favor of this kind of research, at http://www.scientistsforscience.org/. I do not recognize a single name on their list of signatories

Comment by Anders_H on Why did no LessWrong discourse on gain of function research develop in 2013/2014? · 2021-06-19T13:31:35.344Z · LW · GW

Here is a video of Prof. Lipsitch at EA Global Boston in 2017. I haven't watched it yet, but I would expect him to discuss gain-of-function research:  https://forum.effectivealtruism.org/posts/oKwg3Zs5DPDFXvSKC/marc-lipsitch-preventing-catastrophic-risks-by-mitigating

Comment by Anders_H on Why did no LessWrong discourse on gain of function research develop in 2013/2014? · 2021-06-19T13:24:24.227Z · LW · GW

Here is a data point not directly relevant to Less Wrong, but perhaps to the broader rationality community:  

Around this time, Marc Lipsitch organized a website and an open letter warning publicly about the dangers of gain-of-function research. I was a doctoral student at HSPH at the time, and shared this information with a few rationalist-aligned organizations. I remember making an offer to introduce them to Prof. Lipsitch, so that maybe he could give a talk. I got the impression that the Future of Life Institute had some communication with him, and I see from their 2015 newsletter that there is some discussion of his work, but I am not sure if anything more concrete came out of of this

My impression was that while they considered this important, this was more of a catastrophic risk than an existential risk, and therefore outside their core mission. 

Comment by Anders_H on Shall we count the living or the dead? · 2021-06-15T12:25:54.404Z · LW · GW

This comment touches on the central tension between the current paradigm in medicine, i.e. "evidence-based medicine" and an alternative and intuitively appealing approach based on a biological understanding of mechanism of disease.

In evidence-based medicine, decisions are based on statistical analysis of randomized trials;  what matters is whether we can be confident that the medication probabilistically has improved outcomes when tested on humans as a unit.  We don't care really care too much about the mechanism behind the causal effect, just whether we can be sure it is real.

The exaggerated strawman alternative approach to EBM would be Star Trek medicine, where the ship's doctor can reliably scan an alien's biology, determine which molecule is needed to correct the pathology, synthesize that molecule and administer it as treatment. 

If we have a complete understanding of what Nancy Cartwright calls "the nomological machine", Star Trek medicine should work in theory. However, you are going to need a very complete, accurate and  detailed map of the human body to make it work. Given the complexity of the human body, I think we are very far from being able to do this in practice. 

There have been many cases in recent history where doctors believed they understood biology well enough to predict the consequences, yet were proved wrong by randomized trials. See for example Vinay Prasad's book "Ending Medical Reversal". 

My personal view is that we are very far from being able to ground clinical decisions in mechanistic knowledge instead of randomized trials. Trying to do so would probably be dangerous given the current state of biological understanding. However, we can probably improve on naive evidence-based medicine by carving out a role for mechanistic knowledge to complement data analysis. Mechanisms seems particularly important for reasoning correctly about extrapolation, the purpose of my research program is to clarify one way such mechanisms can be used. It doesn't always work perfectly, but I am not aware of any examples where an alternative approach works better. 

Comment by Anders_H on Shall we count the living or the dead? · 2021-06-15T10:35:11.350Z · LW · GW

Thank you so much for writing this! Yes, this is mostly an accurate summary of my views (although I would certainly phrase some things differently). I just want to point out two minor disagreements:

  1. I don't think the problem is that doctors are too rushed to do a proper job, I think the patient-specific data that you would need is in many cases theoretically unobservable, or at least that we would need a much more complete understanding of biological mechanisms in order to know what to test the patients for in order to make a truly individualized decision.   At least for the foreseeable future, I think it will be impossible for doctors to determine which patients will benefit on an individual level, they will be constrained to using the patient's observables to put them in a reference group, and then use that reference group to predict risk based on observations from other patients in the same reference group
  2. I am not entirely convinced that the Pearlian approach is the most natural way to handle this.  In the manuscript, I use "modern causal models" as a more general term that also includes other types of counterfactual causal models. Of course, all these models are basically isomorphic, and Cinelli/Pearl did show in response to my last paper that it is possible to do the same thing using DAGs.  I am just not at all convinced that the easiest way to capture the relevant intuition is to use the Pearl's graphical representation of the causal models.
Comment by Anders_H on Shall we count the living or the dead? · 2021-06-15T10:24:10.205Z · LW · GW

You are correct that someone who has one allergy may be more likely to have an other allergy, and that this violates the assumptions of our model. Our model relies on a strong independence assumption,  there are many realistic cases where this independence assumption will not hold.  I also agree that the video uses an example where the assumption may not hold. The video is oversimplified on purpose, in an attempt to get people interested enough to read the arXiv preprint.

If there is a small correlation between baseline risk and effect of treatment, this will have a negligible impact on the analysis. If there is a moderate correlation, you will probably be able to bound the true treatment effect using partial identification methods. If there is strong correlation, this may invalidate the analysis completely.  

The point we are making is not that the model will always hold exactly.  Any model is an approximation. Let's suppose we have three choices:

  1. Use a template for a causal model that "counts the living", think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
  2. Use a template for a causal model that "counts the dead", think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
  3. Use a model that is invariant to whether you count the living or the dead. This cannot be based on a multiplicative (relative risk) parameter. 

The third approach will not be sensitive to the particular problems that I am discussing, but all the suggested methods of this type have their own problems. I have written this earlier, my view is that these problems are more troubling than the problems with the relative risk models. 

What we are arguing in this preprint, is that if you decide to go with a relative risk model, you should choose between (1) and (2) based on the principles suggested by Sheps, and then reason about problems with this model and how it can be addressed in the analysis, based on the principles that you have correctly outlined in your comment. 

I can assure you that if you decide to go with a multiplicative model but choose the wrong "base case", then all of the problems you have discussed in your comments will be orders of magnitude more difficult to deal with in any meaningful way.  In other words, it is only after you make the choice recommended by Sheps that it even becomes possibly the meaningfully analyze the reasons for deviation from effect homogeneity...

Comment by Anders_H on Shall we count the living or the dead? · 2021-06-14T20:18:57.173Z · LW · GW

I very emphatically disagree with this. 

You are right that once you have a prediction for risk if untreated, and a prediction risk if treated, you just need a cost/benefit analysis. However, you won't get to that stage without a paradigm for extrapolation, whether implicit or explicit. I prefer making that paradigm explicit.

If you want to plug in raw experimental data, you are going to need data from people who are exactly like the patient in every way.  Then, you will be relying on a paradigm for extrapolation which claims that the conditional counterfactual risks (rather than the magnitude of the effect) can be extrapolated from the study to the patient. It is a different paradigm, and one that can only be justified if the conditioning set includes every cause of the outcome.  

In my view,  this is completely unrealistic. I prefer a paradigm for extrapolation that aims to extrapolate the scale-specific magnitude of the effect. If this is the goal, our conditioning set only needs to include those covariates that predict the magnitude of the effect of treatment, which is a small subset of all covariates that cause the outcome. 

On this specific point, my view is consistent with almost all thinking in medical statistics, with the exception of some very recent work in causal modeling (who prefer the approach based on counterfactual risks). My disagreement with this work in causal modeling is at the core of my last discussion about this on Less Wrong. See for example "Effect Heterogeneity and External Validity in Medicine" and the European Journal of Epidemiology paper that it links to

Comment by Anders_H on Shall we count the living or the dead? · 2021-06-14T18:45:45.487Z · LW · GW

Suppose you summarize the effect of a drug using a relative risk (a multiplicative effect parameter relating the probability of the event if treated with the probability of the event if untreated), and consider this multiplicative parameter to represent the "magnitude of the effect"

The natural thing for a clinician to do will be to assume that the magnitude of the effect is the same in their own patients. They will therefore rely on this specific scale for extrapolation from the study to their patients. However, those patients may have a different risk profile.

When clinicians do this, they will make different predictions depending on whether the relative risk is based on the probability of the event, or the probability of the complement of the event. 

Sheps' solution to this problem is the same as mine:  If the intervention results in a decrease to the risk of the outcome, you should use the probability of the event to construct the relative risk, whereas if the intervention increases the risk of the event, you should use the probability of the complement of the event

Comment by Anders_H on Shall we count the living or the dead? · 2021-06-14T13:14:15.916Z · LW · GW

No. This is not about interpretation of probabilities. It is about choosing what aspect of reality to rely on for extrapolation. You will get different extrapolations depending on whether you rely on a risk ratio, a risk difference or an odds ratio. This will lead to real differences in predictions for what happens under intervention.

Even if clinical decisions are entirely left to an algorithm, the algorithm will need to select a mathematical object to rely on for extrapolation. The person who writes the algorithm needs to tell the algorithm what to use, and the answer to that question is contested. This paper contributes to that discussion, and proposes a concrete solution. One that has been known for 65 years, but never used in practice. 

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-29T09:17:07.733Z · LW · GW
Some time in the next week I'll write up a post with a few full examples (including the one from Robins, Hernan and Wasserman), and explain in a bit more detail.

I look forward to reading it. To be honest: Knowing these authors, I'd be surprised if you have found an error that breaks their argument.

We are now discussing questions that are so far outside of my expertise that I do not have the ability to independently evaluate the arguments, so I am unlikely to contribute further to this particular subthread (i.e. to the discussion about whether there exists an obvious and superior Bayesian solution to the problem I am trying to solve).

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-27T13:15:56.300Z · LW · GW

I don't have a great reference for this.

A place to start might be Judea Pearl's essay "Why I'm only half-Bayesian" at https://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf . If you look at his Twitter account at @yudapearl, you will also see numerous tweets where he refers to Bayes Theorem as a "trivial identity" and where he talks about Bayesian statistics as "spraying priors on everything". See for example https://twitter.com/yudapearl/status/1143118757126000640 and his discussions with Frank Harrell.

Another good read may be Robins, Hernan and Wasserman's letter to the editor at Biometrics, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4667748/ . While that letter is not about graphical models, the propensity scores/marginal structural models are mathematically very closely related. The main argument in that letter (which was originally a blog post) has been discussed on Less Wrong before; I am trying to find the discussion, it may be this link https://www.lesswrong.com/posts/xdh5FPMYYGGX7PBKj/the-trouble-with-bayes-draft

From my perspective, as someone who is not well trained in Bayesian methods and does not pretend to understand the issue well, I just observe that methodological work on causal models very rarely uses Bayesian statistics, that I myself do not see an obvious way to integrate it, and that most of the smart people working on causal inference appear to be skeptical of such attempts

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-27T09:37:53.698Z · LW · GW
Ok, I think that's the main issue here. As a criticism of Pearl and Bareinboim, I agree this is basically valid. That said, I'd still say that throwing out DAGs is a terrible way to handle the issue - Bayesian inference with DAGs is the right approach for this sort of problem.

I am not throwing out DAGs. I am just claiming that the particular aspect of reality that I think justifies extrapolation cannot be represented on a standard DAG. While I formalized my causal model for these aspects of reality without using graphs, I am confident that there exists a way to represent the same structural constraints in a DAG model. It is just that nobody has done it yet.

As for combining Bayesian inference and DAGs: This is one of those ideas that sounds great in principle, but where the details get very messy. I don't have a good enough understanding of Bayesian statistics to make the argument in full, but I do know that very smart people have tried to combine it with causal models and concluded that it doesn't work. Bayesianism therefore plays essentially no role in the causal modelling literature. If you believe you have an obvious solution to this, I recommend you write it up and submit to a journal, because you will get a very impactful publication out of it.

The equality of this parameter is not sufficient to make the prediction we want to make - the counterfactual is still underspecified. The survival ratio calculation will only be correct if a particular DAG and counterfactual apply, and will be incorrect otherwise.

In a country where nobody plays Russian roulette, you have valid data on the distribution of outcomes under the scenario where nobody plays Russian roulette (due to simple consistency). In combination with knowledge about the survival ratio, this is sufficient to make a prediction for the distribution of outcomes in a counterfactual where everybody plays Russian roulette.

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-26T19:55:41.363Z · LW · GW
Identifiability, sure. But latents still aren't a problem for either extrapolation or model testing, as long as we're using Bayesian inference. We don't need identifiability.

I am not using Bayesian inference, and neither are Pearl and Bareinboim. Their graphical framework ("selection diagrams") is very explicitly set up as model for reasoning about whether the causal effect in the target population is identified in terms of observed data from the study population and observed data from the target population. Such identification may succeed or fail depending on latent variables and depending on the causal structure of the selection diagram.

I am confident that Pearl and Bareinboim would not disagree with me about the preceding paragraph. The point of disagreement is whether there are realistic ways to substantially reduce the set of variables that must be measured, by using background knowledge about the causal structure that cannot be represented on selection diagrams.

The obvious causal model for the Russian roulette example is one with four nodes:
first node indicating whether roulette is played
second node, child of first, indicating whether roulette killed
third node, child of second, indicating whether some other cause killed (can only happen if the person survived roulette)
fourth node, death, child of second and third node
This makes sense physically, has a well-defined counterfactual for Norway, and produces the risk difference calculation from the post. What information is missing?

In my model of reality (and I am sure, in most other people's model of reality), the third node has a wide range of unobserved latent ancestors. If the goal is to make inferences about the effect of Russian roulette in Russia using data from Russia, your analytic objective will be to find a set of nodes that d-separate the first node from the fourth node. You do not need to condition on the latent causes of the third node to achieve this (because those latent variables are not also causes of the first node- they cannot be, because the first node was randomized). The identification formula for the effect in Russia is therefore invariant to whether the latent causes of the third node are represented on the graph or not, and you therefore do not have to show them. The DAG model then represents a huge equivalence class of causal models; you can be agnostic between causal models within this equivalence class because the inferences are invariant between them.

But if the goal is to make predictions about the effect in Norway using data from Russia, these latent variables suddenly become relevant. The goal is no longer to d-separate the fourth node from the first node, but to d-separate the fourth node from an indicator for whether a person lives in Russia or Norway. In the true data generating mechanism (i.e. in the reality that the model is trying to represent), there almost certainly are a substantial number of open paths between the indicator for whether a person lives in Norway or Russia and their risk of death. The only possible identification formula for the effect in Russia includes terms for distributions that are conditional on the latent variables. The effect in Norway is therefore not identified from the Russian data.

The underlying structure of reality is still a DAG, it's only our information about reality which will be non-DAG-shaped. DAGs show the causal structure

I agree that reality is generated by a structure that looks something like a directed acyclic graph. But that does not mean that all significant aspects of reality can be modeled using Pearl's specific operationalization of causal DAGs/selection diagrams.

Any attempt to extrapolate from Russia to Norway is going to depend on a background belief that some aspect of the data generating structure is equal between the countries. In the case of Russian roulette, I argue that the natural choice of mathematical object to hang our claims to structural equality on, is the parameter that takes the value 5/6 in both countries.

In DAG terms, you can think of the data generating mechanism for node 4 as responding to a property of the path 1->2->4. In particular, this path forces the quantities Pr(Fourth node =0 | do(First node=1)) and Pr(Fourth node =0 | do(First node=0)) to be related by a factor of 5/6 in both countries. Reality still has a DAG structure, but you won't find a way to encode the figure 5/6 in a causal model based only on selection diagrams. Without a way to encode a parameter that takes the value 5/6, you have to take a long detour where you collect a truckload of data and measure all the latent variables.

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-26T13:13:48.046Z · LW · GW
The key issue is that we're asking a counterfactual question. The question itself will be underdefined without the context of a causal model. The Russian roulette hypothetical is a good example: "Our goal is to find out what happens in Norway if everyone took up playing Russian roulette once a year". What does this actually mean? Are we asking what would happen if some mad dictator forced everyone to play Russian roulette? Or if some Russian roulette social media craze caught on? Or if people became suicidal en-masse and Russian roulette became popular accordingly? These are different counterfactuals, and the answer will be different depending on which of these we're talking about. We need the machinery of counterfactuals - and therefore the machinery of causal models - in order to define what we mean at all by "what happens in Norway if everyone took up playing Russian roulette once a year". That counterfactual only makes sense at all in the context of a causal model, and is underdefined otherwise.

I absolutely agree that this is a counterfactual question. I am using the machinery of counterfactuals and causal models, just a different causal model from the one you and Pearl prefer. In this case, I had in mind a situation that is roughly equivalent to a mad dictator forcing everyone to play Russian roulette, but the underspecified details are not all that important to the argument I am making.

I assume by "unmeasured causes" you mean latent variables - i.e. variables in the causal graph which happened to not be observed. A causal diagram framework can handle latent variables just fine; there is no fundamental reason why every variable needs to be measured. Latent variables are a pain computationally, but they pose no fundamental problem mathematically.

This is straight up wrong, and on this particular point the causal inference establishment is on my side, not yours. For example, if there are backdoor paths that cannot be closed without conditioning on a latent variable, then the causal effect is not identified and there is no amount of computation that can get around this.

Indeed, much of machine learning consists of causal models with latent variables.

Much of machine learning gets causality wrong.

Whether the treatment has an effect does not seem relevant here at all.

It is relevant because it allows me to construct a very simple scenario where we have very strong intuition that extrapolation should work; yet Pearl's selection diagram fails to make a prediction for the target population.

No. My intuition very strongly says that 100% of the relevant structural information/model can be directly captured by causal models, and that you're just not used to encoding these sorts of intuitions into causal models. Indeed, counterfactuals are needed even to define what we mean, as in the Russian roulette example. The individual counterfactual distributions really are the thing we care about, and everything else is relevant only insofar as it approximates those counterfactual distributions in some situations.

I agree that you can encode all structural information in causal models. I do not agree that all structural information can be encoded in DAGs, which are one particular type of causal model. There are several examples of background information about the causal structure, which are essential for identifiability and which cannot be encoded on standard DAGs. For example, monotonicity is necessary for instrumental variable identification.

I am arguing that there is a special type of background information that is crucial for generalizability, and which cannot be encoded in Pearl/Bareinboim's causal diagrams for transportability. I therefore proposed a non-DAG causal model which is able to use this background structural knowledge. The Russian roulette example is an attempt to illustrate the nature of this class of background knowledge.

This does not mean that it is impossible to make an extension of the causal DAG framework to encode the same information. I am just arguing that this is not what the Pearl/Bareinboim selection diagram framework does.

Overall, my impression is that you don't actually understand how to build causal models, and you are very confused about their applicability and limitations.

I did specifically invoke Crocker's Rules, so I'd like to thank you for this feedback.

Of course, I think you are wrong about this. I dislike appeals to authority, but I would like to point out that I have a doctoral degree in epidemiologic methodology from Harvard, and that my thesis advisors were genuine thought leaders in causal modelling. I also want to point out that both my papers on this topic have been reviewed by editors and peer-reviewers with a deep understanding of causal models.

This does of course not necessarily mean that you are wrong. It does however mean that I think you should adjust your priors and truly try to understand my argument before you reach such a strong posterior.

If you genuinely have found a flaw in my argument, I'd like you to state it explicitly rather than just claim that I don't understand causal models. In a hypothetical world in which I am wrong, I would very much like to know about it, as it would allow me to move on and work on something else.

Comment by Anders_H on Effect heterogeneity and external validity in medicine · 2019-10-26T08:41:48.538Z · LW · GW

I am curious why you think the approach based on causal diagrams is obviously correct. Would you be able to unpack this for me?

Does it not bother you that this approach fails to find a solution (i.e. won't make any predictions at all) if there are unmeasured causes of the outcome, even if treatment has no effect?

Does it not bother you that it fails to find a solution to the Russian roulette example, because the approach insists on treating "what happens if treated" and "what happens if untreated" as separate problems, and therefore fails to make use of information about how much the outcomes differs by between the two treatment options?

Does it not seem useful to have an alternative approach that is able make use of all the intuition that says we should be able to make such extrapolations? An alternative approach that formalizes the intuition that led all the pre-Pearl literature to consider the problem in terms of the magnitude of the effect, not in terms of the individual counterfactual distributions?

Comment by Anders_H on The New Riddle of Induction: Neutral and Relative Perspectives on Color · 2017-12-02T18:56:46.422Z · LW · GW

In my view, "the problem of induction" is just a bunch of philosophers obsessing over the fact that induction is not deduction, and that you therefore cannot predict the future with logical certainty. This is true, but not very interesting. We should instead spend our energy thinking about how to make better predictions, and how we can evaluate how much confidence to have in our predictions. I agree with you that the fields you mention have made immense progress on that.

I am not convinced that computer programs are immune to Goodmans point. AI agents have ontologies, and their predictions will depend on that ontology. Two agents with different ontologies but the same data can reach different conclusions, and unless they have access to their source code, it is not obvious that they will be able to figure out which one is right.

Consider two humans who are both writing computer functions. Both the "green" and the "grue" programmer will believe that their perspective is the neutral one, and therefore write a simple program that takes light wavelength as input and outputs a constant color predicate. The difference is that one of them will be surprised after time t, when suddenly the computer starts outputting different colors from their programmers experienced qualia. At that stage, we know which one of the programmers was wrong, but the point is that it might not be possible to predict this in advance.

Comment by Anders_H on The New Riddle of Induction: Neutral and Relative Perspectives on Color · 2017-12-02T17:20:29.841Z · LW · GW

I am not sure I fully understand this comment, or why you believe my argument is circular. It is possible that you are right, but I would very much appreciate a more thorough explanation.

In particular, I am not "concluding" that humans were produced by an evolutionary process; but rather using it as background knowledge. Moreover, this statement seems uncontroversial enough that I can bring it in as a premise without having to argue for it.

Since "humans were produced by an evolutionary process" is a premise and not a conclusion, I don't understand what you mean by circular reasoning.

Comment by Anders_H on Odds ratios and conditional risk ratios · 2017-02-02T15:03:19.943Z · LW · GW

Update: The editors of the Journal of Clinical Epidemiology have now rejected my second letter to the editor, and thus helped prove Eliezer's point about four layers of conversation.

Comment by Anders_H on Odds ratios and conditional risk ratios · 2017-01-25T06:02:48.605Z · LW · GW

Why do you think two senior biostats guys would disagree with you if it was obviously wrong? I have worked with enough academics to know that they are far far from infallible, but curious on your analysis of this question.

Good question. I think a lot of this is due to a cultural difference between those of us who have been trained in the modern counterfactual causal framework, and an old generation of methodologists who felt the old framework worked well enough for them and never bothered to learn about counterfactuals.

Comment by Anders_H on Odds ratios and conditional risk ratios · 2017-01-25T03:55:43.955Z · LW · GW

I wrote this on my personal blog; I was reluctant to post this to Less Wrong since it is not obviously relevant to the core interests of LW users. However, I concluded that some of you may find it interesting as an example of how the academic publishing system is broken. It is relevant to Eliezer's recent Facebook comments about building an intellectual edifice.

Comment by Anders_H on [deleted post] 2017-01-25T03:48:12.884Z

I wrote this on my personal blog; I was reluctant to post this to Less Wrong since it is not obviously relevant to the core interests of LW users. However, I concluded that some of you may find it interesting as an example of how the academic publishing system is broken. It is relevant to Eliezer's recent Facebook comments about building an intellectual edifice.

Comment by Anders_H on Is Caviar a Risk Factor For Being a Millionaire? · 2017-01-25T02:32:53.938Z · LW · GW

VortexLeague: Can you be a little more specific about what kind of help you need?

A very short, general introduction to Less Wrong is available at http://lesswrong.com/about/

Essentially, Less Wrong is a reddit-type forum for discussing how we can make our beliefs more accurate.

Comment by Anders_H on Choosing prediction over explanation in psychology: Lessons from machine learning · 2017-01-18T13:58:16.346Z · LW · GW

Thank you for the link, that is a very good presentation and it is good to see that ML people are thinking about these things.

There certainly are ML algorithms that are designed to make the second kind of predictions, but generally they only work if you have a correct causal model

It is possible that there are some ML algorithms that try to discover the causal model from the data. For example, /u/IlyaShpitser works on these kinds of methods. However, these methods only work to the extent that they are able to discover the correct causal model, so it seems disingenious to claim that we can ignore causality and focus on "prediction".

Comment by Anders_H on Choosing prediction over explanation in psychology: Lessons from machine learning · 2017-01-18T01:23:22.918Z · LW · GW

I skimmed this paper and plan to read it in more detail tomorrow. My first thought is that it is fundamentally confused. I believe the confusion comes from the fact that the word "prediction" is used with two separate meanings: Are you interested in predicting Y given an observed value of X (Pr[Y | X=x]), or are you interested in predicting Y given an intervention on X (i.e. Pr[Y|do(X=x)]).

The first of these may be useful for certain purposes. but If you intend to use the research for decision making and optimization (i.e. you want to intervene to set the value of X , in order to optimize Y), then you really need the second type of predictive ability, in which case you need to extract causal information from the data. This is only possible if you have a randomized trial, or if you have a correct causal model.

You can use the word "prediction" to refer to the second type of research objective, but this is not the kind of prediction that machine learning algorithms are designed to do.

In the conclusions, the authors write:

"By contrast, a minority of statisticians (and most machine learning researchers) belong to the “algorithmic modeling culture,” in which the data are assumed to be the result of some unknown and possibly unknowable process, and the primary goal is to find an algorithm that results in the same outputs as this process given the same inputs. "

The definition of "algorithmic modelling culture" is somewhat circular, as it just moves the ambiguity surrounding "prediction" to the word "input". If by "input" they mean that the algorithm observes the value of an independent variable and makes a prediction for the dependent variable, then you are talking about a true prediction model, which may be useful for certain purposes (diagnosis, prognosis, etc) but which is unusable if you are interested in optimizing the outcome.

If you instead claim that the "input" can also include observations about interventions on a variable, then your predictions will certainly fail unless the algorithm was trained in a dataset where someone actually intervened on X (i.e. someone did a randomized controlled trial), or unless you have a correct causal model.

Machine learning algorithms are not magic, they do not solve the problem of confounding unless they have a correct causal model. The fact that these algorithms are good at predicting stuff in observational datasets does not tell you anything useful for the purposes of deciding what the optimal value of the independent variable is.

In general, this paper is a very good example to illustrate why I keep insisting that machine learning people need to urgently read up on Pearl, Robins or Van der Laan. The field is in danger of falling into the same failure mode as epidemiology, i.e. essentially ignoring the problem of confounding. In the case of machine learning, this may be more insidious because the research is dressed up in fancy math and therefore looks superficially more impressive.

Comment by Anders_H on Triple or nothing paradox · 2017-01-05T23:59:28.417Z · LW · GW

Thanks for catching that, I stand corrected.

Comment by Anders_H on Triple or nothing paradox · 2017-01-05T22:52:14.657Z · LW · GW

The rational choice depends on your utility function. Your utility function is unlikely to be linear with money. For example, if your utility function is log (X), then you will accept the first bet, be indifferent to the second bet, and reject the third bet. Any risk-averse utility function (i.e. any monotonically increasing function with negative second derivative) reaches a point where the agent stops playing the game.

A VNM-rational agent with a linear utility function over money will indeed always take this bet. From this, we can infer that linear utility functions do not represent the utility of humans.

(EDIT: The comments by Satt and AlexMennen are both correct, and I thank them for the corrections. I note that they do not affect the main point, which is that rational agents with standard utility functions over money will eventually stop playing this game)

Comment by Anders_H on A quick note on weirdness points and Solstices [And also random other Solstice discussion] · 2016-12-23T17:36:32.640Z · LW · GW

Because I didn't perceive a significant disruption to the event, I was mentally bucketing you with people I know who severely dislike children and would secretly (or not so secretly) prefer that they not attend events like this at all; or that they should do so only if able to remain silent (which in practice means not at all.) I suspect Anders_H had the same reaction I did.

Just to be clear, I did not attend Solstice this year, and I was mentally reacting to a similar complaint that was made after last year's Solstice event. At last year's event, I did not perceive the child to be at all noteworthy as a disturbance. From reading this thread, it seems that the situation may well have been different this year, and that my reaction might have been different if I had been there. I probably should not have commented without being more familiar with what happened at this year's event.

I also note that my thinking around this may very well be biased, as I used to live in a group house with this child.

Comment by Anders_H on A quick note on weirdness points and Solstices [And also random other Solstice discussion] · 2016-12-22T06:00:51.168Z · LW · GW

While I understand that some people may feel this way, I very much hope that this sentiment is rare. The presence of young children at the event only adds to the sense of belonging to a community, which is an important part of what we are trying to "borrow" from religions.

Comment by Anders_H on Feature Wish List for LessWrong · 2016-12-19T07:57:05.056Z · LW · GW

I'd like each user to have their own sub domain (I.e such that my top level posts can be accessed either from Anders_h.lesswrong.com or from LW discussion). If possible it would be great if users could customize the design of their sub domain, such that posts look different when accessed from LW discussion.

Comment by Anders_H on This one equation may be the root of intelligence · 2016-12-12T02:56:28.887Z · LW · GW

Given that this was posted to LW, you'd think this link would be about a different equation..

Comment by Anders_H on Open thread, Nov. 21 - Nov. 27 - 2016 · 2016-11-22T20:58:25.516Z · LW · GW

The one-year embargo on my doctoral thesis has been lifted, it is now available at https://dash.harvard.edu/bitstream/handle/1/23205172/HUITFELDT-DISSERTATION-2015.pdf?sequence=1 . To the best of my knowledge, this is the first thesis to include a Litany of Tarski in the introduction.

Comment by Anders_H on On Trying Not To Be Wrong · 2016-11-11T22:08:48.111Z · LW · GW

Upvoted. I'm not sure how to phrase this without sounding sycophantic, but here is an attempt: Sarah's blog posts and comments were always top quality, but the last couple of posts seem like the beginning of something important, almost comparable to when Scott moved from squid314 to Slatestarcodex.

Comment by Anders_H on Open Thread, Sept 5. - Sept 11. 2016 · 2016-09-07T23:07:05.720Z · LW · GW

Today, I uploaded a sequence of three working papers to my website at https://andershuitfeldt.net/working-papers/

This is an ambitious project that aims to change fundamental things about how epidemiologists and statisticians think about choice of effect measure, effect modification and external validity. A link to an earlier version of this manuscript was posted to Less Wrong half a year ago, the manuscript has since been split into three parts and improved significantly. This work was also presented in poster form at EA Global last month.

I want to give a heads up before you follow the link above: Compared to most methodology papers, the mathematics in these manuscripts is definitely unsophisticated, almost trivial. I do however believe that the arguments support the conclusions, and that those conclusions have important implications for applied statistics and epidemiology.

I would very much appreciate any feedback. I invoke "Crocker's Rules" (see http://sl4.org/crocker.html) for all communication regarding these papers. Briefly, this means that I ask you, as a favor, to please communicate any disagreement as bluntly and directly as possible, without regards to social conventions or to how such directness may affect my personal state of mind.

I have made a standing offer to give a bottle of Johnnie Walker Blue Label to anyone who finds a flaw in the argument that invalidates the paper, and a bottle of 10-year old Single Scotch Malt to anyone who finds a significant but fixable error; or makes a suggestion that substantially improves the manuscript.

If you prefer giving anonymous feedback, this can be done through the link http://www.admonymous.com/effectmeasurepaper .

Comment by Anders_H on Secret Rationality Base in Europe · 2016-06-17T19:47:36.910Z · LW · GW

This is almost certainly a small minority view, but from my perspective as a European based in the Bay Area who may be moving back to Europe next summer, the most important aspect would be geographical proximity to a decent university where staff and faculty can get away with speaking only English.

Comment by Anders_H on Why you should consider buying Bitcoin right now (Jan 2015) if you have high risk tolerance · 2016-06-14T17:57:25.679Z · LW · GW

What do you mean by "no risk"? This sentence seems to imply that your decisions are influenced by the sunk cost fallacy.

Try to imagine an alien who has been teleported into your body, who is trying to optimize your wealth. The fact that the coins were worth a third of their current price 18 months ago would not factor into the alien's decision.

Comment by Anders_H on Open thread, Jun. 13 - Jun. 19, 2016 · 2016-06-14T01:05:37.629Z · LW · GW

There may be an ethically relevant distinction between a rule that tells you to avoid being the cause of bad things, and a rule that says you should cause good things to happen. However, I am not convinced that causality is relevant to this distinction. As far as I can tell, these two concepts are both about causality. We may be using words differently, do you think you could explain why you think this distinction is about causality?

Comment by Anders_H on The Valentine’s Day Gift That Saves Lives · 2016-05-18T17:30:51.862Z · LW · GW

It would seem that the existence of such contractors follows logically from the fact that you are able to hire people despite the fact that you require contractors to volunteer 2/3 of their time.

Comment by Anders_H on Open Thread May 16 - May 22, 2016 · 2016-05-17T21:07:59.908Z · LW · GW

The Economist published a fascinating blog entry where they use evidential decision theory to establish that tattoo removal results in savings to the prison system. See http://www.economist.com/blogs/freeexchange/2014/08/tattoos-jobs-and-recidivism . Temporally, this blog entry corresponds roughly to the time I lost my respect for the Economist. You can draw your own causal conclusions from this.

Comment by Anders_H on How do you learn Solomonoff Induction? · 2016-05-17T18:21:35.480Z · LW · GW

Solomonoff Induction is uncomputable, and implementing it will not be possible even in principle. It should be understood as an ideal which you should try to approximate, rather than something you can ever implement.

Solomonoff Induction is just bayesian epistemology with a prior determined by information theoretic complexity. As an imperfect agent trying to approximate it, you will get most of your value from simply grokking Bayesian epistemology. After you've done that, you may want to spend some time thinking about the philosophy of science of setting priors based on information theoretic complexity.

Comment by Anders_H on Lesswrong 2016 Survey · 2016-03-26T22:10:21.611Z · LW · GW

I took the survey

Comment by Anders_H on Open thread, Mar. 14 - Mar. 20, 2016 · 2016-03-21T17:28:37.253Z · LW · GW

Thanks. Good points. Note that many of those words are already established in the literature with same meaning. For the particular example of "doomed", this is the standard term for this concept, and was introduced by Greenland and Robins (1986). I guess I could instead use "response type 1" but the word doomed will be much more effective at pointing to the correct concept, particularly for people who are familiar with the previous literature.

The only new term I introduce is "flip". I also provide a new definition of effect equality, and it therefore seems correct to use quotation marks in the new definition. Perhaps I should remove the quotation marks for everything else since I am using terms that have previously been introduced.

Comment by Anders_H on Open thread, Mar. 14 - Mar. 20, 2016 · 2016-03-20T17:00:36.339Z · LW · GW

Do you mean probability instead of probably?

Yes. Thanks for noticing. I changed that sentence after I got the rejection letter (in order to correct a minor error that the reviewers correctly pointed out), and the error was introduced at that time. So that is not what they were referring to.

If the reviewers don't succeed in understanding what you are saying you might have explained yourself in casual language but still failed.

I agree, but I am puzzled by why they would have misunderstood. I spent a lot of effort over several months trying to be as clear as possible. Moreover, the ideas are very simple: The definitions are the only real innovation: Once you have the definitions, the proofs are trivial and could have been written by a high school student. If the reviewers don't understand the basic idea, I will have to substantially update my beliefs about the quality of my writing. This is upsetting because being a bad writer will make it a lot harder to succeed in academia. The primary alternative hypotheses for why they misunderstood are either (1) that they are missing some key fundamental assumption that I take for granted or (2) that they just don't want to understand.

Comment by Anders_H on Open thread, Mar. 14 - Mar. 20, 2016 · 2016-03-20T05:32:34.388Z · LW · GW

Three days ago, I went through a traditional rite of passage for junior academics: I received my first rejection letter on a paper submitted for peer review. After I received the rejection letter, I forwarded the paper to two top professors in my field, who both confirmed that the basic arguments seem to be correct and important. Several top faculty members have told me they believe the paper will eventually be published in a top journal, so I am actually feeling more confident about the paper than before it got rejected.

I am also very frustrated with the peer review system. The reviewers found some minor errors, and some of their other comments were helpful in the sense that they reveal which parts of the paper are most likely to be misunderstood. However, on the whole, the comments do not change my belief in the soundness of the idea, and in my view they mostly show that the reviewers simply didn’t understand what I was saying.

One comment does stand out, and I’ve spent a lot of energy today thinking about its implications: Reviewer 3 points out that my language is “too casual”. I would have had no problem accepting criticism that my language is ambiguous, imprecise, overly complicated, grammatically wrong or idiomatically weird. But too casual? What does that even mean? I have trouble interpreting the sentence to mean anything other than an allegation that I fail at a signaling game where the objective is to demonstrate impressiveness by using an artificially dense and obfuscating academic language.

From my point of view, “understanding” something means that you are able to explain it in a casual language. When I write a paper, my only objective is to allow the reader to understand what my conclusions are and how I reached them. My choice of language is optimized only for those objectives, and I fail to understand how it is even possible for it to be “too casual”.

Today, I feel very pessimistic about the state of academia and the institution of peer review. I feel stronger allegiance to the rationality movement than ever, as my ideological allies in what seems like a struggle about what it means to do science. I believe it was Tyler Cowen or Alex Tabarrok who pointed out that the true inheritors of intellectuals like Adam Smith are not people publishing in academic journals, but bloggers who write in a causal language. I can’t find the quote but today it rings more true than ever.

I understand that I am interpreting the reviewers choice of words in a way that is strongly influenced both by my disappointment in being rejected, and by my pre-existing frustration with the state of academia and peer review. I would very much appreciate if anybody could steelman the sentence “the writing is too casual”, or otherwise help me reach a less biased understanding of what just happened.

The paper is available at https://rebootingepidemiology.files.wordpress.com/2016/03/effect-measure-paper-0317162.pdf . I am willing to send a link to the reviewers’ comments by private message to anybody who is interested in seeing it.

Comment by Anders_H on Link: Evidence-Based Medicine Has Been Hijacked · 2016-03-16T22:43:42.576Z · LW · GW

I think the evidence for the effectiveness of statins is very convincing. The absolute risk reduction from statins will depend primarily on your individual baseline risk of coronary disease. From the information you have provided, I don't think your baseline risk is extraordinarily high, but it is also not negligible.

You will have to make a trade-off where the important considerations are (1) how bothered you are by the side-effects, (2) what absolute risk reduction you expect based on your individual baseline risk, (3) the marginal price (in terms of side effects) that you are willing to pay for slightly better chance at avoiding a heart attack. I am not going to tell you how to make that trade-off, but I would consider giving the medications a try simply because it is the only way to get information on whether you get any side effects, and if so, whether you find them tolerable.

(I am not licensed to practice medicine in the United States or on the internet, and this comment does not constitute medical advise)

Comment by Anders_H on If there was one element of statistical literacy that you could magically implant in every head, what would it be? · 2016-02-26T07:40:28.484Z · LW · GW

Why do you want to be able to do that? Do you mean that you want to be able to look at a spreadsheet and move around numbers in your head until you know what the parameter estimates are? If you have access to a statistical software package, this would not give you the ability to do anything you couldn't have done otherwise. However, that is obvious, so I am going to assume you are more interested in groking some part of the underlying the epistemic process. But if that is indeed your goal, the ability to do the parameter estimation in your head seems like a very low priority, almost more of a party trick than actually useful.

Comment by Anders_H on Open Thread, Feb 8 - Feb 15, 2016 · 2016-02-09T22:38:10.445Z · LW · GW

I disagree with this. In my opinion QALYs are much superior to DALYs for reasons that are inherent to how the measures are defined. I wrote a Tumblr post in response to Slatestarscratchpad a few weeks ago, see http://dooperator.tumblr.com/post/137005888794/can-you-give-me-a-or-two-good-article-on-why .

Comment by Anders_H on The Fable of the Burning Branch · 2016-02-08T18:51:30.203Z · LW · GW

Richard, I don't think Less Wrong can survive losing both Ilya and you in the same week. I hope both of you reconsider. Either way, we definitely need to see this as a wake-up call. This forum has been in decline for a while, but this week I definitely think it hit a breaking point.

Comment by Anders_H on Lesswrong Survey - invitation for suggestions · 2016-02-08T18:49:04.965Z · LW · GW

How about asking "What is the single most important change that would make you want to participate more frequently on Less Wrong?"

This question would probably not be useful for the census itself, but it seems like a great opportunity to brainstorm..

Comment by Anders_H on Disguised Queries · 2016-02-07T19:45:59.705Z · LW · GW

I run the Less Wrong meetup group in Palo Alto. After we announced the events at Meetup.com, we often get a lot of guests who are interested in rationality but who have not read the LW sequences. I have an idea for a introductory session where we have the participants do a sorting exercise. Therefore, I am interested in getting 3D printed versions of rubes, bleggs and other items references in this post.

Does anyone have any thoughts on how to do this cheaply? Is there sufficient interest in this to get a kickstarter running? I expect that these items may be of interest to other Less Wrong meetup groups, and possibly to CFAR workshops and/or schools?