[Link] "The madness of reduced medical diagnostics" by Dynomight
post by Kenny · 2022-06-16T19:20:36.556Z · LW · GW · 25 commentsContents
24 comments
Link:
This is (or seems to me now to be) an obvious-in-hindsight and I'm sad that I've never (or don't remember having) encountered it; at least not so succinctly.
I'd like to try putting this 'advice' into practice myself, e.g. demanding doctors share relevant base rates but not otherwise avoiding seeing a doctor at all or avoiding diagnostic tests (even if I expect the doctor's subsequent decisions to be bad).
25 comments
Comments sorted by top scores.
comment by gwern · 2022-06-16T21:34:52.987Z · LW(p) · GW(p)
This seems straightforward: "the Value Of Information is always positive" - for a rational agent. Thus, if you think that there's a situation where getting information is bad, the easiest way to reconcile this is that you aren't dealing with fully-rational agents. Which is hardly news.
Replies from: anonymousaisafety, Kenny↑ comment by anonymousaisafety · 2022-06-17T00:22:55.848Z · LW(p) · GW(p)
I disagree with this take and with the linked article. "The value of information is always positive" is brushing over the actual problem described by the healthcare studies: that taking a measurement is not guaranteed to accurately capture the world state, because a sensor can be faulty, and it is not always possible to distinguish a faulty sensor from a reliable sensor.
Me: If I went to talk to him, he’d probably lie. And probably it would be impossible to check his story without spending huge amounts of time and exposing myself to danger. But I’d feel obligated to do it anyway, and while I was distracted, the true criminal would get away. That risk outweighs the chance that he’d give me something useful.
This reasoning is claimed to be incorrect in the linked article, and further clarified in the conclusion
It’s a fact that if you make decisions correctly, then putting more information into the system can’t hurt you.
Consider sensors in a control system. We only add additional sensors to the controller if we can guarantee some level of quality from the sensor. If we can't guarantee the sensors are valid, then each additional sensor added to the system might not be adding information at all -- "garbage in, garbage out".
The article is making the claim that a rational agent has a reliable function F(information) -> garbage?
but that's ridiculous. "How do I tell my sensors are working correctly?" is one of the hardest problems in control theory. The solution used in system design is multiple, independent measurements that can all be assessed together.
The claim in the article is that doing these tests is just that. Each test is an additional, independent measurement, that can be assessed against the base rate, and the other risk factors, or symptoms. That is technically true except it then glosses over the problems
Some of the above reasons to be careful about testing are fine. By all means, account for the costs of the CT scan itself (#1). And I’ll wearily pretend to accept that people are emotional and couldn’t understand Bayesian reasoning or false positives and so we need to worry about stressing them out (#2).
...
If you know that a patient’s prior probability for a condition is low, you still know that after doing a test. In a sane world, wouldn’t you do the CT scan, and then… only do the biopsy only if the CT scan showed something serious enough to justify the risks?
Look at one of the quoted studies.
This led to a PET scan that showed no small nodules but confirmed the lesion. Doctors considered surgery but decided against it because the lesion seemed to be growing too fast to be lung cancer. One month later, the lesion had shrunk, suggesting it was just some kind of inflammation or infection.
Bold is my own emphasis. Let's flip a coin and look at another world state, a world state that did not occur for this patient, but has occurred for others.
This led to a PET scan that showed no small nodules but confirmed the lesion. Doctors responded rapidly with surgery to remove the lesion. A post-surgery biopsy revealed that the lesion was not cancerous. Unfortunately, the patient was one of the 3% of people who die within 90 days of lung surgery.[1]
In the real world, there is no reliable function F(information) -> garbage?
. The idea that sometimes a test returns a false positive and it is "obvious" to the doctors that it is a false positive is incorrect. What do you want the doctors to do? Run the test, when it wasn't likely that the patient has cancer (they have no other symptoms), see something that looks cancerous on the test (the false positive) and then do nothing? The conclusion here seems to be "obviously the doctor will realize it was a false positive and simply not operate", which is ignoring the corpus of evidence in the linked studies showing that the doctors couldn't distinguish between false positives![2]
In other words, if you have a threshold for action that is "patients with cancer have these symptoms and also a mass on a CT scan", but you have an arbitrary shortcut like "only do the biopsy only if the CT scan showed something serious enough to justify the risks" (quote from the article), then now you've tied your dangerous action (the surgery) to the thing that we know has a false positive rate -- the test!
The solution offered in the article is "well don't do the biopsy unless they also have the symptoms". This is the "multiple, independent measurements that are assessed together" approach. Except that if they don't have symptoms, and we've decided that symptoms are a prerequisite for the biopsy, then there's no reason to do the CT scan, which is exactly what the doctors concluded in the studies that are being criticized here.
- ^
Actual percent not relevant, so long as the surgery is risky, e.g. above 0.5% mortality rate. I grabbed this 3% number from various articles like this.
- ^
It is weird to me that I need to say this, but when we discuss false positives on sensors, there's for some reason an assumption that within the context of a system, we "know" that we measured a false positive. In general, the system is not aware of a false positive, that's why false positives are a problem. The only way to "know" that a sensor returned a false positive is if you have other, independent measurements that you can use to do some type of out-of-family filtering.
↑ comment by gwern · 2022-06-17T01:03:46.859Z · LW(p) · GW(p)
that taking a measurement is not guaranteed to accurately capture the world state, because a sensor can be faulty, and it is not always possible to distinguish a faulty sensor from a reliable sensor.
No. Dynomight already addressed this: if you have an unreliable sensor (ie. any sensor that has ever existed in the real world), then that simply reduces how useful it is, because it changes your posterior less than a more reliable one would. The VoI remains positive; I refer you to Ramsey and Savage on this particular point of decision theory.
All of your additional comments are generally wrong, and reflect an extremely rigid absolutist approach to making decisions. We use unreliable correlated measures all the time, this is in fact 'technically true' and that is is the point, and yes, your entire example of doctors is simply due to irrationality and does not refute the decision theory point and it has nothing to do with 'being obvious it is a false positive' except in the trivial sense that for a poor measure the posterior of a true positive remains far smaller than it being a false positive and may not motivate a decision, shrinking the VoI towards zero, which will frequently be so small as to not justify the cost of testing (explicitly pointed out by Dynomight). It is definitely the case that many tests cost too much for too little information and should not be run because the VoI is often zero (for a rational decision maker) and the test is simply a loss as it will not change any decisions. Nevertheless, the value of free information is always greater than or equal to zero, and if free information makes you worse off, that implies somewhere there is an irrationality.
(The really relevant problem with this in the context of medicine is that decision theory is considering single agents in a stochastic environment, an idealized physician ordering tests to try to optimize patient health, because game theory hasn't been invented yet; when you bring in multiple agents with different goals and mechanisms like lawsuits, then free information can be quite harmful, but this too is not lost on most people, including Dynomight at the end.)
Replies from: anonymousaisafety, anonymousaisafety↑ comment by anonymousaisafety · 2022-06-17T03:15:31.780Z · LW(p) · GW(p)
NOTE: I wrote this as a separate reply because it's addressing your points about decision theory directly, and is not about the specific scenario discussed with the medical system.
if you have an unreliable sensor (ie. any sensor that has ever existed in the real world), then that simply reduces how useful it is, because it changes your posterior less than a more reliable one would.
I think the crux here is that you seem to be saying the usefulness of reading a sensor's value is in some interval [0, 1]
, where 1 represents that the value provided by the sensor is perfectly trustworthy and 0 is that the value provided by the sensor is totally useless; i.e., it's random noise. Under this belief, you're saying that it is always rational to acquire as many sensors as possible, because there is no downside to acquiring useless sensors. When you run your filter over all of the sensors, anything that has a usefulness of 0 is going to get dropped from the final result. Likewise, low-but-non-zero usefulness sensors are weighted accordingly in the final result.
In my work, this is called sensor fusion. So far, so good.
I can argue that acquiring each sensor has a cost associated with it, but it seems like the idea of "free information" is intended to deflect that argument. Let us assume that the sensors are provided for free, and it's just a question of "given an arbitrary number of sensors, with different usefulness, how many do you want to fuse when trying to model the correct world state?"
I think what you've said above implies that a rational actor should always want more sensors.
Nevertheless, the value of free information is always greater than or equal to zero, and if free information makes you worse off, that implies somewhere there is an irrationality.
More sensors leads to more sensor values ("information"), and the rational actor will simply use the usefulness of each sensor (which for the sake of argument we'll assume that they know exactly) when weighting each sensor value.
In the real world, I still disagree with this claim. Computational complexity[1] exists. There is a cost to interpreting, and fusing, an arbitrary number of sensor values. Each additional sensor, even if it was provided by free, is going to incur an actual cost in computation before that value can be used to make a decision. A rational actor would not accept an arbitrary number of useless sensors if it is going to take non-zero computational cycles to disregard them.
When you include the cost of computation, now the value of those sensors is in some interval [0 - c, 1 - c]
, where c
is how much it costs in computational effort[2] to include the sensor in your filter. In this world, sensors can have less than zero usefulness, i.e. it is actively detrimental to include the sensor in your filter. Your filter functions worse with that sensor than it does without it.
I believe the only way out of this is to ignore computational complexity and assume that c = 0
, but we know that isn't true. Consider the trivial thought experiment of me sitting here and providing you a series of useless facts about a fictional D&D campaign I'm running like, "A miraksaur is a type of dinosaur native to the planet Eurid.", except the facts never stop. How rational would it be for you to keep trying to enter each additional value into your world state? They're totally irrelevant, but if we ignore computational costs, there's no downside to doing so. The reason why you should be wise to tune me out in that scenario is because c
is definitely greater than 0.
- ^
- ^
Note that
c
is only fixed per value in the case where the algorithm for fusing information has linear time complexityO(N)
. We often use something like an extended Kalman filter (EKF) for sensor fusion. In that scenario, each additional value incurs an increasingly higher cost of computational effort to include it, so sensors with low usefulness are especially penalized. If I recall correctly, it isO(N^2)
. It'll get to a point where it doesn't matter how useful a sensor is, it would be irrational to try and include it because it'll be prohibitively expensive to run the full computation.
↑ comment by dynomight · 2022-06-17T03:31:28.267Z · LW(p) · GW(p)
If you're worried about computational complexity, that's OK. It's not something that I mentioned because (surprisingly enough...) this isn't something that any of the doctors discussed. If you like, let's call that a "valid cost" just like the medical risks and financial/time costs of doing tests. The central issue is if it's valid to worry about information causing harmful downstream medical decisions.
Replies from: anonymousaisafety↑ comment by anonymousaisafety · 2022-06-17T04:36:05.238Z · LW(p) · GW(p)
I'm sorry, but I just feel like we've moved the goal posts then.
I don't see a lot of value in trying to disentangle the concept of information from 1.) costs to acquire that information, and 2.) costs to use that information, just to make some type of argument that a certain class of actor is behaving irrationally.
It starts to feel like "assume a spherical cow", but we're applying that simplification to the definition of what it means to be rational. First, it isn't free to acquire information. But second, even if I assume for the sake of argument that the information is free, it still isn't free to use it, because computation has costs.
if a theory of rational decision making doesn't include that fact, it'll come to conclusions that I think are absurd, like the idea that the most rational thing someone can do is acquire literally all available information before making any decision.
↑ comment by anonymousaisafety · 2022-06-17T01:40:57.663Z · LW(p) · GW(p)
yes, your entire example of doctors is simply due to irrationality
So first you say this.
But then you start to backtrack
in the trivial sense that for a poor measure the posterior of a true positive remains far smaller than it being a false positive and may not motivate a decision, shrinking the VoI towards zero, which will frequently be so small as to not justify the cost of testing
And further admit
It is definitely the case that many tests cost too much for too little information and should not be run because the VoI is often zero (for a rational decision maker) and the test is simply a loss as it will not change any decisions.
But then you try to defend the initial claim, that the doctors are being irrational
Nevertheless, the value of free information is always greater than or equal to zero, and if free information makes you worse off, that implies somewhere there is an irrationality.
But we've already established that the tests are not free in the world we live in.
If you're going to prove the doctors are being irrational in the world we live in, then you can't change a core part of the problem statement. The tests do have costs -- in time, in money, in available machines, in false positives that may result in surgeries or other actions with non-zero risk, and in a dozen other ways, some of which were alluded to by Dynomight, like the possibility of lawsuits.
My whole argument, which you said is "generally wrong", is predicated on the fact that this information is not free. I don't accept the notion that people are being irrational because they are making decisions based on the reality of the world where information is not free just because we can hypothesize about worlds where that information is free.
Do you still disagree?
↑ comment by Kenny · 2022-06-18T19:03:05.376Z · LW(p) · GW(p)
I agree with the other replies you've received about this.
You're identifying other real costs of obtaining more information, but any information obtained, i.e. after paying whatever costs are required, is still positive.
You're right that the expected value of obtaining some particular information could be negative, i.e. could be more 'expensive' than the value of the information.
But the information itself is always valuable. Practically, we might 'round down' its value to zero (0). But it's not literally zero (0).
Replies from: anonymousaisafety↑ comment by anonymousaisafety · 2022-06-18T19:12:53.649Z · LW(p) · GW(p)
Are you ignoring the cost of computation to use that information, as I explained here [LW(p) · GW(p)]then?
Replies from: Kenny↑ comment by Kenny · 2022-06-18T19:20:23.389Z · LW(p) · GW(p)
Nope!
That's a cost to use or process the information.
There's still a value of the information itself – or at least it seems to me like there is, if only in principle – even after it's been parsed/processed and is ready to 'use', e.g. for reasoning, updating belief networks, etc..
Replies from: anonymousaisafety↑ comment by anonymousaisafety · 2022-06-18T19:41:23.077Z · LW(p) · GW(p)
Then I'm not sure what our disagreement is.
I gave the example of a Kalman filter in my other post. A Kalman filter is similar to recursive Bayesian estimation. It's computationally intensive to run for an arbitrary number of values due to how it scales in complexity. If you have a faster algorithm for doing this, then you can revolutionize the field of autonomous systems + self-driving vehicles + robotics + etc.
The fact that "in principle" information provides value doesn't matter, because the very example you gave of "updating belief networks" is exactly what a Kalman filter captures, and that's what I'm saying is limiting how much information you can realistically handle. At some point I have to say, look, I can reasonably calculate a new world state based on 20 pieces of data. But I can't do it if you ask me to look at 2000 pieces of data, at least not using the same optimal algorithm that I could run for 20 pieces of data. The time-complexity of the algorithm for updating my world state makes it prohibitively expensive to do that.
This really matters. If we pretend that agents can update their world state without incurring a cost of computation, and that it's the same computational cost to update a world state based on 20 measurements as it would take for 2000 measurements, or if we pretend it's only a linear cost and not something like N^2, then yes, you're right, more information is always good.
But if there are computational costs, and they do not scale linearly (like a Kalman filter), then there can be negative value associated with trying to include low quality information in the update of your world state.
It is possible that the doctors are behaving irrationally, but I don't think any of the arguments here prove it. Similar to what mu says on their post here [LW(p) · GW(p)].
Replies from: Kenny↑ comment by Kenny · 2022-06-18T19:51:52.788Z · LW(p) · GW(p)
You're not wrong but you're like deliberately missing the point!
You even admit the point:
The fact that "in principle" information provides value doesn't matter
Yes, the point was just that 'in principle', any information provides value.
I think maybe what's missing is that the 'in principle point' is deliberately, to make the point 'sharper', ignoring costs, which are, by the time you have used some information, also 'sunk costs'.
The point is not that there are no costs or that the total value of benefits always exceeds the corresponding total anti-value of costs. The 'info profit' is not always positive!
The point is that the benefits are always (strictly) positive – in principle.
↑ comment by Kenny · 2022-06-18T18:59:17.094Z · LW(p) · GW(p)
Sadly, I found this new – at least in just spelling out this particular conclusion, e.g. for medical diagnostics.
I can still imagine some practical difficulties, e.g. being pressured or bullied by a doctor to have some low-expected-value procedure performed.
How useful is us to aim for any particular standard of behaving like a rational agent?
comment by cwillu (carey-underwood) · 2022-06-17T19:44:20.094Z · LW(p) · GW(p)
“The system doesn't know how to stop” --HPMoR!Harry
Replies from: carey-underwood↑ comment by cwillu (carey-underwood) · 2022-06-17T20:00:17.641Z · LW(p) · GW(p)
By which I mean to imply:
How much of the problem is mistaking the act of providing input to a deterministic system for the act of providing information to an agent with discretion? Or (in less-absolute terms) making an error regarding the amount of discretion available to that agent.
Replies from: Kennycomment by Shmi (shminux) · 2022-06-17T05:29:46.805Z · LW(p) · GW(p)
Remember, neither you not your doctor are rational agents, and sometimes it means avoiding situations where this irrationality rears its head more than usual.
Replies from: Kenny↑ comment by Kenny · 2022-06-18T19:25:16.962Z · LW(p) · GW(p)
Can we, NOT being rational agents, reliably determine which each of these "situations" we should avoid?
I can understand why 'medical care' or 'health' might be one of those areas – I think. Do you have any other particular areas in mind?
I kinda feel like "more than usual" is a really low/weird bar. Do many people spend any considerable amount of time in situations in which it's NOT the case that "irrationality rears its head more than usual"?
I think mostly people, including myself, are typically 'sleepwalking' and not reasoning irrationally or rationally at all.
Should 'we', and our doctors, just avoid these kinds of situations forever? Is there no positive expected value for any possible improvement in this kind of reasoning, in these situations?
Replies from: shminux↑ comment by Shmi (shminux) · 2022-06-19T00:10:06.986Z · LW(p) · GW(p)
These are all good questions. I guess one indicator is the situations where we would feel more emotional than usual? The proverbial "triggers" are a thing, and they are indicative of being less rational than usual.
Replies from: Kenny↑ comment by Kenny · 2022-06-20T12:37:31.636Z · LW(p) · GW(p)
I don't disagree but I think I think (ha) of this kind of thing more along the lines of 'needing to prepare for' being "more emotional than usual" or preparing to handle "triggers" or other circumstances in which we'd expect our reasoning to less rational than usual/ideal.
The time to steel oneself to handle 'difficult reasoning scenarios' is before they occur, i.e. before we feel "more emotional than usual" or are 'triggered'.
In my case, I'm long past being 'over' trusting doctors blindly. I've been practicing asking them about, e.g. base rates and the strength of research evidence. What I liked about this post is that it made me realize/recognize that I don't need to also avoid medial diagnostics – I can just 'ignore' the doctor's 'default algorithm' output instead!
As you pointed out in your original comment, it still might be sensible/reasonable/best to avoid some situations. I just find knowing of the option to just 'not do something stupid', or let someone make a stupid decision for me, to be a very helpful frame.
comment by mu_(negative) · 2022-06-17T14:30:27.612Z · LW(p) · GW(p)
I don't disagree with you exactly, but I think the focus on rational decision making misses the context the decisions are being made in. Isn't this just an unaligned incentives problem? When a patient complains of an issue, doctors face exposure to liability if they do not recommend tests to clarify the issue. If the tests indicate something, doctors face liability for not recommending corrective procedures. They generally face less liability for positively recommending tests and procedures because the risk is quantifiable beforehand and the patient makes the decision. If they decline a recommended test, the doctor can't be blamed.
The push to do less testing makes sense in that context. It has to emerge at the level of a movement so that the doctors have safety in numbers.
I am not in healthcare, perhaps this is cynical?
Edit, I see that Gwern already mentioned lawsuits briefly in a comment. But I think it deserves a lot more focus and obviates "you're not dealing with fully rational agents." I mean, maybe not, but that's not necessary to get this result.
Replies from: Kenny↑ comment by Kenny · 2022-06-18T19:47:25.499Z · LW(p) · GW(p)
I don't disagree with you exactly, but I think the focus on rational decision making misses the context the decisions are being made in. Isn't this just an unaligned incentives problem?
I would agree that, in some sense, it is 'just' an "unaligned incentives problem". But those are thorny problems!
The insight I found valuable from the post was 'just' the idea that 'going along with unaligned incentives' wasn't inevitable. That, in fact, if we know or expect that the 'incentive system' is 'unaligned', we could try to find a way to 'just not do that'.
I now think that 'just not making this mistake' is something that's worth trying.
Doctor: "Let's do Diagnostic."
Me: "Okay"
[Diagnostic is done.]
Doctor: "Bad news. Diagnostic returned X. The standard treatment is Y."
Me: "Y given X is stupid because of, e.g. base rates."
And then either:
Doctor: "But Y is the standard treatment!"
Me: "No; goodbye."
or:
Doctor: "Oh yeah; good point. Let's not do Y then."
Me: "Hurray!"
Replies from: mu_(negative)↑ comment by mu_(negative) · 2022-06-19T00:20:23.039Z · LW(p) · GW(p)
Hmm, yeah, I guess that's a good point. I was thinking myopically at a systems level. The post is useful advice for a patient who is willing to do their own research, confident they can do it thoroughly, and is not afraid to "stare into the abyss" i.e risk getting freaked out or overwhelmed.
Although, I also wonder if insurance companies might try to exploit a patient's prior decision to decline recommended treatment/tests as a reason to not cover future costs...
.
Replies from: Kenny