The Cameron Todd Willingham test
post by Kevin · 2010-05-05T00:11:47.162Z · LW · GW · Legacy · 86 commentsContents
86 comments
In 2004, The United States government executed Cameron Todd Willingham via lethal injection for the crime of murdering his young children by setting fire to his house.
In 2009, David Grann wrote an extended examination of the evidence in the Willingham case for The New Yorker, which has called into question Willingham's guilt. One of the prosecutors in the Willingham case, John Jackson, wrote a response summarizing the evidence from his current perspective. I am not summarizing the evidence here so as to not give the impression of selectively choosing the evidence.
A prior probability estimate for Willingham's guilt (certainly not a close to optimal prior probability) is the probability that a fire resulting in the fatalities of children was intentionally set. The US Fire Administration puts this probability at 13%. The prior probability could be made more accurate by breaking down that 13% of intentionally set fires into different demographic sets, or looking at correlations with other things such as life insurance data.
My question for Less Wrong: Just how innocent is Cameron Todd Willingham? Intuitively, it seems to me that the evidence for Willingham's innocence is of higher magnitude than the evidence for Amanda Knox's innocence. But the prior probability of Willingham being guilty given his children died in a fire in his home is higher than the probability that Amanda Knox committed murder given that a murder occurred in Knox's house.
Challenge question: What does an idealized form of Bayesian Justice look like? I suspect as a start that it would result in a smaller percentage of defendants being found guilty at trial. This article has some examples of the failures to apply Bayesian statistics in existing justice systems.
86 comments
Comments sorted by top scores.
comment by Psychohistorian · 2010-05-05T03:18:22.906Z · LW(p) · GW(p)
I suspect as a start that it would result in a smaller percentage of defendants being found guilty at trial.
I disagree. The most obvious reason is that were our system that efficient, prosecutorial behaviour would change.
But more significantly, a Bayesian processing system would not need to exclude relevant evidence. The only evidence it would exclude would (presumably) be that obtained in violation of the defendant's rights. By incorporating and accurately weighting certain forms of character and hearsay evidence that are not available to a jury, I believe one could prove many cases beyond a reasonable doubt that currently impossible (drug lords and mafia bosses, for example, would be rather easier to convict of something).
Guilty people getting off is, in general, not big news, unless the defendant or crime is already very high profile. Moreover, since the criminal cannot be retried under double jeopardy, no one really goes about examining the wrongfully freed. One can be freed after being wrongfully convicted, so there is some incentive to examine the wrongly convicted. (If you count the people who could be proven guilty to a Bayesian intelligence, but not to a jury, this is a much more significant problem).
In other words, you may well be right, but I think you need a lot more evidence to get to that conclusion. And that still requires prosecutorial behaviour to be exogenous, which is an unreasonable assumption.
Replies from: NancyLebovitz, Jack↑ comment by NancyLebovitz · 2010-05-05T10:44:29.064Z · LW(p) · GW(p)
There are problems with an adversarial system-- attorneys are rewarded for using weak but plausible arguments for their clients, and as stated above, there are also inequities in the quality of attorneys that different people can afford. Further, attorneys are so expensive that poorer people get an attorney supplied by the court who may not bother to try, or be so ill-paid that they can't afford to put a good case together.
There are also problems with an inquisitorial system. Iterated prisoner's dilemma implies that the investigator may be on the side of the court more than on the side of the accused. In theory, juries are supposed to prevent that, but US practice has been to give juries less and less information and to tell them that they have less and less freedom to use their whole minds.
I assume that a Bayesian court would permit-- possibly even encourage-- jurors to do additional research during the trial.
Some of the problems can be ameliorated by combining the systems-- three voices, speaking for the accused, the defense, and a third that's hopefully a neutral investigator.
Permitting the jurors to ask questions (afaik, a relatively recent addition to US practice and not universal here-- I don't know about other countries) is a significant contribution to getting more minds on the problem of getting all the relevant information and arguments on the table.
Replies from: JRMayne, rhollerith_dot_com↑ comment by JRMayne · 2010-05-05T14:15:25.772Z · LW(p) · GW(p)
I'd just note that in my jurisdiction, if you had the choice between random draw of the deputy public defenders, and random draw of the private attorneys, you'd want to take out of the first group. Public defense does not have to suck, and often doesn't.
That said, there are certainly differences in defense attorney quality, differences that matter.
Replies from: NancyLebovitz↑ comment by NancyLebovitz · 2010-05-05T14:22:43.393Z · LW(p) · GW(p)
Do you think the difference between the range of quality for deputy public defenders and for private attorneys is typical? If not, any theories about what causes the difference?
Replies from: JRMayne↑ comment by JRMayne · 2010-05-06T04:12:54.141Z · LW(p) · GW(p)
From my discussions in other mid-size and larger counties which have full-time properly paid public defenders, yes.
The reason for it is (IMO) that full-time public defenders often have much more experience, plus specific interest in criminal law. You won't get the hand-holding, but you will get the expertise.
In places where public defenders are paid poorly (or you have some other alternate defense method which pays poorly) your attorney pool will be, predictably, poorer.
Again, there are some... unfortunate attorneys of all stripes out there. But if you're in a jurisdiction where there's money and effort being put in public defense, that effort has a payoff.
↑ comment by RHollerith (rhollerith_dot_com) · 2010-05-05T19:30:58.541Z · LW(p) · GW(p)
Iterated prisoner's dilemma implies that the investigator may be on the side of the court more than on the side of the accused.
Please call them "prosecutors" since in the U.S., "investigators" are people without law degrees hired by the defense or prosecution to interview potential witnesses, etc.
Replies from: NancyLebovitz↑ comment by NancyLebovitz · 2010-05-05T20:11:19.048Z · LW(p) · GW(p)
I was using "investigator" to refer to the professionals who are laying out the arguments about the case without being on on either side-- what I was taking an "inquisitorial" system to be. There are obvious reasons for not calling them "inquisitors".
They obviously aren't prosecutors because they aren't specifically attempting to get a conviction. Do you have a recommended term?
↑ comment by Jack · 2010-05-05T03:31:06.293Z · LW(p) · GW(p)
What do people think would be the ideal Bayesian structure for a trial. I think adversarial with a jury of peers is probably not the best choice. Some kind of inquisitorial panel makes sense but you have to deal with group think and career minded judges. The judges should be trained rationalists, I guess.
Replies from: mattnewport↑ comment by mattnewport · 2010-05-05T04:16:09.418Z · LW(p) · GW(p)
I think something like an adversarial system with a jury of peers would still be a good choice with perfect Bayesian agents. This structure exists largely to avoid conflicts of interest, not merely to compensate for human irrationality. Unless you are assuming that perfect Bayesian agents would not have conflicts of interest (and I don't see any reason to suppose that) then you would still want to maintain those aspects of the legal system that are designed to avoid such conflicts.
There are two common classes of reasons why evidence may be inadmissible or excludable in a trial. One of these classes should probably be admissible with a perfect Bayesian jury, the other not.
Evidence that is inadmissible because it would prejudice or mislead the jury (like information about prior convictions) would probably be fine with a jury of perfect Bayesians but evidence that is thrown out because it was obtained in some way that society deems unacceptable might still be rejected because of broader concerns about creating inappropriate incentives for law enforcement.
This raises the question of just how perfect your Bayesians are however. If they are very good at correctly weighing relevant evidence but still have computational limits these concerns would probably apply. If they are some kind of idealized agents with infinite computational capacity then you might draw different conclusions but as this case is impossible it is not very interesting in my opinion.
Replies from: Jack, JoshuaZ↑ comment by Jack · 2010-05-05T05:26:33.362Z · LW(p) · GW(p)
So obviously if everyone was a perfect Bayesian agent a jury would be fine. I actually think imagining how things would work with perfect Bayesians is boring. I was thinking more realistically, as in, how would I design a new justice system tomorrow (or in 5 years when we're really good at this) to minimize irrationality as much as possible. No perfect Bayesians just smart people you can teach things to.
The advantage of judges over juries is that we could teach judges to be rationalists as part of their job. Also, the adversarial system strikes me as bias inducing. The wealthy get better representation. I think even if we want to keep adversaries in place we should at least give whoever is making the decision more investigative and active in their role. Instead of just listening to arguments they should be asking questions and thinking things through in real time. More like the Supreme Court hears cases than the way a regular jury does. Listening to two pieces of propaganda and then deciding doesn't seem like the ideal way of settling questions of fact. It might be a decent starting point, though.
Making that biasing evidence admissible is definitely a good call assuming properly trained judges.
Replies from: mattnewport↑ comment by mattnewport · 2010-05-05T06:55:07.936Z · LW(p) · GW(p)
The advantage of judges over juries is that we could teach judges to be rationalists as part of their job.
I still think you are missing a primary reason for having a jury system. The role of the judge in a jury system is to be an expert on the law and to explain to juries how it applies to the current case, to ensure the trial is conducted under the rules of the system and to pass sentencing.
The decision of guilt or innocence is delegated to the jury in an attempt to avoid conflicts of interest. Because of the requirement for the judge to be an expert on the law it must be a long term position and so the judge would be an obvious target for bribery or intimidation and will be prone to conflicts of interest when deciding verdicts where the state's interests are opposed to a private citizen's. Historically jury nullification was seen as an important check on unreasonable or unpopular laws.
Adversarial systems have similar advantages in terms of reducing the potential for conflicts of interest to unduly influence the outcome of a trial. The alternative of an inquisitorial system again runs the risk of biasing the court in favour of the state or establishment and against the private citizen.
The history of common law systems reveals an ongoing process attempting to balance the rights and interests of the state, private citizens, the defendant and the prosecution in a world where bias and conflict of interest are recognized to exist. The results are not perfect but understanding why such systems developed as they did is an essential prerequisite to any consideration of possible improvements.
Replies from: Jack↑ comment by Jack · 2010-05-05T07:40:39.583Z · LW(p) · GW(p)
I'm quite familiar with these arguments. I'm not sure why you'd suppose I wasn't.
The decision of guilt or innocence is delegated to the jury in an attempt to avoid conflicts of interest. Because of the requirement for the judge to be an expert on the law it must be a long term position and so the judge would be an obvious target for bribery or intimidation and will be prone to conflicts of interest when deciding verdicts where the state's interests are opposed to a private citizen's.
First, my claim is that determining matters of fact also requires a kind of expertise. Second, I'm generally skeptical about the extent to which an adversarial system protects judges from conflicts of interests, bribes and intimidation more than an inquisitorial system. Even in adversarial systems judges exercise a lot of control and are bribed and intimidated on a regular basis. Lots of liberal democracies have inquisitorial systems and aren't especially notorious for coerced or biased judges. Is there evidence that civil systems lead to more bribery? Third, this fear of judges siding with "the state" against private citizens can be (and is) remedied by checks and balances elsewhere (like the kind all liberal societies already have) and mechanisms to allow the public to fire judges. Fourth, what conflicts of interest aren't avoidable with oversight in an inquistorial system that are avoidable in a adversarial system?
I don't think a fair, rational system should include anything like jury nullification.
The results are not perfect but understanding why such systems developed as they did is an essential prerequisite to any consideration of possible improvements.
"The results are not perfect" considerably understates the problems, methinks.
Replies from: NancyLebovitz↑ comment by NancyLebovitz · 2010-05-05T11:06:07.396Z · LW(p) · GW(p)
Third, this fear of judges siding with "the state" against private citizens can be (and is) remedied by checks and balances elsewhere (like the kind all liberal societies already have) and mechanisms to allow the public to fire judges.
What specific checks and balances do you have in mind? The current system can let serious forensic fraud and gross judicial misconduct continue for a very long time.
↑ comment by JoshuaZ · 2010-05-05T04:24:41.240Z · LW(p) · GW(p)
Even if you have good Bayeianisms, you might still want to throw out prejudicial information that isn't likely to be relevant. It is safer that way and doesn't rely on making as narrow an estimate about how good people are at being rational.
Alternatively, it might make sense to do away with juries altogether and simply have judges decide everything. However, there's some evidence that judges are not much better than juries at deciding cases. So I'm not sure that would help much.
comment by ata · 2010-05-05T10:28:28.989Z · LW(p) · GW(p)
I've tried to think about the problem of idealized Bayesian Justice before, but usually I start by replacing the judge with a robot, and then the jury, and then the lawyers, and then Congress and the Supreme Court and the entire Executive Branch, and pretty soon I've reinvented FAI badly or I find myself trying to rewrite the US Code in Python.
I think "idealized" might be too high a standard.
Replies from: kpreidcomment by komponisto · 2010-05-06T18:17:52.364Z · LW(p) · GW(p)
My question for Less Wrong: Just how innocent is Cameron Todd Willingham? Intuitively, it seems to me that the evidence for Willingham's innocence is of higher magnitude than the evidence for Amanda Knox's innocence.
In both instances, the prosecution case amounts to roughly zero bits of evidence. However, demographics give Willingham a higher prior of guilt than Knox, perhaps by something like an order of magnitude (1 to 4 bits). I am therefore about an order of magnitude more confident in Knox's innocence than Willingham's.
Challenge question: What does an idealized form of Bayesian Justice look like?
Bayesian jurors (preferably along with Bayesian prosecutors and judges); that's really all it comes down to.
In particular, discussions about the structure of the judicial system are pretty much beside the point, in my view. (The Knox case is not about the Italian justice system, pace just about everyone.) Such systematic rules exist mostly as an attempt at correcting for predictable Bayesian failures on the part of the people involved. In fact, most legal rules of evidence are nothing but crude analogues of a corresponding Bayesian principle. For example, the "presumption of innocence" is a direct counterpart of the Bayesian prohibition against privileging the hypothesis.
There is this notion that Bayesian and legal reasoning are in some kind of constant conflict or tension, and oh-whatever-are-we-to-do as rationalists when judging a criminal case. (See here for a classic example of this kind of hand-wringing.) I would like to dispel this notion. It's really quite simple: "beyond a reasonable doubt" just means P(guilty|evidence) has to be above some threshold, like 99%, or something. In which case, if it's 85%, you don't convict. That's all there is to it. (In particular, away with this nonsense about how P(guilty|evidence) is not the quantity jurors should be interested in; of course it is!)
From our perspective as rationality-advocates, the best means of improving justice is not some systematic reform of legal systems, but rather is simply to raise the sanity waterline of the population in general.
Replies from: Yvain, SilasBarta, NancyLebovitz↑ comment by Scott Alexander (Yvain) · 2010-05-06T19:45:01.549Z · LW(p) · GW(p)
Now that you mention it directly, it's flabbergasting that no one's ever said what percentage level "beyond a reasonable doubt" corresponds to (legal eagles: correct me if I'm wrong). That's a pretty gaping huge deviation from a properly Bayesian legal system right there.
Replies from: komponisto, JRMayne↑ comment by komponisto · 2010-05-06T20:09:09.410Z · LW(p) · GW(p)
Well, the number could hardly be made explicit, for political reasons ("you mean it's acceptable to have x wrongful convictions per year?? We shouldn't tolerate any at all!").
In any case, let me not be interpreted as arguing that the legal system was designed by people with a deep understanding of Bayesianism. I say only that we, as Bayesians, are not prevented from working rationally within it.
Replies from: JRMayne↑ comment by JRMayne · 2010-05-07T15:15:13.498Z · LW(p) · GW(p)
This is the third time on LW that I've seen the percentage of certainty for convictions conflated with the percentage of wrongful convictions (I suspect it's just quick writing or perhaps my overwillingness to see that implication on this particular post). They're not identical.
Suppose we had a quantation standard of 99% certainty and juries were entirely rational actors, understanding of the thin slice 1% is, and given unskewed evidence. The percentage of wrongful convictions would be well under 1% at trial; juries would convict on cases from 99% certainty to c. 100% certainty. The actual percentage of wrongful convictions would depend on the skew of the cases in that range.
Replies from: komponisto↑ comment by komponisto · 2010-05-07T16:27:55.235Z · LW(p) · GW(p)
Yes, the certainty level provides a bound on the number of wrongful convictions. A 99% certainty requirement means at least 99% certainty, so an error rate of at most 1%.
↑ comment by JRMayne · 2010-05-07T15:09:10.954Z · LW(p) · GW(p)
It is, in fact, illegal to argue a quantation of "reasonable doubt."
I'm a fan of the jury system, but I do think quantation would lead to less, not more, accuracy by juries. Arguing math to lawyers is bad enough; to have lawyers generally arguing math to juries is not going to work. (I like lawyers and juries, but mathy lawyers in criminal law are quite rare.)
Replies from: SilasBarta, clay↑ comment by SilasBarta · 2010-05-07T16:25:11.329Z · LW(p) · GW(p)
Probably because the math isn't explained properly.
That said, I do agree in the sense that I think juries can still come to the same verdict, the same way they do now (by intuition), and then just jigger the likelihood ratios to rationalize their decision. However, it's still a significant improvement in that questionable judgments are made transparent.
For example, "Wait a sec -- you gave 10 bits of evidence to Amanda Knox having a sex toy, but only 2 bits to her DNA being nowhere at the crime scene? What?"
↑ comment by clay · 2010-05-07T15:52:51.035Z · LW(p) · GW(p)
Illegal??
From wikipedia:
Replies from: JRMayneOne of the earliest attempts to quantify reasonable doubt was a 1971 article... In a later analysis of the question ("Distributions of Interest for Quantifying Reasonable Doubt and Their Applications," 2006[9]) , three students at Valparaiso University presented a trial to groups of students... From these samples, they concluded that the standard was between 0.70 and 0.74.
The majority of law theorists believe that reasonable doubt cannot be quantified. It is more a qualitative than a quantitative concept. As Rembar notes, "Proof beyond a reasonable doubt is a quantum without a number."[10]
↑ comment by JRMayne · 2010-05-07T18:17:30.765Z · LW(p) · GW(p)
It's illegal for the prosecution or defense to do so in court. Apologies for the lack of context.
The 1971 paper that cites the .70-.74 numbers causes me to believe the people who participated were unbelievably bad at quantation, or that the flaws pointed out in 2006 paper of the 1971 paper are sufficient to destroy the value of that finding, or that this is one of many studies with fatal flaws. I expect there are very few jurors indeed who would convict with a belief that the defendant was 25% to be innocent.
I wonder if quantation interferes with analysis for some large group of people? Perhaps just the mention of math interferes with efficient analysis. I don't know; I can say that in math- or physics-intensive cases, both sides try to simplify for the jury.
In fact, we have some types of cases with fact patterns that give us fairly narrow confidence ranges; if there's a case where I'm 75% certain the guy did it, and no likely evidence or investigation will improve that number, that's either not issued, or if that state has been reached post-issuance, the case is dismissed.
↑ comment by SilasBarta · 2010-05-06T19:12:14.173Z · LW(p) · GW(p)
I wouldn't go that far. There are many cases where the legal system explicitly deviates from Bayesianism. Some examples:
Despite the fact that Demographic Group X is more/less likely to have committed crime Y, neither side can introduce this as evidence, e.g. "Since my client is a woman, you should reduce the odds you assign to her having committed a murder by a factor of 4." (Obviously, the jury will notice the race/gender of the defendant, but you can't argue that this is informative about the odds of guilt.)
Prohibition on many types of prejudicial evidence that is informative about the probability of guilt (like whether the defendant is a felon). (This can be justified on grounds of cognitive bias maybe, but not Bayesian grounds.)
In the US, the Constitutional prohibition on using the defendant's silence as evidence, despite its informativeness, e.g., "If he's really innocent, why doesn't he just tell his side of the story? What's the big deal? Why did he wait hours before even saying what happened? Did he need to get his story straight first?" (Again, the jury will notice that the defendant didn't take the stand, but you can't draw their attention to this as the prosecution.)
The exclusionary rule. The impact of illegally-collected physical evidence (i.e. not forced confessions but e.g. warrantless searches) has a small to non-existent impact on the evidence's strength. The policy on excluding illegally-obtained evidence may be justified on decision-theoretic grounds, but not on Bayesian grounds.
Outside of trials, the fact that you have to wait years before you hear a judge's binding opinion on whether or not a law actually can be enforced (i.e. is Constitutional).
You give the legal system way too much credit.
Replies from: komponisto, RobinZ↑ comment by komponisto · 2010-05-06T19:41:19.128Z · LW(p) · GW(p)
But notice that these are examples of restrictions on evidence of guilt. The assumption (very reasonable, it seems to me) is that human irrationality tends in the direction of false positives, i.e. wrongful convictions. (Possibly along with the assumption that our values require a lower tolerance for false positives than false negatives.)
If juries are capable of convicting on the sort of evidence presented at the Knox/Sollecito trial (and they are, whether in Italy, the U.S., or anywhere else)...well, can you imagine all the false convictions we would have if such rules as you listed were relaxed?
Replies from: Sticky, SilasBarta↑ comment by Sticky · 2010-05-07T15:38:28.742Z · LW(p) · GW(p)
The bias toward false positives is probably especially strong in criminal cases. The archetypal criminal offense is such that it unambiguously happened (not quite like the Willingham case), and in the ancestral human environment there were far fewer people around who could have done it. That makes the priors for everyone higher, which means that for whatever level of probability you're asking for it takes less additional evidence to get there. That a person is acting strangely might well be enough -- especially since you'd have enough familiarity with that person to establish a valid baseline, which doesn't and can't happen in any modern trial system.
Now add in the effects of other cognitive biases: we tend to magnify the importance of evidence against people we don't like and excessively discount evidence against people we do. That's strictly noise when dealing with modern criminal defendants, but ancestral humans actually knew the people in question, and had better reason for liking or disliking them. That might count as weak evidence by itself, and a perfect Bayesian would count it while also giving due consideration to the other evidence. But these weren't just suspects, but your personal allies or rivals. Misweighing evidence could be a convenient way of strengthening your position in the tribe, and having a cognitive bias let you do that in all good conscience. We can't just turn that off when we're dealing with strangers, especially when the media creates a bogus familiarity.
↑ comment by SilasBarta · 2010-05-06T20:40:28.545Z · LW(p) · GW(p)
But notice that these are examples of restrictions on evidence of guilt.
No, they're not. The first one I listed can go either way.
"Since my client is a woman, you should reduce the odds you assign to her having committed a murder by a factor of 4."
The second one can go either way too; it just as much excludes e.g. hearsay evidence that implicates someone else.
The assumption (very reasonable, it seems to me) is that human irrationality tends in the direction of false positives, i.e. wrongful convictions.
Sure, but that needs to be accounted for via the guilt probability threshold, not by reducing the accuracy of the evidence. Favoring acquittal through a high burden and biasing evidence in favor of the defendant is "double-dipping".
If juries are capable of convicting on the sort of evidence presented at the Knox/Sollecito trial (and they are, whether in Italy, the U.S., or anywhere else)...well, can you imagine all the false convictions we would have if such rules as you listed were relaxed?
I only listed a few examples off the top of my head. The appropriate comparison is to the general policy of, per Bayesianism, incorporating all informative evidence. This would probably lead to more accurate assessments of guilt. In particularly egregious cases like K/S, it would have been a tremendous boon to them to allow them to have an explicit guilt threshold and count up the (log) likelihood ratio of all the evidence.
In any case, remember that there's a cost to false negatives as well. Although that's heavily muddled by the fundamental injustice of so many laws for which such a cost is non-existent.
Replies from: komponisto↑ comment by komponisto · 2010-05-06T23:43:16.100Z · LW(p) · GW(p)
Let me take a step back here, because despite the fact that it sounds like we're arguing, I find myself in total agreement with other comments of yours in this thread, in particular your description of how trials should work; I could scarcely have said it better myself.
Here's what I claim: the rules of evidence constitute crude attempts to impose some degree of rationality on jurors and prosecutors who are otherwise not particularly inclined to be rational. These hacks are not always successful, and occasionally even backfire; and they would not be necessary or useful for Bayesian juries who could be counted on to evaluate evidence properly. However, removing such rules without improving the rationality of jurors would be a disaster.
(Let's not forget, after all, that there were people here on LW who reacted with indignation at my dismissal of certain discredited evidence in the Knox case, protesting that legal rules of admissibility don't apply to Bayesian calculations -- as if I had been trying to pass off some kind of legal loophole as a Bayesian argument. Such people were apparently taking it for granted that this evidence was significant, which suggests to me that it is very difficult for people -- even aspiring rationalists -- to discount information they come across. This provides support for the necessity of rules that exclude certain kinds of information from courtrooms, given the population currently doing the judging.)
Replies from: SilasBarta, RobinZ↑ comment by SilasBarta · 2010-05-07T02:19:34.231Z · LW(p) · GW(p)
Okay, then I think we're in agreement. I guess I had interpreted your earlier comment as a much stronger claim about the mapping between pure Bayesianism and existing legal systems, but I definitely agree with what you've said here. I would just note that it would probably be more accurate to say that the rules of evidence are hacks to approximate Bayes and correct for predictable cognitive biases, though perhaps in this context those aren't quite separate categories.
↑ comment by RobinZ · 2010-05-06T19:45:41.947Z · LW(p) · GW(p)
The policy on excluding illegally-obtained evidence may be justified on decision-theoretic grounds, but not on Bayesian grounds.
In that case, why should we design the system on Bayesian grounds?
I think that's really why I concur with komponisto - our system may not be optimal, but optimal for a system has to work as a system, including resistance to gaming. Aside from what you suggest about constitutionality, on which I have no comment, your changes are generally unlikely to improve the ability of a legal system to prosecute the guilty and acquit the innocent.
Replies from: JGWeissman, SilasBarta↑ comment by JGWeissman · 2010-05-06T21:16:12.226Z · LW(p) · GW(p)
I think the proper response to illegally obtained evidence, is to allow it to be presented as evidence, but charge those who obtained it with whatever crimes made its obtainment illegal.
The problem with implementing this in the current system is that the government has a monopoly on prosecuting criminal charges, so that agents of the government can get away with criminal acts. If ordinary citizens had the same power as district attorneys to seek indictments and prosecute criminal charges, it would provide a huge disincentive for illegally obtaining evidence, and many other government abuses.
↑ comment by SilasBarta · 2010-05-06T20:44:18.794Z · LW(p) · GW(p)
In that case, why should we design the system on Bayesian grounds?
Maybe we shouldn't; I was just disputing komponisto's insinuation that there's some unappreciated, general mapping between Bayesianism and the existing justice system.
I think that's really why I concur with komponisto - our system may not be optimal, but optimal for a system has to work as a system, including resistance to gaming.
Even when it allows so much relative weight to be given to sociological "evidence" ("she had a wild sex life") compared to physical evidence?
Replies from: RobinZ↑ comment by RobinZ · 2010-05-06T20:50:23.083Z · LW(p) · GW(p)
Maybe we shouldn't; I was just disputing komponisto's insinuation that there's some unappreciated, general mapping between Bayesianism and the existing justice system.
I agree that the necessity of a mapping has not been shown, although that's not what I read into komponisto's comment.
Even when it allows so much relative weight to be given to sociological "evidence" ("she had a wild sex life") compared to physical evidence?
No. But that would be best corrected by sanity and education, not by changing the law. A jury of people interested primarily in the physical evidence would not be distracted by trivia about countercultural tendencies on the parts of relevant persons.
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-06T20:58:22.163Z · LW(p) · GW(p)
that would be best corrected by sanity and education, not by changing the law. A jury of people interested primarily in the physical evidence would not be distracted by trivia about countercultural tendencies on the parts of relevant persons.
But I think it would make a big (positive) difference if everything had to be phrased in terms of likelihood ratios against a prior and guilt threshold.
Replies from: RobinZ↑ comment by RobinZ · 2010-05-06T21:54:39.474Z · LW(p) · GW(p)
Individual pieces of evidence are not independent. If Mortimer Q. Snodgrass is shown to have left his home at 11:50, arrived at the scene of the crime at midnight, and returned home fifteen minutes later is damning if the victim died at midnight and exculpatory if the victim died three hours later. There's a combinatorial explosion trying to describe the effects of every piece of evidence separately.
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-06T22:16:56.909Z · LW(p) · GW(p)
Sure, but at least each side can draw its theorized causal diagram, how the evidence fits in, how the likelihood ratios interplay (per Pearl's method of separating inferential and causal evidence flows), and what probability that justifies. It would still lend a clarity of thought not currently present among the mouthbreathers on juries that haven't been exposed to any of this, even if you had to train them in it first.
(And that would be easy if the trainers really understood [at Level 2 at least] causal diagrams and read my forthcoming article on guidelines for explaining...)
Replies from: thomblake, RobinZ↑ comment by thomblake · 2010-05-06T22:41:46.090Z · LW(p) · GW(p)
read my forthcoming article on guidelines for explaining
Please make this come forth promptly. I plan to explain some pretty complicated stuff to a bunch of people soon, and could use the help!
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-06T22:42:51.946Z · LW(p) · GW(p)
I'll do my best.
↑ comment by RobinZ · 2010-05-06T22:23:35.612Z · LW(p) · GW(p)
I'll put off judgment until after your article, then.
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-06T22:27:09.880Z · LW(p) · GW(p)
Thanks, but I don't see how much the points being discussed here hinge on it.
Are you saying that you're skeptical that Pearl's networks and Bayesian inference can be quickly (e.g. over a day or so) explained to random people selected for jury duty, but might be convinced of the ease of such training after seeing my exposition of how to enhance your explanatory abilities?
Replies from: thomblake, RobinZ↑ comment by thomblake · 2010-05-06T22:40:31.142Z · LW(p) · GW(p)
Related: maybe you just suck at explaining
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-06T22:43:34.260Z · LW(p) · GW(p)
LOL, you have no idea how many times I've thought that about people who claim something's hard to explain ...
↑ comment by RobinZ · 2010-05-06T22:32:16.522Z · LW(p) · GW(p)
Yes. Edit: That's probably a better summary of my thoughts than I could give at the moment, even.
Replies from: SilasBarta, SilasBarta↑ comment by SilasBarta · 2010-05-06T22:35:44.402Z · LW(p) · GW(p)
Can I call 'em or what? ;-)
Replies from: RobinZ↑ comment by RobinZ · 2010-05-06T22:39:42.119Z · LW(p) · GW(p)
I aim to be predictable. (-:
Replies from: SilasBarta↑ comment by SilasBarta · 2010-05-09T14:21:30.170Z · LW(p) · GW(p)
Hm, now that I think about it, that by itself should be evidence I have some abnormally high explanatory mojo -- if I could explain your position to you better than you could explain it to yourself. :-P
Replies from: RobinZ↑ comment by SilasBarta · 2010-05-06T22:35:05.755Z · LW(p) · GW(p)
Damn I'm good B-)
↑ comment by NancyLebovitz · 2010-05-07T00:11:25.075Z · LW(p) · GW(p)
Were defense attorneys left out by accident, or do you think it's not important that they be Bayesian?
Replies from: komponisto↑ comment by komponisto · 2010-05-07T00:51:33.756Z · LW(p) · GW(p)
It's important that everyone be Bayesian, of course.
To address the implied subtext: yes, I'm in general more worried about false convictions than false acquittals.
Arguably, if investigators and jurors were pure Bayesian epistemic rationalists, attorneys (on either side) wouldn't even be necessary. That's an extremely fanciful state of affairs, however.
comment by JRMayne · 2010-05-05T14:32:09.145Z · LW(p) · GW(p)
I would heartily suggest grave caution in having high confidence in conclusions based on media accounts. In this case, the forensic arson investigation was badly flawed. I don't know the other evidence well enough to speak, though I know as a loyal Bayesian, I could make an estimate of guilt. Still, I'd have very low confidence in that estimate, and it could be easily changed (as by reading the entire trial transcript.)
It's very difficult to see these assertions without at least a nod to Roger Keith Coleman's case. That case received a great deal of post-execution publicity, significantly more than Willingham and with roughly equal implications of factual innocence.
comment by gregconen · 2010-05-05T02:38:01.928Z · LW(p) · GW(p)
As ever, both stories are studies in irrelevancy and emotional appeal.
Which probably reflects a good bit of what's wrong with the criminal justice system.
Though unscientific scientific testimony is also a serious problem, apparently also seen in this case.
comment by Jack · 2010-05-05T02:32:55.023Z · LW(p) · GW(p)
What prior are you using for Knox? The pre-murder "chances a nice, pretty upper-middle class American girl would be involved in a murder in the next year" or the "chances the nice, pretty, upper-middle class American girl was involved in murder, given that her roommate was stabbed to death?
The major difference is going to end up being that Meredith Kercher was actually murdered and Willingham's children were not.
Replies from: komponisto↑ comment by komponisto · 2010-05-05T20:43:13.113Z · LW(p) · GW(p)
What prior are you using for Knox? The pre-murder "chances a nice, pretty upper-middle class American girl would be involved in a murder in the next year" or the "chances the nice, pretty, upper-middle class American girl was involved in murder, given that her roommate was stabbed to death?
I've been kicking myself for months for not having done a better job of making the point that it shouldn't matter: so long as you take into account all of the relevant information, Bayes' theorem says nothing about the order in which you process the information.
What we label "prior probability" in everyday contexts like this is just an arbitrary matter of convenience: "priors" are in fact posteriors based on information not explicitly mentioned -- and that information is still there, whether we mention it or not.
Thus, for example, if you assign a high "prior" to Amanda Knox's guilt because of Meredith Kercher's death, you still have to lower the probability (to something around the "pretty middle-class female honor student" range) upon learning that Kercher's death was caused by Rudy Guede -- the screening off effect doesn't go away just because you decided to call that state of knowledge the "prior"; it just acquires the label of "evidence of innocence."
(Just to be clear, I'm not suggesting you don't realize this; I just think the point needs to be reinforced.)
comment by RobinZ · 2010-05-05T01:26:38.025Z · LW(p) · GW(p)
What is this supposed to teach us about rationality that we did not learn from the Amanda Knox case?
I don't think it is a good idea to invoke any sort of controversy without some specific novel point to make. I would not object were it just a thought experiment in an open thread, but good cause is necessary for a top-level post.
Replies from: Kevin, Jack↑ comment by Kevin · 2010-05-05T03:20:09.856Z · LW(p) · GW(p)
I asked two questions that I wanted to see answers for. And the Bayesian justice question is a really hard question.
Every post does not have to be made with the specific goal of teaching rationality. There is a much larger set of possible purposes for posts.
This thread probably belongs in the meta thread.
Replies from: RobinZ↑ comment by Jack · 2010-05-05T02:36:14.322Z · LW(p) · GW(p)
I don't see what the harm is in more practice.
Replies from: RobinZ↑ comment by RobinZ · 2010-05-05T02:57:17.526Z · LW(p) · GW(p)
I would hold top-level posts to a higher standard than "don't see the harm".
Replies from: Clippy, Kevin↑ comment by Clippy · 2010-05-05T13:28:10.711Z · LW(p) · GW(p)
Right, top level posts need to be held to a strict standard: they need to be good, not just "somewhat nice to have around". Furthermore, they should include things of practical relevance to our everyday existence, like materials science, recycling, non-destructive fasteners, etc.
comment by neq1 · 2010-05-05T12:18:28.666Z · LW(p) · GW(p)
The prior probability of 0.13 is wrong. That would be correct if 13% of fires resulting in fatalities of children were intentionally set by the children's dad.
Replies from: Kevin↑ comment by Kevin · 2010-05-07T07:21:00.517Z · LW(p) · GW(p)
Yes, that is why I did not say that the prior probability was correct. I said it was a prior probability estimate.
If you can find demographic data that gives you a better prior probability than 13%, or would like to suggest an arbitrary constant to multiply by, go for it. I meant just to emphasize that the prior probability for Willingham's guilt is at least an order of magnitude higher than the prior probability of Knox's guilt.
Replies from: neq1↑ comment by neq1 · 2010-05-07T12:31:06.432Z · LW(p) · GW(p)
"I meant just to emphasize that the prior probability for Willingham's guilt is at least an order of magnitude higher than the prior probability of Knox's guilt."
I think you made an interesting observation. Just thought it was worth noting that the prior prob is probably too high
Another aspect of this is whether we are being fully Bayesian or not. If fully Bayesian we'd have a prior distribution for the probability. that prior might have a mean or mode at 14%, but still be pretty flat (reflecting uncertainty)
comment by PhilGoetz · 2010-05-05T03:13:54.530Z · LW(p) · GW(p)
I think Bayesian justice would result in a larger percentage of defendants being found guilty at trial, because instead of "guilty beyond a reasonable doubt", the prosecution would only have to prove "expected value of conviction > expected value of no conviction".
EDIT: On the other hand, if someone committed an awful crime, but can convince you that they won't do it again; or if they might, but they pay a lot of taxes; let them go.
If the standard used is value to society, then if the defendant is judged to have no value to society, and executions are cheap, then convict and execute if p(defendent will commit more crime) > 0. If the defendant has a net cost to society, execute regardless.
If government functions via redistribution of taxation, then most people have a negative value to society, since most of the government's income comes from the top 10% or so. Therefore, execute the bottom 90%. Tax, and redistribute among the survivors. Again, the bottom 90% has negative value. Execute. Repeat. You eventually converge on a single citizen, whose expected contribution to society (minus his cost to society) is zero by some measures. At that point, flip a coin.
Replies from: Jack, MartinB, JRMayne, Mass_Driver↑ comment by MartinB · 2010-05-05T06:52:43.056Z · LW(p) · GW(p)
There is still a whole bag of philosophical problems to be solved regarding punishment. Do you punish to prevent the same person from doing something again, do you punish as retribution for the victims (then it doesnt make sense to tax those to provide prisons), or do you want to set a sign against others for not doing crime. Atm there are many inconsistent terms put on various charges, which merit resolving. I read a few times that the legal system works because most people are aware of punishment, not because it punishes most punishworthy deeds, or because its particularly fair. In that case a bayesian system might have very little practical difference from the current (except for all those not guilty). A decent reform would include some liability for attorneys to withhold evidence, or threaten the accused and such. Maybe someone will put up a whole thought out concept.
↑ comment by Mass_Driver · 2010-05-05T03:23:58.997Z · LW(p) · GW(p)
I agree with Jack. You are also conflating tax receipts, the "exchange value" of a citizen, with intrinsic worth, the "use value" of a citizen. In so far as it exists as a real phenomenon, society doesn't value citizens because they pay taxes; society values citizens because society is a construct set up by and for the benefit of citizens.
Also, much government spending is not fairly characterizable as mere redistribution; once you killed off, say, the bottom 40%, you would find that all citizens produced at least some net surplus, some of which would be confiscated to spend on public goods. Some remaining citizens would not contribute as much surplus as others, but an evil government that maximized total tax receipts (as opposed to average tax receipts, which is just a really weird goal) would not feel any urge to execute those citizens.
Replies from: PhilGoetz↑ comment by PhilGoetz · 2010-05-05T03:31:29.250Z · LW(p) · GW(p)
you would find that all citizens produced at least some net surplus, some of which would be confiscated to spend on public goods.
I'm assuming that governments don't have surpluses in the long run. This is, historically, true without exception AFAIK. Too bad - if they did, the country could retire and live off its savings.
comment by Jack · 2010-05-05T02:35:40.404Z · LW(p) · GW(p)
What does an idealized form of Bayesian Justice look like?
To begin with "beyond reasonable doubt" needs to be replaced with "beyond X% certainty" where 100-X is whatever percent of innocent convictions we're comfortable with.
Replies from: ata, jimmy↑ comment by jimmy · 2010-05-05T19:08:05.371Z · LW(p) · GW(p)
"whatever percent of innocent convictions we're comfortable with." Isn't the right way to do it. You need to weigh the expected utilities and go with that.
For example, say we have someone suspected of murder, and you think its only 20% sure that he did it, but executing him given that he's the guilty saves an expected 10 lives, then you do it. If there was a second suspect (same p(guilt)) and you know only one of them is guilty, then you'd execute them both.
There are all sorts of disclaimers that could be added, but the point is that the threshold isn't arbitrary, and intuitions don't get close to the right answer.
Replies from: Jackcomment by NancyLebovitz · 2010-05-07T00:14:24.805Z · LW(p) · GW(p)
Is there an Bayesian take on plea bargaining?
Replies from: JoshuaZ↑ comment by JoshuaZ · 2010-05-07T00:24:14.567Z · LW(p) · GW(p)
Note that plea bargaining isn't even doable in many systems. Plea bargains are only really common in the US and a few other places. Many other locations require a full trial regardless. This is generally balanced with weaker rules of evidence and fewer restrictions on what prosecutors and police can get away with.
To try to answer the original question: I'm not sure plea bargaining ever makes sense in a purely Bayesian system because a plea bargain essentially shortcircuits an attempt to find out the truth.
comment by jimmy · 2010-05-05T19:17:27.179Z · LW(p) · GW(p)
I think the coolest idea for justice systems that I've heard of (dunno its origin) is to have decisions done though prediction markets.
There's a couple ways I can see to keep it grounded to reality. The 'jury' (which would include people that think they can make money doing it) could not be given all of the evidence all the time, and their predictions would be checked (and their bets payed off) on the cases where they were withholding strong evidence.
There could also be two markets, and they have to predict the other juries verdict (possibly given more evidence).
Hell, even play money prediction markets do pretty well and you don't have to worry about reliably confirming actual guilt for payoffs. Just getting people to think with the "my money!" part of their brains helps.
Of course, there are all sorts of details to figure out, but it shouldn't be hard to beat the consensus of 12 people with no stake in the matter that were selected for being too dumb to get out of jury duty.
comment by dudleysharp · 2010-05-05T11:56:01.151Z · LW(p) · GW(p)
The evidence is important and neither Grann nor Jackson are very helpful.
A more thorugh review. I have also read the trial transcript and the police wtiness statement interviews.
"Cameron Todd Willingham: Another Media Meltdown", A Collection of Articles http://homicidesurvivors.com/categories/Cameron%20Todd%20Willingham.aspx
comment by dudleysharp · 2010-05-05T11:55:41.050Z · LW(p) · GW(p)
The evidence is important and neither Grann nor Jackson are very helpful.
A more thorugh review. I have also read the trial transcript and the police wtiness statement interviews.
"Cameron Todd Willingham: Another Media Meltdown", A Collection of Articles http://homicidesurvivors.com/categories/Cameron%20Todd%20Willingham.aspx
Replies from: RobinZcomment by [deleted] · 2010-05-05T02:29:31.059Z · LW(p) · GW(p)
The "challenge question" is the most interesting part of the post. The article you linked to in that paragraph has a lot of implications for justice systems, namely that jurors need a better understanding of probability, especially as DNA fingerprinting and similar forms of forensic evidence become more and more precise. Though the article does point out how probability can be misinterpreted or skewed by both the defense and prosecution, jurors would be better prepared to assess such arguments if they had a better intuitive understanding of probability theory.