High-precision claims may be refuted without being replaced with other high-precision claims

post by jessicata (jessica.liu.taylor) · 2020-01-30T23:08:33.792Z · LW · GW · 30 comments

This is a link post for https://unstableontology.com/2020/01/30/high-precision-claims-may-be-refuted-without-being-replaced-with-other-high-precision-claims/

Contents

31 comments

There's a common criticism of theory-criticism which goes along the lines of:

Well, sure, this theory isn't exactly right. But it's the best theory we have right now. Do you have a better theory? If not, you can't really claim to have refuted the theory, can you?

This is wrong. This is falsification-resisting theory-apologism. Karl Popper would be livid.

The relevant reason why it's wrong is that theories make high-precision claims. For example, the standard theory of arithmetic says 561+413=974. Not 975 or 973 or 97.4000001, but exactly 974. If arithmetic didn't have this guarantee, math would look very different from how it currently looks (it would be necessary to account for possible small jumps in arithmetic operations).

A single bit flip in the state of a computer process can crash the whole program. Similarly, high-precision theories rely on precise invariants, and even small violations of these invariants sink the theory's claims.

To a first approximation, a computer either (a) almost always works (>99.99% probability of getting the right answer) or (b) doesn't work (<0.01% probability of getting the right answer). There are edge cases such as randomly crashing computers or computers with small floating point errors. However, even a computer that crashes every few minutes functions very precisely correctly in >99% of seconds that it runs.

If a computer makes random small errors 0.01% of the time in e.g. arithmetic operations, it's not an almost-working computer, it's a completely non-functioning computer, that will crash almost immediately.

The claim that a given algorithm or circuit really adds two numbers is very precise. Even a single pair of numbers that it adds incorrectly refutes the claim, and very much risks making this algorithm/circuit useless. (The rest of the program would not be able to rely on guarantees, and would instead need to know the domain in which the algorithm/circuit functions; this would significantly complicate the reasoning about correctness)

Importantly, such a refutation does not need to come along with an alternative theory of what the algorithm/circuit does. To refute the claim that it adds numbers, it's sufficient to show a single counterexample without suggesting an alternative. Quality assurance processes are primarily about identifying errors, not about specifying the behavior of non-functioning products.

A Bayesian may argue that the refuter must have an alternative belief about the circuit. While this is true assuming the refuter is Bayesian, such a belief need not be high-precision. It may be a high-entropy distribution. And if the refuter is a human, they are not a Bayesian (that would take too much compute), and will instead have a vague representation of the circuit as "something doing some unspecified thing", with some vague intuitions about what sorts of things are more likely than other things. In any case, the Bayesian criticism certainly doesn't require the refuter to replace the claim about the circuit with an alternative high-precision claim; either a low-precision belief or a lack-of-belief will do.

The case of computer algorithms is particularly clear, but of course this applies elsewhere:

A theory that has been refuted remains contextually "useful" in a sense, but it's the walking dead. It isn't really true everywhere, and:

The fact that false high-precision claims are generally more damaging than false low-precision claims is important ethically. High-precision claims are often used to ethically justify coercion, violence, and so on, where low-precision claims would have been insufficient. For example, imprisoning someone for a long time may be ethically justified if they definitely committed a serious crime, but is much less likely to be if the belief that they committed a crime is merely a low-precision guess, not validated by any high-precision checking machine. Likewise for psychiatry, which justifies incredibly high levels of coercion on the basis of precise-looking claims about different kinds of cognitive impairment and their remedies.

Therefore, I believe there is an ethical imperative to apply skepticism to high-precision claims, and to allow them to be falsified by evidence, even without knowing what the real truth is other than that it isn't as the high-precision claim says it is.

30 comments

Comments sorted by top scores.

comment by Scott Alexander (Yvain) · 2020-01-31T07:28:29.853Z · LW(p) · GW(p)
Likewise for psychiatry, which justifies incredibly high levels of coercion on the basis of precise-looking claims about different kinds of cognitive impairment and their remedies.


You're presenting a specific rule about manipulating logically necessary truths, then treating it as a vague heuristic and trying to apply it to medicine! Aaaaaah!

Suppose a physicist (not even a doctor! a physicist!) tries to calculate some parameter. Theory says it should be 6, but the experiment returns a value of 6.002. Probably the apparatus is a little off, or there's some other effect interfering (eg air resistance), or you're bad at experiment design. You don't throw out all of physics!

Or moving on to biology: suppose you hypothesize that insulin levels go up in response to glucose and go down after the glucose is successfully absorbed, and so insulin must be a glucose-regulating hormone. But you find one guy who just has really high levels of insulin no matter how much glucose he has. Well, that guy has an insulinoma. But if you lived before insulinomas were discovered, then you wouldn't know that. You still probably shouldn't throw out all of endocrinology based on one guy. Instead you should say "The theory seems basically sound, but this guy probably has something weird we'll figure out later".

I'm not claiming these disprove your point - that if you're making a perfectly-specified universally-quantified claim and receive a 100%-confidence 100%-definitely-relevant experimental result disproving it, it's disproven. But nobody outside pure math is in the perfectly-specified universally-quantified claim business, and nobody outside pure math receives 100%-confidence 100%-definitely-relevant tests of their claims. This is probably what you mean by the term "high-precision" - the theory of gravity isn't precise enough to say that no instrument can ever read 6.002 when it should read 6, and the theory of insulin isn't precise enough to say nobody can have weird diseases that cause exceptions. But both of these are part of a general principle that nothing in the physical world is precise enough that you should think this way.

See eg Kuhn, who makes the exact opposite point as this post - that no experimental result can ever prove any theory wrong with certainty. That's why we need this whole Bayesian thing.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T07:35:30.067Z · LW(p) · GW(p)

Yes, of course things that aren't definitive falsifications aren't definitive falsifications, but there have been fairly definitive falsifications in physics, e.g. the falsification of aether theory. (Asking for a falsification to be literally 100% certain to be a falsification is, of course, too high of a standard)

Yes, it's also possible to change the description of the theory so it is only said to apply to 99% of cases in response to counterexamples, but this is a different theory than one that says it applies to 99.9% of cases or 100% of cases. This is a matter of calibration.

comment by paulfchristiano · 2020-01-31T02:24:34.911Z · LW(p) · GW(p)

It seems like there is a real phenomenon in computers and proofs (and some other brittle systems), where they are predicated on long sequences of precise relationships and so quickly break down as the relationships become slightly less true. But this situation seems rare in most domains.

If there's a single exception to conservation of energy, then a high percentage of modern physics theories completely break. The single exception may be sufficient to, for example, create perpetual motion machines. Physics, then, makes a very high-precision claim that energy is conserved, and a refuter of this claim need not supply an alternative physics.

I don't know what "break" means, these theories still give good predictions in everyday cases and it would be a silly reason to throw them out unless weird cases became common enough. You'd end up with something like "well we think these theories work in the places we are using them, and will keep doing so until we get a theory that works better in practice" rather than "this is a candidate for the laws governing nature." But that's just what most people have already done with nearly everything they call a "theory."

Physics is weird example because it's one of the only domains where we could hope to have a theory in the precise sense you are talking about. But even e.g. the standard model isn't such a theory! Maybe in practice "theories" are restricted to mathematics and computer science? (Not coincidentally, these are domains where the word "theory" isn't traditionally used.)

In particular, theories are also responsible for a negligible fraction of high-precision knowledge. My claim that there's an apple because I'm looking at an apple is fairly high-precision. Most people get there without having anything like an exceptionless "theory" explaining the relationship between the appearance of an apple and the actual presence of an apple. You could try and build up some exceptionless theories that can yield these kinds of judgments, but it will take you quite some time.

I'm personally happy never using the word "theory," not knowing what it means. But my broader concern is that there are a bunch of ways that people (including you) arrive at truth, that in the context of those mechanisms it's very frequently correct to say things like "well it's the best we have" of an explicit model that makes predictions, and that there are relatively few cases of "well it's the best we have" where the kind of reasoning in this post would move you from "incorrectly accept" to "correctly reject." (I don't know if you have an example in mind.)

(ETA: maybe by "theory" you mean something just like "energy is conserved"? But in these cases the alternative is obvious, namely "energy is often conserved," and it doesn't seem like that's a move anyone would question after having exhibited a counterexample. E.g. most people don't question "people often choose the option they prefer" as an improvement over "people always choose the option they prefer." Likewise, I think most people would accept "there isn't an apple on the table" as a reasonable alternative to "there is an apple on the table," though they might reasonably ask for a different explanation for their observations.)

Replies from: Pattern, jessica.liu.taylor
comment by Pattern · 2020-01-31T05:51:37.520Z · LW(p) · GW(p)

TLDR:

*Another way of visualizing if then statements is flowcharts.

Theorists may be interested in being able to make all the flowcharts

Non-theorists may be interested in the flowcharts they use continuing to work, but not too fussed about everything else.


Long:

It seems like there is a real phenomenon in computers and proofs (and some other brittle systems), where they are predicated on long sequences of precise relationships and so quickly break down as the relationships become slightly less true. But this situation seems rare in most domains.

Types of reasoning:

If x = 2, x^2 = 4. (Single/multi-case. Claims/if then statements.*)

For all x >= 0, x^2 is monotonic. (For every case. Universal quantification.)

Either can be wrong: the incorrect statement "If x =2, x^2 =3" and "For all real x, x^2 is monotonic."


A claim about a single case probably isn't a theory. Theories tend to be big.

We can think of there being "absolute theories" (correct in every case, or wrong), or "mostly correct theories" (correct more than some threshold, 50%, 75%, 90%, etc.).

Or we can think of "absolute", "mostly correct", "good enough", etc. as properties. One error has been found in a previously spotless theory? It's been moved from "absolute" to "mostly correct". Theorists may want 'the perfect theory' and work on coming up with a new, better theory and 'reject' the old one. Other people may say 'it's good enough.' (Though sometimes a flaw reveals a deep underlying issue that may be a big deal - consider the replication crisis.)


This post is about "absolute theories" or "theories which are absolute". As paulfchristiano points out, a theory may be good enough even if it isn't absolute. The post seems to be about: a) Critics need not replace the theory, and b) once we see that a theory is flawed, more caution may be required when using it, and c) pretending the theory is infallible and ignoring its flaws as they pile up will lead to problems. 'If you see a crack in an aquarium, fix it, don't let it grow into a hole that lets all the water out.'


The post might be about a specific context, where the author thinks people are 'ignoring cracks in the aquarium and letting water out.' Some posts do a bad job of being a continuation of one or more IRL conversations, but I found this to be a fairly good one.


Perhaps different types of theories should be handled differently, particularly based on how important the consequences are, and the difference between theories which are "absolute" and theories which are not, may matter a lot. If cars could run on any fuel except water, which would make them explode, but otherwise would be fine with anything...people wouldn't just put anything in their fuel tank - they'd be careful to only put things that they knew didn't have any water in them.

Or the difference mostly matters to theorists, and non-theorists are more interested in specifics (things that pertaining to the specific theories they care about/use), rather than the abstract (theories in general), and this post won't be very useful to them.*

comment by jessicata (jessica.liu.taylor) · 2020-01-31T02:33:14.011Z · LW(p) · GW(p)

I don’t know what “break” means, these theories still give good predictions in everyday cases and it would be a silly reason to throw them out unless weird cases became common enough.

If perpetual motion machines are possible that changes quite a lot. It would mean searching for perpetual motion machines might be a good idea, and the typical ways people try to rule them out ultimately fail. Once perpetual motion machines are invented, they can become common.

But even e.g. the standard model isn’t such a theory!

Not totally exceptionless due to anomalies but it makes lots of claims at very high levels of precision (e.g. results of chemical experiments) and is precise at that level, not at a higher level than that. Similarly with the apple case. (Also, my guess is that there are precise possibly-true claims such as "anomalies to the standard model never cohere into particles that last more than 1 second")

I don't want to create a binary between "totally 100% exceptionless theory" and "not high precision at all", there are intermediate levels even in computing. The point is that the theory needs to have precision corresponding to the brittleness of the inference chains it uses, or else the inference chain probably breaks somewhere.

comment by abramdemski · 2020-01-30T23:40:55.808Z · LW(p) · GW(p)

I like this post because I'm fond of using the "what's the better alternative?" argument in instrumental matters, so it's good to have an explicit flag of where it fails in epistemic matters. Technically the argument still holds, but the "better alternative" can be a high entropy theory, which often doesn't rise to saliency as a theory at all.

It's also a questionable heuristic in instrumental matters, as often it is possible to meaningly critique a policy without yet having a better alternative. But one must be careful to distinguish between these "speculative" critiques (which can note important downsides but don't strongly a policy should be changed, due to a lack of alternatives) vs true evaluations (which claim that changes need to be made, and therefore should be required to evaluate alternatives).

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T01:16:59.467Z · LW(p) · GW(p)

I believe The Real Rules Have No Exceptions [LW · GW] is an instrumental analogue.

comment by Matt Goldenberg (mr-hire) · 2020-01-31T18:24:23.061Z · LW(p) · GW(p)

Is this in response to something? What's the context for this post?

It seems to me both obvious that criticism is allowed without having plausible alternatives, and that in the context of policy debate criticism without alternatives should be treated skeptically.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T20:48:35.083Z · LW(p) · GW(p)

Not a response to anything in particular, but I've had a lot of discussions over my life where someone makes strong claims and doesn't say they were erroneous when counterexamples are provided, such as this recent one [LW(p) · GW(p)].

The policy equivalent is The Real Rules Have No Exceptions [LW · GW]: if "exceptions" are being made, the rule should be modified, discarded, or relabeled (e.g. as a guideline/heuristic). The criticism "you aren't really following your rules" is valid even if you don't have an alternative ruleset.

comment by Pattern · 2020-01-31T02:16:51.000Z · LW(p) · GW(p)

Theorist: I have a circuit.

Critic: Your circuit is broken.

Critic's critic: But what will we do without calculators?

comment by Wei Dai (Wei_Dai) · 2020-01-31T01:42:12.869Z · LW(p) · GW(p)

Likewise for psychiatry, which justifies incredibly high levels of coercion on the basis of precise-looking claims about different kinds of cognitive impairment and their remedies.

I've seen people make the opposite complaint, that we don't commit enough people to mental hospitals nowadays and as a result the mentally ill make up a large fraction of the homeless. (45% of homeless are mentally ill, according to this source.)

Also psychiatry just doesn't seem to fit in with the rest of your examples. At least from a layman's perspective it seems like there is a lot of recognition that pretty much everything is a spectrum and mental illnesses are just extreme ends of the spectra, that many diagnoses are hard to distinguish from each other, that lots of drugs don't work for lots of individuals, etc.

Replies from: Yvain, jessica.liu.taylor, Pattern
comment by Scott Alexander (Yvain) · 2020-02-01T02:07:47.091Z · LW(p) · GW(p)

An alternate response to this point is that if someone comes off their medication, then says they're going to kill their mother because she is poisoning their food, and the food poisoning claim seems definitely not true, then spending a few days assessing what is going on and treating them until it looks like they are not going to kill their mother anymore seems justifiable for reasons other than "we know exactly what biological circuit is involved with 100% confidence"

(source: this basically describes one of the two people I ever committed involuntarily)

I agree that there are a lot of difficult legal issues to be sorted out about who has the burden of proof and how many hoops people should have to jump through to make this happen, but none of them look at all like "you do not know the exact biological circuit involved with 100% confidence using a theory that has had literally zero exceptions ever"

comment by jessicata (jessica.liu.taylor) · 2020-01-31T02:12:36.988Z · LW(p) · GW(p)

There are two very, very different arguments one can make for psychiatric coercion:

  1. There are identifiable, specifiable, diagnosable, and treatable types of cognitive impairment. People who have them would, therefore, benefit from others overriding their (impaired) agency to treat their condition.

  2. There aren't identifiable/specifiable/etc cognitive impairments, or at least psychiatry can't find them reliably (see: homosexuality in the DSM, the history of lobotomies, Foucault, Szasz). However, psychiatry diagnoses a significant fraction of homeless people with "disorders", and psychiatrically imprisoning homeless people without trial is good, so psychiatric coercion is good.

The second argument is morally repugnant to many who hold the first view and is also morally repugnant in my own view. If the DSM isn't actually a much more scientifically valid personality typing procedure than, say, the Enneagram, then locking people up based on it without trial is an ethically horrible form of social control, as locking people up based on the Enneagram would be.

Very few people who accept the first argument would accept the second, indicating that the moral legitimacy of psychiatry is tied to its precise-looking claims about types of cognitive impairment.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2020-01-31T03:06:20.603Z · LW(p) · GW(p)

I don't feel qualified to debate what seems to be a very complex issue, but people who advocate for more coercion (mainly AOT) these days seem to do so on the basis of good patient outcomes rather than "high-precision claims":

Assisted Outpatient Treatment (AOT, formerly known as involuntary outpatient commitment (IOC), allows courts to order certain individuals with brain disorders to comply with treatment while living in the community. [...] Research shows Assisted Outpatient Treatment:

  • Helps the mentally ill by reducing homelessness (74%); suicide attempts (55%); and substance abuse (48%).
  • Keeps the public safer by reducing physical harm to others (47%) and property destruction (46%).
  • Saves money by reducing hospitalization (77%); arrests (83%); and incarceration (87%).

If you see an ethical issue with this argument, can you explain what it is?

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T03:18:25.961Z · LW(p) · GW(p)

If the website weren't making the claim that diagnosable/treatable/etc cognitive impairment exists, they wouldn't be saying people are "mentally ill", talk about "Anosognosia", etc.

Without that language the copy on the page doesn't really seem justified. There are lots of groups of people (e.g. religious groups) that claim to be able to help people's life outcomes, but that doesn't ethically justify coercing people to join them.

I hope you can see the problems of legal order of civil society that would arise if people's rights could be overridden on the basis of empirical claims that people like them (not even them specifically, people placed in the same reference class) "benefit" according to metrics such as having a house and being out of jail, which are themselves determined largely by the society.

This also isn't touching the social and informational coercion associated with labeling people as "crazy" and "should be locked up for their own good" (which has psychological effects on them!) based on highly questionable diagnostic criteria. Labeling people who aren't crazy as crazy is gaslighting.

In general on this topic please see Unrecognized Facts by the Council for Evidence-Based Psychiatry.

(I haven't even gotten into issues of study methodology, which may be quite serious; the "Research" link on the page you linked simply links to the same page, which is quite suspicious.)

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2020-01-31T06:00:26.034Z · LW(p) · GW(p)

If the website weren’t making the claim that diagnosable/treatable/etc cognitive impairment exists, they wouldn’t be saying people are “mentally ill”, talk about “Anosognosia”, etc.

Do you think psychiatry is totally useless or harmful even for voluntary patients? If not, how would you prefer that psychiatrists talk? If yes (as seemingly suggested by the reference you linked), that seems to be the real crux between you and people like the ones I linked to, so why not argue about that to begin with?

There are lots of groups of people (e.g. religious groups) that claim to be able to help people’s life outcomes, but that doesn’t ethically justify coercing people to join them.

It depends on how much they actually help. I value autonomy and people having justified beliefs but also people not suffering, so if someone was suffering really badly and there is strong evidence that the only way to help them involves coercive religious indoctrination, that may well be a tradeoff I end up thinking should be made.

I hope you can see the problems of legal order of civil society that would arise if people’s rights could be overridden on the basis of empirical claims that people like them (not even them specifically, people placed in the same reference class) “benefit” according to metrics such as having a house and being out of jail, which are themselves determined largely by the society.

This also isn’t touching the social and informational coercion associated with labeling people as “crazy” and “should be locked up for their own good” (which has psychological effects on them!) based on highly questionable diagnostic criteria. Labeling people who aren’t crazy as crazy is gaslighting.

Yes, I grant both of these as real problems.

(I haven’t even gotten into issues of study methodology, which may be quite serious; the “Research” link on the page you linked simply links to the same page, which is quite suspicious.)

Looks like a bad link. The actual research page is this one: https://mentalillnesspolicy.org/wp-content/uploads/aotworks.pdf

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T06:14:24.980Z · LW(p) · GW(p)

Do you think psychiatry is totally useless or harmful even for voluntary patients?

Sometimes yes, sometimes no?

If yes (as seemingly suggested by the reference you linked), that seems to be the real crux between you and people like the ones I linked to, so why not argue about that to begin with?

We're in a subthread about whether the claims of psychiatry are true. You suggested maybe coercion is good even if the claims are not scientifically valid. I said no, that's morally repugnant. You linked people making that argument. I see that the argument relies on the premise that the claims of psychiatry aren't bullshit, so it doesn't show coercion is good even if the claims are not scientifically valid. So you are not asking for a form of interpretive labor that is reasonable in context.

I don't have much to say about the research link except (a) this doesn't look like an unbiased metaanalysis and (b) authoritarian control systems will often produce outcomes for subjects that look better on-paper through more domination but this is a pretty bad ethical argument in the context of justifying the system. Like, maybe slaves who run away experience worse health outcomes than slaves who remain slaves, because they have to hide from authorities, could get killed if they're caught later, are more likely to starve, etc. (And they're less likely to be employed!)

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2020-02-01T05:50:11.210Z · LW(p) · GW(p)

(Disengaging because it seems like we're talking past each other, I don't see a low-effort way to get things back on track, and this doesn't seem like an important enough topic (at least from my perspective) to put a lot more effort into.)

comment by Pattern · 2020-01-31T02:15:15.281Z · LW(p) · GW(p)

Does mental illness cause homelessness, or vice versa?

Replies from: jimrandomh
comment by jimrandomh · 2020-01-31T07:05:30.231Z · LW(p) · GW(p)

Both. Uncontroversially, I think, though there is some room to quibble about the exact ratio of causality direction.

comment by Dagon · 2020-01-31T00:40:09.821Z · LW(p) · GW(p)

Note that degree and type of refutation matters a whole lot. Many theories CAN still be applicable to a more restricted set of predictions, or can be valuable in making less precise predictions. "It's the best we have" isn't sufficient, but "it's the best we have AND it's good enough for X" could be.

There are TONS of models that get proven wrong, but still allowed excellent progress, and still have validity in a subset of cases (and are usually easier to use than the more complete/precise models).

I suspect the phrase "high-precision" is doing a lot of work in your post that I haven't fully understood. Almost all of your examples don't require universality or exception-free application (what I take your "high-precision" requirement to mean), only preponderance of utility in many commonly-encountered cases.

For some of them, a very minor caveat "this may be wrong; here are signs that you might be misapplying it" would redeem the theory without changing hardly any behavior.

comment by George3d6 · 2020-01-31T17:15:17.932Z · LW(p) · GW(p)

I wholeheartedly agree with this article to the point of being jealous of not having written it myself.

comment by cousin_it · 2020-01-31T08:18:17.934Z · LW(p) · GW(p)

If a computer makes random small errors 0.01% of the time in e.g. arithmetic operations, it’s not an almost-working computer, it’s a completely non-functioning computer, that will crash almost immediately.

Floating point arithmetic in computers is usually not precise, and has many failure modes that are hard to understand even for experts. Here's a simple one: when calculating the sum of many numbers, adding them from smallest to biggest or from biggest to smallest will often give different results, and the former one will be more correct. Here's a more complex one: a twenty page paper about computing the average of two numbers. But there are programs that do trillions of floating point operations and don't crash.

Replies from: George3d6, jessica.liu.taylor
comment by George3d6 · 2020-01-31T17:10:32.212Z · LW(p) · GW(p)
Floating point arithmetic in computers is usually not precise, and has many failure modes that are hard to understand even for experts.

Floating point arithmetic might not be precise, but it's in-precise in KNOWN ways.

As in, for a given operation done given a certain set of instruction you can know that cases X/Y/Z have undefined behavior. (e.g. using instruction A to multiply c and d will only give a precise result up to the nth decimal place)

By your same definition basically every single popular programming language is not precise since they can manifest UB, but that doesn't stop your kernel from working since it's written in such a way to (mainly) avoid any sort of UB.

Pragmatically speaking, I can take any FP computation library and get deterministic results even if I run a program millions of times on different machines.

Heck, even with something like machine learning where your two main tools are fp operations and randomness you can do something like (example for torch):

                torch.manual_seed(2)                 torch.backends.cudnn.deterministic = True                 torch.backends.cudnn.benchmark = False

and you will get the same results by running the same code millions and millions of time over on many different machine.

Even if the library gets itself in an UB situation (e.g. number get too large and go to nan) , it will be precisely reach that UB at the exact same point each time.

So I think the better way to think of FPA is "defined only in a bounded domain", but the implementations don't bother to enforce those definitions programatically, since that would take too long. Saying nan is cheaper than checking if a number is nan each time kinda thing.

Replies from: cousin_it
comment by cousin_it · 2020-02-01T08:29:58.664Z · LW(p) · GW(p)
comment by jessicata (jessica.liu.taylor) · 2020-01-31T08:22:37.971Z · LW(p) · GW(p)

This does apply to floating point but I was thinking of integer operations here.

Replies from: cousin_it
comment by cousin_it · 2020-01-31T08:29:48.666Z · LW(p) · GW(p)

Well, your broader claim was that computer algorithms shouldn't kinda sorta work, they need to work 100%. And floating point arithmetic belies that claim. For that matter, so does integer arithmetic - practically no programs come with a rigorous analysis of when integer overflow or division by zero can or can't happen. For example, binary search in Java was buggy for many years, because the (high+low)/2 operation on integers is funny.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T08:34:31.349Z · LW(p) · GW(p)

The claim was that if the arithmetic circuit that is supposed to add numbers fails 0.01% of the time the computer crashes, which is true.

Replies from: cousin_it
comment by cousin_it · 2020-01-31T08:41:41.042Z · LW(p) · GW(p)

You did also say that

The claim that a given algorithm or circuit really adds two numbers is very precise. Even a single pair of numbers that it adds incorrectly refutes the claim, and very much risks making this algorithm/circuit useless.

For almost every arithmetic operation in actual computers, on every type of numbers, there are many inputs for which that operation returns the wrong result. (Yeah, arbitrary size integers are an exception, but most programs don't use those, and even they can fail if you try making a number that doesn't fit in memory.) But still, lots of algorithms are useful.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2020-01-31T08:52:10.426Z · LW(p) · GW(p)

Are you referring to overflow? If so, that's the right result, the function to compute is "adding integers mod N" not "adding integers" (I agree I said "adding integers" but anyway addition mod N is a different very, very precise claim). Otherwise that's a hardware bug and quality assurance is supposed to get rid of those.

Replies from: cousin_it
comment by cousin_it · 2020-01-31T09:10:35.472Z · LW(p) · GW(p)

I still don't think the programming example supports your point.

For example, in C and C++, integer overflow is undefined behavior. The compiler is allowed to break your program if it happens. Undefined behavior is useful for optimizations - for example, you can optimize x<x+1 to true, which helps eliminate branches - and there have been popular programs that quietly broke when a new compiler release got better at such optimizations. John Regehr's blog is a great source on this.

Almost nothing in programming is 100% reliable, most things just kinda seem to work. Maybe it would be better to use an example from math.