Bayes is Out-Dated, and You’re Doing it Wrong
post by AnthonyRepetto · 2023-02-25T23:18:53.558Z · LW · GW · 44 commentsContents
44 comments
<sharing it here, too, though I can already imagine the reaction...>
~ a community of Bayes-enthusiasts fumble statistical inference ~
TL;DR — Industry uses Dirichlet Process and SAS, NOT Bayes. Bayes is persistently *wrong* and lacks a great deal of important information. Supposed ‘rationalists’ cling to Bayes as the Ultimate Truth, without knowing enough Mathematics to know they’re wrong.
“Oh, well my Prior was <preferred assumption> but I guess I have to update with that one data-point that wandered into my life.” — multiple ‘Rationalists’ in my year of invading their gatherings
A weird thing is happening in the Bay Area, slowly creeping into the Zeitgeist: a group of non-mathematicians have decided they found the BEST statistical technique ever, and they want to use it to understand the whole world… but their technique is 260 YEARS OLD, and we’ve done a LOT better since then. It’s called Bayes’ Theorem, published in 1763 — literally 260 candles this year.
Let’s get a sense of just how out-dated and bizarre it is, to insist you have the One-True-Method when it’s 260 years old: back in 1763, when Bayes was published, there was another new-fangled invention sweeping Europe — the Dutch Plough. That’s the plough used today by the Amish. Literally, relying on Bayes to draw conclusions is like farming with an Amish plough; it’s hilariously inadequate, and completely dismissed by industry.
That quote at the top is an amalgam of multiple conversations with the Effective Altruists and Astral Codex Ten ‘Rationalists’ (they made that term up to describe themselves); it’s a persistent theme in their conversations. And, it’s not even the *correct* use of Bayes! Let’s see why:
In Bayes’ Theorem, you begin with a Prior. These Rationalists pick the Prior that they *prefer*. Neutral Bayesian Priors, however, are the average of all possible assumptions, NOT you’re preferred place to start. These folks’ first step is a disastrous error. Then, when they say “I guess I should update my Prior…” Wait! Why in the world would you ever feel confidence about a belief, when the ONLY thing you have is a Prior? A Prior is, by definition, the state of “no information” when one should have intellectual humility, not certainty!
Then, they are updating their Bayesian estimate using…. a *few* examples? The Rationalists repeatedly rely upon sparse evidence, while claiming certainty, as if “Statistically Significant Sample Size” just isn’t a thing. Bayes doesn’t *need* statistically significance, apparently! Finally, those examples they use are culled from personal experience. I hope I don’t have to explain to anyone why we need to collect a random sample from representative sub-populations? The supposedly rational Bayes-fans fail on each possible count.
So, if they correct those mistakes, can they then rely on Bayes to find their precious truths? Nope. Bayes is consistently wrong, reliably. That’s why industry doesn’t use it. They’d lose money. Dirichlet lets them make money, because it works better. That’s a stronger proof, empirically, than all the rationalizations of their community’s prominent Bayes-trumpeters: a fiction writer and a psych councilor, both of whom lack relevant experience with statistical analysis software and techniques.
In particular, the blog of that psych councilor, “Astral Codex Ten” has a tag-line: it quotes Bayes’ Theorem, and follows by saying “all else is commentary.” Everyone who reads his blog, and who then DOESN’T check what statistical techniques are used in the real world, stays there as part of the community. They have self-selected for a community of people who call Bayes the be-all-end-all, all of them agreeing they’re right, and they don’t know that they’re horribly wrong… because they don’t check!
Think about this for a moment: if you state Bayes’ Theorem, and then claim “all else is commentary” while recommending readers use Bayes, you are implicitly claiming “NO further improvements in statistical analysis have occurred in the 260 years since Bayes was published; Student-t Distributions, Levi Distributions, they don’t even need to exist!” That’s the core tenet of the Bay Area Rationalists’ luminary, addicted to Bayes.
Wait, so why and how is Dirichlet such an improvement?
Let’s imagine you took a survey in some big city, and found (unsurprisingly) a majority Democrats — it was a 60/40 split, on the nose. That sample’s split is also the “maximum likelihood” for the potential Population. Said another way, “The real-world population which is most likely to give you a 60/40 sample is a 60/40 population.” But, does that make 60/40 your best guess for the real population? No.
Imagine each possible population, one at a time. There’s the 100% Democrat population, first — what is the *likelihood* of such a population producing a 60/40 sample? Zero. What about 99% Democrat? Well, then it’ll depend upon how *many* people you surveyed, but there is just a tiny chance the real population is 99% Democrat! Keep doing that, for every population, all the way to 99% Republican, then 100% Republican. Whew! Now, you have a *likelihood* distribution, the “likelihood of population X generating sample Y.”
When we look at this distribution, for data that falls in two buckets (D/R), then we’ll notice something: the *peak* likelihood is at 60/40, but there’s ALSO a bunch of probability-mass on the 50/50 side of the curve, creating a tilt to the over-all probability. While the ‘mode’ of the likelihood distribution is still the 60/40 estimate, the actual ‘mean’ of that distribution is closer to 50/50, every time! You *should* expect that the true population is closer to an *equal division* among buckets. When you collect more samples, you narrow that distribution of likelihoods, so you see less drift toward 50/50. That’s the reason you want a ‘statistically significant sample size’.
Let’s look at that other aspect Dirichlet possesses, which Bayes wholly lacks: Confidence!
When you look at the likelihood of each population, the chance of it producing your observed sample, you can also ask: “How far AWAY from our best guess would we need to place boundaries, such that we include 95% of the possible populations’ likelihoods within our bounds?” That’s called your Confidence Interval! You may have only learned the trimmed-down simplicities and z-score tables in your Stat 101 class, but there’s a reason for why they can claim confidence: that interval of population-estimates contains 95% of the likelihood-distribution’s probability-mass!
Finally, let’s consider “the cost of being wrong”. Bayes doesn’t balance your prediction according to the cost of being wrong; Dirichlet’s distribution over potential populations can simply be *multiplied* by the cost of each error-distance, and then the mode of that distribution will “minimize the COST of being WRONG.” You can even multiply by costs which are discontinuous or ranges, producing high and low bounds and nuanced thresholds of risk. Definitely better than Bayes.
Now, Dirichlet isn’t even the be-all-end-all… it was published in 1973, 50 years old THIS year! SAS has trade secrets since the 70’s, and invests 2.5x more into R&D than the TECH-industry average! If you want to pass muster for pharmaceuticals in front of the FDA, you send all your data to SAS. It’s required, because they’re soooo damn GOOD! So, unless you work at SAS (which has the highest profits per employee hour of all companies on Earth, and has expanded consistently since 1976… consistently rated one of the best employers on the planet…) then you DON’T know the be-all-end-all statistical technique — and neither do Scott Alexander or Eliezer Yudkowski, as much as they’d like you to believe otherwise. Just for reference, when “you think you’re right BECAUSE you don’t know enough to know you’re wrong,” that’s called the Dunning-Kreuger Effect, dear Rationalists.
44 comments
Comments sorted by top scores.
comment by RobertM (T3t) · 2023-02-26T02:59:54.384Z · LW(p) · GW(p)
This post, and many of @AnthonyRepetto [LW · GW]'s subsequent replies to comments on it, seem to be attacking a position that the named individuals don't hold, while stridently throwing out a bunch of weird accusations and deeply underspecified claims. "Bayes is persistently wrong" - about what, exactly?
Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy, and not include a bunch of random (and wrong) snipes.
I'm rate-limiting your ability to comment to once per day. You may consider this a warning; if the quality of your argumentation doesn't improve then you will no longer be welcome to post on the site.
Replies from: Vladimir_Nesov↑ comment by Vladimir_Nesov · 2023-02-26T03:07:43.027Z · LW(p) · GW(p)
Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy
That's not the problem here and this is a bad general rule.
Replies from: T3t↑ comment by RobertM (T3t) · 2023-02-26T03:18:22.961Z · LW(p) · GW(p)
That's definitely one of the problems with this post, and while rudeness is generally undesirable it's slightly more forgiveable when there's some evidence of the thing that "justifies" it.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-27T19:58:48.517Z · LW(p) · GW(p)
"Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy, and not include a bunch of random (and wrong) snipes."
I did in fact include empirical metrics of Dirichlet's superiority and how Bayes' Theorem fails in contrast: industry uses it, after they did their own tests, which is empiricism at work. I also showed how Dirichlet Process allows you to compute Confidence Intervals, while Bayes' Theorem is incapable of computing Confidence Intervals. I also explained how, due to the median of the likelihood function being closer to an equal distribution than Bayes would expect, Bayes is persistently biased toward whichever extrema might be observed in the sample. Thus, Bayes' Theorem will consistently mis-estimate; it's persistently wrong, and Dirichlet was developed as the necessary adjustment. So, I did give explicit reasons why Bayes' Theorem is inadequate compared to the modern, standard approach which has empirical backing in industry.
It seems like you want to rate-limit me for an unspecified duration? What are the empirical metrics for that rate-limit being removed? And, the fact that you claim I "didn't provide specific, uncontroversial examples," when I just showed you those specifics again here, implies that you either weren't reading everything very carefully, or you want to mischaracterize me to silence any opposition of your preferred technique: Bayes'-Theorem-by-itself.
Replies from: T3t↑ comment by RobertM (T3t) · 2023-02-27T23:08:19.865Z · LW(p) · GW(p)
The missing examples are for claims of the form:
The Rationalists repeatedly rely upon sparse evidence, while claiming certainty
They have self-selected for a community of people who call Bayes the be-all-end-all, all of them agreeing they’re right, and they don’t know that they’re horribly wrong… because they don’t check!
...then you DON’T know the be-all-end-all statistical technique — and neither do Scott Alexander or Eliezer Yudkowski, as much as they’d like you to believe otherwise.
I would not be surprised if some random "rationlist" you ran into somewhere was sloppy or imprecise with their usage of Bayes. I would also not be surprised if you misinterpreted some offhand comment as an unjustified claim to statistical rigor. Maybe it was some third, other thing.
As an aside, all the ways in which you claim that Bayes is wrong are... wrong? Applications of the theorem gives you wrong results insofar as the inputs are wrong, which in real life is ~always, and yet the same is true of the techniques you mention (which, notably, rely on Bayes). There is always the question of what tool is best for a given job, and here we circle back to the question of where exactly this grevious misuse of Bayes is occurring.
It seems like you want to rate-limit me for an unspecified duration? What are the empirical metrics for that rate-limit being removed? And, the fact that you claim I "didn't provide specific, uncontroversial examples," when I just showed you those specifics again here, implies that you either weren't reading everything very carefully, or you want to mischaracterize me to silence any opposition of your preferred technique: Bayes'-Theorem-by-itself.
Deeply uncharitable interpretations of others' motives is not something we especially tolerate on LessWrong.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-28T23:44:42.817Z · LW(p) · GW(p)
Ah, first: you DID claim that I "didn't provide specific, uncontroversial examples" and I HAD given such for why Bayes' Theorem is inadequate. Notice that you made your statement in this context:
<<"Bayes is persistently wrong" - about what, exactly?
Content like this should include specific, uncontroversial examples>>
In that context, where you precede "this" with my statement about Bayes, I naturally took "content like this" to be referring to my statement that "Bayes is persistently wrong." I hope you can see how easy it would be for me to conclude such a thing, considering "this" refers to... the prior statement?
You now move your goal-posts by insisting that my statement "Rationalists repeatedly rely upon sparse evidence, while claiming certainty" was ACTUALLY the argument I had to support with specifics... while if I were to give such specifics, I would have betrayed individual confidences, which is unethical. So, no, I'll continue to assert without specifics, for the sake of confidences, that "Rationalists repeatedly rely upon sparse evidence, while claiming certainty" because MULTIPLE rationalist over the past YEAR have done so, NOT an isolated incident or an off-hand joke, as you assume.
Your assumption that my "amalgam of rationalists I've met over the last year" was somehow a one-off or cursory remark is your OWN uncharitable interpretation; you are dismissing my repeated interactions with your community; such has been the norm. Similarly, in the EA Forum post "Doing EA Better" - a group of risk analysts had been spending a year trying to tell EA that "you're doing risk-assessment wrong; those techniques are out-dated," and EA members kept insisting their way was fine and right. Eventually, that nearly-dozen folks sat down and scribed an essay to EA... and EA pointedly ignored that fact they mentioned! "EA dismisses experts when experts tell EA they're using out-dated techniques." I'm seeing a similar pattern across the Rationalist community, NOT a one-off event or a casual remark; they were using Bayes' Theorem improperly, as the substance of arguments made in response to me.
"As an aside, all the ways in which you claim that Bayes is wrong are... wrong?"
Bayesian Inference is a good and real thing. And, Bayes' Theorem is an old formula, used in Bayesian Inference. AND Bayes' Theorem cannot produce Confidence Intervals, nor will it allocate to minimize the cost of being wrong, nor does it make adjustments for samples' bias toward the extrema. Those are all specific ways where "I just plug it into Bayes' Theorem" is factually wrong. You keep claiming that my critique is wrong - but you only do so vaguely! You skip right past these failures of Bayes' Theorem, each time I mention them. Check the math books: there is NO "question of what tool is best for a given job," as you say - rather, Bayes' Theorem alone is NEVER the tool. You'll have to adjust in many ways, not just one. And if you don't do so, you are in fact using an obsolete technique during your Bayesian Inference.
comment by Jonas Moss (jonas-moss-1) · 2023-02-25T23:49:16.251Z · LW(p) · GW(p)
Roughly speaking, we can divide Bayesianism into two, maybe three or more, separate but related meanings:
1. Adherence to a form of Bayesian epistemology. You think that knowledge comes in degrees of belief, and the correct way to update your beliefs on seeing new information is to use Bayes theorem. It's usually done informally.
2. Adherence to Bayesian statistics. You believe that frequentist inference is invalid and that frequentist measures of an estimator's quality should not be used. Instead, you prefer to use precisely defined priors and likelihoods, derive their posteriors, and report a quantity based solely on that. Moreover, you would often espouse some form of Bayesian decision theory - i.e., you have a loss function in addition to your prior and likelihood, and report (or act on) the optimal decision according to your framework. All of this is usually done formally.
Your comments about Dirichlet don't make sense. Are you thinking about the Dirichlet distribution? If so, it is more widely used in Bayesian statistics than frequentist statistics, as it is the conjugate prior to the multinomial distribution. Regarding your comments about the SAS institute, I can say this: Most of the members of this forum are deeply interested in deep learning. Is deep learning Bayesian? No. Not even Bayesian deep learning is properly Bayesian. Does that matter to you, as a Bayesian epistemologist? No, as deep learning has little to nothing to do with epistemology. Does it matter to you, as a Bayesian statistician? No, as deep learning is not about inference or decision theory, which is what Bayesian statisticians care about (for the most part).
By the way, Bayes theorem isn't a "statistical technique", it's just a theorem. Used by all statisticians without a second thought. It's when you use it to do inference you become a Bayesian statistician.
↑ comment by AnthonyRepetto · 2023-02-25T23:52:08.311Z · LW(p) · GW(p)
I haven't observed any rationalists here using Dirichlet, and no, I wasn't talking about Bayesian vs. Frequentist; Bayesians are correct. Using Bayes Theorem when you didn't consider the probability of each possibly population producing your observed sample? That's definitely you doing it wrong. Instrumentation has variability; Dirichlet is how you include that, too.
comment by quanticle · 2023-02-26T07:01:35.534Z · LW(p) · GW(p)
Criticizing the use of Bayes Theorem because it's 260 years old is such a weird take.
The Pythagorean theorem is literally thousands of years old. But it's still useful, even though lots of progress has been made in trigonometry since then. Should we abandon , as a result?
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-02-27T00:14:56.477Z · LW(p) · GW(p)
This does seem like a laughable conclusion. Imagine the implications for the world if this line of reasoning became the accepted paradigm!
Though I'm hesitant to outright dismiss anyone willing to put in effort into writing a post, the author here really needs to rewrite their post to remove all the self-imposed absurdities.
comment by simon · 2023-02-25T23:30:10.029Z · LW(p) · GW(p)
If you have a real argument that the prior is reliably best obtained via a Dirichlet process and no other method of coming up with a prior is ever more useful, then make the argument.
I see:
- argument from authority/prestige
- argument from age (as if math changes over time)
- straw/weakmanning ("These Rationalists pick the Prior that they *prefer*."; "The Rationalists repeatedly rely upon sparse evidence, while claiming certainty")
↑ comment by AnthonyRepetto · 2023-02-25T23:32:06.687Z · LW(p) · GW(p)
Dirichlet is used by industry, NOT Bayes. What is your rebuttal to that, to show that Bayes is in fact superior to Dirichlet?
Replies from: simon, AnthonyRepetto↑ comment by simon · 2023-02-25T23:35:13.219Z · LW(p) · GW(p)
The wiki article on the Dirrchlet process includes:
In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.
I.e. it isn't an alternative to Bayes, but rather a way of coming up with a prior.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:39:23.033Z · LW(p) · GW(p)
Erm, I didn't include a link, so you're literally fabricating. And, just because Dirichlet can be used over the possible population of parameters, doesn't mean that's ALL it does; it's NOT "just to plug into Bayes". You need to learn more than the first paragraph of wikipedia, and the fact that you assume you're right, when you ONLY read so much, is more demonstration of you community being Dunning-Kreugers. You haven't learned enough to learn that you're wrong.
Replies from: simon↑ comment by simon · 2023-02-25T23:43:57.315Z · LW(p) · GW(p)
My apologies, I must have searched it and forgot that i did so.
That being said, can you provide an argument/llnk that there is any part of the Dirichlet process that is not Bayesian?
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:54:50.722Z · LW(p) · GW(p)
You are introducing the umbrella-term "Bayesian" when I too agree in Bayesian vs Frequentist. That is NOT the same as "Uses Bayes' Theorem without compensating for the likelihood of possible populations, nor cost of being wrong." If you do the latter, as many I've met in the Rationalist community do, you're doing statistical inference wrong. Industry uses Dirichlet, while y'all don't - provide a rebuttal to that key point, or else you don't have an argument.
Replies from: simon↑ comment by simon · 2023-02-26T00:00:03.189Z · LW(p) · GW(p)
If some people are doing that (edit: i.e. overconfidently generalizing from a few datapoints, which I think was in the above comment but taken out), they are doing it wrong. One of Jaynes' main points is that you should take into account all the available information.
I've not encountered any claim that anyone can do perfect Bayesian reasoning in their head.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T00:05:16.240Z · LW(p) · GW(p)
If you perform Bayes' Theorem as presented by Scott Alexander and Eliezer Yudkowski, then you are necessarily NOT including the Dirichlet Process... because they don't! Bayes' Theorem has no capacity to give you a Confidence Interval; you'll need to add modern techniques to get information like that. Scott Alexander and Eliezer have a crew of people who never learned those facts, all pretending they're doing it correctly, when they aren't doing Dirichlet on the possible populations' likelihood distribution. Where is the "possible populations' likelihood distribution" mentioned by Scott Alexander or Eliezer? They give you the wrong info, and you don't check industry, which uses Dirichlet.
None of you address this core point: Industry uses Dirichlet. How do you get around that, and pretend you're doing it right?
Replies from: simon↑ comment by simon · 2023-02-26T01:11:17.579Z · LW(p) · GW(p)
A confidence interval is just an upper and lower bound according to some probability threshold. I.e. it's just probabilities, and does not require some super special technique.
Regarding the Dirichlet process:
Reading the wiki article it seems like it's designed for a particular class of problems, and is not a general solution to all problems. So, it would make sense to use it if your problem falls in that class, but not if it doesn't.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T01:14:29.638Z · LW(p) · GW(p)
Erm, you are demonstrating that same issue I pointed-out originally: you thinking that you have the right answer, after only a wiki page, is exactly the Dunning-Kreuger Effect. You're evidence of my argument, now.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T01:17:47.726Z · LW(p) · GW(p)
The way you derive a confidence interval is by assessing the likelihood function, which is across the distribution of populations. Bayes' Theorem, as presented by Scott Alexander and Eliezer Yudkowski, does NOT include those tools; you can't use what they present to derive an actual confidence interval. Your claim of 'confidence' on a prediction market is NOT the same as Dirichlet saying "95% of the possible populations' likelihood MASS lies within these bounds." THAT is a precise and valuable fact which "Bayes as presented to Rationalists" does NOT have the power to derive.
↑ comment by AnthonyRepetto · 2023-02-25T23:33:03.116Z · LW(p) · GW(p)
And, I never claimed that priors are better obtained with Dirichlet than Bayes... I'm not sure what you were reading, could you quote the section where you thought I was making that claim?
comment by xepo · 2023-02-26T02:25:20.705Z · LW(p) · GW(p)
why are you trying to attack instead of educate?
90% of your article is “rationalists do it wrong”. Why? Who cares? Teach us how to do it better instead of focusing on how we’re doing it wrong.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T02:53:01.128Z · LW(p) · GW(p)
No, I write articles in various newsletters. I wrote an article ABOUT your community. I shared that article here, not for your pleasure or personal growth. You're an adult; check reality by seeing if industry uses Bayes' Theorem the way Scott Alexander and Eliezer Yudkowski do. That's the work you are responsible as an adult, before you go around claiming that you're doing the best job at finding truth, when Bayes' Theorem can't even give you a Confidence Interval.
comment by Adam Shai (adam-shai) · 2023-02-25T23:39:43.403Z · LW(p) · GW(p)
I don't know if I'm missing something, but it sounds like you are discussing for a particular method of picking a prior within a Bayesian context, but you are not arguing against Bayes itself. If anything, it seems to me this is pro-Bayes, just using DIrilecht Processes as a prior.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:41:48.353Z · LW(p) · GW(p)
Erm, is SAS using Bayes? That's the actual best in class.
Replies from: adam-shai↑ comment by Adam Shai (adam-shai) · 2023-02-25T23:44:27.525Z · LW(p) · GW(p)
Well I don't know SAS at all but a quick search of the SAS documentation for dirilecht calls it a "nonparametric Bayes approach"...
https://documentation.sas.com/doc/en/casactml/8.3/casactml_nonparametricbayes_details12.htm
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:49:38.983Z · LW(p) · GW(p)
SAS has developed their own trade-secret that outperforms all public methods; by definition, that MUST not be what YOU do when you apply Bayes to a few personal examples.
comment by Dagon · 2023-02-26T14:28:28.333Z · LW(p) · GW(p)
The downvotes are predictable - not only is it mis-stating a strawman of the group's position, it uses a lot of exclamation points to emphasize how stupid we all are.
However, it's also got some pretty good points, especially as some of the adjacent social groups are exploding, in part due to untenable extrapolation over unstable premeses.
comment by LVSN · 2023-02-25T23:53:48.917Z · LW(p) · GW(p)
I can't tell if you're right, because no one has ever laid out [LW · GW] Bayesianism as a set of definition and instruction steps, explained what Bayesianism uniquely relevantly achieves, and explored the relevant consequences of making various tempting-from-some-perspective mutations on the instruction set; those are the steps required to elevate a person's grasp of Bayesianism to true understanding.
You have also not followed those steps with Dirichlet and SAS, and compared it to Bayesianism.
Still I have an intuition that your complaint about using personal experience is not virtuous. Everything you learn has to pass through personal experience. If your other ways of becoming informed had not been conceived, learning from personal experience would still be possible. Information is information no matter how seriously you take it, and I think personal experience is worth taking seriously, as a person who concerns themself with misleadingness, in a world full of people who attend only to the truth of what they hear and not to the misleadingness.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:56:52.560Z · LW(p) · GW(p)
You claim of Bayes and Dirichlet that "no one has ever laid (them) out", and to prove your claim, you link to another post that YOU wrote, where you claim it again? Check math textbooks; I don't have to teach you what's already available in the public sphere.
Replies from: LVSN↑ comment by LVSN · 2023-02-26T00:03:37.778Z · LW(p) · GW(p)
It was not to prove my claim; the post I wrote elaborates more fully on what I believe is the correct teaching process. If you read the post, it would become clear to you that my teaching standards have never been met in textbooks, and can hardly even in principle be met through textbooks. My teaching standards are not arbitrary; if these standards are not met then I will not truly understand the subject.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T00:11:05.077Z · LW(p) · GW(p)
Your difficulty understanding it is NOT equivalent to "no one has ever laid them out". Those are two wildly different statements. A dyslexic person would have similar difficulty reading a novel, yet that is NOT equal to "no one ever wrote a book."
Replies from: LVSN↑ comment by LVSN · 2023-02-26T00:17:16.991Z · LW(p) · GW(p)
Feeling like you understand is not the same as actual understanding. People who read the existing explanations and feel like they understand, when the explanations did not follow the process I described, do not truly understand. My complaint is not that when I read the explanations I don't feel like I understand them; my complaint is that the extents to which Bayesianism have ever been laid out are insufficient for creating true understanding upon first reading.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T00:30:52.069Z · LW(p) · GW(p)
Astounding! Then my argument that "NOT including Dirichlet is wrong" must have been wrong? Or else, why are you mentioning that no one taught you to your own satisfaction?
Replies from: LVSN↑ comment by LVSN · 2023-02-26T00:48:41.268Z · LW(p) · GW(p)
Then my argument that "NOT including Dirichlet is wrong" must have been wrong?
It could be right, actually. The only objection I made was in response to your objection to using personal experience, and I only talked about my intuition rather than what must or must not be the case.
Or else, why are you mentioning that no one taught you to your own satisfaction?
You seem to want to proselytize better epistemic methods, and I am telling you what I need from you in order to adopt or reject your advised methods from an engineering angle (which I regard as superior); until then I can only follow clues of lesser quality (such as the correlation between caring about misleadingness and tendency to say things that impress me as insightful); the detective angle.
Replies from: AnthonyRepetto, AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T01:11:56.413Z · LW(p) · GW(p)
Screenshots are up! I'll be glad when more members of the public see the arguments you give for ignoring mine. :P cheers!
↑ comment by AnthonyRepetto · 2023-02-26T00:59:05.801Z · LW(p) · GW(p)
Wait. Let me see if I've got the core points:
"I don't need to engage with Anthony's arguments unless he presents them to my satisfaction."
AND
"No one on Earth has ever presented Bayes to my satisfaction."
If NO ONE has EVER presented that information to your satisfaction, it would be daft to assume I would accomplish such a feat! You have such high standards as a pre-requisite to your engagement, that by your OWN admission, NO ONE in history has ever MET your standard! Why bother telling me all this? I didn't share the post to convince or proselytize - as I said in the intro, I am only sharing this on your site as a courtesy. I wrote it on another newsletter, for the general public to learn about all of you.
And, considering that your community's response is "I don't have to engage with the argument unless you present it to my satisfaction, and NO ONE has ever done so, thus I win," the public should get to have a good, stern look at your behavior and justifications. I get you to betray yourself, and screenshot your responses, to show the PUBLIC what your community is like.
Replies from: LVSN↑ comment by LVSN · 2023-02-26T01:25:27.946Z · LW(p) · GW(p)
You should not argue for that which you do not understand in the first place as though you understood it.
I think I am a very odd member of the rationalist community; it would not make sense to take me as a representative. Many people here would probably be comfortable saying they understand Bayesianism after a typical explanation and I would have to disagree with them about that, little high-standards weirdo that I am.
I'm sorry that you feel like I was making any of my responses at your expense; I don't want you to lose, and by helping each other make considerations not made before I believe we are helping each other win.
comment by duck_master · 2024-07-16T18:49:13.964Z · LW(p) · GW(p)
The single biggest question I have is "what is Dirichlet?"
comment by [deleted] · 2023-02-25T23:32:18.110Z · LW(p) · GW(p)
As I understand it, in the event that you are correct and Dirichlet is better, rational Rationalists must switch to the better algorithm. Because rationality is about systematized winning, and if you are correct, this is a measurably better algorithm to win.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-25T23:37:22.025Z · LW(p) · GW(p)
Yes! And, even since Dirichlet was published in 1973, it has ONLY ever been run on super-computers, using statistically significant sample sizes! You CANNOT do Dirichlet in your head, unless you are a Savant, and no math class will ask you to Dirichlet on a quiz. I'm not sure how ANYONE can claim Bayes is reliable, when NO ONE in industry touches it... your community has an immense blind-spot to real-world methods, yet you claim certainty and confidence - that's the Dunning-Kreugers self-selecting into a pod that all agree they're right to use Bayes.
Replies from: None↑ comment by [deleted] · 2023-02-25T23:58:54.917Z · LW(p) · GW(p)
That's only one piece of rationality, and I think the general conclusion was "ask an artificial intelligence you can trust" would be the only scalable way for humans to be genuinely rational in their decision-making. It does not matter what algorithm that machine uses internally, merely it is the best performing one from the class of "sufficiently trustworthy" choices.
Note this is a feasible thing to do, for example the activation function Swish was found this way.
A lot of the rest of it was dismissing obviously wrong individuals and institutions? You saw how you dismissed the idea of "start with a prior from the median of mainstream knowledge" and "update with each anecdote"?
The thing is, that method is arguably better than many institutions and individuals are. At least it uses information to make it's decision.
One of the tenants of "what does $authority_figure claim to know and how does he know it" allows you to dismiss obviously wrong/misaligned authorities on subjects.
Such as the FDA or machine learning scientists setting 2060 as the date for AGI. (the FDA is misaligned, it serves it's own interests not the interests of living Americans wanting to remain that way. the ML scientists did not account for an increase in investment or recursive improvement)
There are a lot of other ideas and societal practices that are simply based on bullshit, no actual thought or process was even followed to generate them, they are usually just parroting some past flawed idea. Like what you said regarding Bayes.
Replies from: AnthonyRepetto↑ comment by AnthonyRepetto · 2023-02-26T00:08:29.835Z · LW(p) · GW(p)
Then why does industry use Dirichlet, not Bayes? You keep pretending yours is better, when everyone who has to publish physics used additional methods, from this century. None of you explain why industry would use Dirichlet, if Bayes is superior. Further, why would Dirichlet even be PUBLISHED unless it's an improvement? You completely disregard these blinding facts. More has happened in the last 260 years than just Bayes' Theorem, and your suspicion of the FDA doesn't change that fact.