The Mountain Troll

post by lsusr · 2022-06-11T09:14:01.479Z · LW · GW · 26 comments

Contents

26 comments

It was a sane world. A Rational world. A world where every developmentally normal teenager was taught Bayesian probability.

Saundra's math class was dressed in their finest robes. Her teacher, Mr Waze, had invited the monk Ryokan to come speak. It was supposed to be a formality. Monks rarely came down from their mountain hermitages. The purpose of inviting monks to speak was to show respect for how much one does not know. And yet, monk Ryokan had come down to teach a regular high school class of students.

Saundra ran to grab monk Ryokan a chair. All the chairs were the same—even Mr Waze's. How could she show respect to the mountain monk? Saundra's eyes darted from chair to chair, looking for the cleanest or least worn chair. While she hesitated, Ryokan sat on the floor in front of the classroom. The students pushed their chairs and desks to the walls of the classroom so they could sit in a circle with Ryokan.

"The students have just completed their course on Bayesian probability," said Mr Waze.

"I see[1]," said Ryokan.

"The students also learned the history of Bayesian probability," said Mr Waze.

"I see," said Ryokan.

There was an awkward pause. The students waited for the monk to speak. The monk did not speak.

"What do you think of Bayesian probability?" said Saundra.

"I am a Frequentist," said Ryokan.

Mr Waze stumbled. The class gasped. A few students screamed.

"It is true that trolling is a core virtue of rationality," said Mr Waze, "but one must be careful not to go too far."

Ryokan shrugged.

Saundra raised her hand.

"You may speak. You need not raise your hand. Rationalism does not privilege one voice above all others," said Ryokan.

Saundra's voice quivered. "Why are you a Frequentist?" she said.

"Why are you a Bayesian?" said Ryokan. Ryokan kept his face still but he failed to conceal the twinkle in his eye.

Saundra glanced at Mr Waze. She forced herself to look away.

"May I ask you a question?" said Ryokan.

Saundra nodded.

"With what probability do you believe in Bayesianism?" said Ryokan.

Saundra thought about the question. Obviously not 1 because no Bayesian believes anything with a confidence of 1. But her confidence was still high.

"Ninety-nine percent," said Saundra, "Zero point nine nine."

"Why?" said Ryokan, "Did you use Bayes' Equation? What was your prior probability before your teacher taught you Bayesianism?"

"I notice I am confused," said Saundra.

"The most important question a Rationalist can ask herself is 'Why do I think I know what I think I know?'" said Ryokan. "You believe Bayesianism with a confidence of where represents the belief 'Bayesianism is true' and represents the observation 'your teacher taught you Bayesianism'. A Bayesian belives with a confidence because . But that just turns one variable into three variables ."

Saundra spotted the trap. "I think I see where this is going," said Saundra, "You're going to ask me where I got values for the three numbers ."

Ryokan smiled.

"My prior probability was very small because I didn't know what Bayesian probability was. Therefore must be very large." said Saundra.

Ryokan nodded.

"But if is very large then that means I trust what my teacher says. And a good Rationalist always questions what her teacher says," said Saundra.

"Which is why trolling is a fundamental ingredient to Rationalist pedagogy. If teachers never trolled their students then students would get lazy and believe everything that came out of their teachers' mouths," said Ryokan.

"Are you trolling me right now? Are you really a Frequentist?" said Saundra.

"Is your teacher really a Bayesian?" said Ryokan.


  1. Actually, what Ryokan said was "そうです" which means "[it] is so". ↩︎

26 comments

Comments sorted by top scores.

comment by Viliam · 2022-06-11T21:26:54.081Z · LW(p) · GW(p)

"It is simply a question of which method of calculation is more convenient for you," explained Ryokan. "If you are a high-level monk and remember your previous billion reincarnations, any possible event you consider has already happened to you many times. Calculating n(A)/n(S) once requires much less effort than calculating Bayesian updates over and over again."

"But if the enlightened person remembers the frequencies of everything, how can probability be in the mind?" cried Saundra.

"Because the entire reality only exists in the mind of Lord Vishnu," answered Mr Waze.

comment by Aleksi Liimatainen (aleksi-liimatainen) · 2022-06-11T12:34:00.701Z · LW(p) · GW(p)

I am a regularity detector generated by the regularities of reality. Frequentism and Bayesianism are attempted formalizations of the observed regularities in the regularity detection process but, ultimately, I am neither.

comment by Yoav Ravid · 2023-12-12T09:13:05.654Z · LW(p) · GW(p)

Heh, reading this after my dialogue with Lsusr [LW · GW], It seems most people have mistook this story to be about Bayesianism and Frequentism, when it's actually about Trolling. 

In a way, I guess it worked? :P

Replies from: lsusr
comment by lsusr · 2023-12-12T09:14:34.907Z · LW(p) · GW(p)

The first paragraph was supposed to be sarcastic satire.

comment by Jay · 2022-06-11T12:12:50.298Z · LW(p) · GW(p)

I think a better way to look at it is that frequentist reasoning is appropriate in certain situations and Bayesian reasoning is appropriate in other situations.  Very roughly, frequentist reasoning works well for descriptive statistics and Bayesian reasoning works well for inferential statistics.  I believe that Bayesian reasoning is appropriate to use in certain kinds of cases with a probability of (1-delta), where 1 represents the probability of something that has been rationally proven to my satisfaction and delta represents the (hopefully small) probability that I am deluded.

comment by Donald Hobson (donald-hobson) · 2023-12-30T01:03:46.121Z · LW(p) · GW(p)

As a human mind, I have a built in default system of beliefs. That system is a crude "sounds plausible" intuition. This mostly works pretty well, but it isn't perfect.

This crude system heard about probability theory, and assigned it a "seems true" marker. The background system, as used before learning probability theory, kind of roughly approximates part of probability theory. But it's not a system that produces explicit numbers. 

So I can't assign a probability to baysianism being true, because the part of my mind that decided it was true isn't using explicit probabilities, just feelings. 

comment by nim · 2022-06-11T18:33:31.220Z · LW(p) · GW(p)

"A good rationalist always questions what her teacher says."

Why does Saundra believe this? I'd hazard the guess that her teacher said it to her.

The axioms that we pick up before we learn to question new axioms are the hardest to see and question. I wonder if that's a factor in the correlation between "smarter" people often seeming to learn to question axioms earlier in life -- less time spent getting piled with beliefs that were never tested by the "shall I choose to believe this?" filter because the filter didn't exist yet when the beliefs were taken on.

Replies from: Vladimir_Nesov, Jay
comment by Vladimir_Nesov · 2022-06-11T19:31:30.918Z · LW(p) · GW(p)

The whole concept of "questioning" is questionable, as it's suggesting an improvement over status quo where claims you overhear are unconditionally accepted as own beliefs verbatim, or at least alternatives to them are discouraged from being discussed, which is insane (and the way language models learn). A more reasonable baseline for improvement is where claims are given inappropriate credence or inappropriate attention to the question of their credence (where one alternative is siloing them inside hypotheticals).

comment by Jay · 2022-06-11T21:06:39.400Z · LW(p) · GW(p)

Being honest, for nearly all people nearly all of the time questioning firmly established ideas is a waste of time at best.  If you show a child, say, the periodic table (common versions of which have hundreds of facts), the probability that the child's questioning will lead to a significant new discovery are less that 1 in a billion* and the probability that they will lead to a useless distraction approach 100%.  There are large bodies of highly reliable knowledge in the world, and it takes intelligent people many years to understand them well enough to ask the questions that might actually drive progress.  And when people who are less intelligent, less knowledgeable, and/or more prone to motivated reasoning are asking the questions, you can get flat earthers, Qanon, etc.

*Based on the guess that we've taught the periodic table to at least a billion kids and it's never happened yet.

Replies from: Vladimir_Nesov, jiao-bu
comment by Vladimir_Nesov · 2022-06-11T21:23:35.714Z · LW(p) · GW(p)

the probability that the child's questioning will lead to a significant new discovery

The relevant purpose is new discoveries for the child, which is quite plausible. Insufficiently well-understood claims are also not really known, even when they get to be correctly (if not validly) accepted on faith. (And siloing such claims inside appropriate faith-correctness/source-truthfulness hypotheticals is still superior to accepting them unconditionally.) There is also danger of discouraging formation of gears level [LW · GW] understanding on the basis of irrefutability of policy level [LW · GW] knowledge, rendering ability to make use of that knowledge brittle. The activity of communicating personal discoveries to the world is mostly unrelated.

Replies from: Jay
comment by Jay · 2022-06-12T00:43:55.102Z · LW(p) · GW(p)

I get your point, and I totally agree that answering a child's questions can help the kid connect the dots while maintaining the kid's curiosity.  As a pedagogical tool, questions are great.  

Having said that, most people's knowledge of most everything outside their specialties is shallow and brittle.  The plastic in my toothbrush is probably the subject of more than 10 Ph.D. dissertations, and the forming processes of another 20.  This computer I'm typing on is probably north of 10,000.  I personally know a fair amount about how the silicon crystals are grown and refined, have a basic understanding of how the chips are fabricated (I've done some fabrication myself), know very little about the packaging, assembly, or software, and know how to use the end product at a decent level.  I suspect that worldwide my overall knowledge of computers might be in the top 1% (of some hypothetical reasonable measure).  I know very little about medicine, agriculture, nuclear physics, meteorology, or any of a thousand other fields.

Realistically, a very smart* person can learn anything but not everything (or even 1% of everything).  They can learn anything given enough time, but literally nobody is given enough time.  In practice, we have to take a lot of things on faith, and any reasonable education system will have to work within this limit.  Ideally, it would also teach kids that experts in other fields are often right even when it would take them several years to learn why.

*There are also average people who can learn anything that isn't too complicated and below-average people who can't learn all that much.  Don't blame me; I didn't do it.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2022-06-12T01:25:20.528Z · LW(p) · GW(p)

My point is not that one should learn more, but about understanding naturally related to any given claim of fact, whose absence makes it brittle and hollow. This sort of curiosity does apply to your examples, not in a remedial way that's only actually useful for other things. The dots being connected are not other claims of fact, but alternative versions of the claim (including false ones) and ingredients of motivation for looking into the fact and its alternatives, including more general ideas whose shadows influence the claim. These gears of the idea do nothing for policies that depend on the fact, if it happens to be used appropriately, but tend to reassemble into related ideas that you never heard about (which gives an opportunity to learn what is already known about them).

It doesn't require learning much more, or about toothbrushes, it's instead emphasis of curiosity on things other than directly visible claims of fact, that shifts attention to those other things when presented with a given claim. This probably results in knowing less, with greater fluency.

Replies from: Jay
comment by Jay · 2022-06-12T11:58:00.758Z · LW(p) · GW(p)

To the extent that I understand what you're saying, you seem to be arguing for curiosity as a means of developing a detailed, mechanistic ("gears-level" in your term) model of reality.  I totally support this, especially for the smart kids.  I'm just trying to balance it out with some realism and humility.  I've known too many people who know that their own area of expertise is incredibly complicated but assume that everything they don't understand is much simpler.  In my experience, a lot of projects fail because a problem that was assumed to be simple turned out not to be.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2022-06-12T15:01:23.855Z · LW(p) · GW(p)

This is useless in practice and detrimental to being a living encyclopedia, distracting from facts deemed salient by civilization. Combinatorial models of more specific and isolated ideas you take an interest in, building blocks for reassembling into related ideas, things that can be played with and not just taken from literature and applied according to a standard methodology. The building blocks are not meant to reconstruct ideas directly useful in practice, it's more about forming common sense and prototyping. The kind of stuff you learn in the second year of college (the gears, mathematical tools, empirical laws), in the role of how you make use of it in the fourth year of college (the ideas reassembled from them, claims independently known that interact with them, things that can't be explained without the background), but on the scale of much smaller topics.

Well, that's the attempt to channel my impression of the gears/policy distinction, which I find personally rewarding, but not necessarily useful in practice, even for research. It's a theorist's aesthetic more than anything else.

comment by Jiao Bu (jiao-bu) · 2023-12-12T16:54:16.544Z · LW(p) · GW(p)

"There are large bodies of highly reliable knowledge in the world,[...]"

The purpose of the questioning is to find out which objects are in that bucket, and which objects are in some other bucket.

If the child accepts what she is told about (A)There are large bodies of highly reliable knowledge in the world, and (B) This is one of them, then you might get many types of crazy.

TH;DT:  The idea of firmly established ideas is unfortunately culturally and sub-culturally bound, at least to an extent.  Which "firmly established truths" are currently being taught in Shalafi schools?  I think the "flat-earthers, Qanon, etc...," could easily destroy the nonsense of their beliefs if they could employ a bit of the questioning.

Maybe what you and I are saying is a strong case of reversible advice?

Replies from: Jay
comment by Jay · 2023-12-16T00:04:36.446Z · LW(p) · GW(p)

You seem to be steering in the direction of postmodernism, which starts with the realization that there are many internally consistent yet mutually exclusive ways of modeling the world.  Humility won't solve that problem, but neither will a questioning mindset.  

Every intellectual dead-end was once the product of a questioning mind.  Questioning is much more likely to iterate toward a dead end than to generate useful results.  This isn't to say that it's never useful (it obviously can be), but it rarely succeeds and is only the optimal path if you're near the frontiers of current understanding (which schoolchildren obviously aren't).

The best way to get out of a local maximum that I've found is to incorporate elements of a different, but clearly functional, intellectual tradition.

Replies from: jiao-bu
comment by Jiao Bu (jiao-bu) · 2024-01-11T15:59:00.417Z · LW(p) · GW(p)

"The best way to get out of a local maximum that I've found is to incorporate elements of a different, but clearly functional, intellectual tradition."

I agree wholeheartedly with this being a good way (Not sure about "best").  The crux is "clearly functional" and "maxima" -- and as an adult, I can make pretty good judgments about this.  I'm also likely to bake in some biases about this that could be wrong.  And depending on what society you find yourself within, you might do the same.

If I understand you, you are basically asking to jump from one maxima to another, assuming that in doing this search algorithm, you will eventually find a maxima that's better than the one you're in, or get enough information to go back to the previous one.  And we limit our search on "functional."

But what if you have little information or priors available as to what would be functional or not, or even what constitutes a maxima?  There's no information telling a child not to go join a fringe religious group, for example (and I think they often do their recruiting among the very young, for this reason).  

Moreover, if someone (1) without clear criteria for what constitutes a "maxima" or "functional," or (2) who may even wish to explore other models of "functional" because they suspect their current model may be self-limiting, then we get to questioning.

And I think in (2) above, I am defining the positive side of post-modernism, which also exists and contributes to our society.  The most salient criticism of post-modernism is usually that it is anti-heirarchical, yet insisting it is a better approach than those before it, constitutes a performative contradiction.  Also, I think they are sometimes guilty of taking a "noble savage" approach to other cultures or ways of thinking (failure to judge what is functional).

However, if we combine the "questioning" (broad search, willing to approach with depth where it seems useful), with some level of judgement about "functional" (assuming our judgement is sound), then I think it's still a useful approach.

Because what you have presented offers no method I can see for a child without existing priors, or someone educated in a Shalafi school or similar (where judgement of "functional" is artificially curtailed), to find better ways to think.

Replies from: Jay
comment by Jay · 2024-01-15T00:58:52.583Z · LW(p) · GW(p)

A child who's educated in a Salafi school has two choices - become a Salafi or become a failed Salafi.  One of those is clearly better than the other. Salafis, like almost every adult, know how to navigate their environment semi-successfully and the first job of education is to pass on that knowledge.  It would better if the kid could be given a better education, but the kid won't have much control over that (and wouldn't have the understanding to choose well).  Kids are ignorant and powerless; that's not a function of any particular political or philosophical system.  

I think in general it's best for children to learn from adults mostly by rote.  Children should certainly ask questions of the adults, but independent inquiry will be at best inefficient and usually a wrong turn.   The lecture-and-test method works, and AFAIK we don't have anything else that teaches nearly as well.

Later, when they have some understanding, they can look around for better examples.  

Replies from: jiao-bu
comment by Jiao Bu (jiao-bu) · 2024-01-22T19:04:26.009Z · LW(p) · GW(p)

We are also overloading the word "Child" here, which we may need to disambiguate at this point.

What you are saying applies broadly to a 7 year old, and less to a 16 year old.   For the 16 year old, there's no longer 2 possible outcomes "succeed as a Salafi" or "fail as a Salafi."  There is often the very real option to "Make your way towards something else."  And the seeds of that could easily start (probably did!) in the 13 or 14 year old.

It's also neat that humans are kind of wired where the great questioning/rebellion tends to happen more in the 13-to-16-year-old than the 7-year-old.  Thus the common phenomenon where the person graduates high school and church at the same time, or leaves the cult, emigrates, etc.

Replies from: Jay
comment by Jay · 2024-01-22T23:36:00.916Z · LW(p) · GW(p)

I think you're onto something.  I think, for this purpose, "child" means anyone who doesn't know enough about the topic to have any realistic chance at successful innovation.  A talented 16 year old might successfully innovate in a field like music or cooking, having had enough time to learn the basics.  When I was that age kids occasionally came up with useful new ideas in computer programming, but modern coding seems much more sophisticated.  In a very developed field, one might not be ready to innovate until several years into graduate school.  

A 16-year-old Salafi will be strongly influenced by his Salafi upbringing.  Even if he* rebels, he'll be rebelling against that specific strain of Islam.  It would take a very long and very specific journey to take him toward California-style liberalism; given the opportunity to explore he'd likely end up somewhere very different.

*My understanding of this particular Islamic school is hazy, but I doubt our student is female.

comment by Adam Zerner (adamzerner) · 2022-06-12T05:08:06.554Z · LW(p) · GW(p)

Is there some connection to the mountain troll in HPMoR? I'm not seeing it, but I feel like the title would be too big a coincidence otherwise.

Replies from: Dirichlet-to-Neumann
comment by Dirichlet-to-Neumann · 2022-06-12T15:30:45.979Z · LW(p) · GW(p)

Sometime a mountain troll is just a mountain troll.

comment by NoriMori1992 · 2024-04-03T05:05:43.259Z · LW(p) · GW(p)

You believes Bayesianism

 

Should be "believe"

Replies from: lsusr
comment by lsusr · 2024-04-03T16:44:55.080Z · LW(p) · GW(p)

Fixed. Thanks.

comment by Darmani · 2022-06-12T04:01:25.176Z · LW(p) · GW(p)

I don't see what this parable has to do with Bayesianism or Frequentism. 

 

I thought this was going to be some kind of trap or joke around how "probability of belief in Bayesianism" is a nonsense question in Frequentism.

Replies from: amaury-lorin
comment by momom2 (amaury-lorin) · 2024-09-18T10:43:26.889Z · LW(p) · GW(p)

It's thought-provoking.
Many people here identify as Bayesians, but are as confused as Saundra by the troll's questions, which indicates that they're missing something important.