Beyond Statistics 101

post by JonahS (JonahSinick) · 2015-06-26T10:24:41.919Z · LW · GW · Legacy · 129 comments

Contents

  Is statistics beyond introductory statistics important for general reasoning?
  Advanced statistics enables one to reach nonobvious conclusions
  IQ research and PCA as a case study
  The philosophy of dimensionality reduction
None
129 comments

Is statistics beyond introductory statistics important for general reasoning?

Ideas such as regression to the mean, that correlation does not imply causation and base rate fallacy are very important for reasoning about the world in general. One gets these from a deep understanding of statistics 101, and the basics of the Bayesian statistical paradigm. Up until one year ago, I was under the impression that more advanced statistics is technical elaboration that doesn't offer major additional insights  into thinking about the world in general.

Nothing could be further from the truth: ideas from advanced statistics are essential for reasoning about the world, even on a day-to-day level. In hindsight my prior belief seems very naive – as far as I can tell, my only reason for holding it is that I hadn't heard anyone say otherwise. But I hadn't actually looked advanced statistics to see whether or not my impression was justified :D.

Since then, I've learned some advanced statistics and machine learning, and the ideas that I've learned have radically altered my worldview. The "official" prerequisites for this material are calculus, differential multivariable calculus, and linear algebra. But one doesn't actually need to have detailed knowledge of these to understand ideas from advanced statistics well enough to benefit from them. The problem is pedagogical: I need to figure out how how to communicate them in an accessible way.

Advanced statistics enables one to reach nonobvious conclusions

To give a bird's eye view of the perspective that I've arrived at, in practice, the ideas from "basic" statistics are generally useful primarily for disproving hypotheses. This pushes in the direction of a state of radical agnosticism: the idea that one can't really know anything for sure about lots of important questions. More advanced statistics enables one to become justifiably confident in nonobvious conclusions, often even in the absence of formal evidence coming from the standard scientific practice.

IQ research and PCA as a case study

In the early 20th century, the psychologist and statistician Charles Spearman discovered the the g-factor, which is what IQ tests are designed to measure. The g-factor is one of the most powerful constructs that's come out of psychology research. There are many factors that played a role in enabling Bill Gates ability to save perhaps millions of lives, but one of the most salient factors is his IQ being in the top ~1% of his class at Harvard. IQ research helped the Gates Foundation to recognize iodine supplementation as a nutritional intervention that would improve socioeconomic prospects for children in the developing world.

The work of Spearman and his successors on IQ constitute one of the pinnacles of achievement in the social sciences. But while Spearman's discovery of IQ was a great discovery, it wasn't his greatest discovery. His greatest discovery was a discovery about how to do social science research. He pioneered the use of factor analysis, a close relative of principal component analysis (PCA).

The philosophy of dimensionality reduction

PCA is a dimensionality reduction method. Real world data often has the surprising property of "dimensionality reduction":  a small number of latent variables explain a large fraction of the variance in data.

This is related to the effectiveness of Occam's razor: it turns out to be possible to describe a surprisingly large amount of what we see around us in terms of a small number of variables. Only, the variables that explain a lot usually aren't the variables that are immediately visibleinstead they're hidden from us, and in order to model reality, we need to discover them, which is the function that PCA serves. The small number of variables that drive a large fraction of variance in data can be thought of as a sort of "backbone" of the data. That enables one to understand the data at a "macro /  big picture / structural" level.

This is a very long story that will take a long time to flesh out, and doing so is one of my main goals. 

129 comments

Comments sorted by top scores.

comment by minusdash · 2015-06-26T14:43:42.453Z · LW(p) · GW(p)

"impression that more advanced statistics is technical elaboration that doesn't offer major additional insights"

Why did you have this impression?

Sorry for the off-topic, but I see this a lot in LessWrong (as a casual reader). People seem to focus on textual, deep-sounding, wow-inducing expositions, but often dislike the technicalities, getting hands dirty with actually understanding calculations, equations, formulas, details of algorithms etc (calculations that don't tickle those wow-receptors that we all have). As if these were merely some minor additions over the really important big picture view. As I see it this movement seems to try to build up a new backbone of knowledge from scratch. But doing this they repeat the mistakes of the past philosophers. For example going for the "deep", outlook-transforming texts that often give a delusional feeling of "oh now I understand the whole world". It's easy to have wow-moments without actually having understood something new.

So yes, PCA is useful and most statistics and maths and computer science is useful for understanding stuff. But then you swing to the other extreme and say "ideas from advanced statistics are essential for reasoning about the world, even on a day-to-day level". Tell me how exactly you're planning to use PCA day-to-day? I think you may mean you want to use some "insight" that you gained from it. But I'm not sure what that would be. It seems to be a cartoonish distortion that makes it fit into an ideology.

Anyway, mainstream machine learning is very useful. And it's usually much more intricate and complicated than to be able to produce a deep everyday insight out of it. I think the sooner you lose the need for everything to resonate deeply or have a concise insightful summary, the better.

Replies from: JonahSinick, VoiceOfRa, RomeoStevens, None
comment by JonahS (JonahSinick) · 2015-06-26T15:02:10.388Z · LW(p) · GW(p)

Why did you have this impression?

Groupthink I guess: other people who I knew didn't think that it's so important (despite being people who are very well educated by conventional standards, top ~1% of elite colleges).

Tell me how exactly you're planning to use PCA day-to-day?

Disclaimer: I know that I'm not giving enough evidence to convince you: I've thought about this for thousands of hours (including working through many quantitative examples) and it's taking me a long time to figure out how to organize what I've learned.

I already have been using dimensionality reduction (qualitatively) in my day to day life, and I've found that it's greatly improved my interpersonal relationships because it's made it much easier to guess where people are coming from (before people's social behavior had seemed like a complicated blur because I saw so many variables without having started to correctly identify the latent ones).

i think the sooner you lose the need for everything to resonate deeply or have a concise insightful summary, the better.

You seem to be making overly strong assumptions with insufficient evidence: how would you know whether this was the case, never having met me? ;-)

Replies from: minusdash, InquilineKea
comment by minusdash · 2015-06-26T15:29:36.887Z · LW(p) · GW(p)

Qualitative day-to-day dimensionality reduction sounds like woo to me. Not a bit more convincing than quantum woo (Deepak Chopra et al.). Whatever you're doing, it's surely not like doing SVD on a data matrix or eigen-decomposition on the covariance matrix of your observations.

Of course, you can often identify motivations behind people's actions. A lot of psychology is basically trying to uncover these motivations. Basically an intentional interpretation and a theory of mind are examples of dimensionality reduction in some sense. Instead of explaining behavior by reasoning about receptors and neurons, you imagine a conscious agent with beliefs, desires and intentions. You could also link it to data compression (dimensionality reduction is a sort of lossy data compression). But I wouldn't say I'm using advanced data compression algorithms when playing with my dog. It just sounds pretentious and shows a desperate need to signal smartness.

So, what is the evidence that you are consciously doing something similar to PCA in social life? Do you write down variables and numbers, or how can I imagine qualitative dimensionality reduction. How is it different from somebody just getting an opinion intuitively and then justifying it with afterwards?

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-26T18:26:56.220Z · LW(p) · GW(p)

See Rationality is about pattern recognition, not reasoning.

Your tone is condescending, far outside of politeness norms. In the past I would have uncharitably written this off to you being depraved, but I've realized that I should be making a stronger effort to understand other people's perspectives. So can you help me understand where you're coming from on an emotional level?

Replies from: minusdash, 27chaos, JonahSinick
comment by minusdash · 2015-06-26T19:14:26.644Z · LW(p) · GW(p)

You asked about emotional stuff so here is my perspective. I have extremely weird feelings about this whole forum that may affect my writing style. My view is constantly popping back and forth between different views, like in the rabbit-duck gestalt image. On one hand I often see interesting and very good arguments, but on the other hand I see tons of red flags popping up. I feel that I need to maintain extreme mental efforts to stay "sane" here. Maybe I should refrain from commenting. It's a pity because I'm generally very interested in the topics discussed here, but the tone and the underlying ideology is pushing me away. On the other hand I feel an urge to check out the posts despite this effect. I'm not sure what aspect of certain forums have this psychological effect on my thinking, but I've felt it on various reddit communities as well.

Replies from: None, JonahSinick, ChristianKl, Vaniver
comment by [deleted] · 2015-06-26T23:19:26.653Z · LW(p) · GW(p)

On one hand I often see interesting and very good arguments, but on the other hand I see tons of red flags popping up. I feel that I need to maintain extreme mental efforts to stay "sane" here.

Seconded, actually, and it's particular to LessWrong. I know I often joke that posting here gets treated as submitting academic material and skewered accordingly, but that is very much what it feels like from the inside. It feels like confronting a hostile crowd of, as Jonah put it, radical agnostics, every single time one posts, and they're waiting for you to say something so they can jump down your throat about it.

Oh, and then you run into the issue of having radically different priors and beliefs, so that you find yourself on a "rationality" site where someone is suddenly using the term "global warming believer" as though the IPCC never issued multiple reports full of statistical evidence. I mean, sure, I can put some probability on, "It's all a conspiracy and the official scientists are lying", but for me that's in the "nonsense zone" -- I actually take offense to being asked to justify my belief in mainstream science.

As much as "good Bayesians" are never supposed to agree to disagree, I would very much like if people would be up-front about their priors and beliefs, so that we can both decide whether it's worth the energy spent on long threads of trying to convince people of things.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-27T01:06:32.294Z · LW(p) · GW(p)

Oh, and then you run into the issue of having radically different priors and beliefs, so that you find yourself on a "rationality" site where someone is suddenly using the term "global warming believer" as though the IPCC never issued multiple reports full of statistical evidence.

Rather bad statistical evidence I might add. Seriously, your argument amounts to an appeal to authority. Whatever happened to nullius in verba?

I mean, sure, I can put some probability on, "It's all a conspiracy and the official scientists are lying",

Some of them are, a lot of them were even caught when the climategate emails went public. Most of them, however, are some combination of ideologues and people who couldn't handle the harder sciences and are now memorizing the teacher's password, in other words a prospiracy. Add in what happens to climate journals that dare publish anything insufficiently alarmist and one gets the idea about the current state of climate science.

Replies from: None
comment by [deleted] · 2015-06-28T19:04:20.623Z · LW(p) · GW(p)

Appeal to Authority? Not in the normal sense that the IPCC exercises violent force, and I therefore designate them factually correct. No, it's an Appeal to Expertise Outside My Own Domain. It's me expecting that the same academic and scientific processes and methods that produced my expertise in my fields produced domain-experts in other fields with their own expertise, and that I can therefore trust in their findings about as thoroughly as I trust in my own.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-28T20:02:59.642Z · LW(p) · GW(p)

Appeal to Authority? Not in the normal sense that the IPCC exercises violent force, and I therefore designate them factually correct.

That's not the normal sense of appeal to authority, that would be appeal to force.

No, it's an Appeal to Expertise Outside My Own Domain.

And how do you know that they're actual experts? Because they (metaphorically) wear lab coats? That's what appeal to authority is. While it's not necessarily a fallacy, it's notable that science started making progress as soon as people disavowed using it.

Replies from: Good_Burning_Plastic
comment by Good_Burning_Plastic · 2015-06-28T22:59:05.670Z · LW(p) · GW(p)

Do you believe that the mass of the muon as listed by the Particle Data Group is at least approximately correct? If so, why?

Replies from: soreff, VoiceOfRa
comment by soreff · 2015-06-29T03:32:05.887Z · LW(p) · GW(p)

I haven't tracked down the specific evidence - but muons are comparatively easy: They live long enough to leave tracks in particle detectors with known magnetic fields. That gives you the charge-to-mass ratio. Given that charge looks quantized (Milliken oil drop experiment and umpteen repetitions), and there are other pieces of evidence from the particle tracks of muon decay (and the electrons from that decay again leave tracks, and the angles are visible even if the neutrinos aren't) - I'd be surprised if the muon mass wasn't pretty solid.

Replies from: Good_Burning_Plastic
comment by Good_Burning_Plastic · 2015-06-29T10:09:57.452Z · LW(p) · GW(p)

Assuming that both particle physicists and climatologists are doing things properly, that would only mean that the muon mass has much smaller error bars than the global warming (which it does), not that the former is more likely to be correct within its error bars.

Then again, it's possible that climatologists are less likely to be doing things properly.

comment by VoiceOfRa · 2015-06-30T00:27:16.623Z · LW(p) · GW(p)

If you ask a physicist or an evolutionist why their beliefs are correct they will generally give you an answer (or at least start talking about the general principal). If you ask that question about climate science you'll generally get either a direct appeal to authority or an indirect one: it's all in this official report which I haven't read but it's official so it must be correct.

Heck climate scientists aren't even that sparing about basic facts. They'll mention that CO2 is a greenhouse gas, but avoid any more technical questions. For example, I only recently found out that (in the absence of other factors or any feedback) temperature is a logarithmic function of CO2 concentration.

Replies from: EHeller
comment by EHeller · 2015-06-30T02:34:45.109Z · LW(p) · GW(p)

Heck climate scientists aren't even that sparing about basic facts. They'll mention that CO2 is a greenhouse gas, but avoid any more technical questions. For example, I only recently found out that (in the absence of other factors or any feedback) temperature is a logarithmic function of CO2 concentration.

So this seems like you've never cracked open any climate/atmospheric science textbook? Because that is pretty basic info. It seems like you're determined to be skeptical despite not really spending much time learning about the state of the science. Also it sounds like you are equivocating between "climate scientist" and "person on the internet who believes in global warming."

My background is particle physics, if someone asked me about the mass of a muon, I'd have to make about a hundred appeals to authority to give them any relevant information, and I suspect climate scientists are in the same boat when talking to people who don't understand some of the basics. I've personally engaged with special relativity crackpots who ask you to justify everything, and keep saying this or that basic fact from the field is an appeal to authority. There is no convincing a determined skeptic, so it's best not to engage.

If you are near a university campus, wait until there is a technical talk on climate modelling and go sit and listen (don't ask questions, just listen). You'll probably be surprised at how vociferous the debate is- climate modelers are serious scientists working hard on perfecting their models.

comment by JonahS (JonahSinick) · 2015-06-26T22:04:21.156Z · LW(p) · GW(p)

Thanks so much for sharing. I'm astonished by how much more fruitful my relationships have became since I've started asking.

I think that a lot of what you're seeing is a cultural clash: different communities have different blindspots and norms for communication, and a lot of times the combination of (i) blindspots of the communities that one is familiar with and (ii) respects in which a new community actually is unsound can give one the impression "these people are beyond the pale!" when the actual situation is that they're no less rational than members of one's own communities.

I had a very similar experience to your own coming from academia, and wrote a post titled The Importance of Self-Doubt in which I raised the concern that Less Wrong was functioning as a cult. But since then I've realized that a lot of the apparently weird beliefs on LWers are in fact also believed by very credible people: for example, Bill Gates recently expressed serious concern about AI risk.

If you're new to the community, you're probably unfamiliar with my own credentials which should reassure you somewhat:

  • I did a PhD in pure math under the direction of Nathan Dunfield, who coauthored papers with Bill Thurston, who formulated the geometrization conjecture which Perelman proved and in doing so won one of the Clay Millennium Problems.

  • I've been deeply involved with math education for highly gifted children for many years. I worked with the person who won the American Math Society prize for best undergraduate research when he was 12.

  • I worked at GiveWell, which partners with with Good Ventures, Dustin Moskovitz's foundation.

  • I've done fullstack web development, making an asynchronous clone of StackOverflow (link).

  • I've done machine learning, rediscovering logistic regression, collaborative filtering, hierarchical modeling, the use of principal component analysis to deal with multicollinearity, and cross validation. (I found the expositions so poor that it was faster for me to work things out on my own than to learn from them, though I eventually learned the official versions).You can read some details of things that I found here. I did a project implementing Bayesian adjustment of Yelp restaurant star ratings using their public dataset here

So I imagine that I'm credible by your standards. There are other people involved in the community who you might find even more credible. For example: (a) Paul Christiano who was an international math olympiad medalist, wrote a 50 page paper on quantum computational complexity with Scott Aaronson as an undergraduate at MIT, and is a theoretical CS grad student at Berkeley. (b) Jacob Steinhardt, a Hertz graduate fellow who does machine learning research under Percy Liang at Stanford.

So you're not actually in some sort of twilight zone. I share some of your concerns with the community, but the groupthink here is no stronger than the groupthink present in academia. I'd be happy to share my impressions of the relative soundness of the various LW community practices and beliefs.

Replies from: None, minusdash
comment by [deleted] · 2015-06-26T23:24:53.876Z · LW(p) · GW(p)

There are other people involved in the community who you might find even more credible. For example: (a) Paul Christiano who was an international math olympiad medalist, wrote a 50 page paper on quantum computational complexity with Scott Aaronson as an undergraduate at MIT, and is a theoretical CS grad student at Berkeley. (b) Jacob Steinhardt, a Hertz graduate fellow who does machine learning research under Percy Liang at Stanford.

Of course, Christiano tends to issue disclaimers with his MIRI-branded AGI safety work, explicitly stating that he does not believe in alarmist UFAI scenarios. Which is fine, in itself, but it does show how people expect someone associated with these communities to sound.

And Jacob Steinhardt hasn't exactly endorsed any "Twilight Zone" community norms or propaganda views. Errr, is there a term for "things everyone in a group thinks everyone else believes, whether or not they actually do"?

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-27T01:50:52.837Z · LW(p) · GW(p)

I'm not claiming otherwise: I'm merely saying that Paul and Jacob don't dismiss LWers out of hand as obviously crazy, and have in fact found the community to be worthwhile enough to have participated substantially.

Replies from: None
comment by [deleted] · 2015-06-28T19:10:24.303Z · LW(p) · GW(p)

I think in this case we have to taboo the term "LWers" ;-). This community has many pieces in it, and two large parts of the original core are "techno-libertarian Overcoming Bias readers with many very non-mainstream beliefs that they claim are much more rational than anyone else's beliefs" and "the SL4 mailing list wearing suits and trying to act professional enough that they might actually accomplish their Shock Level Four dreams."

On the other hand, in the process of the site's growth, it has eventually come to encompass those two demographics plus, to some limited extent, almost everyone who's willing to assent that science, statistical reasoning, and the neuro/cognitive sciences actually really work and should be taken seriously. With special emphasis on statistical reasoning and cognitive sciences.

So the core demographic consists of Very Unusual People, but the periphery demographics, who now make up most of the community, consist of only Mildly Unusual People.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-29T07:18:41.346Z · LW(p) · GW(p)

Yes, this seems like a fair assessment o the situation. Thanks for disentangling the issues. I'll be more precise in the future.

comment by minusdash · 2015-06-26T22:20:44.586Z · LW(p) · GW(p)

Those are indeed impressive things you did. I agree very much with your post from 2010. But the fact that many people have this initial impression shows that something is wrong. What makes it look like a "twilight zone"? Why don't I feel the same symptoms for example on Scott Alexander's Slate Star Codex blog?

Another thing I could pinpoint is that I don't want to identify as a "rationalist", I don't want to be any -ist. It seems like a tactic to make people identify with a group and swallow "the whole package". (I also don't think people should identify as atheist either.)

Replies from: JonahSinick, ChristianKl, None
comment by JonahS (JonahSinick) · 2015-06-27T01:33:50.601Z · LW(p) · GW(p)

I'm sympathetic to everything you say.

In my experience there's an issue of Less Wrongers being unusually emotionally damaged (e.g. relative to academics) and this gives rise to a lot of problems in the community. But I don't think that the emotional damage primarily comes from the weird stuff that you see on Less Wrong. What one sees is them having born the brunt of the phenomenon that I described here disproportionately relative to other smart people, often because they're unusually creative and have been marginalized by conformist norms

Quite frankly, I find the norms in academia very creepy: I've seen a lot of people develop serious mental health problems in connection with their experiences in academia. It's hard to see it from the inside: I was disturbed by what I saw, but I didn't realize that math academia is actually functioning as a cult, based on retrospective impressions, and in fact by implicit consensus of the best mathematicians of the world (I can give references if you'd like) .

Replies from: Pentashagon, eternal_neophyte, None, minusdash
comment by Pentashagon · 2015-06-30T04:03:17.380Z · LW(p) · GW(p)

I was disturbed by what I saw, but I didn't realize that math academia is actually functioning as a cult

I'm sure you're aware that the word "cult" is a strong claim that requires a lot of evidence, but I'd also issue a friendly warning that to me at least it immediately set off my "crank" alarm bells. I've seen too many Usenet posters who are sure they have a P=/!=NP proof, or a proof that set theory is false, or etc. who ultimately claim that because "the mathematical elite" are a cult that no one will listen to them. A cult generally engages in active suppression, often defamation, and not simply exclusion. Do you have evidence of legitimate mathematical results or research being hidden/withdrawn from journals or publicly derided, or is it more of an old boy's club that's hard for outsiders to participate in and that plays petty politics to the damage of the science?

Grothendieck's problems look to be political and interpersonal. Perelman's also. I think it's one thing to claim that mathematical institutions are no more rational than any other politicized body, and quite another to claim that it's a cult. Or maybe most social behavior is too cult-like. If so; perhaps don't single out mathematics.

I've seen a lot of people develop serious mental health problems in connection with their experiences in academia.

I question the direction of causation. Historically many great mathematicians have been mentally and socially atypical and ended up not making much sense with their later writings. Either mathematics has always had an institutional problem or mathematicians have always had an incidence of mental difficulties (or a combination of both; but I would expect one to dominate).

Especially in Thurston's On Proof and Progress in Mathematics I can appreciate the problem of trying to grok specialized areas of mathematics. The terminology and symbology is opaque to the uninitiated. It reminds me of section 1 of the Metamath Book which expresses similar unhappiness with the state of knowledge between specialist fields of mathematics and the general difficulty of learning mathematics. I had hoped that Metamath would become more popular and tie various subfields together through unifying theories and definitions, but as far as I can tell it languishes as a hobbyist project for a few dedicated mathematicians.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-30T05:22:04.815Z · LW(p) · GW(p)

I'm sure you're aware that the word "cult" is a strong claim that requires a lot of evidence, but I'd also issue a friendly warning that to me at least it immediately set off my "crank" alarm bells.

Thanks, yeah, people have been telling me that I need to be more careful in how I frame things. :-)

Do you have evidence of legitimate mathematical results or research being hidden/withdrawn from journals or publicly derided, or is it more of an old boy's club that's hard for outsiders to participate in and that plays petty politics to the damage of the science?

The latter, but note that that's not necessarily less damaging than active suppression would be.

Or maybe most social behavior is too cult-like. If so; perhaps don't single out mathematics.

Yes, this is what I believe. The math community is just unusually salient to me, but I should phrase things more carefully.

I question the direction of causation. Historically many great mathematicians have been mentally and socially atypical and ended up not making much sense with their later writings. Either mathematics has always had an institutional problem or mathematicians have always had an incidence of mental difficulties (or a combination of both; but I would expect one to dominate).

Most of the people who I have in mind did have preexisting difficulties. I meant something like "relative to a counterfactual where academia was serving its intended function." People of very high intellectual curiosity sometimes approach academia believing that it will be an oasis and find this not to be at all the case, and that the structures in place are in fact hostile to them.

This is not what the government should be supporting with taxpayer dollars.

Especially in Thurston's On Proof and Progress in Mathematics I can appreciate the problem of trying to grok specialized areas of mathematics.

What are your own interests?

Replies from: Pentashagon
comment by Pentashagon · 2015-07-01T06:08:37.263Z · LW(p) · GW(p)

The latter, but note that that's not necessarily less damaging than active suppression would be.

I suppose there's one scant anecdote for estimating this; cryptography research seemed to lag a decade or two behind actively suppressed/hidden government research. Granted, there was also less public interest in cryptography until the 80s or 90s, but it seems that suppression can only delay publication, not prevent it.

The real risk of suppression and exclusion both seem to be in permanently discouraging mathematicians who would otherwise make great breakthroughs, since affecting the timing of publication/discovery doesn't seem as damaging.

This is not what the government should be supporting with taxpayer dollars.

I think I would be surprised if Basic Income was a less effective strategy than targeted government research funding.

What are your own interests?

Everything from logic and axiomatic foundations of mathematics to practical use of advanced theorems for computer science. What attracted me to Metamath was the idea that if I encountered a paper that was totally unintelligible to me (say Perelman's proof of Poincaire's conjecture or Wiles' proof of Fermat's Last Theorem) I could backtrack through sound definitions to concepts I already knew, and then build my understanding up from those definitions. Alas, just having a cross-reference of related definitions between various fields would be helpful. I take it that model theory is the place to look for such a cross-reference, and so that is probably the next thing I plan to study.

Practically, I realize that I don't have enough time or patience or mental ability to slog through formal definitions all day, and so it would be nice to have something even better. A universal mathematical educator, so to speak. Although I worry that without a strong formal understanding I will miss important results/insights. So my other interest is building the kind of agent that can identify which formal insights are useful or important, which sort of naturally leads to an interest in AI and decision theory.

comment by eternal_neophyte · 2015-06-28T19:58:31.992Z · LW(p) · GW(p)

I would like to see some of those references (simply because I have no relation to Academia, and don't like things I read somewhere to gestate into unfounded intuitions about a subject).

comment by [deleted] · 2015-06-28T19:32:50.775Z · LW(p) · GW(p)

Quite frankly, I find the norms in academia very creepy: I've seen a lot of people develop serious mental health problems in connection with their experiences in academia. It's hard to see it from the inside: I was disturbed by what I saw, but I didn't realize that math academia is actually functioning as a cult, based on retrospective impressions, and in fact by implicit consensus of the best mathematicians of the world (I can give references if you'd like) .

I've only been in CS academia, and wouldn't call that a cult. I would call it, like most of the rest of academia, a deeply dysfunctional industry in which to work, but that's the fault of the academic career and funding structure. CS is even relatively healthy by comparison to much of the rest.

How much of our impression of mathematics as a creepy, mental-health-harming cult comes from pure stereotyping?

Replies from: ChristianKl, ZoltanBerrigomo, Richard_Kennaway, JonahSinick
comment by ChristianKl · 2015-06-28T19:40:31.124Z · LW(p) · GW(p)

How much of our impression of mathematics as a creepy, mental-health-harming cult comes from pure stereotyping?

Jonah happens to be a math phd. How can you engage in pure stereotyping of mathematicians while you get your PHD?

Replies from: None
comment by [deleted] · 2015-06-28T23:26:35.054Z · LW(p) · GW(p)

I was more positing that it's a self-reinforcing, self-creating effect: people treat Mathematics in a cultish way because they think they're supposed to.

Replies from: Richard_Kennaway, ChristianKl
comment by Richard_Kennaway · 2015-06-29T12:26:52.041Z · LW(p) · GW(p)

I was more positing that it's a self-reinforcing, self-creating effect

I don't believe there's any such thing, on the general grounds of "no fake without a reality to be a fake of."

comment by ChristianKl · 2015-06-28T23:40:32.943Z · LW(p) · GW(p)

Who do you mean when you say "people"?

comment by ZoltanBerrigomo · 2015-06-30T00:11:28.535Z · LW(p) · GW(p)

For what its worth, I have observed a certain reverence in the way great mathematicians are treated by their lesser-accomplished colleagues that can often border on the creepy. This is something specific to math, in that it seems to exist in other disciplines with lesser intensity.

But I agree, "dysfunctional" seems to be a more apt label than "cult." May I also add "fashion-prone?"

comment by Richard_Kennaway · 2015-06-29T12:25:43.079Z · LW(p) · GW(p)

How much of our impression of mathematics as a creepy, mental-health-harming cult

Er, what? Who do you mean by "we"?

comes from pure stereotyping?

The link says of Turing:

Finally, Alan Turing, the great Bletchley Park code breaker, father of computer science and homosexual, died trying to prove that some things are fundamentally unprovable.

This is a staggeringly wrong account of how he died.

Replies from: None
comment by [deleted] · 2015-06-29T23:07:17.303Z · LW(p) · GW(p)

This is a staggeringly wrong account of how he died.

Hence my calling it "pure stereotyping"!

comment by JonahS (JonahSinick) · 2015-06-29T07:35:01.260Z · LW(p) · GW(p)

I don't have direct exposure to CS academia, which, as you comment, is known to be healthier :-). I was speaking in broad brushstrokes , I'll qualify my claims and impressions more carefully later.

comment by minusdash · 2015-06-27T01:46:24.724Z · LW(p) · GW(p)

I don't really understand what you mean about math academia. Those references would be appreciated.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-27T02:02:48.644Z · LW(p) · GW(p)

The top 3 answers to the MathOverflow question Which mathematicians have influenced you the most? are Alexander Grothendieck, Mikhail Gromov, and Bill Thurston. Each of these have expressed serious concerns about the community.

  • Grothendieck was actually effectively excommunicated by the mathematical community and then was pathologized as having gone crazy. See pages 37-40 of David Ruelle's book A Mathematician's Brain.

  • Gromov expresses strong sympathy for Grigory Perelman having left the mathematical community starting on page 110 of Perfect Rigor. (You can search for "Gromov" in the pdf to see all of his remarks on the subject.)

  • Thurston made very apt criticisms of the mathematical community in his essay On Proof and Progress In Mathematics. See especially the beginning of Section 3: "How is mathematical understanding communicated?" Terry Tao endorses Thurston's essay in his obituary of Thurston. But the community has essentially ignored Thurston's remarks: one almost never hears people talk about the points that Thurston raises.

Replies from: itaibn0, ZoltanBerrigomo, Will_Sawin
comment by itaibn0 · 2015-06-28T23:38:58.060Z · LW(p) · GW(p)

I don't know about Grothendieck, but the two other sources appear to have softer criticism of the mathematical community than "actually functioning as a cult".

comment by ZoltanBerrigomo · 2015-06-29T06:15:17.935Z · LW(p) · GW(p)

The links you give are extremely interesting, but, unless I am missing something, it seems that they fall short of justifying your earlier statement that math academia functions as a cult. I wonder if you would be willing to elaborate further on that?

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-29T07:30:37.632Z · LW(p) · GW(p)

I'll be writing more about this later.

The most scary thing to me is that the most mathematically talented students are often turned off by what they see in math classes, even at the undergraduate and graduate levels. Math serves as a backbone for the sciences, so this may badly undercutting scientific innovation at a societal level.

I honestly think that it would be an improvement on the status quo to stop teaching math classes entirely. Thurston characterized his early math education as follows:

I hated much of what was taught as mathematics in my early schooling, and I often received poor grades. I now view many of these early lessons as anti-math: they actively tried to discourage independent thought. One was supposed to follow an established pattern with mechanical precision, put answers inside boxes, and "show your work," that is, reject mental insights and alternative approaches.

I think that this characterizes math classes even at the graduate level, only at a higher level of abstraction. The classes essentially never offer students exposure to free-form mathematical exploration, which is what it takes to make major scientific discoveries with significant quantitative components.

Replies from: Pentashagon
comment by Pentashagon · 2015-06-30T03:24:42.197Z · LW(p) · GW(p)

I distinctly remember having points taken off of a physics midterm because I didn't show my work. I think I dropped the exam in the waste basket on the way out of the auditorium.

I've always assumed that the problem is three-fold; generating a formal proof is NP-hard, getting the right answer via shortcuts can include cheating, and the faculty's time is limited. Professors/graders do not have the capacity to rigorously demonstrate to themselves that the steps a student has written down actually pinpoint the unique answer. Without access to the student's mind graders are unable to determine if students cheat or not; being able to memorize and/or reproduce the exact steps of a calculation significantly decrease the likelihood of cheating. Even if graders could do one or both of the previous for a single student, they are not 30x or 100x as smart as their students, making it impractical to repeat the process for every student.

That said, I had some very good mathematics teachers in higher level courses who could force students to think, and one in particular who could encourage/demand novelty from students simply by asking them to solve problems that they hadn't yet learned to solve. I didn't realize the power of the latter approach until later (and at the time everyone complained about exams with a median score well under 50%), but his classes were always my favorite.

comment by Will_Sawin · 2015-06-27T23:29:34.207Z · LW(p) · GW(p)

Thank you for all these interesting references. I enjoyed reading all of them, and rereading in Thurston's case.

Do people pathologize Grothendieck as having gone crazy? I mostly think people think of him as being a little bit strange. The story I heard was that because of philosophical disagreements with military funding and personal conflicts with other mathematicians he left the community and was more or less refusing to speak to anyone about mathematics, and people were sad about this and wished he would come back.

Replies from: JonahSinick, VoiceOfRa
comment by JonahS (JonahSinick) · 2015-06-28T00:47:25.440Z · LW(p) · GW(p)

Do people pathologize Grothendieck as having gone crazy?

His contribution of math is too great for people to have explicitly adopted a stance that was too unfavorable to him, and many mathematicians did in fact miss him a lot. But as Perelman said:

Of course, there are many mathematicians who are more or less honest. But almost all of them are conformists. They are more or less honest, but they tolerate those who are not honest." He has also said that "It is not people who break ethical standards who are regarded as aliens. It is people like me who are isolated.

If pressed, many mathematicians downplay the role of those who behaved unethically toward him and the failure of the community to give him a job in favor of a narrative "poor guy, it's so sad that he developed mental health problems."

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-28T02:40:47.238Z · LW(p) · GW(p)

If pressed, many mathematicians downplay the role of those who behaved unethically toward him and the failure of the community to give him a job

What failure? He stepped down from the Steklov Institute and has refused every job offer and prize given to him.

comment by VoiceOfRa · 2015-06-28T02:47:03.103Z · LW(p) · GW(p)

Do people pathologize Grothendieck as having gone crazy?

From the details I'm aware of "gone crazy" is not a bad description of what happened.

comment by ChristianKl · 2015-06-27T12:29:20.383Z · LW(p) · GW(p)

Another thing I could pinpoint is that I don't want to identify as a "rationalist", I don't want to be any -ist.

Nobody forces you to do so. Plenty of people in this community don't self identify that way.

comment by [deleted] · 2015-06-26T23:24:18.434Z · LW(p) · GW(p)

Another thing I could pinpoint is that I don't want to identify as a "rationalist", I don't want to be any -ist.

I've always thought that calling yourself a "rationalist" or "aspiring rationalist" is rather useless. You're either winning or not winning. Calling yourself by some funny term can give you the nice feeling of belonging to a community, but it doesn't actually make you win more, in itself.

comment by ChristianKl · 2015-06-27T12:09:35.934Z · LW(p) · GW(p)

My view is constantly popping back and forth between different views

That sounds like you engage in binary thinking and don't value shades of grey of uncertainty enough. You feel to need to judge arguments for whether they are true or aren't and don't have mental categories for "might be true, or might not be true".

Jonah makes strong claims for which he doesn't provide evidence. He's clear about the fact that he hasn't provided the necessary evidence.

Given that you pattern match to "crackpot" instead of putting Jonah in the mental category where you don't know whether what Jonah says is right or wrong. If you start to put a lot of claims into the "I don't know"-pile you don't constantly pop between belief and non-belief. Popping back and forth means that the size of your updates when presented new evidence are too large.

Being able to say "I don't know" is part of genuine skepticism.

Replies from: minusdash
comment by minusdash · 2015-06-27T12:32:25.712Z · LW(p) · GW(p)

I'm not talking about back and forth between true and false, but between two explanations. You can have a multimodal probability distribution and two distant modes are about equally probable, and when you update, sometimes one is larger and sometimes the other. Of course one doesn't need to choose a point estimate (maximum a posteriori), the distribution itself should ideally be believed in its entirety. But just as you can't see the rabbit-duck as simultaneously 50% rabbit and 50% duck, one sometimes switches between different explanations, similarly to an MCMC sampling procedure.

I don't want to argue this too much because it's largely a preference of style and culture. I think the discussions are very repetitive and it's an illusion that there is much to be learned by spending so much time thinking meta.

Anyway, I evaporate from the site for now.

comment by Vaniver · 2015-06-26T20:10:19.934Z · LW(p) · GW(p)

I feel that I need to maintain extreme mental efforts to stay "sane" here. Maybe I should refrain from commenting. It's a pity because I'm generally very interested in the topics discussed here, but the tone and the underlying ideology is pushing me away.

I would be very interested in hearing elaboration on this topic, either publicly or privately.

Replies from: minusdash
comment by minusdash · 2015-06-26T21:45:15.817Z · LW(p) · GW(p)

I prefer public discussions. First, I'm a computer science student who took courses in machine learning, AI, wrote theses in these areas (nothing exceptional), I enjoy books like Thinking Fast and Slow, Black Swan, Pinker, Dawkins, Dennett, Ramachandran etc. So the topics discussed here are also interesting to me. But the atmosphere seems quite closed and turning inwards.

I feel similarities to reddit's Red Pill community. Previously "ignorant" people feel the community has opened a new world to them, they lived in darkness before, but now they found the "Way" ("Bayescraft") and all this stuff is becoming an identity for them.

Sorry if it's offensive, but I feel as if many people had no success in the "real world" matters and invented a fiction where they are the heroes by having joined some great organization much higher above the general public, who are just irrational automata still living in the dark.

I dislike the heavy use of insider terminology that make communication with "outsiders" about these ideas quite hard because you get used to referring to these things by the in-group terms, so you get kind of isolated from your real-life friends as you feel "they won't understand, they'd have to read so much". When actually many of the concepts are not all that new and could be phrased in a way that the "uninitiated" can also get it.

There are too many cross references in posts and it keeps you busy with the site longer than necessary. It seems that people try to prove they know some concept by using the jargon and including links to them. Instead, I'd prefer authors who actively try to minimize the need for links and jargon.

I also find the posts quite redundant. They seem to be reiterations of the same patterns in very long prose with people's stories intertwined with the ideas, instead of striving for clarity and conciseness. Much of it feels a lot like self-help for people with derailed lives who try to engineer their life (back) to success. I may be wrong but I get a depressed vibe from reading the site too long. It may also be because there is no lighthearted humor or in-jokes or "fun" or self-irony at all. Maybe because the members are just like that in general (perhaps due to mental differences, like being on the autism spectrum, I'm not a psychiatrist).

I can see that people here are really smart and the comments are often very reasonable. And it makes me wonder why they'd regard a single person such as Yudkowsky in such high esteem as compared to established book authors or academics or industry people in these areas. I know there has been much discussion about cultishness, and I think it goes a lot deeper than surface issues. LessWrong seems to be quite isolated and distrusting towards the mainstream. Many people seem to have read stuff first from Yudkowsky, who often does not reference earlier works that basically state the same stuff, so people get the impression that all or most of the ideas in "The Sequences" come from him. I was quite disappointed several times when I found the same ideas in mainstream books. The Sequences often depict the whole outside world as dumber than it is (straw man tactics, etc).

Another thing is that discussion is often too meta (or meta-meta). There is discussion on Bayes theorem and math principles but no actual detailed, worked out stuff. Very little actual programming for example. I'd expect people to create github projects, IPython notebooks to show some examples of what they are talking about. Much of the meta-meta-discussion is very opinion-based because there is no immediate feedback about whether someone is wrong or right. It's hard to test such hypotheses. For example, in this post I would have expected an example dataset and showing how PCA can uncover something surprising. Otherwise it's just floating out there although it matches nicely with the pattern that "some math concept gave me insight that refined my rationality". I'm not sure, maybe these "rationality improvements" are sometimes illusions.

I also don't get why the rationality stuff is intermixed with friendly AI and cryonics and transhumanism. I just don't see why these belong that much together. I find them too speculative and detached from the "real world" to be the central ideas. I realize they are important, but their prevalence could also be explained as "escapism" and it promotes the discussion of untestable meta things that I mentioned above, never having to face reality. There is much talk about what evidence is but not much talk that actually presents evidence.

I needed to develop a sort of immunity against topics like acausal trade that I can't fully specify how they are wrong, but they feel wrong and are hard to translate to practical testable statements, and it just messes with my head in the wrong way.

And of course there is also that secrecy around and hiding of "certain things".

That's it. This place may just not be for me, which is fine. People can have their communities in the way they want. You just asked for elaboration.

Replies from: Vaniver, Risto_Saarelma, None
comment by Vaniver · 2015-06-27T00:35:23.021Z · LW(p) · GW(p)

Thanks for the detailed response! I'll respond to a handful of points:

Previously "ignorant" people feel the community has opened a new world to them, they lived in darkness before, but now they found the "Way" ("Bayescraft") and all this stuff is becoming an identity for them.

I certainly agree that there are people here who match that description, but it's also worth pointing out that there are actual experts too.

the general public, who are just irrational automata still living in the dark.

One of the things I find most charming about LW, compared to places like RationalWiki, is how much emphasis there is on self-improvement and your mistakes, not mistakes made by other people because they're dumb.

It seems that people try to prove they know some concept by using the jargon and including links to them. Instead, I'd prefer authors who actively try to minimize the need for links and jargon.

I'm not sure this is avoidable, and in full irony I'll link to the wiki page that explains why.

In general, there are lots of concepts that seem useful, but the only way we have to refer to concepts is either to refer to a label or to explain the concept. A number of people read through the sequences and say "but the conclusions are just common sense!", to which the response is, "yes, but how easy is it to communicate common sense?" It's one thing to be able to recognize that there's some vague problem, and another thing to be able to say "the problem here is inferential distance; knowledge takes many steps to explain, and attempts to explain it in fewer steps simply won't work, and the justification for this potentially surprising claim is in Appendix A." It is one thing to be able to recognize a concept as worthwhile; it is another thing to be able to recreate that concept when a need arises.

Now, I agree with you that having different labels to refer to the same concept, or conceptual boundaries or definitions that are drawn slightly differently, is a giant pain. When possible, I try to bring the wider community's terminology to LW, but this requires being in both communities, which limits how much any individual person can do.

I also don't get why the rationality stuff is intermixed with friendly AI and cryonics and transhumanism.

Part of that is just seeding effects--if you start a rationality site with a bunch of people interested in transhumanism, the site will remain disproportionately linked to transhumanism because people who aren't transhumanists will be more likely to leave and people who are transhumanists will be more likely to find and join the site.

Part of it is that those are the cluster of ideas that seem weird but 'hold up' under investigation--most of the reasons to believe that the economy of fifty years from now will look like the economy of today are just confused, and if a community has good tools for dissolving confusions you should expect them to converge on the un-confused answer.

A final part seems to be availability; people who are convinced by the case for cryonics tend to be louder than the people who are unconvinced. The annual surveys show the perception of LW one gets from just reading posts (or posts and comments) is skewed from the perception of LW one gets from the survey results.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2015-06-27T01:46:52.026Z · LW(p) · GW(p)

One of the things I find most charming about LW, compared to places like RationalWiki, is how much emphasis there is on self-improvement and your mistakes, not mistakes made by other people because they're dumb.

I agree that LW is much better than RationalWiki, but I still think that the norms for discussion are much too far in the direction of focus on how other commenters are wrong as opposed to how one might oneself be wrong.

I know that there's a selection effect (with respect to the more frustrating interactions standing out). But people not infrequently mistakenly believe that I'm wrong about things that I know much more about than they do, with very high confidence, and in such instances I find the connotations that I'm unsound to be exasperating.

I don't think that this is just a problem for me rather than a problem for the community in general: I know a number of very high quality thinkers in real life who are uninterested in participating on LW explicitly because they don't want to engage with commenters who are highly confident that their own positions are incorrect. There's another selection effect here: such people aren't salient because they're invisible to the online community.

Replies from: Vaniver
comment by Vaniver · 2015-06-27T02:22:30.253Z · LW(p) · GW(p)

I know that there's a selection effect (with respect to the more frustrating interactions standing out).

I agree that those frustrating interactions both happen and are frustrating, and that it leads to a general acidification of the discussion as people who don't want to deal with it leave. Reversing that process in a sustainable way is probably the most valuable way to improve LW in the medium term.

comment by Risto_Saarelma · 2015-06-27T13:03:57.233Z · LW(p) · GW(p)

There's also the whole Lesswrong-is-dying thing that might be contribute to the vibe you're getting. I've been reading the forum for years and it hasn't felt very healthy for a while now. A lot of the impressive people from earlier have moved on, we don't seem to be getting that many new impressive people coming in and hanging out a lot on the forum turns out not to make you that much more impressive. What's left is turning increasingly into a weird sort of cargo cult of a forum for impressive people.

Replies from: V_V
comment by V_V · 2015-06-27T14:13:56.004Z · LW(p) · GW(p)

Actually, I think that LessWrong used to be worse when the "impressive people" were posting about cryonics, FAI, many-world interpretation of quantum mechanics, and so on.

Replies from: Risto_Saarelma
comment by Risto_Saarelma · 2015-06-27T18:00:51.052Z · LW(p) · GW(p)

It has seemed to me that a lot of the commenters who come with their own solid competency are also less likely to get unquestioningly swept away following EY's particular hobbyhorses.

comment by [deleted] · 2015-06-26T23:37:12.184Z · LW(p) · GW(p)

I needed to develop a sort of immunity against topics like acausal trade that I can't fully specify how they are wrong, but they feel wrong and are hard to translate to practical testable statements, and it just messes with my head in the wrong way.

The applicable word is metaphysics. Acausal trade is dabbling in metaphysics to "solve" a question in decision theory, which is itself mere philosophizing, and thus one has to wonder: what does Nature care for philosophies?

By the way, for the rest of your post I was going, "OH MY GOD I KNOW YOUR FEELS, MAN!" So it's not as though nobody ever thinks these things. Those of us who do just tend to, in perfect evaporative cooling fashion, go get on with our lives outside this website, being relatively ordinary science nerds.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-27T01:21:05.531Z · LW(p) · GW(p)

The applicable word is metaphysics.

Sorry avoiding metaphysics doesn't work. You just end up either reinventing them (badly) or using a bad 5th hand version of some old philospher's metaphysics. Incidentally, Eliezer also tried avoiding metaphysics and wound up doing the former.

Replies from: None, TheAncientGeek
comment by [deleted] · 2015-06-28T19:02:08.064Z · LW(p) · GW(p)

I don't like Eliezer's apparent mathematical/computational Platonism myself, but most working scientists manage to avoid metaphysical buggery by simply dealing with only those things with which what they can actually causally interact. I recall an Eliezer post on "Explain/Worship/Ignore", and would add myself that while "Explain" eventually bottoms out in the limits of our current knowledge, the correct response is to hit "Ignore" at that stage, not to drop to one's knees in Worship of a Sacred Mystery that is in fact just a limit to current evidence.

EDIT: This is also one of the reasons I enjoy being in this community: even when I disagree with someone's view (eg: Eliezer's), people here (including him) are often more productive and fun to talk to than someone who hits the limits of their scientific knowledge and just throws their hands up to the tune of "METAPHYSICS, SON!", and then joins the bloody Catholic Church, as if that solved anything.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-28T20:09:57.381Z · LW(p) · GW(p)

I don't like Eliezer's apparent mathematical/computational Platonism myself, but most working scientists manage to avoid metaphysical buggery by simply dealing with only those things with which what they can actually causally interact.

That works up until the point where you actually have to think about what it means to "causally interact" with something. Also questions like "does something that falls into a black hole cease to exist since it's no longer possible to interact with it"?

Replies from: None
comment by [deleted] · 2015-06-29T10:38:36.627Z · LW(p) · GW(p)

Also questions like "does something that falls into a black hole cease to exist since it's no longer possible to interact with it"?

But there are trivially easy answers to questions like that. Basically you have to ask "Cease to exist for whom?" i.e. it obviously ceases to exist for you. You just have to taboo words like "really" here such "does it really cease to exist" as they are meaningless, they don't lead to predictions. What often people consider "really" reality is the perception of a perfect god-like omniscient observer but there is no such thing.

Essentially there are just two extremes to avoid, the po-mo "nothing is real, everything is mere perception" and the traditional, classical "but how things really really REALLY are?" and the middle way here is "reality is the sum of what could be perceived in principle". A perception is right or wrong based on how much it meshes with all the other things that can in principle be perceived. Everything that cannot even be perceived in theory is not part of reality. There is no how things "really" are, the closest we have to that what is the sum of all potential, possible perceivables about a thing.

I picked up this approach from Eric S. Raymond, I think he worked it out decades before Eliezer did, possibly both working from Peirce.

This is basically anti-metaphysics.

Replies from: CCC
comment by CCC · 2015-06-29T12:05:52.204Z · LW(p) · GW(p)

Everything that cannot even be perceived in theory is not part of reality.

Does this imply that only things that exist in my past light cone are real for me at any given moment?

Replies from: None
comment by [deleted] · 2015-06-29T13:36:02.838Z · LW(p) · GW(p)

I don't know what real-for-me means here. Everything that in principle, in theory, could be observed, is real. Most of those you didn't. This does not make them any less real.

I meant the "for whom?" not in the sense of me, you, or the barkeeper down the street. I meant it in the sense of normal beings who know only things that are in principle knowable, vs. some godlike being who can know how things really "are" regardless of whether they are knowable or not.

Replies from: CCC, VoiceOfRa
comment by CCC · 2015-06-30T14:27:17.187Z · LW(p) · GW(p)

Everything that in principle, in theory, could be observed, is real.

Well, that's where it starts to break down; because what you can, in theory, observe is different from what I can, in theory, observe.

This is because, as far as anyone can tell, observations are limited by the speed of light. I cannot, even in principle, observe the 2015 Alpha Centauri until at least 2019 (if I observe it now, I am seeing light that left it around 2011). If Alpha Centauri had suddenly exploded in 2013, I have no way of observing that until at least 2018 - even in principle.

So if the barkeeper, instead of being down the street, is rather living on a planet orbiting Alpha Centauri, then the set of what he can observe in principle is not the same as the set of what I can observe in principle.

comment by VoiceOfRa · 2015-07-01T03:06:47.417Z · LW(p) · GW(p)

Everything that in principle, in theory, could be observed, is real. Most of those you didn't. This does not make them any less real.

I'd like to congratulate you on developing your own "makes you sound insane to the man in the street" theory of metaphysics.

Replies from: IlyaShpitser, None
comment by IlyaShpitser · 2015-07-01T04:21:00.826Z · LW(p) · GW(p)

Man on the street needs to learn what counterfactual definiteness is.

Replies from: Stephen_Cole, VoiceOfRa
comment by Stephen_Cole · 2015-08-10T16:34:49.354Z · LW(p) · GW(p)

Ilya, can you give me a definition of "counterfactual definiteness" please?

Replies from: IlyaShpitser
comment by IlyaShpitser · 2015-08-10T16:39:50.781Z · LW(p) · GW(p)

Physicists are not very precise about it, may I suggest looking into "potential outcomes" (the language some statisticians use to talk about counterfactuals):

https://en.wikipedia.org/wiki/Rubin_causal_model

https://en.wikipedia.org/wiki/Counterfactual_definiteness

Potential outcomes let you think about a model that contains a random variable for what happens to Fred if we give Fred aspirin, and a random variable for what happens to Fred if we give Fred placebo. Even though in reality we only gave Fred aspirin. This is "counterfactual definiteness" in statistics.

This paper uses potential outcomes to talk about outcomes of physics experiments (so there is an exact isomorphism between counterfactuals in physics and potential outcomes):

http://arxiv.org/pdf/1207.4913.pdf

Replies from: Stephen_Cole
comment by Stephen_Cole · 2015-08-10T16:52:28.293Z · LW(p) · GW(p)

Sounds like this is perhaps related to the counterfactual-consistency statement? In its simple form, that the counterfactual or potential outcome under policy "a" equals the factual observed outcome when you in fact undertake policy "a", or formally, Y^a = Y when A = a.

Pearl has a nice (easy) discussion in the journal Epidemiology (http://www.ncbi.nlm.nih.gov/pubmed/20864888).

Is this what you are getting at, or am I missing the point?

Replies from: IlyaShpitser
comment by IlyaShpitser · 2015-08-10T19:09:49.415Z · LW(p) · GW(p)

No, not quite. Counterfactual consistency is what allows you to link observed and hypothetical data (so it is also extremely important). Counterfactual definiteness is even more basic than that. It basically sets the size of your ontology by allowing you to talk about Y(a) and Y(a') together, even if we only observe Y under one value of A.


edit: Stephen, I think I realized who you are, please accept my apologies if I seemed to be talking down to you, re: potential outcomes, that was not my intention. My prior is people do not know what potential outcomes are.


edit 2: Good talks by Richard Gill and Jamie Robins at JSM on this:

http://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211222

Replies from: Stephen_Cole
comment by Stephen_Cole · 2015-08-22T13:09:36.342Z · LW(p) · GW(p)

No offense taken. I am sorry I did not get to see Gill & Robins at JSM. Jamie also talks about some of these issues online back in 2013 at https://www.youtube.com/watch?v=rjcoJ0gC_po

comment by VoiceOfRa · 2015-07-02T04:58:56.681Z · LW(p) · GW(p)

Well, this whole thread started because minusdash and eli_sennesh objected to the concept of accusal trade for being too metaphysical.

comment by [deleted] · 2015-07-01T07:37:44.153Z · LW(p) · GW(p)

I just need to translate that for him to street lingo.

"There is shit we know, shit we could know, and shit could not know no matter how good tech we had, we could not even know the effects it has on other stuff. So why should we say this later stuff exists? Or why should we say this does not exist? We cannot prove either."

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-07-02T05:04:57.062Z · LW(p) · GW(p)

My serious point is that one cannot avoid metaphysics, and that way too many people start out from "all this metaphysics stuff is BS, I'll just use common sense" and end up with there own (bad) counter-intuitive metaphysical theory that they insist is "not metaphysics".

Replies from: Creutzer
comment by Creutzer · 2015-07-02T05:11:09.664Z · LW(p) · GW(p)

You could charitably understand everything that such people (who assert that metaphysics is BS) say with a silent "up to empirical equivalence". Doesn't the problem disappear then?

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-07-02T05:14:08.216Z · LW(p) · GW(p)

No because you need a theory of metaphysics to explain what "empirical equivalence" means.

Replies from: Creutzer
comment by Creutzer · 2015-07-02T08:40:26.254Z · LW(p) · GW(p)

To be honest, I don't see that at all.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-07-03T01:29:42.681Z · LW(p) · GW(p)

So how would you define "empirical equivalence"?

comment by TheAncientGeek · 2015-06-28T21:28:47.165Z · LW(p) · GW(p)

Its insufficiently appreciated that physicalism is metaphysics too.

comment by 27chaos · 2015-06-29T00:38:13.171Z · LW(p) · GW(p)

How about you just jump right to the details of your method, and then backtrack to help other people understand the necessary context to appreciate the method? Otherwise, you will lose your audience.

comment by JonahS (JonahSinick) · 2015-06-26T18:36:30.484Z · LW(p) · GW(p)

See my edit. Part of where I'm coming from is realizing how socially undeveloped people's in our reference class are tend to be, such that apparent malice often comes from misunderstandings.

comment by InquilineKea · 2015-06-28T19:42:29.596Z · LW(p) · GW(p)

(before people's social behavior had seemed like a complicated blur because I saw so many variables without having started to correctly identify the latent ones).

Interesting - what are some examples of the latent ones?

comment by VoiceOfRa · 2015-06-27T00:49:58.545Z · LW(p) · GW(p)

Why did you have this impression?

Probably because of the human tendency to overestimate the importance of any knowledge one happens to have and underestimate the importance of any knowledge one doesn't. (Is there a name for this bias?)

Replies from: Good_Burning_Plastic
comment by RomeoStevens · 2015-06-26T18:29:28.927Z · LW(p) · GW(p)

I think having the concept of PCAs prevents some mistakes in reasoning on an intuitive day to day level of reasoning. It nudges me towards fox thinking instead of hedgehog thinking. Normal folk intuition grasps at the most cognitively available and obvious variable to explain causes, and then our System 1 acts as if that variable explains most if not all the variance. Looking at PCAs many times (and being surprised by them) makes me less likely to jump to conclusions about the causal structure of clusters of related events. So maybe I could characterize it as giving a System 1 intuition for not making the post hoc ergo propter hoc fallacy.

Maybe part of the problem Jonah is running in to explaining it is that having done many many example problems with System 2 loaded it into his System 1, and the System 1 knowledge is what he really wants to communicate?

Replies from: minusdash, JonahSinick
comment by minusdash · 2015-06-26T19:24:29.674Z · LW(p) · GW(p)

What do you mean by getting surprised by PCAs? Say you have some data, you compute the principal components (eigenvectors of the covariance matrix) and the corresponding eigenvalues. Were you surprised that a few principal components were enough to explain a large percentage of the variance of the data? Or were you surprised about what those vectors were?

I think this is not really PCA or even dimensionality reduction specific. It's simply the idea of latent variables. You could gain the same intuition from studying probabilistic graphical models, for example generative models.

Replies from: RomeoStevens
comment by RomeoStevens · 2015-06-26T19:32:02.143Z · LW(p) · GW(p)

Surprised by either. Just finding a structure of causality that was very unexpected. I agree the intuition could be built from other sources.

Replies from: minusdash
comment by minusdash · 2015-06-26T19:46:35.827Z · LW(p) · GW(p)

PCA doesn't tell much about causality though. It just gives you a "natural" coordinate system where the variables are not linearly correlated.

Replies from: VoiceOfRa
comment by VoiceOfRa · 2015-06-27T01:29:52.983Z · LW(p) · GW(p)

Right, one needs to use additional information to determine causality.

comment by JonahS (JonahSinick) · 2015-06-26T19:04:55.464Z · LW(p) · GW(p)

Yes, you seem to have a very clear understanding of where I'm coming from. Thanks.

comment by [deleted] · 2015-06-26T23:11:01.690Z · LW(p) · GW(p)

As I see it this movement seems to try to build up a new backbone of knowledge from scratch. But doing this they repeat the mistakes of the past philosophers.

Don't say the p-word, please ;-).

I do agree that more real-life understanding is gained from just obtaining a broad scientific education than from going wow-hunting. But of course, I would say that, since I'm a fanatical textbook purchaser.

comment by Anders_H · 2015-06-26T15:59:39.959Z · LW(p) · GW(p)

I don't believe you can obtain an understanding of the idea that "correlation does not imply causation" from even a very deep appreciation of the material in Statistics 101. These courses usually make no attempt to define confounding, comparability etc. If they try to define confounding, they tend to use incoherent criteria based on changes in the estimate. Any understanding is almost certainly going to have to originate from outside of Statistics 101; unless you take a course on causal inference based on directed acyclic graphs it will be very challenging to get beyond memorizing the teacher's password

Replies from: SatvikBeri
comment by SatvikBeri · 2015-06-26T16:10:48.155Z · LW(p) · GW(p)

Agree completely, and I'll also point out that at least for me, a very shallow understanding of the ideas in Causality did much more to help me understand correlation vs. causation, confounding etc. than any amount of work with Statistics 101. And this was enormously practical–I was able to make significantly better financial decisions at Fundation due to understanding concepts like Simpson's Paradox on a system 1 level.

Replies from: gwern
comment by gwern · 2015-06-27T02:53:05.215Z · LW(p) · GW(p)

To chime in as well: my own understanding of 'correlation does not imply causation' does not come from the basic statistics courses and articles and tutorials I read. While I knew the saying and the concepts and a little bit about causal graphs, it took years of failed self-experiments and the intensely frustrating experience of seeing correlate after correlate fail randomized experiments before I truly accepted it.

I don't know how helpful, exactly, this has been on a practical level, but at least it's good for me on an epistemic level in that I have since accepted many fewer new beliefs than I would otherwise have.

Replies from: IlyaShpitser
comment by IlyaShpitser · 2015-06-28T15:05:40.376Z · LW(p) · GW(p)

Me four.


Although you know, there is no reason in principle you couldn't get all that stuff Anders_H is talking about from intro stats, it's just that stats isn't taught as well as it can be.

comment by Viliam · 2015-07-08T19:55:31.608Z · LW(p) · GW(p)

I would probably use different words, but I believe I fit Jonah's description. Before finding LW, I felt strongly isolated. Like, surrounded by human bodies, but intellectually alone. Thinking about topics that people around me considered "weird", so I had no one to debate them with. Having a large range of interests, and while I could find people to debate individual interests with, I had no one to talk with about the interesting combinations I saw there.

I felt "weird", and from people around me I usually got two kinds of feedback. When I didn't try to pretend anything, they more or less confirmed that I am weird (of course, many were gentle, trying not to hurt me). When I tried to play a role of someone "less weird" (that is, I ignored most of the things I considered interesting, and just tried to fit)... well, it took a lot of time and practice to do this correctly, but then people accepted me. So, for a long time it felt like the only way to be accepted would be to supress a large part of what I consider to be "myself"; and I suspect that it would never work perfectly, that there would still be some kind of intellectual hunger.

Then I found LW and I was like: "whoa... there actually are people like me! too bad they are on the other side of the planet though". Then I found some of them living closer, and... going to meetups feels incredibly refreshing. First time in my life, I don't have to suppress anything, to play any role. I just am... in an environment that feels natural. I finally started understanding how people can enjoy having social contacts.

Now let's imagine that in a parallel universe, those LessWrongers who live in a city near to mine, would instead be my neighbors since my childhood, or that we would be classmates at high school. I believe my life would be very different. (I believe there are people like this in my city, but the problem is finding those few dozen individuals among the hundreds of thousands, especially when there is no word in a public vocabulary to describe "us".)

I can't the article now, but I believe it was written by Lewis Terman, where he observed how successful are highly intelligent people. He found a difference between those who were "intelligent people in an intelligent environment" and those who were "isolated intelligent people". The former were usually very successful in life: they could talk with their parents and friends as equals, share their algorithms for life success, fit into their environment. The latter felt isolated, and often burned out at some moment of their lives. The conclusion was that for a highly intelligent person, having similarly highly intelligent family and friends makes a huge difference in their lives. -- When you observe the difference between "academia" and "LessWrong", it may be related to this.

It is easier to be academically successful when your parents are. You can pick good habits and strategies from them; you can debate your work and problems with them. If you are the only academically inclined person in the family, you lead a double life: the "real life" outside of school, and the "academic life" inside. The more you focus on your work, the more it feels like you are withdrawing from everything else. On the other hand, if you come from the same culture, focusing on the work makes you fit into the culture.

I am going to break a taboo here, but I don't know how to tell it otherwise. I have IQ about four or five sigma above the average. The difference between me and the average Mensa member is larger that the difference between Mensa and the general population. Many people in Mensa seem kind of dense to me, and average people, those are sometimes like five-years old children. (I believe for many people on LessWrong it feels the same.) Sure, intelligence in not everything: other people have skills and traits that I lack, sometimes have more success than me, and I admire that. It's just... so difficult to talk with them like with adult people. But when I go to LW meetup, it's like "whoa... finally a group of adult people, how amazing!".

But I'm already an old man, relatively speaking. Now I am 39; I found LW when I was 35. Finally I have a company of my peers (still not in my own city), but it can't fix the three decades of my life that already passed in isolation. It can make my life better, but I will always have the emotional scars of chronic loneliness. Oh, how much I envy those lucky kids who can go to LW meetups as teenagers. Makes me wonder how much my own life could be different; I probably wouldn't recognize myself.

Of course, this is just one data point; I don't know how typical or atypical I am within the LW community.

comment by btrettel · 2015-06-27T01:01:56.043Z · LW(p) · GW(p)

PCA and other dimensionality reduction techniques are great, but there's another very useful technique that most people (even statisticians) are unaware of: dimensional analysis, and in particular, the Buckingham pi theorem. For some reason, this technique is used primarily by engineers in fluid dynamics and heat transfer despite its broad applicability. This is the technique that allows scale models like wind tunnels to work, but it's more useful than just allowing for scaling. I find it very useful to reduce the number of variables when developing models and conducting experiments.

Dimensional analysis recognizes a few basic axioms about models with dimensions and sees what they imply. You can use these to construct new variables from the old variables. The model is usually complete in a smaller number of these new variables. The technique does not tell you which variables are "correct", just how many independent ones are needed. Identifying "correct" variables requires data, domain knowledge, or both. (And sometimes, there's no clear "best" variable; multiple work equivalently well.)

Dimensional analysis does not help with categorical variables, or numbers which are already dimensionless (though by luck, sometimes combinations of dimensionless variables are actually what's "correct"). This is the main restriction that applies. And you can expect at best a reduction in the number of variables of about 3. Dimensional analysis is most useful for physical problems with maybe 3 to 10 variables.

The basic idea is this: Dimensions are some sort of metadata which can tell you something about the structure of the problem. You can always rewrite a dimensional equation, for example, to be dimensionless on both sides. You should notice that some terms become constants when this is done, and that simplifies the equation.

Here's a physical example: Let's say you want to measure the drag on a sphere (units: N). You know this depends on the air speed (units: m/s), viscosity (units: m^2/s), air density (units: kg/m^3), and the diameter of the sphere (units: m). So, you have 5 variables in total. Let's say you want to do a factorial design with 4 levels in each variable, with no replications. You'd have to do 4^4 = 256 experiments. This is clearly too complicated.

What fluid dynamicists have recognized is that you can rewrite the relationship in terms of different variables, and nothing is missing. The Buckingham pi theorem mentioned previously says that we only need 2 dimensionless variables given our 5 dimensional variables. So, instead of the drag force, you use the drag coefficient, and instead of the speed, viscosity, etc., you use the Reynolds number. Now, you only need to do 4 experiments to get the same level of representation.

As it turns out, you can use techniques like PCA on top of dimensional analysis to determine that certain dimensionless parameters are unimportant (there are other ways too). This further simplifies models.

There's a lot more on this topic than what I have covered and mentioned here. I would recommend reading the book Dimensional analysis and the theory of models for more details and the proof of the pi theorem.

(Another advantage of dimensional analysis: If you discover a useful dimensionless variable, you can get it named after yourself.)

Replies from: Epictetus, Daniel_Burfoot, VoiceOfRa
comment by Epictetus · 2015-06-27T04:41:32.850Z · LW(p) · GW(p)

In general, if your problem displays any kind of symmetry* you can exploit that to simplify things. I think most people are capable of doing this intuitively when the symmetry is obvious. The Buckingham pi theorem is a great example of a systematic way to find and exploit a symmetry that isn't so obvious.

* By "symmetry" I really mean "invariance under a group of transformations".

Replies from: btrettel
comment by btrettel · 2015-06-27T12:41:29.655Z · LW(p) · GW(p)

This is a great point. Other than fairly easy geometric and time symmetries, do you have any advice or know of any resources which might be helpful towards finding these symmetries?

Here's what I do know: Sometimes you can recognize these symmetries by analyzing a model differential equation. Here's a book on the subject that I haven't read, but might read in the future. My PhD advisor tells me I already know one reliable way to find these symmetries (e.g., like how to find the change of variables used here), so reading this would be a poor use of time in his view. This approach also requires knowing a fair bit more about a phenomena than just which variables it depends on.

Replies from: Epictetus, Vaniver
comment by Epictetus · 2015-06-27T16:40:35.427Z · LW(p) · GW(p)

The book you linked is the sort of thing I had in mind. The historical motivation for Lie groups was to develop a systematic way to use symmetry to attack differential equations.

comment by Vaniver · 2015-06-27T14:15:53.515Z · LW(p) · GW(p)

This is a great point. Other than fairly easy geometric and time symmetries, do you have any advice or know of any resources which might be helpful towards finding these symmetries?

Are you familiar with Noether's Theorem? It comes up in some explanations of Buckingham pi, but the point is mostly "if you already know that something is symmetric, then something is conserved."

The most similar thing I can think of, in terms of "resources for finding symmetries," might be related to finding Lyapunov stability functions. It seems there's not too much in the way of automated function-finding for arbitrary systems; I've seen at least one automated approach for systems with polynomial dynamics, though.

Replies from: btrettel, Douglas_Knight
comment by btrettel · 2015-06-27T15:40:42.584Z · LW(p) · GW(p)

Not familiar with Noether's theorem. Seems useful for constructing models, and perhaps determining if something else beyond mass, momentum, and energy is conserved. Is the converse true as well, i.e., does conservation imply that symmetries exist?

I'm also afraid I know nearly nothing about non-linear stability, so I'm not sure what you're referring to, but it sounds interesting. I'll have to read the Wikipedia page. I'd be interested if you know any other good resources for learning this.

Replies from: Vaniver, Will_Sawin
comment by Vaniver · 2015-06-27T17:10:33.688Z · LW(p) · GW(p)

Is the converse true as well, i.e., does conservation imply that symmetries exist?

I think this is what Lie groups are all about, but that's a bit deeper in group theory than I'm comfortable speaking on.

I'd be interested if you know any other good resources for learning this.

I learned it the long way by taking classes, and don't recall being particularly impressed by any textbooks. (I can lend you the ones I used.) I remember thinking that reading through Akella's lecture notes was about as good as taking the course, and so if you have the time to devote to it you might be able to get those from him by asking nicely.

comment by Will_Sawin · 2015-06-27T21:58:14.006Z · LW(p) · GW(p)

Conservation gives a local symmetry but there may not be a global symmetry.

For instance, you can imagine a physical system with no forces at all, so everything is conserved. But there are still some parameters that define the location of the particles. Then the physical system is locally very symmetric, but it may still have some symmetric global structure where the particles are constrained to lie on a surface of nontrivial topology.

comment by Douglas_Knight · 2015-06-27T20:09:33.743Z · LW(p) · GW(p)

Noether's theorem has nothing to do with Buckingham's theorem. Buckingham's theorem is quite general (and vacuous), while Noether's theorem is only about hamiltonian/lagrangian mechanics.

Added: Actually, Buckingham and Noether do have something in common: they both taught at Bryn Mawr.

Replies from: Vaniver, EHeller, btrettel
comment by Vaniver · 2015-06-27T21:57:52.740Z · LW(p) · GW(p)

Noether's theorem has nothing to do with Buckingham's theorem.

Both of them are relevant to the project of exploiting symmetry, and deal with solidifying a mostly understood situation. (You can't apply Buckingham's theorem unless you know all the relevant pieces.) The more practical piece that I had in mind is that someone eager to apply Noether's theorem will need to look for symmetries; they may have found techniques for hunting for symmetries that will be useful in general. It might be worth looking into material that teaches it, not because it itself is directly useful, but because the community that knows it may know other useful things.

comment by EHeller · 2015-06-27T21:44:18.722Z · LW(p) · GW(p)

It's a quite bit more general than Lagrangian mechanics. You can extend it to any functional that takes functions between two manifolds to complex numbers.

comment by btrettel · 2015-06-27T20:35:28.687Z · LW(p) · GW(p)

In what sense do you mean Buckingham's theorem is vacuous?

comment by Daniel_Burfoot · 2015-06-28T18:01:47.990Z · LW(p) · GW(p)

I've always been amazed at the power of dimensional analysis. To me the best example is the problem of calculating the period of an oscillating mass on a spring. The relevant values are the spring constant K (kg/s^2) and the mass M (kg), and the period T is in (s). The only way to combine K and M to obtain a value with dimensions of (s) is sqrt(M/K), and that's the correct form of the actual answer - no calculus required!

Replies from: Douglas_Knight
comment by Douglas_Knight · 2015-07-01T06:25:23.446Z · LW(p) · GW(p)

Actually, there's another parameter, the displacement. It turns out that the spring period does not depend on the displacement, but that's a miracle that is special to springs. Instead, look at the pendulum. The same dimensional analysis gives the square root of the length divided by gravitational acceleration. That's off by a dimensionless constant, 2π. Moreover, even that is only approximately correct. The real answer depends on the displacement in a complicated way.

Replies from: btrettel
comment by btrettel · 2015-07-01T13:55:39.042Z · LW(p) · GW(p)

This is a good point. At best you can figure out that period is proportional to (not equal to) sqrt(M/K) multiplied by some function of other parameters, say, one involving displacement and another characterizing the non-linearity (if K is just the initial slope, as I've seen done before). It's a fortunate coincidence if the other parameters are unimportant. You can not determine based solely on dimensional analysis whether certain parameters are unimportant.

comment by VoiceOfRa · 2015-06-27T01:36:21.579Z · LW(p) · GW(p)

That's because outside of physics (and possibly chemistry) there are enough constants running around that all quantities are effectively dimensionless. I'm having a hard time seeing a situation in say biology where I could propose dimensional analysis with a straight face, to say nothing of softer sciences.

Replies from: btrettel, Will_Sawin
comment by btrettel · 2015-06-27T02:07:54.291Z · LW(p) · GW(p)

As I said, dimensional analysis does not help with categorical variables. And when the number of dimensions is low and/or the number of variables is large, dimensional analysis can be useless. I think it's a necessary component of any model builder's toolbox, but not a tool you will use for every problem. Still, I would argue that it's underutilized. When dimensional analysis is useful, it definitely should be used. (For example, despite its obvious applications in physics, I don't think most physics undergrads learn the Buckingham pi theorem. It's usually only taught to engineers learning fluid dynamics and heat transfer.)

Two very common dimensionless parameters are the ratio and fraction. Both certainly appear in biology. Also, the subject of allometry in biology is basically simple dimensional analysis.

I've seen dimensional analysis applied in other soft sciences as well, e.g., political science, psychology, and sociology are a few examples I am aware of. I can't comment much on the utility of its application in these cases, but it's such a simple technique that I think it's worth trying whenever you have data with units.

Speaking more generally, the idea of simplification coming from applying transformations to data has broad applicability. Dimensional analysis is just one example of this.

comment by Will_Sawin · 2015-06-27T22:01:32.700Z · LW(p) · GW(p)

One thing that most scientists in these soft scientists already have a good grasp on, but a lot of laypeople do not, is the idea of appropriately normalizing parameters. For instance dividing something by the mass of the body, or the population of a nation, to do comparisons between individuals/nations of different sizes.

People will often make bad comparisons where they don't normalize properly. But hopefully most people reading this article are not at risk for that.

comment by interstice · 2015-06-26T16:41:12.366Z · LW(p) · GW(p)

What resources would you recommend for learning advanced statistics?

Replies from: None
comment by [deleted] · 2015-06-28T19:23:34.055Z · LW(p) · GW(p)

What would you call "advanced" statistics? But let's start listing classes:

1) Intro to Discrete and Continuous Probability -- you'll need this for every possible path

Now we need to start branching out. Choose your adventure: applied or theoretical? Frequentist, Bayesian, Likelihoodist, or "Machine" Learning?

Your normal university statistics sequence will probably give you Intro to Frequentist Statistics 1 at this point. That's a fine way to go, but it's not the only way. In fact, many departments in the empirical sciences will teach Data Analysis classes, or the like, which introduce applied statistics before teaching you the theory, which would mean you've actually dealt with real data before you learn the theory. I think that might be a Very Good Idea.

Now let's hope you've taken one of the following paths:

  • Data Analysis and Intro to Frequentist Stats 1
  • Intro to Bayesian Statistics 1
  • Intro to Machine Learning (with laboratory exercises to get experience)

From there I would recommend knowing linear algebra decently well before moving on. Then you can start taking courses/reading textbooks in more advanced/theoretical machine learning, computational Bayesian methods, multidimensional frequentist statistics, causal analysis, or just more and more applied data analysis. You should probably check what sort of statistical methods are favored "in the field" that you actually care about.

comment by Vladimir_Nesov · 2015-06-30T01:45:09.648Z · LW(p) · GW(p)

This doesn't address the issue of the claimed difference in Jonah's perception of LWers from his perception of other groups.

comment by Viliam · 2015-07-09T10:25:27.742Z · LW(p) · GW(p)

I am not giving up, and I hope I will still achieve some big success.

In the shortest term... I have a baby now, which turned my life upside down a bit, so I need to solve some logistic problems first (e.g. to buy a new flat) and get used to the new situation. It might take a year. -- Not complaining here; I always wanted to have children, but it's taking time and energy and money, so my options are now more limited than usual. I believe it will be okay in a few months, but today, I am rather busy and tired. Also, having a family limits my options; for example if I would decide that moving to another city would make my life better, it is no longer only my own decision. My hands are a bit more tied than they would be if I were 25 again.

I still didn't give up completely on starting a rationalist community in my own city, and I have two specific plans. (1) These days I am finishing the translation of the LW Sequences book; when it is ready, I will distribute it freely and try to make it popular, and hope that people who enjoy it will contact me. (2) In September, I plan to do some rationality "lectures" (advertising for LW and for the translated book) on at least one high school, and one university.

I will probably not do anything scientific, ever; that train has already gone. Cannot compete with 20-years olds with fresh brains and fresh memories of their university lectures, who don't have a family to feed. It would be wiser to focus fully on my personal life and making money, because that's what I have to do anyway. -- The current plan is writing computer games, because the entry costs are almost zero, and I can do it at home in the evenings when the baby sleeps. (I have to keep the day job to pay bills.) Later, when the baby grows up and starts attenting school, I may try something more ambitious.

But still, even if my plans succeed and I live till 80, I will not be able to do as much as in the hypothetical parallel universe where I would find a LW community as a teenager (and also live till 80). But it will still be better than yet another parallel universe where LW doesn't exist at all or where I am somehow unable to find it.

Replies from: Gram_Stone
comment by Gram_Stone · 2015-07-09T13:24:41.732Z · LW(p) · GW(p)

It is so painful to have an easily available possible world in which you find LessWrong earlier than in the real world. I ran into LW/OB five times since I was 16 and didn't stick around until I was 21. I can't imagine what I would be like with five years of exposure to the important things that I've been exposed to in the past six months, as well as having grown alongside the community, seeing as how I came around near the time that LW began.

Replies from: Viliam
comment by Viliam · 2015-07-09T21:36:01.748Z · LW(p) · GW(p)

I also didn't stick with LW at the first time. I found an article linked from somewhere, I believe it was "Well-Kept Gardens Die By Pacifism", I was impressed, but then I left. A year or two later, I again randomly found an article, then I saw it was the same website as the previous one, so I was like "Oh, this website contains multiple interesting articles" and started clicking on random links in text. Then I cautiously posted a few comments in the Open Thread -- some got downvotes, some got upvotes -- and kept reading...

So, somewhere in the parallel Everett branch there is a version of me that didn't return to LW anymore, or just returned, read one article, and left again. Poor guy; he probably spends a lot of time having stupid debates on other websites.

What do you believe you would have done differently, if you would stick around here at 16?

comment by JonahS (JonahSinick) · 2015-06-29T07:16:56.545Z · LW(p) · GW(p)

I'm speaking based on many interactions with many members of the community. I don't think this is true of everybody, but I have seen a difference at the group level.

comment by [deleted] · 2015-06-26T23:07:15.292Z · LW(p) · GW(p)

Real world data often has the surprising property of "dimensionality reduction": a small number of latent variables explain a large fraction of the variance in data.

Why is that surprising? The causal structure of the world is very sparse, by the nature of causality. One cause has several effects, so once you scale up to lots of causative variables, you expect to find that large portions of the variance in your data are explained by only a few causal factors.

Causality is indeed the skeleton of data. And oh boy, wait until you hit hierarchical Bayes models!

Only, the variables that explain a lot usually aren't the variables that are immediately visible – instead they're hidden from us, and in order to model reality, we need to discover them, which is the function that PCA serves.

Not quite. PCA helps you reduce dimensionality by discovering the directions of variation in your feature-space that explain most of the variation (in fact, a total ordering of the directions of variation in the data by how much variation they explain). Then there's Independent Components Analysis, which separates your feature data into its most independent/orthogonal directions of variation.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2015-06-27T08:11:41.862Z · LW(p) · GW(p)

The causal structure of the world is very sparse, by the nature of causality.

Can you expand your reasoning? We do see around us sparse — that is, understandable — causal systems. And even chaotic ones often give rise to simple properties (e.g. motion of huge numbers of molecules → gas laws). But why (ignoring anthropocentric arguments) would one expect to see this?

Replies from: None
comment by [deleted] · 2015-06-28T18:52:25.989Z · LW(p) · GW(p)

There are really just three ways the causal structure of reality could go:

  • Many causes -> one effect
  • One cause -> one effect, strictly
  • One cause -> many effects

Since the latter will generate more (apparent) random variables, most observables will end up deriving from a relatively sparse causal structure, even if we assume that the causal structures themselves are sampled uniformly from this selection of three.

So, for instance, parameter-space compression (which is its own topic to explain, but oh well), aka: the hierarchical structure of reality, actually does follow that first item: many micro-level causes give rise to a single macro-level observable. But you'll still find that most observables come from non-compressive causal structures.

This is why we actually have to work really hard to find out about micro-scale phenomena (things lower on the hierarchy than us): they have fewer observables whose variance is uniquely explicable by reference to a micro-scale causal structure.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2015-06-28T21:43:19.734Z · LW(p) · GW(p)

I need that expanded a lot more. Why not many causes -> many effects, for example?

Replies from: None
comment by [deleted] · 2015-06-28T23:11:19.989Z · LW(p) · GW(p)

Ah, you mean a densely interconnected "almost all to almost all" causal structure. Well, I'd have to guess: because that would look far more like random behavior than causal order, so we wouldn't even notice it as something to causally analyze!

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2015-06-29T12:54:05.875Z · LW(p) · GW(p)

We do notice turbulence as something doesn't look random, and is hard-to-impossible to causally analyze.

Here's an anecdote. I can't copy and paste it, but it's in the middle column.

Replies from: btrettel
comment by btrettel · 2015-06-30T13:07:00.901Z · LW(p) · GW(p)

This is a very interesting point. PCA (or as its time and/or space series version is called, the Karhunen-Loève expansion and/or POD) has not been found to be useful for turbulence modeling, as I recall. There's a brief section in Pope's book on turbulence about modeling with this. From what I understand, POD is mostly used for visualization purposes, not to help build models. (It's worth noting that while my background in fluid dynamics is strong, I know little to nothing about PCA and the like aside from what they apparently do.)

Maybe I don't actually understand causality, but I think in terms of modeling, we do have a good model (the Navier-Stokes, or N-S, equations) and so in some sense, it's clear what causes what. In principle, if you run a computer simulation with these equations and the correct boundary conditions, the result will be reasonably accurate. This has been demonstrated through direct simulations of some relatively simple cases like flow through a channel. So that's not the issue. The actually issue is that you need a lot of computing power to simulate even basic flows, and attempts to develop lower order models have been fairly unsuccessful. So as a model, N-S is of limited utility as-is.

In my view, the "turbulence problem" comes down to two facts: 1. the N-S equations are chaotic (sensitive to initial conditions, so small changes can cause big effect) and 2. they exhibit large scale separation (so the smallest details you need to resolve, the Kolmolgorov scales in most cases are much smaller than the physical dimensions of a problem, say the length of a wing). To understand these points better, imagine that rigid body dynamics was inaccurate (say, modeling the trajectory of a baseball), and you had to model all the individual atoms to get it right. And if one was off that might possibly have a big effect. Obviously that's a lot harder, and it's probably computationally intractable outside of a few simple cases. (The chaos part is "avoided" because you probably would simulate an ensemble of initial conditions via Monte Carlo or something else, and get an "ensemble mean" which you would compare against an experiment. This works well from what I understand even if the details are unclear.)

So in some sense, yes, this looks like an "almost all to almost all" causal structure. Though, I looked up a bit about causal diagrams and it's not even clear to me how you might draw one for turbulence, and not because of turbulence itself. It's not clear what an "event" might be to me. There isn't even a precise definition of "turbulence" to begin with, so maybe this should be expected. I suppose on some level such things are arbitrary and you could define an event to be fluid movement in some direction, for each direction, each point in space, and each time. I'm not sure if anyone has done this sort of analysis.

(For the incompressible N-S equations, you can easily say that everything causes everything because the equations are elliptic, so the speed of sound is infinite (which means changes in some place are felt everywhere instantaneously). In other words, the "domain of dependence" is everywhere. But I don't know if that means these effects are substantial. Obviously in reality, far away from something quiet that's happening, you don't notice it, even if the sound waves had time to reach you. In practice, this means that doing incompressible fluid dynamics requires the solution of an elliptic PDE, which can be a pain for reasons unrelated to turbulence.)

comment by Anders_H · 2015-06-26T15:49:42.476Z · LW(p) · GW(p)

I disagree that you can get an understanding of the idea that "correlation does not imply causation" from Stats 101. I don