Rationality is about pattern recognition, not reasoning

post by JonahSinick · 2015-05-26T19:23:05.879Z · score: 28 (37 votes) · LW · GW · Legacy · 82 comments

Contents

  Long version
None
82 comments

Short version (courtesy of Nanashi)

Our brains' pattern recognition capabilities are far stronger than our ability to reason explicitly. Most people can recognize cats across contexts with little mental exertion. By way of contrast, explicitly constructing a formal algorithm that can consistently cats across contexts requires great scientific ability and cognitive exertion.

Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes. Reasoning plays a role, but a relatively small one. Sufficiently high quality mathematicians don't make their discoveries through reasoning. The mathematical proof is the very last step: you do it to check that your eyes weren't deceiving you, but you know ahead of time that your eyes probably weren't deceiving you.

I have a lot of evidence that this way of thinking is how the most effective people think about the world. I would like to share what I learned. I think that what I've learned is something that lots of people are capable of learning, and that learning it would greatly improve people's effectiveness. But communicating the information is very difficult.

It took me 10,000+ hours to learn how to "see" patterns in evidence in the way that I can now. Right now, I don't know how to communicate how to do it succinctly. In order to succeed, I need collaborators who are open to spend a lot of time thinking carefully about the material, to get to the point of being able to teach others. I'd welcome any suggestions for how to find collaborators.

Long version

For most of my life, I believed that epistemic rationality was largely about reasoning carefully about the world. I frequently observed people's intuitions leading them astray. I thought that what differentiated people with high epistemic rationality is Cartesian skepticism: the practice of carefully scrutinizing all of one's beliefs using deductive-style reasoning. 

When I met Holden Karnofsky, co-founder of GiveWell, I came to recognize that Holden's general epistemic rationality was much higher than my own. Over the course of years of interaction, I discovered that Holden was not using my style of reasoning. Instead, his beliefs were backed by lots of independent small pieces of evidence, which in aggregate sufficed to instill confidence, even if no individual piece of evidence was compelling by itself. I finally understood this in 2013, and it was a major epiphany for me. I wrote about it in two posts [1], [2].

After learning data science, I realized that my "many weak arguments" paradigm was also flawed: I had greatly overestimated the role that reasoning of any sort plays in arriving at true beliefs about the world. 

In hindsight, it makes sense. Our brains' pattern recognition capabilities are far stronger than our ability to reason explicitly. Most people can recognize cats across contexts with little mental exertion. By way of contrast, explicitly constructing a formal algorithm that can consistently cats across contexts requires great scientific ability and cognitive exertion. And the best algorithms that people have been constructed (within the paradigm of deep learning) are highly nontransparent: nobody's been able to interpret their behavior in intelligible terms.

Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes. Reasoning plays a role, but a relatively small one. If one has developed the capacity to see in this way, one can construct post hoc explicit arguments for why one believes something, but these arguments aren't how one arrived at the belief.  

The great mathematician Henri Poincare hinted at what I finally learned, over 100 years ago. He described his experience discovering a concrete model of hyperbolic geometry as follows:

I left Caen, where I was living, to go on a geological excursion under the auspices of the School of Mines. The incidents of the travel made me forget my mathematical work. Having reached Coutances, we entered an omnibus to go to some place or other. At the moment when I put my foot on the step, the idea came to me, without anything in my former thoughts seeming to have paved the way for it, that the transformations I had used to define the Fuchsian functions were identical with those of non-Euclidean geometry. I did not verify the idea; I should not have had time, as upon taking my seat in the omnibus, I went on with a conversation already commenced, but I felt a perfect certainty. On my return to Caen, for convenience sake, I verified the result at my leisure.”

Sufficiently high quality mathematicians don't make their discoveries through reasoning. The mathematical proof is the very last step: you do it to check that your eyes weren't deceiving you, but you know ahead of time that your eyes probably weren't deceiving you. Given that this is true even in math, which is thought of as the most logically rigorous subject, it shouldn't be surprising that the same is true of epistemic rationality across the board.

Learning data science gave me a deep understanding of how to implicitly model the world in statistical terms. I've crossed over into a zone of no longer know why I hold my beliefs, in the same way that I don't know how I perceive that a cat is a cat. But I know that it works. It's radically changed my life over a span of mere months. Amongst other things, I finally identified a major blindspot that had underpinned my near total failure to achieve my goals between ages 18 and 28. 

I have a lot of evidence that this way of thinking is how the most effective people think about the world. Here I'll give two examples. Holden worked under Greg Jensen, the co-CEO of Bridgewater Associates, which is the largest hedge fund in the world. Carl Shulman is one of the most epistemically rational members of the LW and EA communities. I've had a number of very illuminating conversations with him, and in hindsight, I see that he probably thinks about the world in this way. See Luke Muehlhauser's post Just the facts, ma'am! for hints of this. If I understand correctly, Carl correctly estimated Mark Zuckerberg's future net worth as being $100+ million upon meeting him as a freshman at Harvard, before Facebook. 

I would like to share what I learned. I think that what I've learned is something that lots of people are capable of learning, and that learning it would greatly improve people's effectiveness. But communicating the information is very difficult. Abel Prize winner Mikhail Gromov wrote

We are all fascinated with structural patterns: periodicity of a musical tune, a symmetry of an ornament, self-similarity of computer images of fractals. And the structures already prepared within ourselves are the most fascinating of all. Alas, most of them are hidden from ourselves. When we can put these structures-within-structures into words, they become mathematics. They are abominably difficult to express and to make others understand. 

It took me 10,000+ hours to learn how to "see" patterns in evidence in the way that I can now. Right now, I don't know how to communicate how to do it succinctly. It's too much for me to do as an individual: as far as I know, nobody has ever been able to convey the relevant information to a sizable audience!

In order to succeed, I need collaborators who are open to spend a lot of time thinking carefully about the material, to get to the point of being able to teach others. I'd welcome any suggestions for how to find collaborators.

82 comments

Comments sorted by top scores.

comment by Nanashi · 2015-05-26T20:07:53.785Z · score: 7 (7 votes) · LW · GW

I'd be glad to offer what help I can. Based on other posts of yours, I would definitely practice brevity. This post is over 1000 words long and easily could be condensed to 250 or less.

comment by Nanashi · 2015-05-26T20:26:28.512Z · score: 8 (8 votes) · LW · GW

Per our email exchange, here is the condensed version that uses only your original writing:

"Our brains' pattern recognition capabilities are far stronger than our ability to reason explicitly. Most people can recognize cats across contexts with little mental exertion. By way of contrast, explicitly constructing a formal algorithm that can consistently cats across contexts requires great scientific ability and cognitive exertion.

Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes. Reasoning plays a role, but a relatively small one. Sufficiently high quality mathematicians don't make their discoveries through reasoning. The mathematical proof is the very last step: you do it to check that your eyes weren't deceiving you, but you know ahead of time that it's your eyes probably weren't deceiving you.

I have a lot of evidence that this way of thinking is how the most effective people think about the world. I would like to share what I learned. I think that what I've learned is something that lots of people are capable of learning, and that learning it would greatly improve people's effectiveness. But communicating the information is very difficult.

It took me 10,000+ hours to learn how to "see" patterns in evidence in the way that I can now. Right now, I don't know how to communicate how to do it succinctly. In order to succeed, I need collaborators who are open to spend a lot of time thinking carefully about the material, to get to the point of being able to teach others. I'd welcome any suggestions for how to find collaborators."

Notes:

  • I removed all the quotations. Although I'm guessing they were probably key to your own understanding of the issue, I don't think they are an efficient way to improve other people's understanding.
  • Much of the post was dedicated (unnecessarily) to why your viewpoint is right rather than just stating your viewpoint. People who agree with you don't need to be convinced. People who disagree with you aren't going to be swayed by your arguments.
  • I removed a few paragraphs that repeated themselves.
comment by Vaniver · 2015-05-26T21:16:56.137Z · score: 16 (16 votes) · LW · GW

While I agree that there's value to being able to state the summary of the viewpoint, I can't help but feel that brevity is the wrong approach to take to this subject in particular. If the point is that effective people reason by examples and seeing patterns rather than by manipulating logical objects and functions, then removing the examples and patterns to just leave logical objects and functions is betraying the point!

Somewhat more generally, yes, there is value in telling people things, but they need to be explained if you want to communicate with people that don't already understand them.

comment by Nanashi · 2015-05-26T21:44:10.337Z · score: 2 (2 votes) · LW · GW

I definitely agree that you shouldn't be so brief as to not get your point across, I think the level of brevity depends on what your goal is. In this case, he's asking for help. It isn't until 1,500 words in that the two most important questions: "What does he want?" and "Why should I help him?" are answered.

(Besides, he specifically wanted help in communicating things succinctly.)

comment by [deleted] · 2015-05-28T16:13:45.743Z · score: 1 (1 votes) · LW · GW

The post reminded me of The creative mind by Margaret Bowden; her examples, in particular Kekule seeing the benzene ring, seem relevant here. (Although the book definitely could be shorter:)

comment by Nanashi · 2015-05-26T20:50:48.121Z · score: 6 (6 votes) · LW · GW

Here is the even-further edited version, condensed to 150 words.

I have a lot of evidence that the most effective people in the world have a very specific way of thinking. They use their brain's pattern-matching abilities to process the world, rather than using explicit reasoning.

Our brain can pattern match much more efficiently than it can reason. Most people can recognize a cat very easily. But creating an algorithm to recognize cats is far more difficult. And breakthroughs of any kind are very rarely made via explicit reasoning, but rather through a complex and rapid-fire combination of ideas.

Doing this is something that many people are capable of learning. But, it took me 10,000+ hours to learn how to "see" the world way that I can now, and I do not know how to communicate this process succinctly. In order to help people, I need collaborators who are willing to help clarify my thoughts. I'd welcome any suggestions.

You'll note it very quickly gets to the three main points:

  • What are you talking about?
  • Why should we listen to you?
  • What do you want?

Let me know if I summarized any part of your thoughts incorrectly.

comment by JonahSinick · 2015-05-26T20:50:37.271Z · score: 5 (5 votes) · LW · GW

Thanks very much, both for the shorted version and for the notes. I added the shorted version at the top of my post.

comment by Nanashi · 2015-05-26T20:52:30.319Z · score: 3 (3 votes) · LW · GW

Not a problem at all. What you're talking about is something I believe in, so I'm glad to help.

comment by Grothor · 2015-05-28T07:07:22.066Z · score: 2 (2 votes) · LW · GW

I do not think the entire post was too long, but I do think reading the short version first was helpful. It's sort of like reading an abstract before diving into a journal article. If nothing else, it helps people who are uninterested save some time.

People who agree with you don't need to be convinced. People who disagree with you aren't going to be swayed by your arguments.

I'm not convinced this is true, but regardless, what about people who neither agree nor disagree? To a large extent, explaining why your viewpoint is right is exactly the same thing as explaining in detail what your viewpoint is.

comment by Lumifer · 2015-05-27T17:26:01.949Z · score: 6 (6 votes) · LW · GW

An interesting post. You started with the assumption that formal reasoning is the right way to go and found out that it's not necessarily so. Let me start from the opposite end: the observation that the great majority of people reason all the time by pattern-matching, this is the normal, default, bog-standard way of figuring things out.

You do not need to "retrain" people to think in patterns -- they do so naturally.

Looking at myself, I certainly do think in terms of patterns -- internal maps and structures. Typically I carry a more-or-less coherent map of the subject in my head (which certain areas being fuzzy or incomplete, that's fine) and the map is kinda-spatial. When a new piece of data comes in, I try to fit it into the existing (in my head) structure and see if it's a good fit. If it's not a good fit, it's like a pebble in a shoe -- an irritant and an obvious problem. The problem is fixed either by reinterpreting the data and its implications, or by bending and adjusting the structure so there is a proper place for the new data nugget. Sometimes both happen.

Formal reasoning is atypical for me, that's why I'm not that good at math. I find situations where you have enough hard data to formally reason about it to be unusual and rare (that would probably be different if I were an engineer or an accountant :-D). Most often you have stochastic reasoning with probability distributions and conditional outcomes and that is amenable to analysis only at low levels. At high enough levels you're basically back to pattern recognitions, ideally with some support from formal reasoning.

In any case, I'm not sure why do you think that teaching people to think in patterns will be hard or will lead to major jumps in productivity. People already do this, all the time. Essentially you are talking about unlearing the reliance on formalism which is applicable to very few.

comment by JonahSinick · 2015-05-27T18:18:18.884Z · score: 1 (1 votes) · LW · GW

The reference class that I've implicitly had in mind in writing my post is mathematicians / LWers / EAs, who do seem to think in the way that I had been. See my post Many weak arguments and the typical mind.

People outside of this reference generally use implicit statistical models that are not so great. For such people, the potential gains come from learning how to build much better implicit statistical models (as I did as a result of my exposure to data science.) I don't know whether learning more advanced statistics would work for you personally - but for me, it was what I needed. Historically, most people who have very good implicit statistical models seem to have learned by observing others who do. But it can be hard to get access to them (e.g. I would not have been able to connect with Greg Jensen, Holden's former boss, during my early 20's, as Holden did.)

comment by Lumifer · 2015-05-27T18:55:48.506Z · score: 0 (0 votes) · LW · GW

mathematicians / LWers / EAs

Mathematicians, yes, but that's kinda natural because people become good mathematicians precisely by the virtue of being very good at formal reasoning. But I don't know about LW/EA in general -- I doubt most of them have "mathematical minds".

People outside of this reference generally use implicit statistical models that are not so great.

Really? Math geeks/LW/EA are the creme de la creme, the ultimate intellectual elite? I haven't noticed. "Normal" people certainly don't have great thinking skills. But there is a very large number of smart and highly successful people who are outside of your reference class. They greatly outnumber the math/LW/EA crowd.

comment by JonahSinick · 2015-05-27T21:07:43.356Z · score: 3 (3 votes) · LW · GW

But I don't know about LW/EA in general -- I doubt most of them have "mathematical minds".

Within the LW cluster I've seen a lot of focus on precision. It's not uncommon for people in the community to miss the main points that I'm trying to make in favor of focusing on a single sentence that I wrote that seems wrong. I have seldom had this experience in conversation with people outside of the LW cluster: my conversation partners outside of the LW cluster generally hold my view: that it's inevitably the case that one will say some things things that are wrong, and that it's best to focus on the main points that someone is trying to make.

Really? Math geeks/LW/EA are the creme de la creme, the ultimate intellectual elite? I haven't noticed. "Normal" people certainly don't have great thinking skills. But there is a very large number of smart and highly successful people who are outside of your reference class. They greatly outnumber the math/LW/EA crowd.

By "generally" I meant "most people," not "for a fixed person" – i.e. I don't necessarily disagree with you.

Separately, I believe that a large fraction of transferable human capital is in fact in elite math and physics, but that's a long conversation. My impression is that good physicists do use the style of thinking that I just learned. In the case of elite mathematicians, I think it would take like 5 years of getting up to speed with real world stuff before their strength as thinkers started to come out vividly.

comment by Good_Burning_Plastic · 2015-05-28T14:06:18.568Z · score: -1 (1 votes) · LW · GW

great majority of people

Of people worldwide, or of people reading this post? Considering the former leads to this failure mode.

comment by Lumifer · 2015-05-28T15:02:27.199Z · score: 1 (1 votes) · LW · GW

Of people worldwide, or of people reading this post?

Both.

Mathematicians are weird people, they think differently :-) I don't think most of LW is mathematicians.

comment by VoiceOfRa · 2015-06-02T03:07:46.085Z · score: 0 (0 votes) · LW · GW

As a mathematician I can testify that even most mathematicians think in maps.

comment by paulfchristiano · 2015-05-27T06:13:49.836Z · score: 6 (6 votes) · LW · GW

I agree that a picture of many weak arguments supporting or undermining explicit claims does not capture whet humans do---the inferences themselves are much more complex than logical deductions, such that we don't yet know any way of representing the actual objects that are being manipulated. I think this is the mainstream view, certainly in AI now.

I don't know what it means to say that our pattern recognition capabilities are stronger than our logical reasoning; they are two different kinds of cognitive tasks. It seems like saying that we are much better at running fast than lifting heavy objects. Sometimes you can do a task in one way or the other, and we might say that one or the other is better way to get something done. And we can compare to other animals, or to machines, and talk about comparative advantage. And so on.

Perhaps the most relevant claim would be asking what fraction of variance in outcomes is described by one characteristic or another, or in which domain is practice most helpful. I think that explicit principles about how to reason do not distinguish good mathematicians from each other, though they may distinguish mathematicians at different times. The situation seems similar in most endeavors. I think this is because it is so much easier to transfer explicit information between people, so the residual is what differentiates. Learning the explicit info is still the right first step to mastery, though it's not most of the work.

Improving explicit info and norms seems to be the main way that we progress as a civilization, because that's what you can build on, share, and so on. But of course, those can be explicit norms about how to train any part of reasoning, and I am pretty agnostic about what kind of reasoning is best to try to improve.

Overall I feel like there is a broad version of your thesis that I think is clearly true. You are no doubt making a more specific claim. I'm interested to see it fleshed out. I can easily see disagreeing with it. If I do, it's probably because I have a broader sense of / distribution over ways we might use our brain. For example:

"Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes."

Is a specific way we can use our built in pattern recognition, one could list perhaps half a dozen similar tactics at a similar level of abstraction. I will be interested to see if you have evidence that distinguishes between these principles and singles out one as most important. (You could also interpret your quote in a broad way, in which case I think it is very clear. So I guess I should just wait to discuss!)

Note that the accuracy of this kind of literal analogy, or how far you can take that analogy, is a question that researchers in computer science and AI explicitly discuss. Most everyone agrees that there is some of this, but I think there is legitimate uncertainty about how much.

Also, when you say "don't make their discoveries through reasoning," it's not exactly clear what this means. This might also be something that I disagree with strongly, or it may be something that I think is very clear (or maybe somewhere in between). Logic plays a key role in the mathematician's reasoning. None of the steps the mathematician makes are valid inferences in any formal proof system (but neither are any steps of most published proofs!), though often she will write or think pairs of statements that are in fact related to each other in a logically precise way, and the existence of such relationships is a key part of why the cognitive process yields correct results.

comment by [deleted] · 2015-05-27T07:37:54.278Z · score: 5 (5 votes) · LW · GW

This is interesting. I have found that when you are like 16, you often want everything to be super logical and everything that is not feels stupid. And growing up largely means accepting "common sense", which at the end of the day means relying more on pattern recognition. (This is also politically relevant - young radicalism is often about matching everything with a logical sounding ideology, while people when they grow and become more moderate simply care about what typical patterns tend to result in human flourishing more than about ideology.)

There is something in pattern recognition that feels pretty "conservative" in the not-too-political sense. Logical reasoning is an individualistic thing, you can make your own philosophy of things especially if you are young and feel you are so much smarter than everybody else. But if you treat your brain as a pattern sensor, it is different.

First of all experience matters more than sheer brightness. You start to think less and less that old people are dinosauric fools and respect them more and more. (Of course the right kind of experience matters more than just clocking in a lot of birthdays. There are 19 years old guys who are so fanatic about cars that they spend day and night working on them and probably know why your car does not run well than your dad does.)

Second, throwing a lot of brains on the patterns i.e. actually listening to other people's opinions starts to look good. It scales differently. You can feel you can out-reason and out-logic a thousand people because logic is not additive. But being a sensor its. When hunting for a detail, you cannot out-see two thousand eyeballs. So you start to respect other people's opinions more.

Third, there are depositories of recognized patterns. They are usually called best practices, accepted practices or even traditions. They start to matter.

It is a very sobering experience, and for me it was kind of painful (because humiliating, deflating), it happened between 21 and 26.

It is turning people more conservative in the not-so-political sense and it is probably a good thing, at least I think it made me better off, although it was painful. For example, in architecture, do you value bravely tradition bucking original design, or you value traditional pattern-book architecture? Scruton argues the later is more likely to create an environment in which people feel good.

comment by JonahSinick · 2015-05-27T08:06:36.099Z · score: 1 (1 votes) · LW · GW

Much of what you say resonates with me. I think that a major problem that very smart young people often have is not meeting older counterparts of themselves. The great mathematician Don Zagier was an extreme prodigy, progressing so rapidly that he earned his PhD at age 20. But despite the fact that he possessed immense innate ability, he needed to learn from a great mathematician in order to become one. He wrote

My first real teacher was in my third year of graduate school, Friedrich Hirzebruch. It was through him that I began to think like a real mathematician. This is something you can't teach yourself but have to learn from a master.

There's some overlap between what you write and Nick Beckstead's post Common sense as a prior, which I recommend if you haven't read it before.

I went through something similar to what you went through, but for me it has a happy ending – it's not that my ideas were wrong all along, it's that I hadn't yet learned how to integrate them with the wisdom of people who were older than me. I suspect that something similar is true of you to some degree as well.

comment by Gondolinian · 2015-06-01T20:29:29.442Z · score: 0 (0 votes) · LW · GW

I have found that when you are like 16, you often want everything to be super logical and everything that is not feels stupid. And growing up largely means accepting "common sense", which at the end of the day means relying more on pattern recognition.

For a counterexample, I am 16 and almost all my decisions/perceptions are based on implicit pattern recognition more than explicit reasoning.

ETA: I think I missed your point.

comment by [deleted] · 2015-06-02T07:46:52.006Z · score: 1 (1 votes) · LW · GW

My point is that I was like this guy you probably aren't.

comment by Wei_Dai · 2015-05-27T06:33:00.700Z · score: 4 (4 votes) · LW · GW

Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes.

Can you explain a bit more why you think that the way people with very high level epistemic rationality process evidence is analogous to how we recognize visual patterns? Do you think these two mental processes are fundamentally using the same algorithms, or just that both are subconscious computations that we don't understand very well?

comment by JonahSinick · 2015-05-27T08:16:34.477Z · score: 3 (3 votes) · LW · GW

I'm going on the one learning algorithm hypothesis that's become popular in parts of the neuroscience community, together with my subjective impressions. Intuitively, it seems parsimonious to suppose that evolution hijacked the algorithms used for sensory processing for general intelligence. I don't think that the algorithms are the same, but I would guess that they're relatives.

comment by Wei_Dai · 2015-05-28T23:01:48.301Z · score: 6 (6 votes) · LW · GW

Interesting. I found one paper that explains the one learning algorithm hypothesis and gives evidence for it. Quoting from it:

There seems to be some evidence for a single algorithm explaining the computations performed by the primary auditory, visual, motor, and somatosensory cortices. Given how little is known about higher-level processing immediately downstream of these primary areas, it is premature to generalize to other areas located in the occipital lobe. Less is known about the details of the computations performed in the areas located in anterior cortex and, in particular, the prefrontal cortex, which is disproportionately enlarged in humans when compared to non-human primates.

Is there anything more up to date or comprehensive than this paper?

This tangent aside, I agree that it would be really valuable to improve the way we process evidence subconsciously. I'm a bit skeptical that you've actually found such a method, but I hope that you succeed in writing it down and that it really works.

comment by Kaj_Sotala · 2015-06-08T15:47:51.545Z · score: 2 (2 votes) · LW · GW

Our Coalescing Minds paper had the one learning algorithm hypothesis as one of its assumptions; I wasn't the neuroscience expert, but my co-author was, and here's what he wrote about that premise (note that the paper was intended for a relatively popular audience, so the neuroscience detail was kept light):

An adult human neocortex consists of several areas which are to varying degrees specialized to process different types of information. The functional specialization is correlated with the anatomical differences of different cortical areas. Although there are obvious differences between areas, most cortical areas share many functional and anatomical traits. There has been considerable debate on whether cortical microcircuits are diverse or canonical [Buxhoeveden & Casanova, 2002; Nelson, 2002] but we argue that the differences are variations of the same underlying cortical algorithm, rather than entirely different algorithms. This is because most cortical areas seem to have the capability of processing any type of information. The differences seem to be a matter of optimization to a specific type of information, rather than a different underlying principle.

The cortical areas do lose much of their plasticity during maturation. For instance, it is possible to lose one’s ability to see colors if a specific visual cortical area responsible for color vision is damaged. The adult brain is not plastic enough to compensate for this damage, as the relevant regions have already specialized to their tasks. If the same brain regions were to be damaged during early childhood, color blindness would most likely not result.

However, this lack of plasticity reflects learning and specialization during the lifespan of the brain rather than innate algorithmic differences between different cortical areas. Plenty of evidence supports the idea that the different cortical areas can process any spatiotemporal patterns. For instance, the cortical area which normally receives auditory information and develops into the auditory cortex will develop visual representations if the axons carrying auditory information are surgically replaced by axons carrying visual information from the eyes [Newton & Sur, 2004]. The experiments were carried out with young kittens, but a somewhat similar sensory substitution is seen even in adult humans: relaying visual information through a tactile display mounted on the tongue will result in visual perception [Vuillerme & Cuisiner, 2009]. What first feels like tickling in the tongue will start feeling like seeing. In other words, the experience of seeing is not in the visual cortex but in the structure of the incoming information.

Another example of the mammalian brain’s ability to process any type of information is the development of trichromatic vision in mice that, like mammalian ancestors, normally have a dichromatic vision [Jacobs et al., 2007]. All it takes for a mouse to develop primate-like color vision is the addition of a gene encoding the photopigment which evolved in primates. When mice are born with this extra gene, their cortex is able to adapt to the new source information from the retina and to make sense of it. Even the adult cortical areas of humans can be surprisingly adaptive as long as the changes happen slowly enough [Feuillet et al., 2007]. Finally, Marzullo et al. [2010] demonstrated that rats implanted with electrodes both in their motor and visual cortices can learn to modulate the output from their motor cortex based on feedback given to visual cortex.

comment by jacob_cannell · 2015-06-13T21:26:40.912Z · score: 1 (1 votes) · LW · GW

The paper you linked to about the one learning algorithm hypothesis is from 2012. Since that time the theory has gained significant strength from the advances in DL, and in particular the work on deep reinforcement learning. Proving that an ANN with a relatively simple initial/prior architecture and about 1 million neurons can reach human-level performance on a set of 100 games when trained end to end with RL is pretty strong (albeit indirect) evidence for the one learning hypothesis.

One key remaining question is then: how does the brain actually implement approximate optimization/learning that is at least as good as back-prop? We know that back-prop is not biologically realistic. On that front, Bengio's group has made significant recent progress with a new technique/theory called target propagation 1, which originated in part as an explanation for how the brain could implement credit assignment, but it also shows promise as a potential replacement for backprop 2 - which further increases the biological plausibility.

In terms of more direct evidence, the hippocampus in particular appears to have a simple explanation in terms of reinforcement learning 3.

In terms of the prefrontal cortex in particular, there are working theories that explain much of the PFC as a set of modules specialized for working memory buffers that are controlled by gating units in the basal ganglia. That whole system in particular is also driven/learned through dopamine based RL.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2015-06-07T07:16:03.734Z · score: 3 (3 votes) · LW · GW

You have not understood correctly regarding Carl. He claimed, in hindsight, that Zuckerberg's potential could've been distinguished in foresight, but he did not do so.

comment by JonahSinick · 2015-06-07T08:00:45.745Z · score: 2 (2 votes) · LW · GW

I'm puzzled, is there a way to read his comment

People described him to me as resembling a young Bill Gates. His estimated expected future wealth based on that data if pursuing entrepreneurship, and informed by the data about the relationship of all of the characteristics I could track with it, was in the 9-figure range. Then add in that facebook was a very promising startup (I did some market sizing estimates for it, and people who looked at it and its early results were reliably impressed).

other than as him doing it at the time?

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2015-06-07T19:50:33.700Z · score: 1 (1 votes) · LW · GW

Yes, as his post facto argument.

comment by [deleted] · 2015-05-29T12:15:46.728Z · score: 3 (3 votes) · LW · GW

If I understand correctly, Carl correctly estimated Mark Zuckerberg's future net worth as being $100+ million upon meeting him as a freshman at Harvard, before Facebook.

Well, if I understand the post correctly, even as a freshman, Mark apparently had previous experience with owning/running a business, and was deliberately trying to become a tech entrepreneur. Now, given that someone is from a privileged family, is attending school at (almost) the maximally privileged and well-connected institution (at least on the East Coast) for wannabe rich guys, has previous experience with business by the time he reaches age 18, possesses enough intelligence to be going to school at Harvard (which, despite being partly a privilege club, still requires genuine intelligence and work-ethic to get into), and is clearly driven to become a rich tech guy... it's not that unrealistic to say that he would, eventually, achieve his goal.

It just requires conditioning on a bunch of facts that most people don't know about Mark Zuckerberg. But once you do condition on the facts that were available at the time... it all adds up to normality.

But anyway, to address the heart of the post... "pattern-matching" and inductive reasoning are similar but not identical. AFAIK, the human mind performs probabilistic causal induction on whatever data "makes it through" the sensory cortices, which are quite possibly but not surely doing something like independent component analysis via unsupervised neural learning... so yeah.

Most people have some solid intuitions about how probabilistic causal induction works, not in quantitative terms but in terms of "What happens if I do this?". That's the whole reason TVTropes exists. The problem is that we tell people only special little causal models called "logic" or "debate rules" or, God help the poor victims, "philosophy" are invested with the Special Epistemic Normative Power of telling you true things, rather than specifically training the inductive faculties we really rely on, the faculty of spotting what reality is doing by looking.

comment by Morendil · 2015-05-28T08:13:21.348Z · score: 2 (2 votes) · LW · GW

I'd welcome any suggestions for how to find collaborators.

Keep posting the material here. Post to Main. Don't worry about it not being polished enough: you'll get plenty of feedback. Ignore feedback that isn't useful to you.

comment by RyanCarey · 2015-05-27T23:01:25.375Z · score: 2 (2 votes) · LW · GW

Thanks for the post Jonah.

In medical school, I was taught that when you're a novice doctor, you'll make diagnoses and plans using deliberative reasoning, but that experts eventually pattern-match everything.

If that's true, then pattern-matching might arise naturally with experience, or it might be something that's difficult to achieve in many domains at once.

When I read your article, the reasons that I might doubt that you deserve collaborators are:

1) that enthusiastic self-reports of special perceptual-cognitive abilities have a low prior probability 2) that you lack mechanistic explanations of things you did that led to you percieving some types of patterns more easily

Also, it seems almost like a parody that you spent 10,000+ hours learning to see patterns in evidence. Like, I thought you might in the next line say 'gotcha, that's describing how I learnt English as a child'! Do you mean you learnt patterns, or you learnt how to learn patterns? And if you want others to help you to teach the material, you'll almost certainly need to start by teaching pieces of it yourself!

Putting all of this to the side, I enjoyed the post - thanks for writing it - and am interested to see what you come up with.

comment by Dr_Manhattan · 2015-05-27T16:47:45.953Z · score: 2 (2 votes) · LW · GW

Some contrary evidence about usefulness of explicit models: http://www.businessinsider.com/elon-musk-first-principles-2015-1

My take is that you need both, some things are understood better "from first principles" (engineering) others are more suitable for pattern matching (politics).

comment by JonahSinick · 2015-05-27T18:41:39.497Z · score: 1 (1 votes) · LW · GW

Yes, as I say in another comment, my sense had been that what works best is 50% intuition and 50% explicit reasoning, and now I think it's more like 95% vs 5%. If you're spending all of your time thinking, that still leaves roughly an hour a day for explicit reasoning, which is substantially more than usually.

comment by btrettel · 2015-05-29T01:37:46.124Z · score: 0 (0 votes) · LW · GW

I think there might be some confusion over terms here. I don't think "pattern matching" is the best way to phrase this.

Musk seems to be arguing for "rule learning" (figuring out the underlying rule) as opposed to "example learning" (interpolating to the nearest example in your collection). In the book Make it Stick, the authors mention that rule learners tend to be better learners. (These terms come from the psychological literature.)

I don't think this observation is incompatible with the importance of recognizing patterns. You need to "pattern match" which rule to invoke. You also need to recognize the pattern that is the rule in the first place. Recognizing which examples to use also could be pattern matching, too, so this is why I don't think the term is right.

In the same book mentioned previously, the authors write about Kahneman's systems 1 and 2, and I got the impression that mastery often is moving things from system 2 (more careful reasoning) to system 1 (automatic pattern matching, which might simply be precomputed). Here's an example: Vaniver suggested to me before that (if I recall correctly) when playing chess, someone might not explicitly consider a certain number of moves; their brain just has a map that goes from the current state of the board and other information to their next move. Developing this ability requires recognizing the right patterns in the game, which could come from simply having a large library of examples to interpolate from, or whatnot. This is precisely what I thought of when I read that it took (the famous) 10,000 hours for JonahSinick to see the patterns.

(To be fair, you do need both, but it seems that if you can develop good rules, you should use them. Also, developing accurate intuition is useful, whether it uses explicit rules or not.)

comment by ChristianKl · 2015-05-27T16:59:34.509Z · score: 0 (0 votes) · LW · GW

Musk is very interesting in his regard. He didn't start SpaceX and Tesla because he reasoned himself into those projects having a high chance of commercial success.

He choose them because he believed in those goals. He's driven by passion towards those goals.

comment by Dr_Manhattan · 2015-05-27T19:20:59.673Z · score: 0 (0 votes) · LW · GW

Even if I agree with you on the goals (I can claim he used meta-rationality here, in the sense that someone should try to make humans interplanetary species, even if he thought his chance of success was less than 50%) a lot the thinking that made him arrive at SpaceX seemed to be "one can actually do this way cheaper than the currently accepted standards, based on cost of materials etc"

comment by ChristianKl · 2015-05-27T23:46:06.308Z · score: 0 (0 votes) · LW · GW

I don't think Jonah or I argues that you should never make calculations. Musks did make many decisions on that path and from the outside it's hard to get an overview of what drives which decision.

comment by Burgundy · 2015-05-27T04:38:39.891Z · score: 2 (4 votes) · LW · GW

What you are describing is my native way of thinking. My mind fits large amounts of information together into an aesthetic whole. I took me a while to figure out that other people don't think this way, and they can't easily just absorb patterns from evidence.

This mode of thinking has been described as Introverted Thinking in Ben Kovitz's obscure psychology wiki about Lenore Thomson's obscure take on Jungian psychology. Some of you are familiar with Jungian functions through MBTI, the Myers-Briggs Type Indicator. Introverted Thinking (abbreviated Ti) is the dominant function of the INTP type.

It will only take a few quotes to illustrate why you are talking about the same thing:

Introverted Thinking (Ti) is the attitude that beneath the complexity of what is manifest (apparent, observed, experienced) there is an underlying unity: a source or essence that emerges and takes form in different ways depending on circumstances. What is manifest is seen as a manifestation of something. From a Ti standpoint, the way to respond to things is in a way that is faithful to that underlying cause or source and helps it emerge fully and complete, without interference from any notion of self. The way to understand that underlying essence is to learn to simultaneously see many relationships within what is manifest, to see every element in relation to every other element, the relationships being the "signature" of the underlying unity. This can only be experienced directly, not second-hand.

Introverted thinking is a form of mental representation in which every input, every variable, every aspect of things is considered simultaneously and holistically to perceive causal, mathematical, and aesthetic order. What you know by Ti, you know with your hands, your eyes, your muscles, even a tingling sensation "downstairs" because you sense that everything fits. Every variable is fair game to vary, every combination of variables worthy of consideration; the only ultimate arbiter is how well the parts form a unified whole rather than a jumble.

Introverted Thinking (Ti) is contrasted with Extraverted Thinking (Te):

From the Te perspective, anything for which you can't give an operational definition in terms of measurement (an "objective test") doesn't exist. The decision criteria are defined not exactly in terms of the things: they're defined in terms of observations of a sort that anyone can do and get the same result. You put the totality of the real-world situation onto your scales, so that all causal factors come into play--both known and unknown. What's accessible to you is the reading on the scale: that and only that is the basis for your decision.

As a dominant function, Te typically leads one to pursue and collect reliable ways of making decisions to get predictable results. The repeatability of a process becomes one of the main criteria for finding it valuable. Repeatable processes are valuable from a Te perspective because they enable you to make agreements with other people, where there is no doubt as to whether each party has fulfilled its part of the agreement. Making and delivering on promises is often how a Te attitude leads one to understand ethics.

Introverted Thinking about language:

From the Ti standpoint, communication is possible only between people who share some common experience of the things that they're talking about. To say something that you can understand, I need to relate it logically to things in your own experience. To show you how far a piece of wood bends, instead of giving a numerical measure (Te), I'd either encourage you to bend a piece of wood yourself, or find some mathematically similar thing that you know about and relate wood-bending to that. Words cannot be defined prior to the reality that they're about; words and criteria defined independently of the reality would be meaningless. The world itself provides a natural set of reference points, arising from the real, causal structure of things. Ultimately, to talk is to say, "I mean *that)."

Introverted Thinking uses language and concepts merely as pointers to patterns in reality that are incredibly more complex than anything that can be described in words. In contrast, Extraverted Thinking is about step-by-step justification according to shared language and critera. A common failure mode of Extraverted Thinking is King on The Mountain, which I think everyone will instantly recognize.

Introverted Thinking and Extraverted Thinking, along with Extraverted Intuition and Introverted Intuition, are combined to create rationality. Extraverted Intuition provides the idea generation, Introverted Thinking provides pattern recognition, Extraverted Thinking handles justification, and Introverted Intuition avoids bias. According to the Jung-Thomson-Kovitz theory, all of these modes of thinking provide benefits and failure modes. For example, a failure mode of Introverted Thinking is that since it is aesthetic and subjective, it can be very hard for Introverted Thinkers with different inputs to reconcile worldviews if they differ, whereas Extraverted Thinkers could slowly hammer out agreement step-by-step.

LessWrong seems mostly dominated by INTJs, who have Introverted Intuition and Extraverted Thinking. They are mostly focused on justification and bias. These are important skills, but Introverted Thinking is important for marshaling the priors of the totality of your experience.

comment by Burgundy · 2015-05-27T08:07:09.785Z · score: 2 (2 votes) · LW · GW

Continuing a bit…

It’s truly strange seeing you say something like “Very high level epistemic rationality is about retraining one's brain to be able to see patterns in the evidence in the same way that we can see patterns when we observe the world with our eyes.” I already compulsively do the thing you talking about training yourself to do! I can’t stop seeing patterns. I don’t claim that the patterns I see are always true, just that’s it’s really easy for me to see them.

For me, thinking is like a gale wind carrying puzzle pieces that dance in the air and assemble themselves in front of me in gigantic structures, without any intervention by me. I do not experience this as an “ability” that I could “train”, because it doesn’t feel like there is any sort of “me” that is doing it: I am merely the passive observer. “Training” pattern recognition sounds as strange to me training vision itself: all I have to do is open my eyes, and it happens. Apparently it isn’t that way for everyone?

The only ways I’ve discovered to train my pattern recognition is to feed myself more information of higher quality (because garbage-in, garbage out), and to train my attention. Once I can learn to notice something, I will start to compulsively see patterns in it. For someone who isn’t compulsively maxing out their pattern recognition already, maybe it’s trainable.

Another example: my brain is often lining people in rows of 3 or 4 according to some collection of traits. There might “something” where Alice has more of it than Bob, and Bob has more of it than Carol. I see them standing next to each other, kind of like pieces on a chessboard. Basically, I think what my brain is doing is some kind of factor analysis where it is identifying unnamed dimensions behind people’s personalities and using them to make predictions. I’m pretty sure that not everyone is constantly doing this, but I could be wrong.

Perhaps someone smarter than me might be able to visualize a larger number of people in multiple dimensions in people-space. That would be pretty cool.

On a trivial level, everyone can do pattern-recognition to some degree, merely by virtue of being a human with general intelligence. Yet some people can synthesize larger amounts of information collected over a longer period of time, update their synthesis faster and more frequently, and can draw qualitatively different sorts of connections.

I think that’s what you are getting at when you talk about pattern recognition being important for epistemic rationality. Pattern recognition is like a mental muscle: some people have it stronger, some people have different types of muscles, and it’s probably trainable. There is only one sort of deduction, but perhaps there are many approaches to induction.

Luke’s description of Carl Shulman reminds me of Ben Kovitz’s description of Introverted Thinking as constantly writing and rewriting a book. When you ask Carl Shulman a question on AI, and he starts giving you facts instead of a straight answer, he is revealing part of his book.

“Many weak arguments” is not how this feels from the inside. From the inside, it all feels like one argument. Except the thing you are hearing from Carl Shulman is really only the tip of the iceberg because he cannot talk fast enough. His real answer to your question involves the totality of his knowledge of AI, or perhaps the totality of the contents of his brain.

For another example of taking arguments in totality vs. in isolation, see King On The Mountain, describing an immature form of Extraverted Thinking:

In the King-on-the-Mountain style of conversation, one person (the King) makes a provocative statement, and requires that others refute it or admit to being wrong. The King is the judge of whether any attempted refutation is successful.

A refutation, for the King to rule it valid, must be completely self-contained: the King will ignore anything outside the way he has conceptualized the topic (see below for an extended illustration). If the refutation involves two or more propositions that must be heard together, the King will rule that each proposition individually fails to refute his statement. He won't address multiple propositions taken together. Once a proposition is rejected, he gives it no further consideration. A refutation must meet the King's pre-existing criteria and make sense in terms of the King's pre-existing way of understanding the subject. The King will rule any suggestion that his criteria are not producing insight as an attempt to cheat. […] The amount of information that the King considers at one time is very small: one statement. He makes one decision at a time. He then moves on to the next attempted refutation, putting all previous decisions behind him. The broad panorama--of mathematical, spatial, and temporal relationships between many facts--that makes up the pro-evolution argument, which need to be viewed all at once to be persuasive, cannot get in, unless someone finds a way to package it as a one-step-at-a-time argument (and the King has patience to hear it). Where his opponent was attempting to communicate just one idea, the King heard many separate ideas to be judged one by one.

Some of the failure modes of Introverted Thinking involves seeing imaginary patterns, dealing with corrupted input, or having aesthetic biases (aesthetic bias is when you are biased towards an explanation that look neat or harmonious). Communication is also hard, but your true arguments would take a book to describe, if they could even be put into words at all.

comment by 27chaos · 2015-05-27T03:37:42.903Z · score: 2 (2 votes) · LW · GW

How many bad ideas or ambiguously true ideas do mathematicians have for every good idea they produce? How many people feel "deep certainties" about hypotheses that never pan out? Even when sometimes correct, do their hunches generally do better than chance alone would suggest? I agree with the idea that pattern recognition is important, but think your claims are going too far. My opinion is that successful pattern recognition, even in the hands of the best human experts, relies heavily on explicit reasoning that takes control over the recognition mechanisms and keeps them accurately targeted. Without cumbersome restraints that resist mental manipulations, humans are more likely to invent numerology than Calculus. Filtering out bad ideas or chains of thought that pattern recognition brings into one's head is important.

A significant reason I've had problems with advanced Calculus is that my brain starts inventing too many justifications for things, and then I become unable to distinguish between remembered rules which are valid and ones which my mind invented without sufficient justification. The difference between a superstition, a heuristic, and a rule is extremely important, but I don't think pattern recognition is well equipped to monitor thoughts to maintain these distinctions. I see pattern recognition as being about what things have in common. That has a lot to recommend it, but differences are important too. I wouldn't say either pattern recognition or reasoning are of primary importance. They're two halves of a whole, either alone is almost useless while both together can be very very strong. In my own case, it's the restrictions I find difficult, being imaginative is almost too easy for me.

comment by dxu · 2015-05-27T04:10:20.007Z · score: 3 (3 votes) · LW · GW

A significant reason I've had problems with advanced Calculus is that my brain starts inventing too many justifications for things, and then I become unable to distinguish between remembered rules which are valid and ones which my mind invented without sufficient justification.

That may reflect more of a lack of sufficient practice on your part than anything else. It takes a long time to become familiar enough with a topic that your brain can start intuitively and spontaneously generating good ideas on that topic. As an example, despite having spent several years playing chess, I still have to consider every position carefully and with deliberation; although there have been cases in which the move which immediately springs to mind is correct, I've found that in general the opposite is true. However, there is evidence that top grandmasters do not view chess positions this way; their play is based a lot more on "feeling" than "thinking". (I don't have the source for it, but I definitely remember reading something about it in both GEB and Thinking, Fast and Slow.) Clearly, this means that despite having played chess for so long, I have still not yet reached the level at which intuition can play a significant role in my calculations. Based on what you've written here, I would judge it likely that you are in a similar situation with respect to calculus.

(Also see this. I think that the "post-rigorous" stage described in this post matches nicely with what Jonah said above.)

comment by JonahSinick · 2015-05-27T04:27:50.882Z · score: 2 (2 votes) · LW · GW

Thanks :-). I was going to respond along these lines before seeing that you had spoken my mind.

comment by satt · 2015-06-01T23:58:18.543Z · score: 2 (2 votes) · LW · GW

I feel a link to an old comment of mine belongs somewhere under this top-level post, and this subthread might be the best place for it, so I'm putting it here...

comment by 27chaos · 2015-05-27T04:42:05.858Z · score: 1 (1 votes) · LW · GW

If you're right, in chess it requires years and years of domain specific practice to get pattern recognition skills adequately prepared so that scrupulous thought is not required when evaluating moves. That doesn't seem like an argument against the importance of scrupulous thought to me, it seems like the opposite. Scrupulous thought is very hard to avoid relying on.

I think you're wrong however. I think once you reach a certain level of familiarity with a subject, the distinction between pattern recognition and scrupulous reasoning itself breaks down. I don't think chess experts only use the raw processing power of their subconscious minds when evaluating the board, I think they alternate between making bottom-up assessments and top-down judgements. The accounts given in the neurology books are reactions to the popular perception that reasoning abilities are all that matters in chess, but if they've given you the impression that reasoning isn't important in chess then I feel like they may have gone too far in emphasizing their point. Expert chess players certainly feel like they're doing something important with their conscious minds. They give narrative descriptions of their rounds regularly. I acknowledge that explicit thought is not all there is to playing chess, but I'm not prepared to say experts' accounts of their thoughts are just egoist delusions, or anything like that.

I suppose one point I'm trying to make here is that biased stupid thought and genius insightful thought feel the same from the inside. And I think even geniuses have biased stupid thoughts often, even within their fields of expertise, and so the importance of rigor should not be downplayed even for them. Genius isn't a quality for avoiding bad thoughts, it's quality that makes someone capable of having a few good thoughts in addition to all their other bad ones. When genius is paired with good filters, then it produces excellence regularly. Without good filters, it's much less reliable.

Finally, when you're dealing with theories about the universe the situation is different than when dealing with strategy games. You can't make a dumb subargument and then a smart subargument and have the two statements combine to produce a moderately valuable hypothesis. If you start driving down the wrong street, correctly following the rest of a list of directions will not be helpful to you. Rigor is important throughout all steps of the entire process. No mistakes can lead to success without first being undone (or at least almost none will - there are always exceptions).

comment by dxu · 2015-05-27T05:03:33.402Z · score: 1 (1 votes) · LW · GW

I think even geniuses have biased stupid thoughts often, even within their fields of expertise, and so the importance of rigor should not be downplayed even for them.

To use the chess analogy once more: this seems to conflict with the fact that in chess, top grandmasters' intuitions are almost always correct (and the rare exceptions almost always involve some absurd-looking move that only gets found after the fact through post-game computer analysis). Quite often, you'll see a chess author touting the importance of "quiet judgment" instead of "brute calculation"; that suggests extremely strongly to me that most grandmasters don't calculate out every move--and for good reason: it would be exhausting!

Likewise, I'm given to understand many mathematicians also have this sort of intuitive judgment; of course, it takes a long time to build up the necessary background knowledge and brain connections for such judgment, but then, Jonah never claimed otherwise. From the post itself:

It took me 10,000+ hours to learn how to "see" patterns in evidence in the way that I can now. Right now, I don't know how to communicate how to do it succinctly. It's too much for me to do as an individual: as far as I know, nobody has ever been able to convey the relevant information to a sizable audience!

If we could find a way to quickly build up the type of judgment described above, it could very well change the way people do things forever, but alas, we're not quite there. That's the whole point of Jonah's request for collaboration. (In an ideal world, I'd participate, but as a 17-year-old I doubt I'd have much to contribute and a lot of my time is used up preparing for college at this stage anyway, so... yeah. Unfortunate.)

comment by 27chaos · 2015-05-27T05:18:23.334Z · score: 1 (1 votes) · LW · GW

I was not aware most grandmasters' first instincts ended up being correct usually, interesting.

Likewise, I'm given to understand many mathematicians also have this sort of intuitive judgment; of course, it takes a long time to build up the necessary background knowledge and brain connections for such judgment, but then, Jonah never claimed otherwise. From the post itself:

I've been changing my position somewhat thoughout this conversation, just so it's clear. At this point, I guess what I think is that a hard distinction between "reasoning" and "pattern recognition" doesn't make much sense. It seems like successful pattern recognition is to a significant extent comprised of scrupulously reasoned ideas that have been internalized. If someone hypothetically refused to use explicit reasoning while being taught to recognize certain patterns, I'd expect that person to have a more difficult time learning. Reasoning about ideas in the way that is slow and deliberative eventually makes patterns easier to recognize in the way that is fast and intuitive. If someone doesn't incorporate slow thought originated restrictions into their fast pattern matching capabilities, then they will start believing in faces that appear in the clouds, assuming that they ever learn to pattern match at all.

comment by Lumifer · 2015-05-27T17:37:11.582Z · score: 0 (0 votes) · LW · GW

Without cumbersome restraints that resist mental manipulations, humans are more likely to invent numerology than Calculus.

That is true which is why most people are not great thinkers. However high skill might not come from explicit reasoning, but from refining the pattern matching to prune away false branches. Mastery of a skill comes not from the ability to do a lot of Bayesian updates correctly and really fast, it comes from practicing till your intuition (=pattern-recognition engine) starts to reliably lead you towards good solutions and away from bad ones.

comment by shminux · 2015-05-27T01:50:59.337Z · score: 2 (2 votes) · LW · GW

I believe that what you are describing is known as internalizing.

comment by [deleted] · 2015-05-28T19:17:55.103Z · score: 1 (1 votes) · LW · GW

Do you really think that this is something that can be taught through writing?

Most intuitive pattern recognition comes through repeated practice, and I think that it might make more sense to create some sort of training regimen/coaching that allows others to have that practice, instead of writing a post about it.

If you did create this training, I'd be incredibly interested in taking it (probably up to about $300 or so, which is admittedly small for this type of thing).

comment by Dr_Manhattan · 2015-05-27T16:33:42.695Z · score: 1 (1 votes) · LW · GW

algorithms that people have been constructed (within the paradigm of deep learning) are highly nontransparent: nobody's been able to interpret their behavior in intelligible terms.

Not quite true Jonah: http://arxiv.org/pdf/1311.2901.pdf

comment by JonahSinick · 2015-05-27T18:36:41.207Z · score: 0 (0 votes) · LW · GW

Even if what I said isn't literally true, it's still true that the cutting edge research in pattern recognition is in deep learning, where the algorithms that are in some sense highly nontransparent.

comment by jacob_cannell · 2015-06-13T20:56:16.750Z · score: 1 (1 votes) · LW · GW

Upon reading your comment about non-transparency in DL I thought of the exact same paper on visualizing ANN features that Dr_Manhattan posted. There was a recent post on the machine learning subreddit about using similar techniques to investigate the neural representations automatically learned in language model RNNs.

There is no intrinsic connection between transparency and automatic feature learning techniques. Consider the case of a large research team where the work in creating a visual system is divided amongst dozens of researchers, who each create specific features for certain layers/modules. The resulting features are not intrinsically opaque just because the lead researcher doesn't necessarily understand the details of each feature each engineer came up with. The lead researcher simply needs to invest the time in understanding those features (if so desired).

Deep learning simply automates the tedious feature engineering process. You can always investigate the features or specific circuits the machine came up with - if so desired. It is true that ML and DL optimization tools in particular are often used as black boxes where the researcher doesnt know or care about the details of the solution - but that does not imply that the algorithms themselves are intrinsically opaque.

comment by RomeoStevens · 2015-05-27T07:30:43.300Z · score: 1 (1 votes) · LW · GW

Does this capture any of what you're talking about? This is my intuitive take away from the post so I want to check if it's not what is intended. An analogy: we know that the lens has flaws and we can learn specific moves to shift the lens a bit so that we can see the flaws more easily. For those with high levels of epistemic rationality, bumping the lens around in just the right ways is, or has become, an automatic process such that they seem to have a magic ability to always catch the flaws right away. We ask them for an algorithm to do that and they point to a mishmash of different ways of nudging the lens. Both sides feel mildly silly. "So I move the it 2cm to the left followed by a 2 degree rotation clockwise?", "Uh, just nudge it around a bit dude, you should see what I mean after trying for a while."

comment by JonahSinick · 2015-05-27T08:10:17.554Z · score: 0 (0 votes) · LW · GW

Yes :-). I wouldn't say that it perfectly encapsulates what I was trying to say, but I myself don't yet know how to give a perfect encapsulation either. Some of the comments that other commenters have made are very much on point as well.

comment by Sniffnoy · 2015-05-26T23:43:27.614Z · score: 1 (1 votes) · LW · GW

Is this what you were referring to in "Is Scott Alexander bad at math?" when you said that being good at math is largely about "aesthetic discernment" rather than "intelligence"? Because if so that seems like an unusual notion of "intelligence", to use it to mean explicit reasoning only and exclude pattern recognition. Like it would seem very odd to say "MIT Mystery Hunt doesn't require much intelligence," even if frequently domain knowledge is more important to spotting its patterns.

Or did you mean something else? I realize this is not the same post, but I'm just not clear on how you're separating "aesthetic discernment" from "intelligence" here; the sort of aesthetic discernment needed for mathematics seems like a kind of intelligence.

comment by JonahSinick · 2015-05-27T00:06:16.953Z · score: 4 (4 votes) · LW · GW

The distinction that I'm drawing is that intelligence is about the capacity to recognize patterns whereas aesthetic discernment is about selectively being drawn toward patterns that are important. I believe that intelligence explains a large fraction of the variance in mathematicians' productivity. See my post Innate Mathematical Ability. But I think that the percent of variance that intelligence explains is less than 50%.

comment by Sniffnoy · 2015-05-27T00:24:28.513Z · score: 0 (0 votes) · LW · GW

Ah, I see. I forgot about that, thanks!

comment by g_pepper · 2015-05-26T23:15:47.333Z · score: 1 (1 votes) · LW · GW

Your observation that

the most effective people in the world have a very specific way of thinking. They use their brain's pattern-matching abilities to process the world, rather than using explicit reasoning

is the subject of Malcolm Gladwell's book Blink.

I don't remember that Gladwell gave any tips for actually developing one's skills for this type of thinking, but he does have a number of interesting stories and analysis about this type of thinking. It also makes the observation that this type of non-explicit reasoning can lead us astray.

I suspect that pattern-matching is vastly more efficient than explicit reasoning as you suggest, but that it is subject to bias and can in some cases lead one astray. Therefore, I think that rationality is a combination of both types of thinking - to use your example, a mathematician uses his/her pattern matching thought processes to figure out how to approach proving a theorem, but then reasons explicitly when formalizing the proof (and explicit reasoning would be used in verifying the proof as well).

At least some of the biases discussed in the sequences and elsewhere can be attributed to non-explicit reasoning and the antidote to these biases is to reason explicitly.

comment by JonahSinick · 2015-05-27T00:22:57.381Z · score: 0 (0 votes) · LW · GW

Thanks for your interest :-)

Your observation that is the subject of Malcolm Gladwell's book Blink.

There's certainly overlap, but I'm making a more precise claim: that one can develop powerful intuition not only in particular domains but that one can also develop powerful general predictive models to get very high epistemic rationality across the board.

I suspect that pattern-matching is vastly more efficient than explicit reasoning as you suggest, but that it is subject to bias and can in some cases lead one astray.

Yes, my realization is about relative effect sizes: I used to think that the right balance is 50% intuition and 50% explicit reasoning or something, whereas now I think that it's more like 95% intuition and 5% explicit reasoning. (I'm speaking very vaguely here.)

At least some of the biases discussed in the sequences and elsewhere can be attributed to non-explicit reasoning and the antidote to these biases is to reason explicitly.

Ah, but explicit reasoning isn't the only antidote: you can also use intuition to correct for emotional and cognitive biases :-). I know that it's highly nonobvious how one would go about doing this.

Somewhat tangentially, you might be interested by my post Reason is not the only means of overcoming bias. (The post is 4.5 years old..I've been thinking about these things for a long time :P.)

comment by Lumifer · 2015-05-27T17:43:10.250Z · score: 0 (0 votes) · LW · GW

but I'm making a more precise claim: that one can develop powerful intuition not only in particular domains but that one can also develop powerful general predictive models to get very high epistemic rationality across the board.

Why do you think so? Basically, what evidence do you have that you can build strong "intuitions" which will work across diverse domains? My off-the-top-of-my-head reaction is that in dissimilar domains your intuition will mislead you.

comment by JonahSinick · 2015-05-27T18:32:51.541Z · score: 1 (1 votes) · LW · GW

It's really hard, that's why almost nobody knows how to do it :P.

Roughly speaking, the solution for me was to develop deep intuition in a lot of different domains, observe the features common to the intuitions in different domains, and abstract the common features out.

Finding the common features was very difficult, as there are a huge number of confounding factors that mask over the underlying commonalities. But it makes sense in hindsight - we wouldn't be able to develop deep intuitions in so many different domains if not for there being subtle underlying commonalities - there weren't evolutionary selective pressures specifically for the ability to develop general relativity and quantum field theory - the fact that it's possible for us means that the relevant pattern recognition abilities are closely related to the ones used in social contexts, etc.

comment by pwno · 2015-06-01T13:23:41.736Z · score: 1 (1 votes) · LW · GW

observe the features common to the intuitions in different domains, and abstract the common features out.

Have you explicitly factored these out? If so, what are some examples?

comment by Lumifer · 2015-05-27T19:00:33.590Z · score: 0 (0 votes) · LW · GW

It's really hard, that's why almost nobody knows how to do it :P.

The question is why do you think it is even possible?

the solution for me was to develop deep intuition in a lot of different domains, observe the features common to the intuitions in different domains, and abstract the common features out.

So, do you feel that your intuition will work successfully in the fields of, say, post-modernist literary critique, agriculture, and human biochemistry?

comment by JonahSinick · 2015-05-27T21:14:38.962Z · score: 1 (1 votes) · LW · GW

The question is why do you think it is even possible?

Because I've seen other people do it, I've observed a strong correlation between the ability to do it and overall functionality, and I've recently discovered how to do it myself and have seen huge gains to both my epistemic and instrumental rationality.

I know that I'm not providing enough information for you to find what I'm saying very compelling. Again, it took me 10,000+ hours before I myself started to get it. I might well have been skeptical before doing so.

So, do you feel that your intuition will work successfully in the fields of, say, post-modernist literary critique, agriculture, and human biochemistry?

I don't know – it depends on the relative roles of skill and luck in these fields. If you're talking about those major discoveries from the past that required integrating a diverse collection of sources of information, I believe that the people who made the discoveries were using this style of thinking. For example, I believe that this was probably true of Norman Borlaug.

comment by g_pepper · 2015-05-27T00:48:04.942Z · score: 0 (0 votes) · LW · GW

Ah, but explicit reasoning isn't the only antidote

Yes, I was just about to edit my post to say "an antidote" rather than "the antidote". As a practical matter, no one is going to explicitly reason through every situation. A more practical antidote is to recognize biases and learn rules of thumb for avoiding them. A classic example is the conjunction fallacy. Explicitly calculating conditional probabilities will obviously correct this fallacy, but most of us are not going to do that most of the time. However, if one is aware of the fallacy, one can develop a rule of thumb that states that less specific hypotheticals are usually more probable than more specific hypotheticals; this rule is sufficient for avoiding the conjunction fallacy most of the time. However, even here, explicit reasoning played a role in avoiding the bias; explicit reasoning was used to learn about and understand the bias, and to develop the rule of thumb.

Is using this sort of rule of thumb what you mean by using intuition to correct for emotional and cognitive biases?

comment by Ronak · 2015-06-07T20:10:33.349Z · score: 0 (0 votes) · LW · GW

Oh this is nice. I've also come to realise this over time, ,in different words, and my mind is extremely tickled by how your formulation puts it on an equal footing with other non-explicit-rationality avenues of thought.

I would love to help you. I am very interested in a passion project right now. And we seem to be classifying similar things as hard-won realisations, though we have very different timelines for different things; talking to you might be all-round interesting for me.

comment by oge · 2015-05-29T20:26:32.752Z · score: 0 (0 votes) · LW · GW

Hi Jonah, this article is very intriguing since I might be going through a similar phase as you. Please add me to any list of collaborators you're drawing up.

comment by JonahSinick · 2015-05-29T20:48:18.781Z · score: 0 (0 votes) · LW · GW

Great :-). Send me an email at jsinick@gmail.com.

comment by MalcolmOcean (malcolmocean) · 2015-05-28T23:39:19.070Z · score: 0 (0 votes) · LW · GW

This seems valuable—I'm interested in helping (will email).

I want to highlight that "communicating how to do it" might not make sense as a frame. Pattern-matching is closely related to chunking. Ctrl+F yields other people who've mentioned chess, so I'll just point at that and then note that we actually know exactly how to communicate the skill of chunking chessboards: you get the person to practice chess in a certain way. There are of course better and worse ways to do this, but it seems like rather than looking for an insight to communicate you want to look for a learning process and how to make it more efficient by (e.g.) tightening feedback loops.

comment by Dr_Manhattan · 2015-05-27T16:44:07.331Z · score: 0 (0 votes) · LW · GW

I have a lot of evidence that this way of thinking is how the most effective people think about the world. Here I'll give two examples. Holden worked under Greg Jensen, the co-CEO of Bridgewater Associates, which is the largest hedge fund in the world.

BW also uses a lot of explicit models, https://www.youtube.com/watch?v=PHe0bXAIuk0

Holden working under Greg is also generally weak evidence about how Greg thinks.

comment by JonahSinick · 2015-05-27T18:38:40.600Z · score: 0 (0 votes) · LW · GW

I wasn't making an argument, I was stating my position. I have far more evidence than I can convey a single blog post.

comment by ChristianKl · 2015-05-27T14:25:45.128Z · score: 0 (0 votes) · LW · GW

I personally agree with your core thesis that pattern matching is central. I invested a lot of effort into Quantified Self community building and gave press interviews praising the promise of QS. I think at the time I overrated straight data over pattern matching. Today I consider pattern matching much more important. I'm happy to collaborate on developing this line of thought.

I'm weary of whether using the word 'rationality' in this context is useful. Webster defines the word as: 'the quality or state of being agreeable to reason'. Wikipedia says: 'Rationality is the quality or state of being reasonable, based on facts or reason. Rationality implies the conformity of one's beliefs with one's reasons to believe, or of one's actions with one's reasons for action.'

comment by TheAncientGeek · 2015-05-27T06:46:12.343Z · score: 0 (0 votes) · LW · GW

Have you read http://lesswrong.com/lw/1wu/reasoning_isnt_about_logic_its_about_arguing/ ?

comment by JonahSinick · 2015-05-27T08:11:17.883Z · score: 0 (0 votes) · LW · GW

No, I haven't. Thanks for pointing it out. :-)

comment by John_Maxwell (John_Maxwell_IV) · 2015-05-27T01:38:57.849Z · score: 0 (0 votes) · LW · GW

The coolest possible output of a collaboration like this would be some kind of browser-based game you could play that would level up your rationality.

Also, what characteristics/skills does your ideal collaborator have? Maybe what you want to do is find an effective altruist whose work could benefit very strongly from the skills you describe, tutor them in the skills, and having taught 1 person, see if you can replicate the most effective bits of teaching bits and scale them to a larger audience.

comment by Epictetus · 2015-05-26T23:14:25.084Z · score: 0 (0 votes) · LW · GW

This sounds like an explanation for the old adage: "Go with your gut". If your brain is a lot better at recognizing patterns than it is at drawing conclusions through a chain of reasoning, it seems advisable to trust that which your brain excels at. Something similar is brought up in The Gift of Fear, where the author cites examples where the pattern-recognition signaled danger, but people ignored them because they could not come up with a chain of reasoning to support that conclusion.

Sufficiently high quality mathematicians don't make their discoveries through reasoning. The mathematical proof is the very last step: you do it to check that your eyes weren't deceiving you, but you know ahead of time that it's your eyes probably weren't deceiving you. Given that this is true even in math, which is thought of as the most logically rigorous subject, it shouldn't be surprising that the same is true of epistemic rationality across the board.

Interesting that you bring up this and Poincare's experience. Jacques Hadamard wrote a book examining this phenomenon based on information he gathered from other mathematicians as well as his (layman's) knowledge of the psychology of the day. His conclusions bore several similarities to what you're trying to explain in this post. He did, however, note that experiences like Poincare's generally only took place if the researcher in question spent a lot of time working on the problem in the old "chain of reasoning" way, with the pattern often becoming clear some weeks or months later, after the researcher had moved on to a different problem. Perhaps this is what constitutes training one's brain to see patterns.

comment by JonahSinick · 2015-05-27T02:33:04.020Z · score: 1 (1 votes) · LW · GW

I read Hadamard's book 8 years ago and liked it a lot.

What I missed is that I mistakenly thought that Poincare's style of thinking was reserved for supergeniuses, and that all that someone like me could do was to clumsily use explicit reasoning.

I found out otherwise when I worked on my speed dating project. Something very primal in me came out, and I worked on it almost involuntarily for ~90 hours a week for 12 weeks. I finally had the experience of becoming sufficiently deeply involved so that the problems that I was trying to solve percolated into my subconscious and my intuition took over. I rediscovered a large fraction of standard machine learning algorithms (it was faster than learning from books for me personally because of my learning disability). Before this, I had no idea how capable I was. It made me realize that being a great scientist might be within the reach of a much larger fraction of the population than I had thought.

comment by alicey · 2015-05-26T19:46:17.104Z · score: 0 (0 votes) · LW · GW

-

comment by JonahSinick · 2015-05-26T19:51:45.504Z · score: 0 (0 votes) · LW · GW

I don't have a concrete plan yet. I have draft posts that I've written that are insufficiently polished for publication that I can share with you. You can get in touch with me at jsinick@gmail.com.

comment by James_Miller · 2015-05-26T19:44:31.086Z · score: 0 (0 votes) · LW · GW

What would be the goal of any such collaboration: LessWrong posts, a book, a podcast series? Knowing what you will produce will help you sell yourself to potential collaborators.

comment by JonahSinick · 2015-05-26T19:54:29.669Z · score: 0 (0 votes) · LW · GW

I don't know what the best route would be. Again, as far as I know, nobody has succeeded in doing what I want to do, so there's a prior against conventional approachs working. If I can interest Scott Alexander, I think that he might be able to do it, though if I recall correctly, he's also blogged about how the more important a topic is to his mind, the less views his posts on it get.