Under what circumstances is "don't look at existing research" good advice?

post by Kaj_Sotala · 2019-12-13T13:59:52.889Z · LW · GW · 4 comments

This is a question post.

Contents

  Answers
    33 shminux
    25 rohinmshah
    19 Vanessa Kosoy
    15 Mark_Friedenbach
    9 G Gordon Worley III
    6 Jan Kulveit
    1 kithpendragon
    1 FactorialCode
None
4 comments

In How I do research [LW · GW], TurnTrout writes:

[I] Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what's confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it's confused and unhelpful, and I can do better by just thinking hard.

The MIRI alignment research field guide [LW · GW] has a similar sentiment:

It’s easy to fall into a trap of (either implicitly or explicitly) conceptualizing “research” as “first studying and learning what’s already been figured out, and then attempting to push the boundaries and contribute new content.”

The problem with this frame (according to us) is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding. (Be mindful of what you’re optimizing in your research!) [...]

... we recommend throwing out the whole question of authority. Just follow the threads that feel alive and interesting. Don’t think of research as “study, then contribute.” Focus on your own understanding, and let the questions themselves determine how often you need to go back and read papers or study proofs.

Approaching research with that attitude makes the question “How can meaningful research be done in an afternoon?” dissolve. Meaningful progress seems very difficult if you try to measure yourself by objective external metrics. It is much easier when your own taste drives you forward.

And I'm pretty sure that I have also seen this notion endorsed elsewhere on LW: do your own thinking, don't anchor on the existing thinking too much, don't worry too much about justifying yourself to established authority. It seems like a pretty big theme among rationalists in general.

At the same time, it feels like there are fields where nobody would advise this, or where trying to do this is a well-known failure mode. TurnTrout's post continues:

I think this is pretty reasonable for a field as young as AI alignment, but I wouldn't expect this to be true at all for e.g. physics or abstract algebra. I also think this is likely to be true in any field where philosophy is required, where you need to find the right formalisms instead of working from axioms.

It is not particularly recommended that people try to invent their own math instead of studying existing math. Trying to invent your own physics without studying real physics just makes you into a physics crank, and most fields seem to have some version of "this is an intuitive assumption that amateurs tend to believe, but is in fact wrong, though the reasons are sufficiently counterintuitive that you probably won't figure it out on your own".

But "do this in young fields, not established ones" doesn't seem quite right either. For one, philosophy is an old field, yet it seems reasonable that we should indeed sometimes do it there. And it seems that even within established fields where you normally should just shut up and study, there will be particular open questions or subfields where "forget about all the existing work and think about it on your own" ought to be good advice.

But how does one know when that is the case?

Answers

answer by Shmi (shminux) · 2019-12-14T03:49:52.426Z · LW(p) · GW(p)

My field is theoretical physics, so this is where my views come from. (Disclaimer: I have not had a research position since finishing my PhD in General Relativity some 10 years ago.) Assuming you want to do original research, and you are not a genius like Feynman (in which case you would not be interesting in my views, anyway, what do you care what other people think?):

  • Map the landscape first. What is known, which areas of research are active, which are inactive. No need to go super deep, just get the feel for what is where.
  • Gain the basic understanding of why the landscape is the way it is. Why are certain areas being worked on? Is it fashion, ease of progress, tradition, something else? Why are certain areas being ignored or stagnate? Are they too hard, too boring, unlikely to get you a research position, just overlooked, or something else?
  • Find a promising area which is not well researched, does not appear super hard, yet you find interesting. Interdisciplinary outlook could be useful.
  • Figure out what you are missing to do a meaningful original contribution there. Evaluate what it would take to learn the prerequisites. Alternate between learning and trying to push the original research.
  • Most likely you will gain unexpected insights, not into the problem you are trying to solve, but into the reason why it's not being actively worked on. Go back and reevaluate whether the area is still promising and interesting. Odds are, your new perspective will lead you to get excited about something related but different.
  • Repeat until you are sure that you have learned something no one else has. Whether a question no one asked, or a model no one constructed or applied in this case, or maybe a map from a completely unrelated area.
  • Do a thorough literature search on the topic. Odds are, you will find that someone else tried it already. Reevaluate. Iterate.
  • Eventually you might find something where you can make a useful original contribution, no matter how small. Or you might not. Still, you will likely end up knowing more and having a valuable perspective and a skill set.

Physics examples: don't go into QFT, String theory or Loop quantum gravity. No way you can do better than, say, Witten and Maldacena and thousands of theorists with IQ 150+ and the energy and determination of a raging rhino. Quantum foundations might still have some low-hanging fruit, but the odds are against it. No idea about the condensed matter research. A positive example: Numerical relativity hit a sweet spot about 15 years ago, because the compute and the algorithms converged, and there were only a few groups doing it. Odds are something similar is possible again, just need to find where.

Also, Kaj, your research into multi-agent models of the mind, for example, might yield something really exciting and new, if looked at in a right way, whatever it is.

answer by Rohin Shah (rohinmshah) · 2019-12-14T01:30:31.128Z · LW(p) · GW(p)

I basically disagree with the recommendation almost always, including for AI alignment. I do think that

The problem [...] is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding.

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus, CHAI's bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me; you definitely can and should think about important questions before learning everything that could potentially be considered "background".

The advice

let the questions themselves determine how often you need to go back and read papers or study proofs.

sounds to me like "when you feel like existing research would be useful, then go ahead and look at it, but don't feel like it's necessary", whereas I would say "as soon as you have questions, which should be almost immediately, one of the first things you should do is find the existing research and read it". The justification for this is the standard one -- people have already done a bunch of work that you can take advantage of.

The main disadvantage of this approach is that you lose the opportunity to figure things out from first principles. When you figure things out from first principles, you often find many branches that don't work out, which helps build intuitions about why things are the way they are, which you don't get nearly as well by reading about research, and you can't go back and figure things out from first principles because you already know the answer. But this first-principles-reasoning is extremely expensive (in time), and is almost never worthwhile.

Another potential disadvantage is that you might be incorrectly convinced that a technique is good, because you don't spot the flaws in it when reading existing research, even though you could have figured it out from first principles. My preferred solution is to become good at noticing flaws (e.g. by learning how to identify and question all of the assumptions in an argument), rather than to ignore research entirely.

Side note: In the case of philosophy, if you're trying to get a paper, then I'm told you often want to make some novel argument (since that's what gets published), which makes existing research less useful (or only useful to figure out what not to think about). If you want to figure out the truth, I expect you would do well to read existing research.

TL;DR: Looking at existing research is great because you don't have to reinvent the wheel, but make sure you need the wheel in the first place before you read about it (i.e. make sure you have a question you are reading existing research to answer).

ETA: If your goal is "maximize understanding of X", then you should never look at existing research about X, and figure everything out from first principles. I'm assuming that you have some reason for caring about X that means you are willing to trade off some understanding for getting it done way faster.

comment by Matthew Barnett (matthew-barnett) · 2019-12-14T01:50:06.364Z · LW(p) · GW(p)
I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus, CHAI's bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me

See also, my shortform post about this [LW(p) · GW(p)].

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2019-12-14T18:32:38.832Z · LW(p) · GW(p)

+1, I agree with the "be lazy in the CS sense" prescription; that's basically what I'm recommending here.

comment by Ben Pace (Benito) · 2019-12-14T02:09:40.567Z · LW(p) · GW(p)

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety".

Yeah, that feels like a natural extension of "I'm not allowed to have thoughts on this yet, so let me get enough social markers to be allowed to think for myself." Or "...to be allowed a thinking license [LW · GW]."

answer by Vanessa Kosoy · 2019-12-14T09:53:56.937Z · LW(p) · GW(p)

IMO the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind. Some nuances:

  • Sometimes thinking about the problem yourself is not useful because you don't have all the information to start. For example: you don't understand even the formulation of the problem, or you don't understand why it is a sensible question to ask, or the solution has to rely on empirical data which you do not have.

  • Sometimes you can so definitively solve the problem during the first step (unprimed thinking) that the rest is redundant. Usually this is only applicable if there are very clear criteria to judge the solution, for example: mathematical proof (but, beware of believing you easily proved something which is widely considered a difficult open problem) or something easily testable (for instance, by writing some code).

  • As John S. Wentworth observed [LW(p) · GW(p)], even if the problem was already definitively solved by others, thinking about it yourself first will often help you learning the state of the art later, and is a good exercise for your mind regardless.

  • The time you should invest into doing the first step depends on (i) how fast progress you realistically expect to make and (ii) how much progress you expect other people to have made by now. If this is an open problem on which many talented people worked for a long time, then expecting to make fast progress yourself is unrealistic unless you have some knowledge to which most of those people had no access, or your talent in this domain is truly singular. In this case you should think about the problem enough to understand why it is so hard, but usually not much longer. If this is a problem on which only few people have worked, or only for a short time, or it is obscure so you doubt it got the attention of talented researchers, then making comparatively fast progress can be realistic. Still, I recommend proceeding to the second step (learning what other people did) once you reach the point when you feel stuck (on the "metacognitive" level when you don't believe you will get unstuck soon: beware of giving up too easily).

After the third step (synthesis), I also recommend doing some retrospective: what have those other researchers understood that I didn't, how did they understand it, and how can I replicate it myself in the future.

answer by [deleted] · 2019-12-24T04:36:52.522Z · LW(p) · GW(p)

Trying to invent your own physics without studying real physics just makes you into a physics crank

This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning. In fact his Nobel prize was for a technique (Feynman diagrams) which he developed on the fly in a lecture he was attending. What the speaker was saying didn’t make sense to him so he developed what he thought was the same theory using his own notation. Turns out what he made was more powerful for certain problems, but he only realized that much later when his colleagues questioned what he was doing on the whiteboard. (Pulled from memory from one of Feynman’s memoirs.)

One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.

Personally I always try to make sure I can derive again what I learn from first principles or the evidence. Only when I’m having particular trouble, or I have the extra time do I try to work it out from scratch in order to learn it. But when I do I come away with a far deeper understanding.

comment by Shmi (shminux) · 2019-12-24T09:03:41.869Z · LW(p) · GW(p)
This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning.

What Feynman recommended was to learn a topic, then put the book aside and see if you can rederive what you have supposedly learned on your own. This has little to do with the thesis you had quoted. I can take a bet 1000:1 that anything a person who has not studied "real physics" can propose as a their own physics will be at best a duplication of long-ago models and most likely just straight up nonsense. I suspect that there hasn't been anyone since Faraday who made original useful contributions to physics without being up to date on the state of the art in the field. Not even Tesla made any foundational contributions, despite having a decent physics education.


Replies from: None
comment by [deleted] · 2019-12-24T10:34:16.223Z · LW(p) · GW(p)

I think we may be talking past each other. You say

What Feynman recommended was to learn a topic, then put the book aside and see if you can rederive what you have supposedly learned on your own.

That's what I meant when I said "Feynman recommends that students always derive physics and math from scratch when learning." You know the context. You know the evidence. You know, in the form of propositional statements, what the answer is. So make an attempt at deriving it yourself from the evidence, not the other way around.

In doing so, I often find that I didn't really understand the original theory. What is built up in the from-scratch derivation is an intuitional understanding that is far more useful than the "book knowledge" you get from traditional learning. So, I would say, you never really learned it in the first place. But now we're debating word definitions.

The other thing that you develop is experience deriving things "from scratch," with just a couple of hints as necessary along the way, which sets you up for doing innovative research once you hit the frontier of knowledge. Otherwise you fall victim to hindsight bias in thinking that all those theorems you read in books seemed so obvious, but discovering something new seems so hard. In reality, there is a skill to research that you only pick up by doing, and not practicing that skill now when the answers could be looked up when you get stuck, is a lost opportunity.

comment by [deleted] · 2019-12-24T16:07:57.573Z · LW(p) · GW(p)

I like this answer, but do question the point about Feynman's gift being mainly traditional rationality.

One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.

I agree that Feynman portrays it that way in his memoirs, but accounts from other physicists and mathematicians paint a different picture. Here are a few example quotes as evidence that Feynman's gifts also involved quite a bit of "magic" (i.e. skills he developed that one would struggle to learn from observation or even imitation).

First, we have Mark Kac, who worked with Feynman and was no schlub himself, describing two kinds of geniuses of which Feynman was his canonical example of the "magician" type (source):

In science, as well as in other fields of human endeavor, there are two kinds of geniuses: the “ordinary” and the “magicians.” An ordinary genius is a [person] that you and I would be just as good as, if we were only many times better. There is no mystery as to how his mind works. Once we understand what he has done, we feel certain that we, too, could have done it. It is different with the magicians. They are, to use mathematical jargon, in the orthogonal complement of where we are and the working of their minds is for all intents and purposes incomprehensible. Even after we understand what they have done, the process by which they have done it is completely dark. They seldom, if ever, have students because they cannot be emulated and it must be terribly frustrating for a brilliant young mind to cope with the mysterious ways in which the magician’s mind works... Feynman is a magician of the highest caliber.

(Note that, according to this source, Feynman did actually have 35 students, many of whom were quite accomplished themselves. so the point about seldom having students doesn't totally hold for him.)

Sidney Coleman, also no slouch, shared similar sentiments (source):

The generation coming up behind him, with the advantage of hindsight, still found nothing predictable in the paths of his thinking. If anything he seemed perversely and dangerously bent on disregarding standard methods. "I think if he had not been so quick people would have treated him as a brilliant quasi crank, because he did spend a substantial amount of time going down what later turned out to be dead ends," said Sidney Coleman, a theorist who first knew Feynman at Caltech in the 50's.

"There are lots of people who are too original for their own good, and had Feynman not been as smart as he was, I think he would have been too original for his own good," Coleman continued. "There was always an element of showboating in his character. He was like the guy that climbs Mont Blanc barefoot just to show that it can be done."

Feynman continued to refuse to read the current literature, and he chided graduate students who would begin their work on a problem in the normal way, by checking what had already been done. That way, he told them, they would give up chances to find something original.

"I suspect that Einstein had some of the same character," Coleman said. "I'm sure Dick thought of that as a virtue, as noble. I don't think it's so. I think it's kidding yourself. Those other guys are not all a collection of yo-yos. Sometimes it would be better to take the recent machinery they have built and not try to rebuild it, like reinventing the wheel. Dick could get away with a lot because he was so goddamn smart. He really could climb Mont Blanc barefoot."

Coleman chose not to study with Feynman directly. Watching Feynman work, he said, was like going to the Chinese opera. "When he was doing work he was doing it in a way that was just -- absolutely out of the grasp of understanding. You didn't know where it was going, where it had gone so far, where to push it, what was the next step. With Dick the next step would somehow come out of -- divine revelation."

In particular, note the last point about "divine revelation".

Admittedly, these are frustratingly mystical. Stephen Wolfram describes it less mystically and also sheds light on why his descriptions of his discoveries always made them seem so obvious after the fact even though they weren't (source:

I always found it incredible. He would start with some problem, and fill up pages with calculations. And at the end of it, he would actually get the right answer! But he usually wasn't satisfied with that. Once he'd gotten the answer, he'd go back and try to figure out why it was obvious. And often he'd come up with one of those classic Feynman straightforward-sounding explanations. And he'd never tell people about all the calculations behind it. Sometimes it was kind of a game for him: having people be flabbergasted by his seemingly instant physical intuition, not knowing that really it was based on some long, hard calculation he'd done.

He always had a fantastic formal intuition about the innards of his calculations. Knowing what kind of result some integral should have, whether some special case should matter, and so on. And he was always trying to sharpen his intuition.

Now, it's possible that all this was really typical rationality, but I'm skeptical given that even other all star physicists and mathematicians found it so hard to understand / replicate.

All that said, I think Feynman's great as is deriving stuff from scratch. I just think people often overestimate how much of Feynman's Feynman-ness came from good old fashioned rationality.

answer by Gordon Seidoh Worley (G Gordon Worley III) · 2019-12-13T18:16:51.410Z · LW(p) · GW(p)

I suspect it's mostly proportional to the answer to the question "how much progress can you expect to make building on the previous work of others?" in a particular field. This is why (for example) philosophy is weird (you can make a lot of progress without paying attention to what previous folks have said), physics and math benefit from study (you can do a lot more cool stuff if you know what others know), and AI safety may benefit from original thinking (there's not much worth building off of (yet)).

answer by Jan Kulveit · 2019-12-14T15:45:56.375Z · LW(p) · GW(p)

I basically agree with Vanessa:

the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind.

Thinking about the problem myself first often helps me understand existing work as it is easier to see the motivations, and solving solved problems is good as a training.

I would argue this is the case even in physics and math. (My background is in theoretical physics and during my high-school years I took some pride in not remembering physics and re-deriving everything when needed. It stopped being a good approach for physics ca since 1940 and somewhat backfired.)

The mistake members of "this community" (LW/rationality/AI safety) are sometimes making is skipping the second step / bouncing off the second step if it is actually hard.

Second mistake is not doing the third step in a proper way, which leads to somewhat strange and insular culture which may be repulsive for external experts. (E.g. people partially crediting themselves for discoveries which are know to outsiders)

answer by kithpendragon · 2019-12-15T15:49:58.165Z · LW(p) · GW(p)

I think one important context for not reading the existing literature first is calibration. Examining the difference between how you are thinking about a question and how others have thought about the same question can be instructive in a couple of ways. You might have found a novel approach that is worth exploring, or you might be way off in your thinking. Perhaps you've stumbled upon an obsolete way of thinking about something. Figuring out how your own thinking process lines up with the field can be extremely instructional, and super useful if you want your eventual original work to be meaningful. At the very least, you can identify your own common failure modes and work to avoid them.

The fastest and easiest way to accomplish all this is by using a sort of research loop where you collect your own thoughts and questions, then compare them with the literature and try to reconcile the two, then repeat. If you just read all the literature first, you have no way to calibrate your explorations when you finally get there.

answer by FactorialCode · 2019-12-13T22:53:51.329Z · LW(p) · GW(p)

I think this is mainly a function of how established the field is and how much time you're willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out.

Thus, if you don't plan to spend a large amount of time in a field , it's far quicker and more effective to just read the literature. However, if you're going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone's approaches from clustering together.

Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn't make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don't immediately bias your thinking towards solutions that others are already working on.

4 comments

Comments sorted by top scores.

comment by johnswentworth · 2019-12-13T19:20:41.845Z · LW(p) · GW(p)

Possibly tangential, but I have found that the "try it yourself before studying" method is a very effective way to learn about a problem/field. It also lends a gut-level insight which can be useful for original research later on, even if the original attempt doesn't yield anything useful.

One example: my freshman year of college, I basically spent the whole month of winter break banging my head against 3-sat, trying to find an efficient algorithm to solve it and also just generally playing with the problem. I knew it was NP-complete, but hadn't studied related topics in any significant depth. Obviously I did not find any efficient algorithm, but that month was probably the most valuable-per-unit-time I've spent in terms of understanding complexity theory. Afterwards, when I properly studied the original NP-completeness proof for 3-sat, reduction proofs, the polynomial hierarchy, etc, it was filled with moments of "oh yeah, I played with something like this, that's a clever way to apply it".

Better example: I've spent a huge amount of time building models of financial markets, over the years. At one point I noticed some structures had shown up in one model which looked an awful lot like utility functions, so I finally got around to properly studying Arrow & Debreu-style equilibrium models. Sure enough, I had derived most of it already. I even had some pieces which weren't in the textbooks (pieces especially useful for financial markets). That also naturally lead to reading up on more advanced economic theory (e.g. recursive macro), which I doubt I would have understood nearly as well if I hadn't been running into the same ideas in the wild already.


comment by gwern · 2019-12-13T18:42:45.443Z · LW(p) · GW(p)

Since you mention physics, it's worth noting Feynman was a big proponent of this for physics, and seemed to have multiple reasons for it.

Replies from: Pattern
comment by Pattern · 2019-12-13T20:25:37.222Z · LW(p) · GW(p)

A big proponent of people studying the existing material, or doing their own experiments?

comment by Pattern · 2019-12-13T20:24:01.750Z · LW(p) · GW(p)
It is not particularly recommended that people try to invent their own math instead of studying existing math.

It might be useful for people to start by figuring out a) what math they want to study or b) what problems they want to solve/what tools they could use. "Learn all of math" is a daunting task. (Though perhaps more useful than "Learn every programming language".)

Trying to invent your own physics without studying real physics just makes you into a physics crank,

(I'm curious about the probabilities here.)

It'd be slow going, though I wouldn't say failure would be guaranteed. (If I went to the leaning tower of pisa, dropped a piece of paper and a rock, and had someone else on the ground time* when they hit the ground (where), I might conclude that a rock hits the ground sooner when dropped from the leaning tower of pisa, than a piece of paper does.)

*Making videos would be ideal, actually. People put a lot of stock in writing things down, but if the whole process is also filmed (and made available live) that could be better than pre-registration** - and allay concerns regarding data tampering.

**absent concerns regarding censoring via the platform