When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?

corruptedcatapillar

When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?

post by corruptedCatapillar · 2024-11-01T07:29:49.262Z · LW · GW · No comments

This is a question post.

  I was originally going to email Gwern directly, but figured being in a public space would benefit others who have the same questions and also put more eyes on it.
None
  Answers
    3 Thor K
    3 lpiqbgk356
None
No comments

I was originally going to email Gwern directly, but figured being in a public space would benefit others who have the same questions and also put more eyes on it.

BLUF: I'm writing to you with a question and asking for advice in doing research better. When you're engaging with an overwhelming amount of resources how do you 1. prevent information overwhelm while 2. keeping a high fidelity understanding of your resources to be able to use them in a larger body of work?

After reading your Spaced Repetition post, many of your LW comments, subreddit posts, etc. I'm always excited with how you synthesize the number of links and breadth of resources you reference, which makes me think you've got some sorcery going on in working with large bodies of work. I've gone through your site and subreddit to see if you've posted on it previously, but most didn't directly stand out to answer this question (though I may have missed it) which is why I'm reaching out.

During my last lit review I stumbled upon a technique of using a spreadsheet to 1. log the paper's details, link, etc. and 2. (most importantly) sort papers by which of the related works sections the resource ended up fitting into. (Note: the "updated" methodology approach starts at line 18 and features multiple "headers" which are the Related Works sections to be filed under. I apologize in advance for the quality.) This dramatically helped, since I could file the info away and go back to processing rather than trying to hold it all in at any given moment. And to improve this approach for next time, I would add notes recording how specifically the paper fits into that section for future me. (Link to paper with resulting RW section if curious: https://arxiv.org/abs/2410.02472.)

This spreadsheet implementation isn't pretty, but I include it because it seems that even with an ugly implementation this helped out quite a bit and thus must hold some promise.

I'm excited to get better at research in general though I'm interested in this aspect right now because I've just received a grant to do a survey (read: many more references to keep track of) for a cryptography paper (a field I'm new to) and I've seen that my previous method works better, but am not confident in its ability to scale.

My current best guess to handle this (inevitable?) information overload is to essentially:

be exposed to lots of resources
start to get overwhelmed (realize this feeling)
get everything out of my head with a brain dump
basically perform a "Principal Component Analysis" on said brain dump
- (as in, "Ok. I'm working on the Related Works section. I just read paper X and that fits into the A bucket, I'll make a header and put it there. Paper Y fits into buckets B and C. I'm kind of confused about what exactly paper Z is doing, I need to note my best guesses of what my confusions are and come back to it." etc.)
and afterwards I'm left with a result of an organized list of all my current resources and concrete items that list where I'm confused.

But when I read this I think, "can that really be it? That seems WAY too simple to be effective at scale."

On the other hand, I could see this simple trick being effective. Taking a look at your note on decluttering makes me think that this brain dump > PCA > organized lists may actually be the right direction. (There's a further rabbit hole exploring the Latent Inhibition concept. How it's "thought to prevent information overload", yet how "those of above average intelligence are thought to be capable of processing this stream effectively"; there must be transferable techniques they employ to handle this info.) I'm also reminded of David Allen's "Getting Things Done" who's aim is to reduce information overwhelm and process boils down to 1. Capture 2. Clarify 3. Organize 4. Review 5. Engage. (See also a paper investigating the cognitive science behind the methodology: http://pespmc1.vub.ac.be/Papers/GTD-cognition.pdf).

This also matches your outline mention: "Instead, I occasionally compile outlines of articles from comments on LW/Reddit/IRC, keep editing them with stuff as I remember them, search for relevant parts, allow little thoughts to bubble up while meditating, and pay attention to when I am irritated at people being wrong or annoyed that a particular topic hasn’t been written down yet." -source

I'd love to hear your thoughts on this approach and any alternative systems you'd recommend. Specifically: How do you remain open and receptive to a wide stream of information without becoming overwhelmed, while also being able to retain pertinent pieces of information for use in your works?

Also, I'm electing to share a higher volume of info in the spirit of asking questions the smart way, apologies for the wall of text.

Thanks for all you do! Well wishes to you,

Answers

answer by Thor K · 2024-11-01T18:35:08.951Z · LW(p) · GW(p)

Try to categorize papers into broad buckets (as you're doing with your spreadsheet), then create more detailed notes only for the most relevant stuff. Most paper notes in my knowledge graph are shallow, 15 minutes to summarize the claimed results, and a personal evaluation of why I would consider going deeper on the paper in the future.

Look for connections across categories when writing. Figure out whether your bottleneck is filtering papers, understanding them, or how you're storing that information.

For my field (zero knowledge proofs), there's only so much filtering I really need to do; the harder work is sequencing my reading and choosing what to go deep on. Sometimes I need to spend 10 or more hours on a single paper and its codebase, and that is actually optimal.

So my bottlenecks are largely around evaluating which research directions are most relevant to me, i.e. whether I think the researchers had an actual insight (or are just citation farming).

As for reading the paper, I typically work with a language model to write the more onerous Latex for my notes, passing screenshots to the llm to copy out, and making my own comprehension notes.

Example of a shallow paper read

Example of a deep paper read

good luck

answer by lpiqbgk356 · 2024-11-01T12:24:27.617Z · LW(p) · GW(p)

This is hard for everyone as nobody can consume the sum total of human knowledge. People consume knowledge which helps them develop filters. These filters then determine which further knowledge they will acquire and how they will integrate with their existing knowledge.

Everyone has blinders on, just that everyone has different blinders on. You will have blinders on as well, you just want to become smart about it. A lot of the above points help you more intelligently filter information.

While reading, keep in mind a hypothesis you want to actively investigate. Don't just read passively. Holden Karnofsky has a good article emphasising this.
It helps if the hypothesis is one you're genuinely curious about. If you're curious you'll endure more bullshit.
Sometimes the answer is just "sucking it up and deal with it". Often you might optimistically hope to understand a research field in 3 months, only to later realise you've spent 5 years on it and you're still not as much of an expert as you'd like. Deep understanding takes time to build. This is a thing that happens and is normal.
Being overwhelmed as a beginner is normal, most resources are not written with you as their target audience. Many people are not interested in going out of their way to help a beginner, they're more interested in doing a mediocre job and going home. (You should try to get help from others though, but you need to smart about it because most people don't care.)
Figure out what the primary methods of investigation of the field are. Is it surveys, is it experiments, is it theoretical reasoning? Human beings do input, processing and output, researchers included. What is the input?
Figure out the main devices and methods used in experiments in that field? For example in biotech some common methods are: observing cell count and physiology under microscope, gel electrophoresis, PCR, plasmid insertion, incubation under UV hood. If you can, try visiting a lab of your research field in person. If not, see youtube videos for each of the main equipment used.
Study how RCTs work. Study basics of probability and statistics. This comes up a lot, no matter your field. Many researchers suck at statistics so if you don't suck you'll see holes in other people's work.
Deeply study atleast one STEM field. For example, mathematics. This will help you understand both other STEM fields and non-STEM fields.
Deeply study atleast one non-STEM field. For example, psychology. This will help you understand both other STEM fields and non-STEM fields.
Ask a few people what their info filters are. Which sources of information do they pay attention to and which ones do they ignore.
Think about what research fields as a whole are doing, not just what individual researchers are trying to write about in individual papers. Where does the funding come from for the field? What is considered high status to work on? Which methods are popular to use? Which assumptions are taken for granted when working inside the field (but might not be taken for granted outside of it)?
Read about Hamming question. Identify Hamming questions in your field of research.

No comments

Comments sorted by top scores.

When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?

Contents

I was originally going to email Gwern directly, but figured being in a public space would benefit others who have the same questions and also put more eyes on it.

Answers

No comments