How I Think About My Research Process: Explore, Understand, Distill

post by Neel Nanda (neel-nanda-1) · 2025-04-26T10:31:34.305Z · LW · GW · 2 comments

Contents

  Introduction
  The key stages
    Ideation (Stage 1): Choose a problem
    Exploration (Stage 1): Gain surface area
    Understanding (Stage 2): Test Hypotheses
    Distillation (Stage 3): Compress, Refine, Communicate
None
2 comments

This is the first post in a sequence about how I think about and break down my research process. Post 2 is coming soon.

Thanks to Oli Clive-Griffin, Paul Bogdan, Shivam Raval and especially to Jemima Jones for feedback, and to my co-author Gemini 2.5 Pro - putting 200K tokens of past blog posts and a long voice memo in the context window is OP.

Introduction

Research, especially in a young and rapidly evolving field like mechanistic interpretability (mech interp), can often feel messy, confusing, and intimidating. Where do you even start? How do you know if you're making progress? When do you double down, and when do you pivot?

These are far from settled questions, but I’ve supervised 20+ papers by now, and have developed my own mental model of the research process that I find helpful. This isn't the definitive way to do research (and I’d love to hear other people’s perspectives!) but it's a way that has worked for me and others.

My goal here is to demystify the process by breaking it down into stages and offering some practical advice on common pitfalls and productive mindsets for each stage. I’ve also tried to be concrete about what the various facets of ‘being a good researcher’ actually mean, like ‘research taste’. I’ve written this post for a mech interp audience, but hopefully it is useful for any empirical science with short feedback loops, and possibly even beyond that.

This guide focuses more on the strategic (high-level direction, when to give up or pivot, etc) and tactical (what to do next, how to prioritise, etc) aspects of research – the "how to think about it" rather than just the "how to do it." Some of skills (coding, reading papers, understanding ML/mech interp concepts) are vital for how to do it, but not in scope here (I recommend the ARENA curriculum and my paper reading list [AF · GW] if you need to skill up).

How to get started? Strategic and tactical thinking are hard skills, and it is rare to be any good at them when starting out at research (or ever tbh). The best way to learn them is by trying things, making predictions, seeing what you get right or wrong (i.e., getting feedback from reality), and iterating. Mentorship can substantially speed up this process by providing "supervised data" to learn from, but either way you ultimately learn by doing.

I’ve erred towards making this post comprehensive, which may make it somewhat overwhelming. You do not need to try to remember everything in here! Instead think of it more as a guide for the high level things to keep in mind, and a source of advice for what to do at each stage. And, obviously, this is massively flavoured by my own subjective experience and may not generalise to you - I’d love to hear what other researchers think.

A cautionary note: Research is hard. Expect frustration, dead ends, and failed hypotheses. Imposter syndrome is common. Focus on the process and what you're learning. Take breaks, the total change to productive time is typically positive. Find sustainable ways to work. Your standards are likely too high.

The key stages

I see research as breaking down into a few stages:

  1. Ideation - Choose a problem/domain to focus on
  2. Exploration - Gain Surface area
    1. North star: Gain information
  3. Understanding - Test Hypotheses
    1. North star: Convince yourself of a key hypothesis
  4. Distillation - Compress, Refine, Communicate
    1. North star: Compress your research findings into concise, rigorous truth that you can communicate to the world

Ideation (Stage 1): Choose a problem

Exploration (Stage 1): Gain surface area

Understanding (Stage 2): Test Hypotheses

Distillation (Stage 3): Compress, Refine, Communicate

Post 2 of the sequence, on key skills, is coming out soon - if you’re impatient you can read a draft of the whole sequence here.

2 comments

Comments sorted by top scores.

comment by Adrian Chan (adrian-chan) · 2025-04-26T17:47:10.673Z · LW(p) · GW(p)

I read this with interest and can't help but contrast it with the research approach I am more accustomed to, and which is perhaps more common in soft sciences/humanities. Because many of us use AI for non-scientific, non-empirical research, and are each discovering that it is both an art and a science. 

My honors thesis adviser (US-Soviet relations) had a post-it on his monitor said "What is the argument?" I research w GPT over multiple turns and days in an attempt to push it to explore. I find I can do so only insofar as I comprehend its responses in whatever discursive context or topic/domain we're in. It's a kind of co-thinking. 

I'm aware that GPT has no perspective, no argument to make, no subjectivity, and no point of view. I on the other hand have interests and am interested. GPT can seem interested, but in a post-subjective or quasi objective way. That is it can write stylistically as if it is interested, but it cannot pursue interests unless they are taken up by me, and then prompted. 

This takes the shape of an interesting conversation. One can "feel" the AI has an active interest and has agency in pursuing research, but we know it is only plumbing texts and conjuring responses. 

This says something about the discursive competence of AI and also of the cognitive psychology of us users. Discursively, the AI seems able to reflect and reason through domain spaces and to return what seems to be commonly-accepted knowledge. That is, it's a good researcher of stored content. It finds propositions, statements, claims, valid arguments insofar as they are reflected in the literature it is trained on. To us, psychologically, however, this can read as subjective opinion, confident reasoning, comprehensive recapitulation. 

In this is a trust issue w AI, insofar as the apparent communication and AI's seeming linguistic competence elicit trust from us users. And this surely factors into the degree to which we regard its responses as "factual," "comprehensive," etc. 

But I am still confounded by whether or not there might be some trick to conversational architectures with multi-turn engagements with AI. Might there be some insight into the prompt structure, or stylistic expression (requests, instructions, commands, formal, informal, empathic...) such that a "false interest" or "post subjective interestedness" might be constructed that can "push" the AI to explore in depth, breadth, novelty, contradiction, by analogy, etc. 

 

For example, four philosophical concepts common to western thinking are: identity, similarity, negation, analogy. Might prompt expressions be possible that serve as navigational coordinates almost, or directions, for use by the LLM in "perusing" discursive spaces (researching within a domain)? 

A different kind of argument, a post-subjective kind of reasoning, a different way of taking up the user's interest but nonetheless mirroring it successfully enough that users experience the effect of being engaged in mutually-interested interactions?

comment by eamag (dmitrii-magas) · 2025-04-26T17:29:06.250Z · LW(p) · GW(p)

From all of these, which ones are:

  • The most time consuming
  • The most difficult for beginners?

I'm asking to figure out what's the bottleneck, it seems like many people are interested in AI safety but not all of them are working on research directly, what need to happen to make it easier for them to start?