What's important in "AI for epistemics"?

  Structure of the post
  Previous work
  Why work on AI for epistemics?
    To be more concrete
      Good norms & practices for AI-as-knowledge-producers
      Good norms & practices for AI-as-communicators
      Differentially high epistemic capabilities
  Heuristics for good interventions
    Direct vs. indirect strategies
    Indirect value generation
    On long-lasting differential capability improvements
    Painting a picture of the future
  Concrete projects for differentially advancing epistemic capabilities
      Evals/benchmarks for forecasting (or other ambitious epistemic assistance)
      Automate forecasting question-generation and -resolution
      Logistics of past-casting
      Start efforts inside of AI companies for AI forecasting or other ambitious epistemic assistance
      Scalable oversight / weak-to-strong-generalization / ELK
      Experiments on what type of arguments and AI interactions tend to lead humans toward truth vs. mislead them
  Concluding thoughts
This post gives my personal take on “AI for epistemics” and how important it might be to work on.

Some background context:

So: How can we affect AI to contribute to better epistemic processes? When looking at concrete projects, here, I find it helpful to distinguish between two different categories of work:

  1. Working to increase AIs’ epistemic capabilities, and in particular, differentially advancing them compared to other AI capabilities. Here, I also include technical work to measure AIs’ epistemic capabilities.[2]

  2. Efforts to enable the diffusion and appropriate trust of AI-discovered information. This is focused on social dynamics that could cause AI-produced information to be insufficiently or excessively trusted. It’s also focused on AIs’ role in communicating information (as opposed to just producing it). Examples of interventions, here, include “create an independent organization that evaluates popular AIs’ truthfulness”, or “work for countries to adopt good (and avoid bad) legislation of AI communication”.

I’d be very excited about thoughtful and competent efforts in this second category. However, I talk significantly more about efforts in the first category, in this post. This is just an artifact of how this post came to be, historically — it’s not because I think work on the second category of projects is less important.[3]

For the first category of projects: Technical projects to differentially advance epistemic capabilities seem somewhat more “shovel-ready”. Here, I’m especially excited about projects that differentially boost AI epistemic capabilities in a manner that’s some combination of durable and/or especially good at demonstrating those capabilities to key actors.

Durable means that projects should (i) take the bitter lesson into account by working on problems that won’t be solved-by-default when more compute is available, and (ii) work on problems that industry isn’t already incentivized to put huge efforts into (such as “making AIs into generally better agents”). (More on these criteria here.)

Two example projects that I think fulfill these criteria (I discuss a lot more projects here):

Separately, I think there’s value in demonstrating the potential of AI epistemic advice to key actors — especially frontier AI companies and governments. When transformative AI (TAI)[4] is first developed, it seems likely that these actors will (i) have a big advantage in their ability to accelerate AI-for-epistemics via their access to frontier models and algorithms, and (ii) that I especially care about their decisions being well-informed. Thus, I’d like these actors to be impressed by the potential of AI-for-epistemics as soon as possible, so that they start investing and preparing appropriately.

If you, above, wondered why I group “measuring epistemic capabilities” into the same category of project as “differentially advancing AI capabilities”, this is now easier to explain. I think good benchmarks could be both a relatively durable intervention for increasing capabilities, via inspiring work to beat the benchmark for a long time, and that they’re a good way of demonstrating capabilities.

Structure of the post

In the rest of this post, I:

Previous work

Here is an incomplete list of previous work on this topic:

Why work on AI for epistemics?


I think there’s very solid grounds to believe that AI’s influence on epistemics is important. Having good epistemics is super valuable, and human-level AI would clearly have a huge impact on our epistemic landscape. (See here for more on importance.)

I also think there are decent plausibility arguments for why epistemics may be important: Today, we are substantially less epistemically capable than our technology allows for, due to various political and social dynamics which don’t all seem inevitable. And I think there are plausible ways in which poor epistemics can be self-reinforcing (because it makes it harder to clearly see what’s the direction towards better epistemics). And vice-versa that good epistemics can be self-reinforcing. (See here for more on path-dependence.)

That’s not very concrete though. To be more specific, I will go through some more specific goals that I think are both important and plausible path-dependent:

Let’s go through all of this in more detail.


I think there’s very solid grounds to believe that AI’s influence on epistemics is important.


While less solid than the arguments for importance, I think there are decent plausibility arguments for why AI’s role in societal epistemics may be importantly path-dependent.

To be more concrete

Now, let’s be more specific about what goals could be important to achieve in this area. I think these are the 3 most important instrumental goals to be working towards:

Let’s go through these in order.

Good norms & practices for AI-as-knowledge-producers

Let’s talk about norms and practices for AIs as knowledge-producers. With this, I mean AIs doing original research, rather than just reporting claims discovered elsewhere. (I.e., AIs doing the sort of work that you wouldn’t get to publish on Wikipedia.)

Here are some norms/institutions/practices that I think would contribute to good usage of AI-as-knowledge-producers:

Good norms & practices for AI-as-communicators

Now let’s talk about norms for AIs as communicators. This is the other side of the coin from “AI as knowledge producers”. I’m centrally thinking about AIs talking with people and answering their questions.

Here are some norms/institutions/practices that I think would enable good usage of AI-as-communicators:

Differentially high epistemic capabilities

Finally: I want AIs to have high epistemic capabilities compared to their other capabilities. (Especially dangerous ones.) Here are three metrics of “epistemic capabilities” that I care about (and what “other capabilities” to contrast them with):

In order for these distinctions to be decision-relevant, there needs to be ways of differentially accelerating one side of the comparison compared to the other. Here are two broad categories of interventions that I think have a good shot at doing so:

Heuristics for good interventions

Having spelled-out what we want in the way of epistemic capabilities, practices for AI-as-knowledge-producers, and AI-as-communicators: Let’s talk about how we can achieve these goals. This section will talk about broad guidelines and heuristics, while the next section will talk about concrete interventions. I discuss:

Direct vs. indirect strategies

One useful distinction is between direct and indirect strategies. While direct strategies aim to directly push for the above goals, indirect strategies instead focus on producing demos, evals, and/or arguments indicating that epistemically powerful AI will soon be possible, in order to motivate further investment & preparation pushing toward the above goals.

My current take is that:

Indirect value generation

One possible path-to-impact from building and iteratively improving capabilities on “understanding”-loaded tasks is that this gives everyone an earlier glimpse of a future where AIs are very epistemically capable. This could then motivate:

The core advantage of the indirect approach is that it seems way easier to pursue than the direct approach.

Core questions about the indirect approach: Are there really any domain-specific demos/evals that would be convincing to people here, on the margin? Or will people’s impressions be dominated by “gut impression of how smart the model is” or “benchmark performance on other tasks” or “impression of how fast the model is affecting the world-at-large”? I feel unsure about this, because I don’t have a great sense of what drives people’s expectations, here.

A more specific concern: Judgmental forecasting hasn’t “taken off” among humans. Maybe that indicates that people won’t be interested in AI forecasting? This one I feel more skeptical of. My best-guess is that AI forecasting will have an easier time of becoming widely adopted. Here’s my argument.

I don’t know a lot of why forecasting hasn’t been more widely adopted. But my guess would be that the story if something like:

For AIs, these problems seem smaller:

Overall, I feel somewhat into “indirect” approaches as a path-to-impact, but only somewhat. But it at least seems worth pursuing the most leveraged efforts here: Such as making sure that we always have great forecasting benchmarks and getting AI forecasting services to work with important actors as soon as (or even before) they start working well.

On long-lasting differential capability improvements

It seems straightforward and scalable to boost epistemic capabilities in the short run. But I expect a lot of work that leads to short-run improvements won’t matter after a couple of year. (This completely ruins your path-to-impact if you’re trying to directly improve long-term capabilities — but even if you’re pursuing an indirect strategy, it’s worse for improvements to last for months than for them to last for years.)

So ideally, we want to avoid pouring effort into projects that aren’t relevant in the long-run. I think there’s two primary reasons for why projects may become irrelevant in the long run: either due to the bitter lesson or due to other people doing them better with more resources.

That said, even if we mess this one up, there’s still some value in projects that temporarily boost epistemic capabilities, even if the technological discoveries don’t last long: The people who work on the project may have developed skills that let them improve future models faster, and we may get some of the indirect sources of value mentioned above.

Ultimately, key guidelines that I think are useful for this work are:

Painting a picture of the future

To better understand which ones of today’s innovations will be more/less helpful for boosting future epistemics, it’s helpful to try to envision what the systems of the future will look like. In particular: It’s useful to think about the systems that we especially care about being well-designed. For me, these are the systems that can first provide a very significant boost on top of what humans can do alone, and that get used during the most high-stakes period around TAI-development.

Let’s talk about forecasting in particular. Here’s what I imagine such future forecasting systems will look like:

Concrete projects for differentially advancing epistemic capabilities

Now, let’s talk about concrete projects for differentially advancing epistemic capabilities, and how well they do according to the above criteria and vision.

Here’s a summary/table-of-contents of projects that I feel excited about (no particular order). More discussion below.

Effort to provide AI forecasting assistance (or other ambitious epistemic assistance) to governments is another category of work that I’d really like to happen eventually. But I’m worried that there will be more friction in working with governments, so that it’s better to iterate outside them first and then try to provide services to them once they’re better. This is only a weakly held guess, though. If someone who was more familiar with governments thought they had a good chance of usefully working with them, I would be excited for them to try it.

In the above paragraph, and the above project titles, I refer to AI forecasting or “other ambitious epistemic assistance”. What do I mean by this?

Now for more detail on the projects I’m most excited about.

Evals/benchmarks for forecasting (or other ambitious epistemic assistance)

Automate forecasting question-generation and -resolution

Logistics of past-casting

Start efforts inside of AI companies for AI forecasting or other ambitious epistemic assistance

Scalable oversight / weak-to-strong-generalization / ELK

Experiments on what type of arguments and AI interactions tend to lead humans toward truth vs. mislead them

Concluding thoughts

The development of AI systems with powerful epistemic capabilities presents both opportunities and significant challenges for our society. Transformative AI will have a big impact on our society’s epistemic processes, and how good or bad this impact is may depend on what we do today.

I started out this post by distinguishing between efforts to differentially increase AI capabilities and efforts to enable the diffusion and appropriate trust of AI-discovered information. While I wrote a bit about this second category (characterizing it as good norms & practices for AI as knowledge producers and communicators), I will again note that the relative lack of content on it doesn’t mean that I think it’s any less important the first category.

On the topic of differentially increasing epistemic AI capabilities, I’ve argued that work on this today should (i) focus on methods that will complement rather than substitute for greater compute budgets, (ii) prioritize problems that industry isn’t already trying hard to solve, and (iii) be especially interested to show people what the future have in store by demonstrating what’s currently possible and prototyping what’s yet to come. I think that all the project ideas I listed do well according to these criteria, and I’d be excited to see more work on them.

