Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

post by owencb, AI Impacts (AI Imacts) · 2024-04-16T10:10:13.338Z · LW · GW · 12 comments

This is a link post for https://blog.aiimpacts.org/p/essay-competition-on-the-automation

Contents

  Background
  Scope
      Automation of wisdom
      Automation of philosophy
      Thinking ahead
      Ecosystems
  Judging
  Details
  FAQ on the automation of wisdom and philosophy
    What’s the basic idea here?
    What do you want to know about such automation?
    What do you mean by “wisdom” and “philosophy”?
    What threats are you concerned about?
      Unwise human actions
      Human philosophical errors
      Unwise AI actions
      AI philosophical errors
    Is there a particular threat model you’re focused on?
    Automating wisdom, philosophy — isn’t this all just AI capabilities work?
None
12 comments

With AI Impacts, we’re pleased to announce an essay competition on the automation of wisdom and philosophy. Submissions are due by July 14th. The first prize is $10,000, and there is a total of $25,000 in prizes available. 

Submit an entry via this form.

The full announcement text is reproduced here:

Background

AI is likely to automate more and more categories of thinking with time.

By default, the direction the world goes in will be a result of the choices people make, and these choices will be informed by the best thinking available to them. People systematically make better, wiser choices when they understand more about issues, and when they are advised by deep and wise thinking.

Advanced AI will reshape the world, and create many new situations with potentially high-stakes decisions for people to make. To what degree people will understand these situations well enough to make wise choices remains to be seen. To some extent this will depend on how much good human thinking is devoted to these questions; but at some point it will probably depend crucially on how advanced, reliable, and widespread the automation of high-quality thinking about novel situations is.

We believe[1] that this area could be a crucial target for differential technological development, but is at present poorly understood and receives little attention. This competition aims to encourage and to highlight good thinking on the topics of what would be needed for such automation, and how it might (or might not) arise in the world.

For more information about what we have in mind, see some of the suggested essay prompts or the FAQ below.

Scope

To enter, please submit a link to a piece of writing, not published before 2024. This could be published or unpublished; although if selected for a prize we will require publication (at least in pre-print form; optionally on the AI Impacts website) in order to pay out the prize.

There are no constraints on the format — we will accept essays, blog posts, papers[2], websites, or other written artefacts[3] of any length. However, we primarily have in mind essays of 500–5,000 words. AI assistance is welcome but its nature and extent should be disclosed. As part of your submission you will be asked to provide a summary of 100–200 words.

Your writing should aim to make progress on a question related to the automation of wisdom and philosophy. A non-exhaustive set of questions of interest, in four broad categories:

Automation of wisdom

Automation of philosophy

Thinking ahead

Ecosystems

If you’re not sure whether a topic would be within scope, feel free to check with us.

Judging

The judging process will be coordinated by Owen Cotton-Barratt. After shortlisting, entries will be assessed by a panel of judges: Andreas Stuhlmüller, Brad Saad, David Manley, Linh Chi Nguyen [EA · GW], and Wei Dai.

Judging criteria will be:

The prize pool is $25,000, and the prize schedule will be:

We may contact entrants whose work impresses us about possible further opportunities (e.g. conferences or research positions) on these topics.

Details

Entries should be submitted via this form, which asks for:

You are of course welcome to seek feedback on drafts before submission. Coauthored articles are also very welcome.

The deadline for submissions is midnight anywhere in the world on Sunday 14th July. We hope to complete shortlisting within two weeks of the submission deadline, and contact winners within four weeks of the submission deadline. Winners whose entries are not yet public will have two weeks after we contact them to provide a public version, or agree to us publishing it on the AI Impacts website. Payment will be made by ACH (for US-based winners) or wire transfer (for international winners).

We reserve the right to extend the submission deadline or increase the prize pool without notice. Judges have the right to split prizes in cases of ties, or to not award prizes in the unlikely event that no submissions are found to merit them.

If you want to ask questions about the competition, feel free to comment, or to email essaycompetition@aiimpacts.org

FAQ on the automation of wisdom and philosophy

What’s the basic idea here?

We're interested in the automation of thinking that can help actors to take wise actions (whatever that means) and avoid unwise actions. As an important subcategory, we're interested in the automation of philosophical thinking, and how to avoid practical errors grounded in philosophical mistakes.

What do you want to know about such automation?

We're not certain! We think it's a potentially important area which hasn't received that much attention. We'd like people to explore more of the ideas around this. If we understood more of the contours of when such automation might be helpful (or unhelpful!), that would seem good. If we understood more about what would be necessary for automation, that would seem good. If people developed a sense of things it would be good for someone to do in the world, that's potentially great.

We give a bunch of example questions we'd be interested in people addressing in the essay prompts part of the announcement, but because it seems like a broad area we've preferred to leave the competition fairly open, and wait to see which parts people can make meaningful contributions to.

What do you mean by “wisdom” and “philosophy”?

By “wisdom”, we mean something like “thinking/planning which is good at avoiding large-scale errors”. An archetype of something which is smart-but-not-wise might be a plan full of clever steps which are each individually well-chosen to chain to the previous step in the plan, but which collectively forget why they were doing this, and end up taking actions which are in conflict with the original goal. Wisdom is also what’s needed for noticing that an old ontology was baking in some problematic assumptions about what was going on.

By “philosophy”, we mean something like “the activity of trying to find answers by thinking things through, without the ability to observe answers”. This is close to the sense understood in the academic discipline of philosophy.

We’re not sure if automating these things is most naturally thought of as one topic, two topics, or more …

What threats are you concerned about?

Progress in these areas seems like it could potentially help avoid a number of different issues:

Unwise human actions

Humans sometimes take actions which are predictably unwise (from some perspectives), and which they later regret. Such actions could be really bad if they interact with high stakes situations. If people had access to trusted high wisdom automated advice, this could help them to reduce the rate of these errors.

This might be particularly important around issues coming with the development of AI, as people will be facing very novel situations and be less able to rely on experience.

Human philosophical errors

People sometimes make decisions that are influenced by their philosophical understanding of an issue. This could happen in the future, e.g. around understanding of AI consciousness/rights. Automation of good work, if achievable, could help people to have deeper understanding by the times they need to make key decisions.

Unwise AI actions

If people empower AI agents, ensuring that they are in some sense wise and not just smart could help to reduce rare damaging actions. In the extreme this could reduce risk of human extinction (imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place).

AI philosophical errors

If AI systems become superintelligent and are meaningfully running the world, their stances on philosophical questions could matter. e.g. deciding to engage in acausal trade (if it doesn’t actually make sense), or deciding not to (if it does) could be a large and consequential error. Better understanding of the automation of philosophy could help either to lead to more philosophically-competent AI systems, or alternatively could help people to coordinate about which parts of thinking should not be delegated to AI systems.

Is there a particular threat model you’re focused on?

No. We could make some guesses (both about which of the above categories are most concerning, and more concretely what the most concerning threats within them are), but we feel like the whole area is under-explored, and wouldn’t be confident in our guesses. We’d love to see high-quality analysis of this.

The fact that the automation of wisdom/philosophy seems important to better understand for multiple different threats — and also seems like a plausibly useful intervention for improving our ability to handle unknown unknowns — feeds into our desire to see it prioritized more than at present.

Automating wisdom, philosophy — isn’t this all just AI capabilities work?

Maybe! Certainly this is a type of capability (and high performance probably requires significantly advanced general capabilities, relative to today).

However, it seems to us that for a given level of general smarts in a system, the capacity for wisdom or philosophy could keep up with that, or could fail to. We are concerned about worlds where the ability to automate wise actions is outstripped by the ability to automate smart ones. So it seems like it may (at least in part) be a problem of differential technological development. We would be interested in further analysis of this question.

  1. ^

     The precise opinions expressed in this post should not be taken as institutional views of AI Impacts, but as approximate views of the competition organizers. We offer them not because we're sure they're exactly right, but because we think they're pointing in a promising direction and it's more likely to provoke high quality interesting entries if we provide some concrete starting points.

  2. ^

     We recognise that the timeline may be on the tight side for thoroughly researched papers. We are very happy to consider papers (and note that most journals accept papers that have been available as pre-prints, e.g. see  https://philarchive.org/journals.html for philosophy journals), but for entrants who are targeting academic publication we also welcome people putting the heart of their argument into an essay for the competition and later expanding it into a paper.

  3. ^

     Feel free to use unusual formats if you consider them best for exploring the ideas. e.g. we would be happy to receive a fictional business plan or technical roadmap for a hypothetical firm working on a challenge in these areas.

12 comments

Comments sorted by top scores.

comment by Alex Mallen (alex-mallen) · 2024-04-16T15:37:54.440Z · LW(p) · GW(p)

I think this is a very important and neglected area! That its tractability is low is one of its central features, but progress on it now greatly aides in progress later by making these hard-to-meaure aims easier to measure.

comment by sweenesm · 2024-09-13T14:31:07.327Z · LW(p) · GW(p)

Any update on when/if prizes are expected to be awarded? Thank you.

Replies from: owencb, owencb
comment by owencb · 2024-09-13T14:54:03.515Z · LW(p) · GW(p)

The judging process should be complete in the next few days. I expect we'll write to winners at the end of next week, although it's possible that will be delayed. A public announcement of the winners is likely to be a few more weeks.

comment by owencb · 2024-09-22T11:19:13.316Z · LW(p) · GW(p)

I've now sent emails contacting all of the prize-winners.

comment by Yitz (yitz) · 2024-04-16T19:11:38.224Z · LW(p) · GW(p)

imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place

I’m not confident this couldn’t swing just as easily (if not more so) in the opposite direction—a wiser system with unaligned goals would be more dangerous, not less. I feel moderately confident that wisdom and human-centered ethics are orthogonal categories, and being wiser therefore does not necessitate greater alignment.

On the topic of the competition itself, are contestants allowed to submit multiple entries?

Replies from: owencb, owencb
comment by owencb · 2024-04-16T20:56:26.014Z · LW(p) · GW(p)

Multiple entries are very welcome!

[With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I'm not sure exactly how we'd handle it if someone did the latter, but we'd aim for something sensible that didn't incentivise people to have been silly about it.]

comment by owencb · 2024-04-16T21:06:01.432Z · LW(p) · GW(p)

It's a fair point that wisdom might not be straightforwardly safety-increasing. If someone wanted to explore e.g. assumptions/circumstances under which it is vs isn't, that would certainly be within scope for the competition.

comment by Mitchell_Porter · 2024-07-15T06:17:22.959Z · LW(p) · GW(p)

I got nowhere near to writing an entry for this competition, but I will link to an essay from 12 years ago [LW · GW], which contains some of the concerns I might have tried to develop. 

comment by Chris_Leong · 2024-07-09T13:54:30.047Z · LW(p) · GW(p)

Do you only take single-author submissions?

Replies from: owencb
comment by owencb · 2024-07-10T09:23:01.620Z · LW(p) · GW(p)

No, multi-author submissions are welcome! (There's space to disclose this on the entry form.)

comment by Lao Mein (derpherpize) · 2024-04-19T15:03:18.092Z · LW(p) · GW(p)

Can you give examples of what you're looking for? Can I email you entries and expect a response?

Replies from: owencb
comment by owencb · 2024-04-19T22:32:52.751Z · LW(p) · GW(p)

I feel awkward about trying to offer examples because (1) I'm often bad at that when on the spot, and (2) I don't want people to over-index on particular ones I give. I'd be happy to offer thoughts on putative examples, if you wanted (while being clear that the judges will all ultimately assess things as seem best to them). 

Will probably respond to emails on entries (which might be to decline to comment on aspects of it).