Zvi’s Thoughts on the Survival and Flourishing Fund (SFF)

zvi

Zvi’s Thoughts on the Survival and Flourishing Fund (SFF)

post by Zvi · 2021-12-14T14:30:01.096Z · LW · GW · 65 comments

  How the S-Process Works
  The Recommenders
  Incentives of the S-Process for Applicants
         1. Apply at all.
         2. Ask for lots of money.
         3. Have a credible case that they could usefully spend the money you ask for.
         4. Be legible and have a legible impact story. Fit into existing cause areas and stories of impact.
         5. Establish legible threshold levels of competency and execution.
        6. Establish threshold levels of anticipated and current cause alignment.
          7. Associate with insiders without generating clear conflicts of interest.
          8. Avoid salient potential harms.
    9. Avoid negative associations that make people uncomfortable.
      10. Don’t be caught lying.
  Incentives of the S-Process for Recommenders
  The Unilateralist’s Curse
  The Time and Resource Gap
  Too Much Money
  And the Nominees Are
  Unserious Applications
  Orthogonal Applications
  Innovation Station
  Nuclear War
  Postscript: ALLFED Corrects Its Estimates
  AI Safety Paper Production
  Access to Power and Money
  Lightcone Infrastructure
  Conclusion
None
65 comments

Epistemic Status: The views expressed are mine alone. They are various degrees of strong and weak, and are various degrees of weakly or strongly held. Due to time constraints, I’m not confident I was careful to mark every claim here with the proper ‘I think that’ style caveats. If something seems to you like it has a huge ‘citation needed’ sign on it, I’m probably not claiming to have proven anything.

I was one of the recommenders for the most recent round of grants for the Survival and Flourishing Fund. In accordance with our recommendations, $9.6 million dollars was allocated for distribution to various charities. The process involved four hours-long meetings where we discussed various questions, several additional discussions with other recommenders individually, many hours spent reading applications, doing research and thinking about what recommendations to make, and a number of meetings with various applicants.

This felt simultaneously like quite a lot of time, and like far less time, and the utilization of far fewer resources than the scope justified. $9.6 million dollars is a lot of money, and the share of that directed by my recommendations was also quite a lot of money.

Getting both the process right and the answers right are rather big deals.

The Survival and Flourishing Fund is related to the Effective Altruism (EA) movement, in that some of the people involved in SFF have also been involved in various EA activities, like EA Global, and SFF is trying to answer a question (how to do good with money) that is historically core to EA thinking and discourse.

It is also related to the EA movement in that, despite no official relationship between SFF and EA, despite the person who runs SFF not considering himself an Effective Altruist (Although he definitely believes, as I do, in being effective when being an altruist, and also in being effective when not being an altruist), despite SFF not being an EA organization, despite the words ‘altruist’ or ‘effective’ not appearing on the webpage, at least this round of the SFF process and its funds were largely captured by the EA ecosystem. EA reputations, relationships and framings had a large influence on the decisions made. A majority of the money given away was given to organizations with explicit EA branding in their application titles (I am including Lightcone@CFAR in this category).

Before going further, it is very important to make two things clear.

First: I am not an Effective Altruist.

And second, to reiterate: SFF is not in any way formally related to or part of EA.

If you are not familiar with EA you’ll want a basic familiarity before continuing (or deciding not to continue, if this post is not relevant to your interests). Without at least some sort of an introduction to and basic familiarity with EA, a lot of this isn’t going to make a lot of sense. The Wikipedia article on it gives a reasonable first explanation. If you want the EA perspective on themselves, and want to read their pitch knowing it is a pitch, this is their essential pitch [? · GW].

I know many EAs and consider many of them friends, but I do not centrally view the world in EA terms, or share the EA moral or ethical frameworks. I don’t use what seem to for all practical purposes be their decision theories. I have very large, very deep, very central disagreements with EA and its core components and central organizations and modes of operation. I have deep worries that important things are deeply, deeply wrong, especially epistemically, and results in an increasingly Goodharted [LW · GW] and inherently political and insider-biased system. I worry that this does intense psychological, epistemic and life experiential damage to many EAs.

Some of that I’ll gesture at or somewhat discuss here, and some of it I won’t. I’m not trying to justify all of my concerns here, I’m trying to share thoughts. If and when I have time in the future, I hope to write something shorter that is better justified.

I also want to make something else clear, for all my disagreements with and worries about it: These criticisms of Effective Altruism are comparing it to what it can and should be, and what it needs to be to accomplish its nigh-impossible tasks, rather than comparing it to popular alternatives.

If you read my Moral Mazes sequence, you’ll see how perversely I view most of what many people do most days. I critique here in such detail because, despite all our disagreements and my worries, I love and I care.

This post is an attempt to do a few different things.

Share my experience of the S-process we used to allocate funds.
Think about how the S-process works, what places it creates weird incentives or results, and how the S-process could be improved.
Encourage worthy applicants for future rounds, since it was clear that organizations generally underapplied.
Share some of my thoughts on various organizations that applied, in the hopes this information can be useful to others.
Share some of my heuristics and models of the Effective Altruist space in general and how it distributes funds, and how that become a large part of what happened at SFF despite SFF not having any formal or intended relationship to EA.
Anything else that seems salient, keeping in mind that a lot of this stuff goes infinitely deep and requires going over actual everything.

How the S-Process Works

The S-process works like this, skipping over some minor stuff:

Jaan (with some input from other funders, or a decision to delegate to Jaan) chooses candidates he thinks would be good recommenders, and asks them if they’d like to participate. He then uses several heuristics plus a source of randomness to select the final group. Heuristics include having a variety of perspectives, and having a mix of repeat and new participants.
The candidates look over the list of applications to get a sense of the landscape, and to check for and declare any conflicts of interest.
Meeting one. We meet, we vote on all the declared conflicts of interest (and do this again if something new comes up), go over the schedule and how the process works, and who is interested in investigating various organizations. If you have a conflict of interest you can answer questions but otherwise stay out of discussions on the group in question, and you can’t fund them (which matters, but less than you’d think it would, because funding is based on who wants to fund you most, not on a vote, so others can and often do pick up the slack).
We also go over the goal, which is survival and flourishing over the long term, as the funders (mostly Jaan, the largest funder) would understand such things. This is of course open to interpretation. Jaan made himself available in case we had any questions.
Everyone goes over the list of organizations, does an initial pass including concrete evaluations of how much good it would do to give them various amounts of money.
Some applications come in late. Recommenders are free to ignore late applications, or to treat them normally, or anything in between.
For each organization, we have three knobs to turn: Value of their first dollar, how many dollars would be useful at all, and what the concavity of the curve between them should be. This is elegant and forces concreteness, but has issues I’ll discuss later (an update to the app is planned to allow inputting an arbitrary monotonically decreasing function). This included a virtual organization called “hold” which represented not giving out the money in this round, instead advising holding onto the funds for future rounds.
Various recommenders will do various forms of further investigation, including meetings with applicants, as seems worthwhile.
From this point on, we all continuously revise our funding decisions until they are locked late in meeting three. After a while, this focuses on decisions that would plausibly change allocations – I think we all did what we could to make our ‘background’ evaluations robust and fix clear mistakes, but didn’t think carefully about things unlikely to matter.
Meeting two we vote on what seem like the highest value discussions, and use our time as best we can. The software helps highlight disagreements to suggest discussions.
Meeting 2.5 is an extra optional similar meeting, more discussions.
Meeting three is another set of similar discussions, except that at the end our decisions are locked, so it was focused on things that would change decisions within that time frame.
Funding worked by a formula that was effectively: Each of our virtual representations took automated turns allocating $1K to whatever we thought had the most marginal value if funded, until we were finished allocating funds. You are funded based on who most is excited to fund you, not based on a consensus on what to fund. I’ll discuss the implications more later.
The funders decide which recommenders to give how much money to, so you do need to ensure your allocations are robust to changes in who gets how much money. This round, there was a factor of ~2x difference in funding between the recommender who ended up directing the most funding and the recommender who ended up directing the least, if you exclude money returned to the funders for use in future rounds (3x if you include money returned to hold).

Or, from an outside view that excludes internal steps:

Jaan chooses a group of recommenders.
Organizations apply for funding.
Recommenders evaluate applicants.
Recommenders generate a payoff function for funding each of them based on (value of first dollar, number of net useful dollars, concavity of the resulting curve connecting those points).
Funders adjust how much and when funding should flow through each recommender..
System allocates funds by (virtually) giving funds to each recommender, and then we take turns allocating $1K to our top choice until all the money has been allocated.
Money is donated.
Hopefully good things.

This was intense. There were millions of dollars being allocated, and both my decisions on funding and the arguments I made in discussions made a big difference to how the money got allocated.

The Recommenders

Jaan chose a very strong group of recommenders given the task at hand. Everyone took the job seriously, everyone was doing their best to be cooperative rather than strategic, and everyone helped everyone else think through the issues, gather information, share models and considerations, and reach our own opinions based on our individual epistemic perspectives, world models and values, as reflected in the desire to improve long term survival and flourishing.

Whether or not they would consider themselves EAs as such, the other recommenders effectively thought largely Effective Altruist frameworks, and seemed broadly supportive of EA organizations and the EA ecosystem as a way to do good. One other member shared many of my broad (and often specific) concerns to a large extent, mostly the others did not. While the others were curious and willing to listen, there was some combination of insufficient bandwidth and insufficient communicative skill on our part, which meant that while we did get some messages of this type across on the margin and this did change people’s decisions in impactful ways, I think we mostly failed to get our central points across more broadly.

To the extent one thinks any or all of that is wrongheaded or broken, one would take issue with the process and its decisions, especially the resulting grants which ended up giving the majority of the funds distributed to explicitly EA-branded organizations.

From many of the valid alternative perspectives that do think such things about EA as it exists in practice, being unusually virtuous in executing the framework here doesn’t make the goings on much less horrifying. I get that.

But given the assumptions of the core EA framework to the extent it was relied upon, I have nothing but praise for all the recommenders and their efforts. That’s important.

Three of us decided to make our names public: Myself, Oliver Habryka and Beth Barnes.

The others have chosen to remain anonymous. I’m sad that the others made that choice, but of course it must be respected.

Incentives of the S-Process for Applicants

The S-process has several interesting impacts on someone considering applying for funds. There’s some advantage to such a process having ‘security through obscurity’ where no one knows what the strategic moves are and thus plays more honestly, but in general this is an error and it is better to be transparent. Most importantly, outsiders not knowing how it works will cause insiders to be favored over outsiders, which is already a failure mode I am worried about.

Here are the things an organization should do if they want funding. These are not the ten things that I would prefer organizations do, or that anyone else would want, rather they are the ten things that will cause them to be more funded more often under similar conditions:

Apply at all.
Ask for lots of money.
Have a credible case that they could usefully spend a ton of money.
Be legible and have a legible impact story. Fit into existing cause areas and stories of impact.
Establish legible threshold levels of competency and execution
Establish threshold levels of anticipated and current cause alignment.
Associate with insiders without generating clear conflicts of interest.
Avoid salient potential harms.
Avoid negative associations that make people uncomfortable.
Don’t be caught lying.

This is due to a mix of the implementation details of the s-process and how people interact in practice with those details, especially anchoring effects, and the ways in which EA insiders in this type of situation currently decide how to allocate funding to projects.

A simple model of the default s-process behavior, with respect to an application whose paperwork is in order, is to ask a series of questions that is something like this.

What cause area does this fit into? Do I care?
What exactly are you claiming to do? Do I care? Would that do anything?
Are you credibly going to attempt to do this thing? Can we trust you?
Can you execute? Do you have a track record? Do we know you?
Could doing this backfire? Should I worry about that?
Does anything about this feel uncomfortable? What vibe do you give off? OK to associate and fund?
How much money are you asking for? How much capacity do you have?
Are there game theory or political considerations in how much to give you?

The most important driver of this is that there is more demand to allocate funds than there is supply of high-quality legible known places to put those funds. Too much money is chasing too few known good places to put it. This round of the SFF had enough money, and arguably it had TMM, or Too Much Money.

If you had given any one recommender the full allocation of money, all of us would have given at least a substantial chunk of that money back to the funders rather than spend it, because there weren’t enough worthwhile places to put the money. By combining the places any one person thought was worthwhile, we almost managed to spend it all.

That is why the questions above look like negative selection, a series of filters to pass through, rather than positive selection. To get funded, including to get funded for quite a large amount, it helped to convince at least one person to be super excited by your project, but it wasn’t necessary. All you had to do to get funded was convince someone that your option was better than doing nothing, and that was mostly good enough. It wouldn’t get you quite as much as someone who was super excited, but depending on how much you could claim to have a lot of capacity, that could easily matter far more.

It is an interesting question, among the organizations worth funding, which organizations are more impactful and deserving of funds. For a small individual donor, that is the most important question, you want to find the best thing to do and do that.

But for the purposes of something like the S-process, under current conditions, it mostly does not matter much. If you are good enough, there’s no reason other than game theory not to fund you up to your level of capacity to spend money. If you’re not good enough, there’s no reason to fund you regardless of who else is applying. Yes, the more excited we are, the more money we’ll find a way to give you, but the test is mostly pass or fail.

In order to get to that threshold, the correct strategy seems to me to be that you should follow the above principles. You are doing something where you’ve proven you can be trusted to do the thing you’re saying you will do, and that you can execute on that, and that you have capacity to spend this kind of money, where that something you are doing can be quantified as being clearly good and also clearly not bad. The something you’re up to can be pretty vague, depending on trust in the group, ability, capacity and cause area in question.

Going over the individual points:

1. Apply at all.

This one’s a slam dunk. The amount of money per organization, even per organization excluding those who got grants in the past, is very high. The cost to apply is very low. At a minimum, you should make an honest low-effort application, because people want to find things to fund and will find ways to overlook such things. I know a group that applied despite knowing they almost certainly wouldn’t get funded, and they didn’t, and it was still a very good expected value play to apply.

2. Ask for lots of money.

You can overdo this, but it’s hard to do so unless you fail at number three. Asking for less does make it more likely you’ll get your full ask and more likely you’ll get anything at all, of course it does, but when people are deciding how much to give you, they have to set a maximum.

In addition to the standard anchoring effects, by default utility of money is assumed by s-process to be a linear decline from the first dollar to last dollar. Thus, if you can get me to say you could spend $1 million, then that means that spending $500k will still have you half as efficient as the first dollar spent. Whereas if you ask for $100k, but have no ability to spend more, then it’s hard to even give you the $100k because the system will assume utility of money to be declining so fast. The ability to spend the extra money makes it much easier for the process to naturally give you a lesser amount. There are ways to try and adjust for it, but they clearly didn’t come close to fully adjusting.

Let’s say you as a recommender in the S-process think that some organization, GoodCause, can use $200k, but at that point it has no ambitions beyond that. You have three knobs to turn: First dollar value, last useful dollar, and concavity. If you enter the last useful dollar as $200k, you can use concavity to almost give them $200k, but you’re going to give them less. If they make up some story about what they’d do with more money, even if it’s ‘we will not need to raise money next year’ or something, then you can enter $400k as the last dollar, and have a good shot of getting them at least $200k, but unless you’re hacking the numbers to get to exactly $200k you’ll likely give them a bunch extra.

I do get why the process works that way, to force it to make logical sense and avoid very weird curves that don’t reflect reality all that well, but it has some weird effects that have predictable distortions.

3. Have a credible case that they could usefully spend the money you ask for.

You need that credible case in order to get people to be willing to put in high last-dollar values with straight faces. The ‘real’ curve often has an additional inflection point (or two) in it. There’s the first $X, which you actually have good uses for, then there’s the next $Y, which you could spend if you wanted to but it’s expendable so there really is a phase shift at some point, although given the uncertainty of other funding sources a strict discontinuity is suspicious. But the lack of one is also suspicious, especially given the many instincts running around.

Note that if you have a case that you can spend money, and you get the money, that puts a lot of pressure on you to actually spend it rather than bank it for the future. If you don’t, then people will stop giving you money and fall out of that habit too, and you’ll have less money, whereas if you scaled higher you could instead raise more money.

This is commonly a problem within corporations and government agencies as well, where it is very clearly destructive. Departments are careful to always spend exactly their budget.

We tried to emphasize to a few places ‘here is some money but please don’t feel obligated to spend it any time soon,’ which hopefully will have some effect, but this is an ongoing problem.

The history of getting funding is used to get more funding, and the history of raising and then spending money generates more money. Thus there are strong incentives for organizations to go bigger and expand beyond their needs or ability to keep quality high, taking on additional tasks as needed, and few incentives to stay small or to disband once the initial mission is complete.

4. Be legible and have a legible impact story. Fit into existing cause areas and stories of impact.

When an organization had a legible standard-form way to claim they had impact – e.g. ‘we will write a bunch of AI policy papers in which we point out how safe actions are good, unsafe actions are bad, but unsafe actions are cheaper, perhaps saying the words Prisoner’s Dilemma a lot’ – there was a lot of ‘oh all right that’s something that does something, I guess’ or similar. Generic claims to gain ‘influence’ or to be a place to regrant, or to raise additional funds, had similar effects.

The influence case is interesting because it illustrates that the division is unnatural and you can get strange classification decisions. It’s actually not clear at all that this cashes out in anything, yet it has now seemingly been classified as legible. There’s a kind of conventional wisdom among EAs as to what set of actions ‘counts’ as legible positive action, and which ones don’t, and it’s your call how coincidental it is that it largely correlates to increasing the influence and power of EA, and/or to giving money to things EA does but outsiders don’t do.

5. Establish legible threshold levels of competency and execution.

If people are worried that you’re not competent in general, or in particular that you can’t execute on your plans, then they won’t want to fund you.

Since this means a lot of people who don’t have infinite time and largely know each other are trying to figure out who is competent and can execute, it becomes important to create general impressions that you can be competent and execute. A vague sentence claiming concerns about this can sink you quite a bit. Thus, this type of thinking causes a lot of risk aversion, and a desire to find ‘normal’ concrete wins, whereas (as Eliezer Yudkowsky says in his recent dialogues) the things most worth doing are mostly things that are doomed to probably fail, but have enough payoff that they’re worth trying anyway.

6. Establish threshold levels of anticipated and current cause alignment.

If people think you won’t know to stick to officially approved good causes and avoid related but alas bad versions of such causes, or worry you’ll pivot into unrelated things, that’s also a reason not to fund you.

These two problems are also central in academia, getting and keeping a position and getting funding for your work. This isn’t surprising since both areas are working on a system where people give you money and you’re expected to do a thing, then you ask for money to do another thing, and mostly it’s a Boolean where you either get it or you don’t. If you do a bunch of small but concrete and legible things, people go ‘oh that person or group can execute, and has evidence of impact’ or other similar things, and you get funded. If you do a few moonshot projects, and can’t point to ‘evidence of impact’ or anything concrete that came out of it, then people start to doubt you and think there’s something wrong with you. Or there’s some sort of bad vibe around you, and that sinks you. And people see all this coming after a while, and adjust accordingly.

7. Associate with insiders without generating clear conflicts of interest.

There are three key dynamics promoting association with insiders.

The first is informational for the applicant. If you associate with insiders, you’ll know to apply for things like SFF, and you’ll know more about what it takes to improve your chances. To some extent this is unavoidable, but I also think it is on us to reach out to those who don’t know about such things, and let them know. We can say things like ‘that should be their responsibility’ all we like, but that doesn’t accomplish anything beyond turning us into a kind of Venture Altruist that is giving out money to those who demonstrate skill at seeking money, a pattern I worry about that generalizes quite far.

The second reason is informational for the recommenders. If some of the recommenders know you personally, or know people they trust who know you personally, that goes a long way, whereas if we don’t have that then it would take a bunch of time. Time was one thing none of us had in abundance, there was no way to do a remotely complete look at all the reasonable organizations given our schedules. It would have required being full time on the project. Maybe that’s what should have happened, but it’s a much more expensive proposition in several senses.

In the meantime, ‘cached thoughts’ and impressions about various organizations and people were largely used because they were quick and we didn’t have anything better.

The third reason is making people comfortable and allowing them to trust you. Things that are explicitly EA-labeled in ways that are credible seemed to get benefit of the doubt on many levels. There was definitely a vibe that such people would know to play by the rules of the game, avoid things that made people uncomfortable or raise alarms about potential harms, and generally keep up the game of playing at being good, and encourage more game playing, in addition to any real good that may or may not get done.

The flip side is that if you got too close to the recommenders, we’d have to recuse. I was recused from Lightcone, which I wasn’t sure was necessary but makes sense, and cost them a decent amount of money that I believe I would have given them had there not been a conflict of interest. I was also recused from Median Foundation, which I did agree was completely necessary. It’s an unfortunate side effect, but I don’t see a way around it without making things worse.

8. Avoid salient potential harms.

When you’re doing something that’s clearly importantly good and impactful, some amount of harm, or risk of harms, is both acceptable and usually inevitable. Big things tend not to be universally beneficial, and if they are then it’s even weirder that we have the opportunity to fund or do them.

When you’re doing something that isn’t as impressive, there’s much more worry about harms. A common mode of thinking was something like ‘this might not do all that much good, but it seems highly unlikely to do harm, so sure, why not’ especially when the operation was cheap to fund.

Whereas when there were salient potential harms, this caused a lot of reluctance to fund. And while we were free to fund things others thought would be harmful, there was definitely a spirit of cooperation and taking such opinions into account. There were several organizations people were initially looking to fund that others thought were harmful, and many of them ended up not being funded. There was clearly also risk aversion involved here.

One debate we inevitably had was the ‘is science/progress/growth/knowledge bad, actually?’ question that gets asked periodically. Given the threat of AGI, there’s the hypothesis that good things are actually bad, whereas bad things are actually good, because good things make AGI timelines shorter and nothing else matters. There’s also those who are concerned (in ways I consider at best wrong, but more centrally I consider confused and not even wrong, although I won’t defend that here) about S-risks and negative utilitarianism (or how coming down from the trees wasn’t such a good idea and agriculture was even worse, or what not, which didn’t come up during the S-process but definitely does happen).

That’s potentially fully distinct from the motive ambiguity phenomenon or the reversed morality of Moral Mazes, where good things are bad and bad things are good because if you support bad things it proves your loyalty and dedication and focus, making you a good ally who won’t be distracted or stopped by moral considerations.

It’s also all potentially not distinct from the motive ambiguity phenomenon, depending on your model of what people are thinking and how charitable you intend to be.

I do think that some people come to the ‘science and technology are bad’ conclusion for the ‘right’ reasons, but I also think that some people come to it for the wrong reasons, and often both are in play.

In this case, we did manage to effectively agree to treat science, technology, progress, economic growth and other neat stuff like that as neat stuff. The exception was when things differentially impacted AGI development, which everyone agreed was quite bad and very much not neat. Otherwise, still neat, but with a ‘discount’ based on such worries.

An implicit theme of many discussions was which potential harms should be considered salient and taken seriously into account, versus which ones weren’t understood or justified sufficiently to enter into evidence, beyond worries about progress. How worried should we be about things such as…

Very very broadly and also specifically, as a kind of catch-all: Various incentives?
Rewarding/punishing or failing to punish/reward bad/good past behavior or results?
Impacts on EA or cause area culture?
Perceptions of projects, or of cause areas, or of EA in general?
Information cascades?
Information flow? Polluting the information stream, or cutting off info, or (in places like AGI or biological risks) failure to keep things secret?
Game theoretic impacts?
Concerns about S-risks that are often misinterpreted in harmful ways, or that might be harmful even if not misinterpreted?
Various ethical or moral considerations?
EA becoming about promoting EA rather than doing actual work?
EA cycling money around in ways that disguised what was happening or allowed people to pass the buck?
Wasting the time of people who could do something important elsewhere?
Noting that this list is incomplete, and that #1 encompasses a LOT of stuff.

One of my greatest frustrations was the difficulty in conveying many of the downsides/harms I saw from a combination of the organizations and their projects, and of the S-process and EA money distribution structures more generally. I definitely got through in some places, and I definitely failed to get through in others. Where I failed, it was some mix of ‘this person genuinely doesn’t care about the thing I care about here,’ ‘I failed to convey my intuitions and model, and that’s a hard thing but it’s also on me’ and probably some amount of me having confused or invalid concerns for various reasons.

One recommender noted that I always seemed to have unique concerns they never would have anticipated.

9. Avoid negative associations that make people uncomfortable.

Several organizations clearly set off various alarm bells and made people uncomfortable. Things the people there had done, or had been associated with, were considered red flags. Without getting into details, there was a large incentive to avoid this happening to you, which is some mix of ‘don’t do stuff like that’ and ‘don’t let this become the perception’ which in practice meant ‘don’t cause anyone to loudly complain about you’ and ‘be careful who and what you associate with.’ It’s not entirely bad, it definitely is a filter for bad actors and you need that, but it wasn’t exactly a robust justice system either.

10. Don’t be caught lying.

Being caught lying was clearly quite bad. There was one organization that was going to get a lot of money, that did get a lot of money, but that got (at least in expectation) substantially less money because of a concern that their claims of impact were based on invalid calculations, and they didn’t correct them when alerted to the problem. I was willing to mostly overlook that in the end because I saw the system of grantmaking as putting a ton of pressure on that organization to do something similar, because the mistake wasn’t obviously a mistake even though I agree that it was one (even once it’s pointed out, there’s a counterargument), and because when I did the calculation on my own it was clear there was plenty of impact (and others did the same and got the same answer). It still did sour me on the whole enterprise, although it’s hard to know exactly how much.

If they hadn’t made the claim/calculation in question it wouldn’t have been an issue at all. We spent a bunch of time on that, and what else it might indicate and how we’d need to react to that, and I’m not sure if we reacted to it too little, too much or about the right amount. I don’t know where the lines should be drawn.

I do know that if the claim had been more brazen, or if there had been a lie that had been material that had clearly occurred, it would have been a severe black mark. Which seems good and right to me.

Incentives of the S-Process for Recommenders

The S-process is an opportunity to direct a lot of money to any organization that applies, both by directing your funds and by persuading others. Your decisions and arguments will of course be listened to carefully, and if you go sufficiently rogue your allocation could have been dramatically reduced, as well as not being invited back in the future.

The time pressure was real. I devoted what time I could.

The most important core tension was ‘spirit of the process’ and ‘get invited to keep playing the game’ versus ‘get the allocation you care about for this round.’

A secondary core tension was ‘figure out the right answers’ versus ‘only so many hours I can spend on this.’

One can consider the process in three stages. First, there’s the individual evaluation, then the discussion, then the adjustments in light of the broader picture.

For the first stage, there are questions like this.

If you think purely in terms of how many dollars are useful at all to an organization, you reinforce the biases discussed above towards claiming to have capacity. If you don’t, you’re not reporting your model accurately.
If you actually attempt to measure relative impact of dollars, your estimates of impact should be orders of magnitude different for different organizations, leading to a clear hierarchy of preferences. To not do this mostly represents one trying to seem reasonable or hedge one’s bets, but is not a reasonable EV perspective. No one put in different orders of magnitude, and we all did more of a rank ordering thing.
There are lots of weird inflection points in our instinctive desire to give money, but the curves don’t allow inflection points. What to do?
If you want to allocate a particular amount of money to a particular organization, the obvious thing to do is to select an extreme concavity so that you aggressively allocate that many dollars but no more. But that’s a hack and rather dishonest if you take the process inputs seriously.
Once you know you’re not going to fund something, how much attention do you pay to getting their curves right or to bothering to enter non-zero numbers?
Where to prioritize one’s time, in general?
How do we think about the value of holding onto money, especially factoring in that we won’t be the one allocating it in the future?

Essentially, there’s a lot of tension between trust the process and input individually accurate things versus think about what this does to the allocation process and what distribution this results in.

Then there’s also the question of ‘where I think the money should go and how I evaluate things’ versus ‘what I think Jaan and the other funders think about such things’ since it’s their money we’re giving away. To some extent they want us to substitute our own work and judgment and world model and values, but also to optimize for their world model and values.

I decided I was going to ‘trust the process’ for as long and as much as possible, and go with the spirit, whenever possible. I think everyone else did too. Decision theory agreed. That didn’t make it easy.

For the second stage, the questions are things like:

How much do I strategically steer conversations to try and make the allocations closer to what I think would be better?
How much do I filter information flow and emphasis based on what outcomes I want?
Are the things I’m curious about a good use of group time or should I let others raise their questions instead?

Again, I did my best to be as non-strategic as possible while still making the case for my views, and I strongly feel that others did the same. I did my best to direct curiosity where I felt curiosity was appropriate, while also making sure my big concerns got discussed since I did feel they were important, and attempting to convey my world models. These types of tensions are probably impossible to fix, and need to be navigated using a positive-sum cooperative culture.

For the third stage, the tensions get more explicit, because we can see a likely approximation of the final outcome that is about to happen, and we can figure out how changes we would make would change that allocation. So we’re trying to update our evaluations as we change our minds without letting strategic questions distort us, but also it’s very salient how much money is changing hands based on where parameters are set.

All of that requires a commitment to the process being more important than the opportunity to redirect the allocation of the money, even when there are very large swings because of the dynamics involved.

The central dynamic is the question of ‘who is going to fund this?’ Often there will be several people who agree that a given organization is worth funding, and would give them similar amounts. Who funds them as their top prioritythen often determines whose other priority also gets funded, and whose gets left behind, and people’s rank orderings. Strategic options were abundant. This was especially tempting when someone else was considering funding something I thought was harmful.

The Unilateralist’s Curse

There is a concern known as The Unilateralist’s Curse [? · GW], where it only takes one person to fund ConflictCause or WeirdCause, or to do pretty much anything else in the world.

There’s a time and a place for this. At one point a few months ago, in one of my Covid-19 posts, I mentioned something non-Covid-related that some other people thought I shouldn’t be drawing attention towards. I was asked to take down the note. My inside view disagreed, but the argument was reasonable, so I gave deference to the request and took down the note.

In general, I think we have a general bias against action rather than in favor, a bias towards regulations and prohibitions and trying to centrally decide and plan things, and otherwise tell people what they must or can’t do far more than is wise, both culturally and as a matter of law. I’d rather make less things political and social, and let people do more things more easily.

The S-process, however, takes this to an extreme position, where six people are deciding to distribute money, and it’s very easy for projects that the group collectively thinks are very harmful to get funded. All that matters is the person most enthusiastic about giving funds, and the process mostly cares not the difference between the opinions ‘this isn’t that great’ and ‘this is actively terrible,’ except insofar as people could use social persuasion.

There was one tool, the ultimatum, that could have been used in a sufficiently extreme situation, and I believe it would have been used if one particular organization had been about to be funded, and possibly one or two others. I am curious if it would have had teeth in practice, but the bar for using it was clearly very high.

The other tool was to prevent endorsement, but we decided to remove that. Initially, there was a rule that if any one person disagreed with a grant, the grant would still happen, but it would not appear on the public list. I would have felt obligated to invoke the veto for one organization. If I had the option to veto and hadn’t used it, that would have been interpreted as an endorsement. There were two others that almost got funded by others, that I also would have felt the need to veto if they were about to be funded.

I observed this, others made similar observations, and we collectively agreed unanimously to take away our right to veto. That way, we could maintain transparency, which we all agreed was good, without feeling like we had individually given our approval where we didn’t want to give it.

There’s no great solution to this, either in the S-process or in general. If we weren’t in a process together, we mostly wouldn’t have even noticed. Having people who care about each others’ concerns and update on them somewhat even when they don’t agree or fully understand helps somewhat, but we’d ideally like more than that, and we’d like to take into account how many people want to fund something. You could argue that three people each thinking GoodCause could make use of a million dollars shouldn’t be that different from one person thinking that, if none of them think it would be useful to give more, but that seems like someone made a mistake somewhere.

I’ve tried to brainstorm mathematical solutions to this, but all of them have horrible strategic incentive issues, and none of them deal with the weirdness of what happens when you don’t happen to be in the S-process at the same time.

The core issue is that there’s no clear relation between the values that I put in for impact, and the values that a different recommender puts in. All we have are the relative values I put in, and even those are mostly only meaningful as a rank order, because our real beliefs should involve different orders of magnitude but our written numbers don’t do that. And that’s when there’s no incentive issues. Add in the incentive issues of letting those numbers cross over, and things get even stranger.

On another level, the core issue is that our true preferences over distributions and outcomes, when asked, are not all that well-represented by smooth curves. Yet we want to capture many of the benefits of the smooth curves. That includes the correction of large biases in our instinctive preferences, and the combination of the two gets hacky no matter what you do.

I’d also note that our discussions of when donations would be harmful mostly were about whether the organizations would do harmful things with the money. There’s a lot of other ways for a donation to be harmful, because it impacts the ecosystem more generally, and the incentive gradients involved. Giving people money for doing mostly worthless things that look right gives you more of what you reward, and in the bigger picture, that too is harmful. No one (or at least no one wise) said playing for keeps to change or save the universe was going to be easy.

Concretely, my suggestion would be something like:

Allow and encourage people to enter negative numbers into the S-process, to represent harm, and have that be highly visible to others.
Allow others to choose how to represent the impact of such concerns, and draw a distinction between model updates that changed their inside view, versus adjusting for knowing someone else’s conclusion even if they disagree.
Before allocating money to individual recommenders, if there are organizations that most people’s unilateral allocation (e.g. what they’d do if they gave out all the money by themselves) would fund, fund those first, so no one has to worry they won’t be funded or that by funding there they’ll be giving away their leverage to do other things.
Given the stakes, think about it more.

The Time and Resource Gap

Very little time was wasted during the process. The case was made that our first meeting could have been shorter, and there were a few places where we could have dealt with various things faster with a better process, but the gains to be had there are relatively minor.

The big issue is simpler than that. There’s a ton of information to find and to process, about a ton of different organizations, and one starts with very little to go on. This forces us to evaluate quickly. Cached impressions have big impacts. Previous results get copied in information cascades. Investigations only happen for some of the organizations, each usually by one person and not lasting very many hours. A number of applicants got six figure or seven figure grants with remarkably little backing those decisions up.

One could say ‘well then you all should have spent more time on this’ but our time is stretched thin as it is. It would not have been practical for me to double my time investment, and I am guessing the same is true for most of the other recommenders as well. You could of course pay me enough to free up that kind of time, at the expense of other things, you could make me quit my job at some price, but none of it would be cheap.

A suggestion has been made to have other people do at least preliminary investigations in advance. I think that’s pointing in exactly the right direction.

One of the great frustrations in my life is that, as far as I can tell, concierge services, assistants and secretaries are useless. With notably rare exceptions, even when they are provided free of charge, I have never been able to get more out of them than the time I put into them. I am confident that this would change for a sufficiently skilled and high-level person, and I am confident that I am lacking key social technology to hire well and to direct such people well.

In this context, in particular, it seems like delegation is clearly The Way. Thus, if I were to do this again, I would hire assistance to do at least the following:

Do a preliminary investigation of every organization, before the recommenders even start looking. Do later deeper dives on the ones that are potentially getting large funding.
Assemble key information about the organizations into good and consistent form.
Let us ask questions, and attempt to answer them, with or without contacting the organizations for answers as appropriate. Investigate particular questions recommenders are curious about. Summarize papers. Compile histories. Fact checks.
Do a sanity check on whatever we write in our notes, and on our evaluations, to look for things that are mistaken, or don’t seem to make sense. Think about what questions we would want to be asking, based on what we’re thinking.
Help schedule meetings for us to talk to people at the orgs, as needed.

The main worry I’d have is this might pull evaluations even more towards the things such people could evaluate, but the hope would be that the extra time and resources allowing deeper exploration is the dominating factor.

It might be a good instrumental cause of its own, of course, to simply have an organization that finds and trains good people capable of providing concierge assistance to people, and then providing it to select people or for help with select tasks when that seems worthwhile, free of charge. This would be highly related to a lot of what Lightcone is up to, only kind of the ‘next level up.’

Too Much Money

Professional poker player Antonio Esfanidari would often say he had TMM, or Too Much Money. Having enjoyed great success at the poker table, Antonio found himself with the ability to buy everything he ever wanted. Money outside of the poker table lost meaning to him. This resulted in some dumb decisions, which even became the topic of a TV show he created with friend and fellow poker pro Phil Laak.

Another form of TMM is when you feel obligated to find ways to spend it, especially when it is your budget or the money you’ve been given. A charitable organization can end up feeling like they need to be reenacting Brewster’s Millions.

TMM is not a fixed amount. TMM is a state of mind and a set of dynamics, and depends on a particular context. It happens when, in terms of spending, your grasp exceeds your reach. The reason Antonio had TMM, despite not being anything close to a billionaire, is the lack of a bigger goal.

Whereas Elon Musk is a billionaire but he is also trying to get civilization to Mars, so in important senses he very much does not have TMM, but his willingness to move large amounts of stock in highly inefficient ways in order to be a better Twitter troll is some combination of bespoke prioritization and having personal TMM.

It is often better to have what I refer to as EM, or Enough Money. That’s a sweet spot where money holds its meaning, and you care about value at all, but lack of money doesn’t hold you back from your goals. Of course, if you had orders of magnitude more money, perhaps you’d have different goals, or at least different methods to seek your goals. I know I would. But in a given local context, you can still have EM.

The other thing you can have, of course, is NEM, or Not Enough Money, or you can even be what Kanye West refers to as Broke Fi Broke.

(And interestingly, in my model, NEM and TMM are states of mind and it’s possible to have both at once, which is what happens when people are buying gold plated toilets.)

Anyway, I mention all this because there are several senses in which the process and the system around it could be considered to have TMM, or Too Much Money.

EA has TMM.
SFF had TMM.
A lot of people in crypto have TMM.
SFF Grants that were too large might cause organizations to have TMM.

If one thinks about the broad range of things one can do with money when playing for keeps and playing to win, then it is crazy to say that EA has TMM.

As a proof by example, rather than the best possible use of such funds: There are a lot of companies out there that one could purchase, and then run as public goods and in the service of important cause areas. Not everything is for sale, but many things are. Twitter’s market cap alone is around 33 billion. Pfizer is over 300 billion. Again, this is purely an existence proof.

However, if one is limited, for whatever reason, to giving away money to charitable organizations that already exist and which legibly fit the mold of EA causes and frameworks, then in the context of funds earmarked for such things, EA does seem from where I sit to have TMM.

In that context, SFF also had TMM. If you looked at each individual recommender’s allocation, everyone gave away substantially less than all the money. When I went looking for additional organizations to encourage to apply, I did find (or think of) one and with time likely could have found more, but my guess is that the organizations that didn’t apply despite being legible EA causes I’d have been excited to fund, did so because they didn’t need the funds.

The core reason for this is crypto. If you invested early in crypto, there’s a very good chance you have TMM. You see the signs of this all over the space, and it’s not a coincidence people are paying premium prices for bored apes. I didn’t think of it in time to get in early, but in hindsight, the moniker fits.

Regardless of how you would score the community’s performance on crypto, enough people who are EA/rationalist adjacent enough did buy enough that they’re in position to give such causes quite a lot, and also that’s where many of our giant funders got their bankrolls. If anything, as a group we are now overinvested, even if you are bullish on the space.

These idiosyncratic large grants are, in my view, a very good thing, but when looking to allocate SFF funds it means there’s a lot more to distribute and less places available to distribute to.

Meanwhile, what happens when an organization gets TMM? There are several potential problems, stuff like this:

The organization may no longer have to prove itself in order to get more funding, which can have many effects both good and bad.
The organization is no longer something others can usefully fund, which… makes those other potential funders sad? Makes them less engaged? People seem to care about this.
The organization is under pressure to spend the money, and to expand its scope and mission. Potentially under pressure to do a lot of this and quickly, and in ways they’re not capable of doing well. This can end up destructive to your production, or it can lead to wasting time of people who could otherwise do something valuable.
The organization could become the target of those who care mostly about money.
The organization is in more danger of fights over money and power, to be brought ‘in line’ in various ways, or of the dynamics involved in Moral Mazes.
Other organizations might lose sight of their missions in order to chase these kinds of funds.

The hope was that if an organization was explicitly told ‘we are giving you more money than you can usefully spend right now, please do not be in any hurry or feel under any obligation to spend it’ that this would help, but I have no idea the extent to which this will turn out to actually help. They have to take it to heart, and have to believe that we believe it, and they need to not think that spending the money will unlock similarly large grants from others now that they’ve seen SFF’s grant.

That brings us back to the question of whether there’s TMM floating around in general in the space, and what ways the money in the space drives organizations to act, in general. I’m sure I’ve gestured broadly at much but not all of it, but I want to stay within scope so I will decide to cut it off here.

And the Nominees Are

Such a post would be incomplete if I did not share at least some of my thoughts on the organizations that applied for money. Not sharing such thoughts makes it that much harder for others to make good decisions, and in a real sense wastes the work that was done. Despite that, there are dangers of doing this, the most salient of which are:

Information cascades. To some extent you want to cause an information cascade when sharing such information, but in an even more important sense information cascades are harmful. It’s important that you, yes you, in when making decisions of what to work on or give money to, think for yourself, model the world as best you can and come to your own decisions.
Discouraging others from investigation. Similar to an information cascade, if someone else has done the work, you might be tempted to skip the work, whereas the work is the most valuable thing most people can do here – using their decisions as costly signals to communicate local or unique information.
Mistakes. I’m gonna mess up slash did mess up a lot of this and it’s not a great look and it’s going to be embarrassing and people are going to complain about it and it would be an additional mistake to deny the giant ugh field this generates.
Politics. Saying in public who should and shouldn’t get large piles of money gets political, and it gets political fast, and oh my do I not want to go there if it can be avoided. Another giant ugh field. But it also can’t be used as a threat to suppress important information, especially asymmetrically by bad actors.
Distraction. Even if the resulting discussions manage to avoid being political, the term ‘demon thread’ still applies, and I could lose unlimited amounts of time and attention, along with a lot of stress, while the other points that were the prime motivation for writing this could get mostly forgotten.
Hackability. Revealing too many details about your ‘choose whether to give people money’ algorithm can provide good incentives and motivation to people, but it can also encourage them to fake it and hack you, and make you worry that your interactions with people are fake. I hate this, and have only some idea how much it sucks for people like Jaan.

All right, that’s all noted, time to just do go ahead. A powerful mantra.

There were a bunch of applicants. We can sort them into a few broad categories.

Unserious Applications

Firing an application into the void was made intentionally easy, subject to the need to include sufficient information for the recommenders about new organizations. One of the costs of this is that there will inevitably be some people who apply because there’s no downside and maybe you get a check, and be wasting your time.

This is a situation in which, once you have no business getting a check, worse is better, because it lets us dismiss your application faster. Thus, I appreciate that all these applications were very very obviously unserious and thus didn’t waste much time, and some of them put a smile on my face in a ‘nice try, kid’ kind of way.

Orthogonal Applications

As an example of this category, there was an application to take kids in Flint, Michigan fly fishing. I am not against taking kids in Flint, Michigan fly fishing. Quite the opposite, I’m all for it and I’d rather fund that than light the same amount of money on fire, but this was in no way relevant to our interests or a plausibly efficient use of funds.

This category also included plausibly good uses of funds, but for cause areas that weren’t relevant to the mission of improving the long term future. You could (or could not) conclude that it was a good idea to buy a bunch of malaria nets because you believe they save lives directly, or give money to poor people to make them less poor. Again, I have no problem with any of that, but it wasn’t relevant to our interests.

If we’d had any animal welfare charities, which we didn’t, they too would have gone into this category.

The interesting border case is economic growth. I certainly do not buy the Tyler Cowen model from Stubborn Attachments that all that matters long term is economic growth. Then again, there’s always that good old chestnut of ‘what if more economic activity means faster doom because AGI or other tech.’ Undifferentiated economic growth, especially catch-up growth that wouldn’t involve meaningful innovation or change in the culture or ability to do worthwhile things, didn’t seem relevant to me.

Thus the question of the Charter Cities Institute. Creating charter cities that operate under First World rules seems like a good thing. People should totally do that, and totally be willing to support it. I talked to them hoping it would be aligned with the mission.

The call was great, because they were honest with me and told me they weren’t doing the thing I wanted them to do. This is The Way. I didn’t do a good job hiding what answers I wanted to hear, and they said ‘nope, sorry, that’s not what we do here.’ Bravo. We also talked about a bunch of other stuff.

What I was looking for on that call was the ability to do things you can’t do in First World countries. In particular, challenge trials seemed like a strong litmus test. If your charter city allows the world to do challenge trials, then it’s super valuable. If it doesn’t, then you might be helping the particular people, but you’re not mostly doing the thing I care about. One of the things I learned on the call was that Prospera messed this up due to the way they intertwine existing legal systems, and a lot of other similar things too, which are the places that matter most to me.

That’s because there was a broad different category that I do think matters a lot, which is hard to describe in exact words, but involves changing the cultural landscape to favor the ability to think, communicate, innovate, produce and act in meaningful ways, and show that it is possible to do real things. To make there be more people in the world, in the sense that are not very many people in the world.

To fight the blight.

Innovation Station

A common theme of several applicants was finding better ways to Do Science, and improve levels of innovation. As noted several times there’s the risk that Actually Innovation Is Bad, Yo, the same way there’s the question of whether economic growth is actually bad. So there was a lot of asking about the extent to which such things would differentially advantage the ‘good’ innovation and science in its race against the ‘bad’ innovation and science.

I’m going to broadly gesture at A Thing in a bunch of ways and hope it’s sufficiently good training data that you can get an idea of what I’m pointing at, in the kind of mode where one attempts to transfer intuitions and models rather than prove anything.

My thinking about this heavily rhymes with what Eliezer Yudkowsky in his recent series of discussions [? · GW] (that came out after SFF was done) calls ‘shallow’ versus ‘deep’ patterns.

In this model, GPT-3 works by memorizing a ton of different shallow patterns, then uses this to predict text. Most existing ML systems do broadly similar things. They don’t do this other ‘deep’ pattern recognition, which one could also call ‘thinking’ or ‘actually thinking.’

Not doing deep thinking isn’t an attribute limited to artificial intelligence. One can model most humans most of the time as using entirely shallow patterns, and as doing something much closer to running a more-coherent and more-state-retaining version of GPT-3 (GPT-4?) than one might otherwise think. This is remarkably good at getting one through the day, provided the training data has pointed you to sufficiently resonant and appropriate shallow patterns.

One can also model organizations like academic institutions or corporations as mostly also running shallow patterns. Of broader society, and what I call the Implicit Conspiracy, existing entirely in shallow patterns. And all of them as evaluating and rewarding people with shallow patterns, and thus as pushing them towards the exclusive use of shallow patterns. The shallow patterns get ahead, and it looks like many things are crumbling around us if we look at them.

Thus, for a central example, you have this mockery that claims the name ‘science’ that is executing (relatively, in context) shallow patterns designed to produce ‘scientific output’ in the form of papers and grants. It occasionally finds something useful, but not that often, and decreasingly often, and it is displacing the actual act of doing science.

This is both directly relevant to ML in the sense of the question ‘if you put together enough memorization of shallow patterns do you get AGI?’ and for the question ‘if you execute a bunch of shallow patterns as AI researchers do you end up with AGI?’ My tentative gut answer to both questions is no (although you might still end up with a bunch of dumb machine learning systems that do a lot of damage or even get us all killed), but with a lot of very scary uncertainty.

Favoring deep patterns over shallow patterns, and enabling people to execute deep patterns at all and show others that they too can execute deep patterns at all, is one way to think about one of the things I believe is currently super important. It is highly related to the danger of Moral Mazes, the fight against the Blight and the Implicit Coalition (also known as Moloch’s Army), the decline in the discourse, the increased fakeness, scamminess and falseness of everything around us, the general inability of almost anyone to do almost anything real, and even more than in other places it is a vital weapon in our ability to successfully work on AI Safety. We need to culturally establish the act of Doing Actual Thing.

Anyway, when I look at something that is promising to create innovation or otherwise enable the Doing of a Thing, and asking whether it’s net good to encourage that, the question in my mind is whether it’s favoring shallow or deep patterns of thought and action.

If one could reinvigorate science for real, that seems clearly on the good side, so to the extent that I saw promising such attempts I was excited.

There were several proposals in this category looking to directly reinvigorate or enable science of a sort: NewScience, PrivateARPA, SocialMinds@CMU and Ought.

NewScience, SocialMinds and PrivateARPA seemed like they were good ideas if we were optimistic about execution. I was able to get there on NewScience, but not on PrivateARPA or SocialMinds. Somehow my notes on PrivateARPA were not saved, and I’m worried that I relied too much on others’ vibes here and made a mistake not funding. For SocialMinds, I wasn’t sufficiently convinced on execution, but would have been onboard if I had been so convinced.

Ought was a weird case, where I had the strong initial instinct that Ought, as I understood it, was doing a net harmful thing. My thinking was something like this. They are using GPT-3 to assist in research, to do things like generate questions to ask, or classify data, or do whatever else GPT-3 can do. The goal is to make research easier. However, because it’s good at the things GPT-3 is good at, this is going to be a much bigger deal for those looking to do performative science or publish papers or keep dumping more compute into the same systems over and over again, than it will help those trying to do something genuinely new and valuable. The hard part where one actually thinks isn’t being sped up, while the rest of the process is. Oh no.

On top of that, it would be evaluating papers, along with extracting information from them, and thus encouraging papers to align themselves with such shallow patterns, and would also be tying a black-box research assistant into the scientific process in ways that couldn’t possibly go wrong in the sense that when they did go wrong they would be nearly impossible to get at or repair. I was also confused why this was a non-profit, their defense of which was ‘to avoid bad incentives’ in various ways, which on reflection is at least reasonable.

A lot of others positivity seemed to reflect knowing the people involved, whereas I don’t know them at all. A lot of support seemed to come down to People Doing Thing being present, and faith that those people would look for net positive things and to avoid net bad things generally, and that they had an active eye towards AI Safety. With time, I’ve forgotten a number of details here, and also notice that I’m not confident I understand what they’re actually doing (I wasn’t the only one confused) and should probably talk to them more about this at some point.

When I read a comment on LessWrong by Jessica Taylor [LW(p) · GW(p)] questioning why one of MIRI’s latest plans wasn’t strictly worse than Ought, I realized her question didn’t make sense if I’d been understanding Ought correctly, so I presumed that I was confused and asked her, which helped me understand better:

They’re trying to convert Paul Chistiano’s alignment research (e.g. humans consulting HCH) into near-term testable models. E.g. Paul hypothesizes that if you’re trying to solve a big task, it’s possible to break it into lots of small tasks each of which can be solved by someone thinking for a bounded amount of time (e.g. 1 hour).
They’re considering the feasibility of training machine learning models to solve these specific tasks. They’re trying to build AI tools that help people with breaking a problem into tasks and using AI to help solve sub-tasks, e.g. by predicting what someone is likely to approve of. These AI tools might cognitively enhance people by providing them with advice they would consider good upon reflection.

This frames the whole thing on a meta-level as a way to test a theory of how to build an aligned AI. As per Paul’s theory as I understand it, if you can (1) break up a given task into subcomponents and then (2) solve each subcomponent while (3) ensuring each subcomponent is aligned then that could solve the alignment problem with regard to the larger task, so testing to see what types of things can usefully be split into machine tasks, and whether those tasks can be solved, would be some sort of exploration in that direction under some theories. I notice I have both the ‘yeah sure I guess maybe’ instinct here and the mostly-integrated inner-Eliezer-style reaction that very strongly thinks that this represents fundamental confusion and is wrong. In any case, it’s another perspective, and Paul specifically is excited by this path.

I wouldn’t be surprised to learn this was net harmful, but there was enough disagreement and upside in various ways that I concluded that my expectation was positive, so I no longer felt the need to actively try to stop others from funding. Since I was confident I also wasn’t going to get excited enough to become the one funding them, I mostly stopped there to save time. I notice I’m still confused in various ways.

EuroBiostasis also fell into this category for me, because they were clearly actually doing the thing of figuring out which physical techniques would work and which ones wouldn’t, in order to accomplish a clear goal (do cryopreservation properly so people can actually be revived and it’s therefore a real thing). I also did buy the claim that by showing cryonics could work, we could motivate people to care more about the future – I’m not going to get into it here but ‘get people to think there is a future and they should care about it’ is pretty important as part of this whole thing, and if that means we have to (checks notes) solve climate change so people can stop thinking of the future as non-existent and doomed (in ways other than the way it actually might be doomed) then maybe that’s actually a good idea purely for that reason aside from the direct benefits. And I did notice that signing up for Alcor had a ‘care about the future more’ impact on myself, although I am unsure of magnitude. But largely this was ‘f*** around and find out’ in all the right ways, with the potential to show people that you could Do Thing and in particular that the kinds of things people who are worried about other important causes say are important to do can be done in particular, and maybe listen to them about other stuff, and so on.

The applicants for this round of the S-process was a sufficiently ‘weak class’ that this was enough for me. I definitely had a vibe of ‘I should be able to do better’ but I couldn’t, so to some extent I went with it.

Finally, Emergent Ventures India was the one applicant I managed to bring in when I realized I was allowed to go do that, and that the current pool didn’t have enough things I was excited to fund. I’ve been super impressed by Tyler Cowen’s ability to select people who have the potential to have big impacts, and to make them more ambitious and more likely to have those big impacts. Giving money differentially to innovative people likely to do innovative things seems likely to favor the deep over the shallow, and the restriction in our ability to do this type of granting is lack of targets rather than lack of funds, whereas Tyler is very good at identifying targets. They’re not EA insiders and they’re not speaking EA buzzwords, but that really really shouldn’t be the thing that matters here. I hope that having raised this one to attention, it can help find other EA funding sources as well. The idea that this is funding constrained at current operational margins seems nuts to me.

I also contacted Robin Hanson to see if there were any prediction market projects we could explore, including the fire-the-CEO markets he said would be what he’d do with a million dollars, but they weren’t shovel-ready. He needs a founder type to actually execute, and there’s a shortage of those so I couldn’t give him one.

There were a few others that were in various ways Doing Thing that I was in theory ready to fund, but there were red flags raised in various ways or upon examination I didn’t think the Thing in question would work in the relevant sense. I’m making a judgment call and not naming them.

Nuclear War

There are other ways for things to go terribly wrong, but none of them make the possibility of nuclear war go away. Nuclear war could be extremely bad. There was the inevitable ‘but would it be that likely to keep us down permanently?’ along with the also inevitable ‘it might help stop AGI’ and all that, but if felt obligatory rather than true objection territory.

That isn’t always the case, and there are reports that those funding long term causes often say things like ‘nope, that’s not an existential risk, that only kills most people and we don’t care about that very much’ and that’s where the issue would otherwise fit, making it difficult to secure funding. This certainly should matter at least somewhat in terms of ability to get funding, but I sense that this is one of those ‘you don’t fit into any of our slots’ issues that ends up being far more annoying than it should be. Others would know better than I, however.

We were given a bunch of applications that involved preventing nuclear war, and one, Alliance to Feed the Earth in Disasters (ALLFED), which was about mitigation in the aftermath of a nuclear war, centering on but not limited to finding practical ways to keep everyone fed during nuclear winter.

The plans that involved preventing nuclear war were certainly aiming at a goal I considered highly relevant, but none of them seemed at all promising in terms of having any effect. There’s a long history of people who don’t like nuclear war, and of some of those people saying ‘but look nuclear war would be really bad, everyone!’ over and over. They try to ‘raise awareness’ and all that. I don’t see how this leads to a lower probability of nuclear war. It might create feelings of hope, but like hope it is not a strategy.

None of the candidates here seemed like they were even implementing well, so none of them got any consideration for funding.

As an example, Strategic Risks wanted to create a show called ‘Radioactive Road Tripping.’

That left ALLFED, which was a very different case, and where I ended up working with them a bit after this post was originally put up, in ways that addressed the main concerns I had at the time.

ALLFED noticed something few others had noticed or done much about, that being ready could make a huge difference if the nukes did fly in terms of people not starving to death and civilization holding together, and that almost no effort was being made to get ready. While amateurs talked strategy, they studied the logistics, and got others to notice them, with the hope that solutions could be found. Some academics are working on solutions, but ALLFED is especially interested in very cheap, practical solutions that aren’t going to be fun for anyone, but would promise to get the calories into people, and be able to be implemented at scale when the time comes.

The parallel to ‘actually try in advance to deal with something similar in some senses to but far worse than the Covid-19 pandemic’ was not lost on me.

I bought the case that the cause was super neglected and in danger of not getting funding, and could have a huge impact even if that was with small probabilities multiplied together. When I did Fermi calculations, this was a very good investment.

My worry, before talking to others, was whether their technological proposals were feasible, and made sense to work on. I tried a bit to get those who would be in better position to look into this for me, but not as hard as I should have, and I got a bunch of ‘I don’t know either’ back. I fell back mostly on my priors, which was that they were doing the types of things that had any chance of working in practice at all, and as the people who noticed the problem it seemed only reasonable to let them try and solve it, so while I had uncertainties, I was excited to fund them.

When we discussed ALLFED as a group, there were several concerns. I’m going to document the whole thing with the aim of giving the senses I got rather than an aim of maximum charity, please don’t take this as a criticism of any particular person or their actions or decisions, or anything like that.

Capacity. Could ALLFED scale? Could it remain effective, hire and manage well, and so on? Was it mostly the one person who produced value?
Amateurism. Basically a ‘yes, thank you, you founded the space, but now we should leave this to the professionals no?’ kind of vibe thing.
Feasibility. Are their ideas good? I had this too, as noted above.
Honesty. There were concerns, especially around impact calculations.

On the flip side, there was the consideration of a potential ‘hindsight grant,’ the idea that we should give ALLFED money because of what they had already done, to align incentives for things like starting hugely valuable new fields, even if we didn’t have high expectations for what they’d accomplish with the money. I don’t know to what extent ‘the founder of the thing should be by default trusted to figure out how to keep doing it, or at least given the tools to be one of the people trying’ factored in for others but it definitely did for me.

The capacity argument wasn’t invalid. There’s a capacity concern with every organization, otherwise it would be easy to choose the best one and write one check. The unilateralist issue comes up here, as we each had an idea of how much funding room there was and the biggest number is the one ALLFED got.

The amateurism thing seemed wrong to me. Academics are working on some solutions, sure, but they’re working on much less efficient, much more expensive solutions that would be more difficult to implement in a crisis, and they’re doubtless doing it in a very academic way, and by assumption focused on paper writing and grant getting. I’m not saying to can the academics, but there is no sense in which ‘don’t worry, the adults in the room are on it’ is ever going to give me comfort anymore, or cause me to think that now someone else will and therefore I don’t have to.

The feasibility thing wasn’t explored enough, I’m sad we didn’t get a better handle on the physical-world landscape. More research is needed and all that.

The weird one was the honesty concern. There were reports that were essentially of bad vibes around this issue, a general sense of a lack of epistemic rigor and honesty. There was a potentially big grant here, so a lot of questions were asked, and the concrete thing that got identified was their impact statement and how it was calculated. In particular, the calculation surrounding the likelihood of nuclear war, and on top of that the general sense that their estimates of how much of the impact of such wars they were preventing and their overall impact calculations seemed unreasonable.

The first thing I did in response was make sure I’d done my own impact calculation and wasn’t using theirs, and that came back ‘yeah, this is overdetermined to be a good idea, that’s not an issue.’ I think a lot of the concern actually boiled down to something like ‘they’re claiming all this impact and that’s a really big status claim’ and ‘they’re claiming all this impact which would force me to draw conclusions so I’m looking for a way to avoid thinking that’ and ‘they’re claiming all this impact without being sufficiently insider or going through all the proper channels and laying the foundations.’ That’s paraphrasing and somewhat uncharitable, but also my best attempt to be accurate.

The substantive complaint was that they did an invalid calculation when calculating the annual probability of nuclear war. They did a survey to establish a range of probabilities, then they averaged them. One could argue about what kinds of ‘average them’ moves work for the first year, but over time the lack of a nuclear war is Bayesian evidence in favor of lower probabilities and against higher probabilities. It’s incorrect to not adjust for this, and the complaint was not merely the error, but that the error was pointed out and not corrected.

I reflected on this. It certainly wasn’t good but I noticed I wasn’t overly bothered by it, and was only imposing a moderate-sized penalty, so here’s what feels like my intuitive reasons why I only imposed a moderate penalty.

The EA space’s focus on ‘impact’ and in particular on putting together numbers to quantify impact is essentially telling everyone to find the way to write down the highest possible number, and not to worry about whether that number corresponds all that well to reality. When someone tells you to lie to them or hide information or be misleading or p-hack or what not, and then you notice people doing it, it’s a mark against them but it’s kind of on you.
Expanding on that: In particular, the EA space pattern seems to often or by default be that things that can’t be quantified don’t get counted in many funding decisions, and the numbers get taken overly seriously, or at least the way people talk to outsiders about such decisions makes it sound like that. Meanwhile there are a lot of benefits that are hard to quantify, and it seems like you’re competing to put up a higher number against others who are ‘playing the game’ in these ways.
The calculation is wrong and I instantly saw it was wrong and why when the calculation was pointed to, before anyone explained what the error was, but that kind of thing is my comparative advantage. I live for this stuff. My guess is that most scientists wouldn’t see this, most scientists don’t even understand Bayes Rule. That doesn’t mean they can’t do good work, and the original mistake doesn’t reflect all that badly on them when I think about the context. Also, given a nuclear war would kill a lot of people, depending on how you view Anthropic Bias, it can all get a lot murkier. Also, it’s very different to notice you’re making a (somewhat motivated to not be noticed) mistake with the default way to estimate something (in general, ask ten experts and average their guesses is a good heuristic) than it is to realize what the mistake is when it’s pointed at. So I don’t dock them many points for making the mistake, at all.
When the mistake was pointed out, they made a decision not to fix it. That’s the key issue - but see note below that they did fix it after I discussed it with them, and that I believe their failure to fix it came from a lack of understanding their error. When I put myself in their shoes, I see the request that I alter this as a kind of isolated demand for rigor from an insider targeting an outsider, telling them to lower their key numbers for technical reasons in a way that’s not being applied to others and isn’t obviously even correct. And it’s a demand that I go back and fix what isn’t broken to make myself look worse and hurt my cause, and which would require me kind of admitting I did a wrong thing that then everyone would point at forever, and all of that might mean tons of people die in a nuclear war and I’m not excusing it or anything but what the hell did you expect to happen.
So that’s one angle, when I ask whether this means I expect them to be unusually untrustworthy in other ways, I answer no, and I also answer that if you look at OpenPhil or GiveWell analyses with a similar standard I expect to find lots of things like this or worse, and them not to be eager to retroactively correct everything when you point them out and they would lead to conclusions the people involved didn’t like. There’s a kind of person who values ‘getting it right’ enough and such people are sadly very out of fashion and rare in 2021. They don’t seem more common in EA in general.
The other question is whether this represents a strategic mistake, or some sort of bad cultural fit, as in: You should have known to fix your error because we’d have liked it when you fix your errors in this way, thus we should penalize you for being the type of person who makes mistakes. Or who doesn’t understand the cultural codes for interaction in this space. This is like when VCs see a founder screw something up, and they dock them points way way out of proportion to the thing itself because it’s a marker for future such things and the perception of future such things that results from future such things, and so on. Except that now it’s even more of a raw norm enforcement thing, and it’s also a punishment for not listening to what I say you should do over what you should think would actually work or even what actually does work. Thus, the people reacting in this way are doing what is often done to outsiders-in-context. If they did fix the numbers the response would be ‘now the numbers are too low and you’ve admitted you inflated your impact numbers, so we can’t fund you.’ If they didn’t fix the numbers, the response is ‘You didn’t fix it.’

This seems like a good place to have gone into that detail because here there seemed to be broad agreement to fund them anyway, based on the potential impact, neglectedness, track record and uniqueness considerations, among other things, but it likely did matter and counterfactually could have mattered a lot more.

Postscript: ALLFED Corrects Its Estimates

After I wrote this post, Dave Denkenberger of ALLFED reached out to me regarding their estimates of inadvertent nuclear war. He wanted to clear up both what had happened and also the estimate itself. We spent several hours on the phone talking about how this had to work, and I also spent some time in email correspondence with the person who originally pointed out the error to gain additional context.

Those discussions made it clear that the things required to get this calculation correct were indeed difficult to understand for someone who lacked the background doing them, but Dave showed an interest in actually understanding them, which I hope will transfer into general improvements in thinking, and in the end I do think he got it. One can quibble with any answer to a question like this, but I do not have any worries about lack of good faith.

I am satisfied with the correction. That correction has now been made, and can be see here. One could disagree with the answer by arguing about what the prior should be, but it is reasonable and also this is standing in as a proxy for all nuclear war and thus missing important things such as a different pair of countries (USA/China, Russia/China, perhaps India/Pakistan if it in-context counts, etc) and also the possibility of intentional nuclear war. Which is much more salient now in March 2022 given Russia's invasion of Ukraine and its explicit threats to escalate to nuclear weapons and calling lack of control of Ukraine a 'threat to the existence of the state'.

Like many others, I am far more interested in causes that minimize the impact or probability of nuclear war than I was a year ago, and ALLFED remains the best known-to-me long-term way to do that. But it will need to be supplemented by other approaches and taken more seriously in many other ways. There's a real chance for example that Russia will soon have a civil war or breakaway regions, and securing all the nuclear weapons will not be easy in such cases. There are six thousand and it's scary to lose even one.

AI Safety Paper Production

I consider AI Safety and related existential risks to be by far the most important ‘cause area,’ that’s even more true given the focus of SFF, and I am confident Jaan feels the same way. The problem is that saying the words ‘AI Safety’ doesn’t mean you’re making the AI situation safer, and to the extent there are obviously good things to do it would be weird to not find someone already doing them. So when something like SFF gets applications, there’s negative selection effects.

There’s also the problem that as AI Safety becomes more of ‘a field’ there’s more of the traditional pressures that turn previously real things into largely fake things. And the problem that the actual AI Safety problem is an impossible-level problem (as in Shut Up and Do the Impossible [LW · GW]) without clear intermediate signs of progress or publishable papers, and all the stuff like that. And one in which a solution might not exist or might come far too late. It’s easy to see how most efforts end up dancing around the edges and working on shallow easy problems that don’t much matter, or in many cases not even doing that, rather than working on things that might possibly work.

It’s hard to find things that might possibly work in the AI Safety space, as opposed to plans to look around for something that might possibly work.

Thus, I was excited to fund late applicant Topos Institute. As far as I could tell, they’re people with strong mathematical chops working on difficult math problems that they think are most important to solve, along the lines they think might actually work. I wouldn’t have chosen many of the details of their focus and approach, and they don’t even buy the concerns over AGI the same way I or Jaan do, but I want them to do what they think is the right thing to do here, and I’m thrilled for any and all efforts of this type, by as many people as possible, so long as they both have the chops and are aligned with us in the sense that they have their eyes on the prize. All sources I asked confirmed that they count. On reflection I regret not giving them more than I did, and I believe this was due to the S-process default curves and them only asking for a reasonable amount of money.

CHAI@BERI also seemed clearly worthwhile, and they got a large grant as well.

I should also mention CLR, the Center for Long Term Risk, which at the time was applying with the suffix @EAF. I was excited by the detailed contents of what they are working on, relative to the baseline the applications set for excitement, but their focus on s-risks was concerning to me. I don’t want to have the debate on this, but I consider concerns about s-risks a bigger thing to be concerned about right now than actual s-risks. They do have a reasonable plan to mitigate the risk of concern about s-risk, and are saying many of the right things when asked, so I came around to it being worth proceeding.

In contrast with the ones above, there were a number of organizations that all looked alike to me. This will be fair to some of them, and less fair to others, but they seemed to centrally be writing papers that model the AI Safety space in this way.

Building an AI does stuff.
But is not ‘safe.’
Unless, of course, you push the button marked ‘safety.’
Alas, pushing that button is costly relative to not pushing it.
We can model the problem as two players in an iterated prisoner’s dilemma who can defect (not press safe) or cooperate (press safe) each round.
Sometimes IPDs go quite badly.
Do something! Regulation?

I read enough of these papers for my eyes to glaze over quite a bit.

There’s a steelman of what they’re doing in these cases, which is that in order to get people to listen to you, you have to write exactly the right official paper in the right place with the right emphasis and tone twenty times for twenty different subgroups, after which perhaps they’ll pay attention to you at all, or something like that. I’m not fully discounting this, but I notice I don’t have any expectations that meaningful things will result.

Then there’s ‘field building’ in the sense of things that make more people put ‘AI Safety’ into their job descriptions, but without anything I found plausible that would cause this to result in AI becoming safer. Diluting the field with a bunch of shallow work doesn’t seem like it should count as helping.

I don’t feel any need to point out who fits into these categories, but if you’re considering funding an organization based on it helping with AI Safety I’d urge you to check to see if they’re actually doing things that are useful. Part of that is that given our time constraints, I’m reluctant to start a potential negative information cascade with this high a risk of being mistaken about the particular claims or organization.

Then there’s the people who think the ‘AI Safety’ risk is that things will be insufficiently ‘democratic,’ too ‘capitalist’ or ‘biased’ or otherwise not advance their particular agendas. They care about, in Eliezer’s terminology from Twitter, which monkey gets the poisoned banana first. To the extent that they redirect attention, that’s harmful. The avalanche has already begun. It is too late for the pebbles to vote. To the extent that they do get control over existing bananas that aren’t fully poisoned, or future ones that are, the goals are in my model even worse than that.

I do feel the need to mention one organization here, AIObjectives@Foresight, because they’re the only organization that got funding that I view as an active negative. I strongly objected to the decision to fund them, and would have used my veto on an endorsement if I’d retained the right to veto. I do see that they are doing some amount of worthwhile research into ‘how to make AIs do what humans actually want’ but given what else is on their agenda, I view their efforts as strongly net-harmful, and I’m quite sad that they got money. Some others seemed to view this concern more as a potential ‘poisoning the well’ concern that the cause area would become associated with such political focus, whereas I was object-level concerned about the agenda, and in giving leverage over important things to people who are that wrong about very important things and focused on making the world match their wrong views.

Getting deeper into that would be an even longer thing, and maybe it’s worthwhile but I’m going to stop there. In general, the group of applicants here made me despair about the work being done and its prospects for being useful.

Access to Power and Money

In my model, one should be deeply skeptical whenever the answer to ‘what would do the most good?’ is ‘get people like me more money and/or access to power.’ One should be only somewhat less skeptical when the answer is ‘make there be more people like me’ or ‘build and fund a community of people like me.’ The more explicitly and centrally this is what one is doing, the more skeptical one should be. The default reasons people advocate for such things are obvious, regardless of how conscious or intentional such paths might or might not be.

The art must have an end other than itself. By its fruits ye shall know it, the shining city on a hill. Power corrupts, if you gaze into the abyss it gazes into you, we are who we pretend to be and our virtues are that which we practice. If we are functionally about seeking power and money then we’ll turn into the same thing as everyone else who is about seeking power and money. Be wary of anyone saying “only I can fix it.” And all that. The more EA funds are giving to other EA funds and those funds are about expanding EA, the more one should worry it’s a giant circle of nothing.

These could be split into a few categories.

Some organizations focus on access and influence. If you can get people with power to listen to you and adopt your ideas, that’s valuable. The best example of this was what is now known as the Center for Long Term Resilience, and at the time was called Alpenglow@CEA. They had a solid case that they were successfully getting meaningful access for people who would use that access in ways that matter.

This was kind of the best case scenario for this sort of thing, where there was relatively less danger of corruption or wasted money compared to the potential for tangible benefit. The bar for such efforts should be quite high. I still think we overfunded because there are others out there and I think SFF overpaid versus its ‘fair share’ here, but that’s not the biggest mistake. I wish we knew how to do such things ‘safely’ in terms of keeping ourselves intact in the process. Until then, I’ll continue to be deeply uncomfortable in such waters.

2. A second category are regrants. I have a strong aversion to giving out grants in order to give out grants (in order to give out grants?) without a damn good reason for this, especially to organizations that seem like they shouldn’t be running into funding issues, or that seemed like they would be able to make lower-quality decisions than we would. The argument in favor boiled mostly down to ‘the other places can use lower-cost labor and thus make smaller-size grants and find things we don’t have time to evaluate.’ Which isn’t nothing, especially if one can be confident the money only flows in one direction. I do worry about a lot of self-dealing and double-counting here, and the various things it might be messing up.

I don’t think we should have been anything like this eager to give money to the EA Infrastructure Fund. Ironically, it was a question I asked at one of the meetings that led someone to fund them at this level, as I clarified the situation in their mind. That’s how the process is supposed to work, even if I feel a little sick about it – the whole goal is to be strategically unstrategic (or is that unstrategically strategic? Both?), as it were. That’s not to say I think the fund shouldn’t exist or have money, and especially that if we believe Buck in particular is very good at finding good small targets and small things to do that Buck shouldn’t have the ability to go do that, but this felt very much like overkill and a kind of giving up, especially given the goal of ‘infrastructure.’

Then again, I do think ‘fund individuals’ is a great thing to be doing if executed well, far better than funding organizations, so maybe I was wrong about this. I’d need to look into it in more detail to know. I do know that if I was given money for ‘infrastructure’ in this sense I would expect it to be well-spent relative to current margins, but also that it would look and feel weird.

Along similar lines there was the LTFF@CEA, or Long Term Future Fund. They have some clear wins on their book (e.g. John Wentworth) and my notes indicate I thought the bulk of their targets seemed reasonable, although on reflection that makes me worry about the extent to which ‘seem reasonable’ was an optimization target. It’s another case of ‘find individuals and other places to put small amounts in ways that seem plausibly good and do it’ and it seems like something like SFF should be able to do better but if the applicant pool is this shallow maybe we can’t.

As an isolated thing, almost all small grants of these types that are issued without forcing people to apply first seem like they’re net good, but they also end up warping the space and culture around the seeking of such grants, whether or not formal applications have to be involved. There’s a lot of ways this can turn toxic and ruin things, and the technology to avoid this doesn’t seem to exist anywhere – it’s not a unique failure of EA, or an unusual lack of skill, although there are doubtless places that know how to do somewhat better.

The limiting factor on such efforts, in any case, should be the ability to find good small targets without the process of finding them overly corrupting the process or the more general ecosystem. To the extent funding didn’t reach that point, and the process is sufficiently non-corrupted, it seems reasonable to give it more funds. I’m in no position to evaluate where we are on such scales, but I am at a minimum skeptical that we have untapped pools of ability-to-find-good-targets that aren’t being used but that we could tap at reasonable effort and cost.

The comparisons to ‘field building’ in AI Safety seem relevant.

3. Finally, there were the two explicit pyramid schemes slash plans to use money to extract more money. One wanted to target founders of companies and convince them to pledge to give money, the other wanted to go after heirs of fortunes.

These seemed deeply terrible. If you think the best use of funds, in a world in which we already have billions available, is to go trying to convince others to give away their money in the future, and then hoping it can be steered to the right places, I almost don’t know where to start. My expectation is that these people are seeking money and power, largely for themselves, via attempting to hijack that of others, especially for the one targeting heirs, with the goal here (and in a few other places that were more about power than money, but mostly similar) being to become the court advisor/wizard that it is the power behind the throne, and then we hope that this is used in the right ways, but the people who seek that position tend to be power seekers, or become that over time. It’s weird and surprising when one of them cares about The Realm. Even if they do, their ability to steer their targets will be limited in the best of circumstances. To the extent that such projects even have positive returns on capital, which isn’t clear, the vast majority is likely to go to causes that don’t matter, and the vast majority of money that goes to causes that matter as measured by what it says on the tin will go to fake versions of those. And with such people directing their funds in these places, incentives will shift towards being fake and appealing to fakeness, and away from doing real things, so the money likely does net harm on top of everything.

I’d like to think the virtue/ethics/moral considerations mattered in the end, but my read is that a practical ROI calculation ended up carrying the day – it looked like there was willingness to be what I would view as the villain in the play, but that the calculations said that for our purposes this type of strategy didn’t pay even if you discount such concerns, and so the strategy was not funded, whatever anyone would have chosen to call it.

I am happy about this particular outcome, but sad about the process. Looking over everyone’s comments again it seems clear that my concerns mostly were not shared, and the whole project didn’t give people the willies, and that in turn does give me the willies. When we talk about how various moves evaluate in terms of connections and money and power and all that rather than trying to Do the Thing, we have lost The Way. I wish I had a better way to communicate what I find so deeply wrong here, to make people really Look, and my inability to effectively do that and the inability of many others to see it is the thing I find most deeply troubling and wrong, if that makes sense.

Lightcone Infrastructure

As a final note I should likely mention Lightcone Infrastructure.

We decided I had a conflict of interest here, so I didn’t have the option to fund them, but if I’d had that option I would have happily done that. I do think there’s good reason, especially from Outside View, for me to not have that option, as I write on LessWrong quite a bit and know the people well, and Raymond Arnold has been a close friend for a while, and also on reflection yes, recusing here was the right decision for other reasons too, as I’ll note in a bit.

I do think I would have funded Lightcone as a full outsider, but it’s impossible to be confident of such things.

While this is the team behind the current LessWrong, which I believe to have high value, that’s not what grants would fund at the margin. I do think that we’re likely underinvesting in LessWrong itself but that doesn’t mean we have an obvious way to turn money into a better LessWrong, as it requires the right people to work on it.

There’s also the fact that the Lightcone plan is more than a little bit something like ‘make life for a class of people Zvi at least kind of belongs to much better in the hopes they get to do more useful things faster’ and on reflection, yeah, I probably shouldn’t be making the funding decision on that, what do you know.

I mostly want to mention Lightcone because I do find the new thing that Lightcone is trying to do compelling, and I would love to see it expanded.

We can model the world as consisting of a limited number of people who are Doing a Thing that we consider relevant to our interests, another group that at least shows signs they could and might do things that are relevant to our interests at some point in the future, and then the vast majority of people who are not doing that and show no signs of ever doing so.

For current Lightcone, ‘relevant to our interests’ mostly means ‘AI Safety work’ but the argument doesn’t strongly depend on that. I certainly think all non-fake AI Safety work should count here, regardless of what else and who else counts as well.

For the people who are Doing Thing, a lot of their time is spent on Stupid Stuff, and they must Beware Trivial Inconveniences and the expenses involved in various actions. This eats up a large portion of their time and cognitive efforts.

Even remarkably small stupid stuff can eat up a remarkably large amount of time that could otherwise have been useful. Having a good, well-equipped and stocked place to work, where you get meals taken care of and can meet with people and interact spontaneously and other neat stuff like that is a big game. So can ‘stop worrying about money’ either in general or in a particular context, or ‘not have to run these stupid errands.’ Life is so much better when there’s other people who can take care of stuff for you.

A lot of work and potential work, and a lot of community things and events, and many other things besides, are bottlenecked by activation energy and free time and relatively small expenses and stuff like that. Even I end up worrying a bunch about relatively small amounts of money, and getting time wasted and rising stress levels about things where the stakes are quite low, and fussing with stupid little stuff all the time. You really could substantially increase the productivity (although not at zero risk of weirdness or distortions of various kinds) of most of the people I know and almost anyone not high up in a corporation or otherwise super loaded, if you gave that person any combination of:

A credit card and told me ‘seriously, use this for whatever you want as long as it’s putting you in a better position to do the stuff worth doing, it doesn’t matter, stop caring about money unless it’s huge amounts.’ More limited versions help less, but still help, and for someone more money constrained than I am, all this doubtless helps far more.
A person I could call that would be an actually useful secretary and/or concierge, especially someone who could run around and do physical tasks. We have a nanny for the kids, and that helps a ton, but that alone doesn’t come anywhere near what would be useful here. The only concierge service I know about, which I somehow got access to, is completely useless to me because it assumes I’m super rich, and I’m not, and also the people who work there can’t follow basic directions or handle anything at all non-standard.
A person I could have do research for me and figure things out, assemble spreadsheets, that sort of thing.
A nice office space and hangout space I could use and take people to, especially where other interesting people also often went, and where everything was made easy, ideally including free food.

And I think that applies to basically everyone who hasn’t already gotten such things handled. And it’s a shame that we can’t find ways to get these sorts of things usefully into the hands of the People Doing Thing that we think are special and the limited resource that actually matters.

That doesn’t mean this is an easy problem. The first item is especially dangerous, you can’t go around handing that out to people without huge amounts of moral hazard and risk of corruption of the entire space, so that’s out, or at least out outside of a few special cases.

The other items, however, hold more promise, and are where Lightcone’s strategy comes into play. If you’re Doing Relevant Thing you get access to nice office space, free food there, and people to take care of at least some of your logistical needs, and a place to gather and meet people. That’s a big game, and to me seems like an Obviously Correct Move as long as you can make reasonable decisions about who gets it, and deliver in a reasonable way, and otherwise avoid things getting too corrupted, which is why you presumably don’t hand out any company credit cards at least outside of special circumstances.

Over and over again, as the years have gone by, I’ve seen communities fail, connections not happen, projects not get done, people get overwhelmed, and other similar things happen in ways that can be solved with relatively small investments of resources, if you can apply them well, and identify the people worth helping in these ways.

It’s also plausible that this is actually a hidden Universally Correct Strategy for society as a whole and we should be giving everyone who is 22, passes some basic checks and asks nicely either these types of resources or a job providing them to those who get the resources, or something, encouraging them to start businesses and unique projects and the world ends up a much better place, or something like that, although I haven’t gamed that out fully, but it isn’t obviously stupid.

There’s a lot of space available in the ‘spend money to enable Doing of Thing’ in various ways, and I’m excited to explore them at some point, but as you can imagine with the Covid posts and the Omicron variant on top of having a job and kids, I’m currently super busy.

But seriously, that whole section makes it very clear I have lots of conflicts of interest, so please don’t take me as an objective source here and draw your own conclusions.

Conclusion

This got even longer than I expected, despite a large number of places I was tempted to say far more than I did, and a lot of places I gestured at stuff rather than finding ways to properly explain it. It was definitely a case of writing a longer letter because I didn’t have time to write a shorter one, and/or it was about ten posts combined into one.

Hopefully it was illustrative of my perspective on things, and on the things themselves, in ways that were helpful. I know this wasn’t an ideal way to present all this information despite it being important, in a similar way to Eliezer’s recent writings not being the ideal way to present their information despite being important, with similar reasons likely being behind both decisions.

I can easily see this generating some mix of a ton of useful discussions and ideas that are great, and a lot of nasty demon threads, and also a bunch of stuff that should be distinct response articles or have its own discussion sections. I encourage use of ‘header comments’ to organize thoughts and topics, and spawning off distinct posts if that seems strategically like the right thing.

Also, there will inevitably be four different copies of this post, if not more – My Substack and WordPress copies, the LessWrong copy, and then EA Forum. I apologize in advance for the inevitable lack of engagement in some or all of those places, depending on how it goes.

65 comments

Comments sorted by top scores.

comment by Richard_Ngo (ricraz) · 2021-12-15T01:11:59.192Z · LW(p) · GW(p)

I know many EAs and consider many of them friends, but I do not centrally view the world in EA terms, or share the EA moral or ethical frameworks. I don’t use what seem to for all practical purposes be their decision theories. I have very large, very deep, very central disagreements with EA and its core components and central organizations and modes of operation. I have deep worries that important things are deeply, deeply wrong, especially epistemically, and results in an increasingly Goodharted [LW · GW] and inherently political and insider-biased system. I worry that this does intense psychological, epistemic and life experiential damage to many EAs.
Some of that I’ll gesture at or somewhat discuss here, and some of it I won’t. I’m not trying to justify all of my concerns here, I’m trying to share thoughts. If and when I have time in the future, I hope to write something shorter that is better justified.
I also want to make something else clear, for all my disagreements with and worries about it: These criticisms of Effective Altruism are comparing it to what it can and should be, and what it needs to be to accomplish its nigh-impossible tasks, rather than comparing it to popular alternatives.
If you read my Moral Mazes sequence, you’ll see how perversely I view most of what many people do most days. I critique here in such detail because, despite all our disagreements and my worries, I love and I care.

I appreciate that you flagged the criticism of EA as being relative to the standard of being able to achieve very difficult tasks. I still think that, when applying very high standards (and concomitantly strong language), it's worth being more careful about the ways in which this predictably biases people's judgements and makes discussion worse. E.g. I have a hard time mentally conceptualising something as being "deeply, deeply wrong" and "horrifying", but also unusually good compared to alternatives; the former crowds out the latter for me, and I suspect many other readers.

More arguments/elaboration in this comment [LW(p) · GW(p)].

Replies from: Zvi, Zvi

↑ comment by Zvi · 2021-12-15T07:41:53.513Z · LW(p) · GW(p)

I want, as much as possible, to get away from the question of whether 'EA is good' or 'EA is bad' to various extents. I made an effort to focus on sharing information, rather than telling people what conclusions or affects to take away from it.

What I am saying in the quoted text is that I believe there are specific things within EA that are deeply wrong. This is not at all a conflict with EA being unusually good.

I'm also saying wrong as in mistaken, and I'm definitely (this is me responding to the linked comment's complaint) not intending on throwing around words like 'evil' or at least didn't do it on purpose, and was trying to avoid making moral claims at all let alone non-consequentialist ones, although I am noting that I have strong moral framework disagreements.

For concrete clean non-EA example, one could say: The NFL is exceptional, but there is something deeply, deeply wrong with the way it deals with the problem of concussions. And I could want badly for them to fix their concussion protocols or safety equipment, and still think the NFL was pretty great.

And I do agree that there will be people who then say "So why do you hate the NFL?" (or "How can you not hate the NFL?") but we need to be better than that, ideally everywhere, but at least here.

(Similarly, the political problem when someone says "I love my country, but X" or someone else says "How can you love your country when it does X")

I do agree that these issues can be difficult, but if this kind of extraordinary effort (flagging the standard in bold text in a clearly sympathetic way, being careful to avoid moral claims and rather sharing intuitions, models and facts, letting the reader draw their own implications on all levels from the information rather than telling them what to conclude) isn't good enough, than I'm confused what the alternative is that still communicates the information at all.

↑ comment by Zvi · 2021-12-15T08:37:12.926Z · LW(p) · GW(p)

It seems worthwhile to break down exactly what the detailed references here are, so I'll also tackle the other example you referred to. Of course this was a giant post written fast, so this is unpacking largely unconscious/background thinking, but that's a lot of what one thinks and is still highly useful.

You refer to "horrifying" so I want to quote that passage:

To the extent one thinks any or all of that is wrongheaded or broken, one would take issue with the process and its decisions, especially the resulting grants which ended up giving the majority of the funds distributed to explicitly EA-branded organizations.
From many of the valid alternative perspectives that do think such things about EA as it exists in practice, being unusually virtuous in executing the framework here doesn’t make the goings on much less horrifying. I get that.

Here I was attempting to speak to and acknowledge those who do actually find this horrifying, and to point out that there are frameworks of thinking where EA really is doing something that would qualify as horrifying, and that this wasn't in conflict with the idea that what we found in SFF was in many ways an unusually virtuous execution of the thing, and that they should update positively when they notice this, despite it not being better in consequentialist terms given the rest of their framework. I know multiple people who are indeed horrified here.

I wanted people to be able to take in the information no matter their prior perspectives, and also for everyone to notice that the assumption that "EA = good" is a hidden assumption and that if it goes away a lot of other things fall away too.

What I didn't say, is the claim that anything actually is horrifying here in any objective sense, or even that I was horrified. On reflection I am horrified about the 'seek power and money' dynamics and people's failure to notice the skulls there, but that's not a bid to get everyone else to be horrified.

I think your other comment has merit in that deontological/moral language has great ability to distract, so it should be used carefully, it's better where possible to say explicitly exactly what is happening, but also there are times when it's the only reasonable way to convey information, and trade-offs are a thing, and there's of course a time and a place to advocate for one's moral framework, or where it's important to share one's moral framework as context - e.g. when Ben says "I realized Facebook was evil" he is sharing information about his frameworks, thinking and mental state, that it would be very difficult to convey without the word evil. Ben could instead try to say "I realized Facebook was having a net harmful impact on its users much larger than the surplus it was able to extract, and that it would be decision theoretically correct to avoid giving it power or interacting with it" or something but that both is way longer and uglier and also really, really, really doesn't convey the same core information Ben wants to convey.

There's also times when the literally accurate word simply has negative connotations because negative things have negative connotations. Thus, if someone or some system systematically "says that which is not" in order to extract resources from others, whereas if "saying that which is not" would not allow the extraction of resources it would have instead said that which is, it seems reasonable to say that this person or system is lying, and that this pattern of lying may be a problem. If you say this is technically correct but you don't like the impression, it has a bad connotation, I mean... what are you suggesting exactly?

Similarly, you object to the use of the word 'attack' and I assumed you were referring to the SSC/NYT thing, and I was prepared to defend that usage, but then I looked and the link is to my post on Slack? And I notice I am confused?

The word 'attack' there, in a post that's clearly using artistic flourish and also talking about abstractions, is used in two places.

“You Can Afford It”
People like to tell you, “You can afford it.”
No, you can’t. This is the most famous attack on Slack.

Yes, this is literally an attack. It is an attempt to extract resources from a target through use of rhetoric, to convince them to not value something which is valuable. And I don't even see what the bad connotations are here. Are you simply saying that the use of the word 'attack' is inherently bad?

Here's the other:

Out to Get You and the Attack on Slack
Many things in this world are Out to Get You. Often they are Out to Get You for a lot, usually but not always your time, attention and money.

Again, I am confused what your objection is here, unless it's something like 'rhetorical flourish is never allowed' (or something more asymmetric and less charitable than that, or some sort of superweapon against any effective rhetoric).

Similarly, you object to "war" in "Blackmailers are privateers in the war against hypocrisy." This is a post of Benquo's I happen to actively and strongly disagree with in its central point, but seriously, what exactly is the issue with the word 'war' here? That metaphor is considered harmful? I don't see this as in any way distracting or distorting, as far as I can tell it's the best way to convey the viewpoint the post is advocating for, and I'm curious how you would propose to do better. Now I happen to think the viewpoint here is very wrong, but that doesn't mean it shouldn't get to use such techniques to convey its ideas.

To give you an idea of where I am sympathetic, I do think the use of the word 'scam' brings more heat than light in many cases, even when it is technically correct, there are other ways to convey the information that work better, and so I make sure to only pull the word 'scam' (or 'fraud') out when I really mean it.

comment by ozziegooen · 2021-12-15T02:45:12.273Z · LW(p) · GW(p)

I liked this post a lot, though of course, I didn't agree with absolutely everything.

These seemed deeply terrible. If you think the best use of funds, in a world in which we already have billions available, is to go trying to convince others to give away their money in the future, and then hoping it can be steered to the right places, I almost don’t know where to start. My expectation is that these people are seeking money and power,

I'm hesitant about this for a few reasons.

Sure, we have a few billion available, and we're having trouble donating that right now. But we're also not exactly doing a ton of work to donate our money yet. (This process gave out $10 Million, with volunteers). In the scheme of important problems, a few (~40-200) billion really doesn't seem like that much to me. Marginal money, especially lots of money, still seems pretty good.
My expectation is that these people are seeking money and power -> I don't know which specific groups applied or their specific details. I can say that my impression, lots of EAs really just don't know what else to do. It's tough to enter research, and we just don't have that much in terms of "these interventions would be amazing, please someone do them" for longtermism. I've seen a lot of orgs get created with something like, "This seems like a pretty safe strategy, it will likely come into use later on, and we already have the right connections to make it happen." This, combined with a general impression that marginal money is still useful in the long-term, I think could present a more sympathetic take than what you describe.

The default strategy for lots of non-EA entrepreneurs I know has been something like, "Make a ton of money/influence, then try to figure out how to use it for good. Because people won't listen to me or fund my projects on my own". I wish more of these people would do direct work (especially in the last few years, when there's been more money), but can sympathize with that strategy. Arguably, Elon Musk is much better off having started with "less ambitious" ventures like Zip2 and Paypal; it's not clear if he would have been funded to start with SpaceX/Tesla when he was younger.

All that said, the fact that EAs have so little idea of what exactly is useful seems like a pretty burning problem to me. (This isn't unique to EAs, to be clear). On the margin, it seems safe to heavily emphasize "figuring stuff out" instead of "making more money, in hopes that we'll eventually figure stuff out" However, "figuring stuff out" is pretty hard and not nearly as tractable as we'd like it to be.

"I would hire assistance to do at least the following"

I've been hoping that the volunteer funders (EA Funds, SFF) would do this for a while now. Seems valuable to at least try out for a while. In general, "funding work" seems really bottlenecked to me, and I'd like to see anything that could help unblock it.

definitely a case of writing a longer letter

I'm impressed by just how much you write on things like this. Do you have any posts outlining your techniques? Is there anything special, like speech-to-text, or do you spend a lot of time on it, or are you just really fast?

Replies from: Zvi, Zvi

↑ comment by Zvi · 2021-12-15T07:15:22.648Z · LW(p) · GW(p)

I did explicitly note that there are things one can do with higher OOM funding that EA can't do now, even if they wouldn't be as efficient per dollar spent, and that this is the counterargument to there being TMM.

In general, I notice I'm more optimistic that someone capable of being a founder (or even early employee) will do more good directly by going out and creating new companies that provide value like PayPal or Tesla, rather than entering a charitable ecosystem (EA or otherwise), whether or not their main motivation is the money. There are of course places where this doesn't apply because they're playing zero-sum games, so the 'cause area' question still matters, but that's part of the whole Doing Thing thing. And I worry that a lot of the 'don't know what else to do' is a reflection of both feeling an obligation to seek money and power as an implication of a moral framework, which then will tend to change people over time to primarily care about money and power, and also damages them in other ways, and it reflects that the frameworks and world models don't see most of what produces value/good in the world as worthwhile except as a means to gather resources, and also requires the narratization of all action lest it not be considered to have value, and that's why they don't see 'what else to do.'

I'd also note that SBF has exactly the Doing Thing energy and mindset, always has, and also managed to earn the money quickly and by being right rather than navigating social dynamics, which are reasons I'm optimistic there. But also that it seems to me like there's something key missing if going around getting people to donate their money looks similar to creating a crypto exchange (and a subset of that, but an important one, if it doesn't look like that but does look like joining a trading firm). If it looks like founding Tesla or Amazon, I want to say 'halt and catch fire.'

I'd also note that I stand by my claim that the best charity in the world is Amazon.com, and that if Jeff Bezos cared exclusively about the world it's not that he shouldn't be going into space, it's that he should suck it up and keep running Amazon.

Then again, a lot of people in the world don't know what to do, so it's not at all a unique problem.

The Moral Mazes issue is very real here even if one is founding a positive-sum enterprise, in that even when they originally started with good intentions (which often isn't the case) the person who comes out at the other end with money and power is by default incapable of using it for the ends they originally had in mind, they've changed and believe that the only way to succeed is to be motivated by other things instead (this is me wimping out from writing 'bad intentions' here, and it's important to know that's what my brain actually believes goes here is partly an active sign error, but it's not needed for this context and I worry it will distract.) If one is building an organization whose explicit purpose is power and extracting money without otherwise creating value, then these problems are going to be much, much worse.

Also important is my expectation that convincing such folks to later give money won't result in that money going to the places that you (or whoever the best careful thinkers are) decide are best according to non-distorted principles, it's going to be a bunch of MOPs that introduce a bunch of naive/dumb money into an ecosystem full of extractive agents and bad incentives, and that's going to make all those problems worse. Or it's going to be controlled by people whose habits of being are about increasing money and power, which will also make a bunch of problems worse rather than better.

When I say 'I don't know where to start' that's not me being condescending, it's a reflection of a ton of inferential distance and what seems like massively overdetermined intuitions that are hard to share - e.g. I wrote a book-long sequence trying to share some of them, and have others I don't know how to put into writing yet. So I'll wrap up here with noting that this is a subset of the things I could say, and that what I could say is probably another book.

↑ comment by Zvi · 2021-12-15T06:38:14.553Z · LW(p) · GW(p)

I don't have any posts outlining techniques.

I don't use any unusual software, I currently compose posts in a combination of the Substack editor and Google docs. I don't believe speech-to-text (or someone taking dictation) would do anything but slow me down, even if it was 100% accurate, typing speed isn't a limiting factor. Mostly I've had a ton of (at least somewhat deliberate) practice writing a ton of stuff, including Magic articles twice a week for many years. To me, the key bottleneck is the 'figure out what to say' step, then the writing mostly flows quickly, then if something is worth the effort, editing is a huge time sink with or without help.

But every piece of writing advice has the same thing at the top: Write. A lot.

comment by Zach Stein-Perlman · 2021-12-14T14:52:47.921Z · LW(p) · GW(p)

I am not an Effective Altruist.

...

I know many EAs and consider many of them friends, but I do not centrally view the world in EA terms, or share the EA moral or ethical frameworks. I don’t use what seem to for all practical purposes be their decision theories. I have very large, very deep, very central disagreements with EA and its core components and central organizations and modes of operation. I have deep worries that important things are deeply, deeply wrong, especially epistemically, and results in an increasingly Goodharted and inherently political and insider-biased system. I worry that this does intense psychological, epistemic and life experiential damage to many EAs.

(1) I wish we distinguished between endorsing doing good better and endorsing the EA movement/community/etc. The current definition of EA is something like:

Effective altruism is the project of:

Using evidence and reason to find the most promising causes to work on.

Taking action, by using our time and money to do the most good we can.

I assume that you roughly endorse this? At the least, one could endorse narrow principles of EA while being quite concerned about the movement/community/etc. So (2) I'm curious what "the EA moral or ethical frameworks" that you disagree with are. Indeed, the standard EA position is that there is no 'EA moral framework,' or perhaps the more honest consensus is 'tentative rough welfarism.' And most important:

(3) I'm curious what your "very large, very deep, very central disagreements with EA and its core components and central organizations and modes of operation" are; knowing this could be quite valuable to me and others. I consider myself an EA, but I think you know more about its "central organizations and modes of operation" than I do, and I would update against the movement/community/etc* if given reason to do so. If being involved in organized EA is a mistake, please help me see why.

(Responses from non-Zvi readers would also be valuable, as would be directing me to existing writing on these topics.)

*Edit: I meant (epistemically) update against my ability to do a lot of good within organized EA, compared to outside of it.

Replies from: Zvi, ChristianKl

↑ comment by Zvi · 2021-12-14T16:32:15.348Z · LW(p) · GW(p)

I intentionally dodged giving more details in these spots, because I want people to reason from the information and figure out what's going on and what that means, and I don't think updating 'against' (or for) things is the way one should be going about updating.

Also because Long Post Is Long and getting into those other things would be difficult to write well, make things much longer, and be a huge distraction from actually processing the information.

I think there's a much better chance of people actually figuring things out this way.

That doesn't mean you're not asking good questions.

I'd give the following notes.

"Doing good better" implies a lot of framework already in ways worth thinking about.
The EA definition above has even more implicit framework, and as to whether I'd endorse it, my instinctive answer to whether I roughly endorse it would be Mu. My full answer is at least one post.
EA definitely has both shared moral frameworks that are like water to a fish, and also implied moral frameworks that fall out of actions and revealed preferences, many of which wouldn't be endorsed consciously if made explicit. I disagree with much of both, but I want readers to be curious and ask what those are and figure that out, rather than taking my word for it. And leave whether I disagree with them for another time if and when I have the time and method to explain properly.
EA modes of operation disagreements I believe I do my best to largely answer through the full content of the post.

Apologies that I can't more fully answer, at least for now.

Replies from: Zach Stein-Perlman

↑ comment by Zach Stein-Perlman · 2021-12-14T17:02:24.624Z · LW(p) · GW(p)

OK, thanks; this sounds reasonable.

That said, I fear that people in my position—viz., students who don't really know non-student EAs*—don't have the information to "figure out what's going on and what that means." So I want to note here that it would be valuable for people like me if you or someone else someday wrote a post explaining more what's going on in organized EA (and I'll finish reading this post carefully, since it seems relevant).

*I run my college's EA group; even relative to other student groups I/we are relatively detached from organized EA.

Sidenote: my Zvi-model is consistent with Zvi being worried about organized EA both for reasons that would also worry me (e.g., "I have deep worries that important things are deeply, deeply wrong, especially epistemically, and results in an increasingly Goodharted and inherently political and insider-biased system") and for reasons that would not worry me much (e.g., EA is quite demanding or quite utilitarian, or something related to "doing good better" or the definition of EA being bad). So I'm not well-positioned to infer much from the mere fact that Zvi (or someone else) has concerns. Of course, it's much healthier to form beliefs on the basis of understanding rather than deference anyway, so it doesn't really matter. I just wanted to note that I can't infer much from your and others' affects for this reason.

↑ comment by ChristianKl · 2021-12-14T16:03:54.703Z · LW(p) · GW(p)

Nearly two years in the pandemic the core EA organizations still seem to show no sign of caring that they didn't prevent it despite their mission including fighting biorisks. Doing so would require asking uncomfortable questions and accepting uncomfortable truths and there seems to be no willingness to do so.

The epistemic habits that would be required to engage such an issue seem to be absent.

When it comes to goodharting, Ben Hoffman's criticism of GiveWell measuring their success by the cost GiveWell imposes on other people would be one example. Instead of writing reports that are as imformative as possible that pushes the report writing in a direction that motivates people to donate instead of being demotivated by potential issues (Ben worked at GiveWell).

Replies from: aarongertler

↑ comment by aarongertler · 2021-12-16T09:52:02.871Z · LW(p) · GW(p)

Nearly two years in the pandemic the core EA organizations still seem to show no sign of caring that they didn't prevent it despite their mission including fighting biorisks.

Which core organizations are you referring to, and which signs are you looking for?

This has been discussed to some extent on the Forum, particularly in this thread [EA · GW], where multiple orgs were explicitly criticized. (I want to see a lot more discussions like these than actually exist, but I would say the same thing about many other topics — EA just isn't very big and most people there, as anywhere, don't like writing things in public. I expect that many similar discussions happened within expert circles and didn't appear on the Forum.)

I worked at CEA until recently, and while our mission isn't especially biorisk-centric (we affect EA bio work in indirect ways on multi-year timescales), our executive director insisted that we should include a mention in the opening talk of the EA Picnic that EA clearly fell short of where it should have been on COVID. It's not much, but I think it reflects a broader consensus that we could have done better and didn't.

That said, the implication that EA not preventing the pandemic is a problem for EA seems reasonable only in a very loose sense (better things were possible, as they always are). Open Phil invested less than $100 million into all of its biosecurity grants put together prior to February 2020, and that's over a five-year period. That this funding (and direct work from a few dozen people, if that) failed to prevent COVID seems very unsurprising, and hard to learn from.

Is there a path you have in mind whereby Open Phil (or anyone else in EA) could have spent that kind of money in a way that would likely have prevented the pandemic, given the information that was available to the relevant parties in the years 2015-2019?

Doing so would require asking uncomfortable questions and accepting uncomfortable truths and there seems to be no willingness to do so.

I find this kind of comment really unhelpful, especially in the context of LessWrong being a site about explaining your reasoning and models.

What are the uncomfortable questions and truths you are talking about? If you don't even explain what you mean, it seems impossible to verify your claim that no one was asking/accepting these "truths", or even whether they were truths at all.

Replies from: Zvi, ChristianKl

↑ comment by Zvi · 2021-12-16T12:42:46.002Z · LW(p) · GW(p)

I have argued to some EA leaders that the pandemic called for rapid and intense response as an opportunity to Do a Thing and thereby do a lot of good, and they had two general responses. One was the very reasonable 'there's a ton of uncertainty and logistics of actually doing useful things is hard yo' but what I still don't understand was the arguments against a hypothetical use of funds that by assumption would work.

In particular (this was pre-Omicron), I presented this hypothetical, based on a claim from David Manheim, doesn't matter for this purpose if the model of action would have worked or not because we're assuming it does:

In May 2020, let's say you know for a fact that the vaccines are highly safe and effective, and on what schedule they will otherwise be available. You can write a $4 billion check to build vaccine manufacturing plants for mRNA vaccines. As a result, in December 2020, there will be enough vaccine for whoever wants one, throughout the world.

Do you write the check?

The answer I got back was an emphatic not only no, but that it was such a naive thing to think it would be a good idea to do, and I needed to learn more about EA.

Replies from: Davidmanheim, liam-donovan-1, aarongertler

↑ comment by Davidmanheim · 2021-12-16T14:49:45.936Z · LW(p) · GW(p)

I will point out that my work proposing funding mechanisms to work on that, and the idea, was being funded by exactly those EA orgs which OpenPhil and others were funding. (But I'm not sure why they people you spoke with claim that they wouldn't fund this, and following your lead, I'll ignore the various issues with the practicalities - we didn't know mRNA was the right thing to bet on in May 2020, the total cost for enough manufacturing for the world to be vaccinated in <6 months is probably a (single digit) multiple of $4bn, etc.)

↑ comment by Liam Donovan (liam-donovan-1) · 2021-12-16T19:34:40.733Z · LW(p) · GW(p)

I haven't done much research on this, but from a naive perspective, spending 4 billion dollars to move up vaccine access by a few months sounds incredibly unlikely to be a good idea? Is the idea that it is more effective than standard global health interventions in terms of QALYs or a similar metric, or that there's some other benefit that is incommensurable with other global health interventions? (This feels like asking the wrong question but maybe it will at least help me understand your perspective)

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2021-12-24T09:33:16.227Z · LW(p) · GW(p)

The idea is that the extra production capacity funded with that $4b doesn't just move up access a few months for rich countries, it also means poor countries get enough doses in months not years, and that there is capacity for making boosters, etc. (It's a one-time purchase to increase the speed of vaccines for the medium term future. In other words, it changes the derivative, not the level or the delivery date.)

Replies from: liam-donovan-1

↑ comment by Liam Donovan (liam-donovan-1) · 2021-12-24T20:24:52.054Z · LW(p) · GW(p)

Is there currently a supply shortage of vaccines?

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2021-12-26T07:21:22.816Z · LW(p) · GW(p)

Yes, a huge one.

"COVAX, the global program for purchasing and distributing COVID-19 vaccines, has struggled to secure enough vaccine doses since its inception..

Nearly 100 low-income nations are relying on the program for vaccines. COVAX was initially aiming to deliver 2 billion doses by the end of 2021, enough to vaccinate only the most high-risk groups in developing countries. However, its delivery forecast was wound back in September to only 1.425 billion doses by the end of the year.

And by the end of November, less than 576 million doses had actually been delivered."

↑ comment by aarongertler · 2021-12-21T21:17:36.920Z · LW(p) · GW(p)

Thanks for sharing your experience.

I've been writing the EA Newsletter and running the EA Forum for three years, and I'm currently a facilitator for the In-Depth EA Program [? · GW], so I think I've learned enough about EA not to be too naïve.

I'm also an employee of Open Philanthropy starting January 3rd, though I don't speak for them here.

Given your hypothetical and a few minutes of thought, I'd want Open Phil to write the check. It seems like an incredible buy given their stated funding standards for health interventions and reasonable assumptions about the "fewer manufacturing plants" counterfactual. (This makes me wonder whether Alexander Berger is among the leaders you mentioned, though I assume you can't say.)

Are any of the arguments that you heard against doing so available for others to read? And were the people you heard back from unanimous?

I ask not in the spirit of doubt, but in the spirit of "I'm surprised and trying to figure things out".

(Also, David Manheim is a major researcher in the EA community, which makes the whole situation/debate feel especially strange. I'd guess that he has more influence on actually EA-funded COVID decisions than most of the people I'd classify as "EA leaders".)

↑ comment by ChristianKl · 2021-12-16T12:14:17.654Z · LW(p) · GW(p)

What are the uncomfortable questions and truths you are talking about?

COVID-19 is airbone. Biosafety level 2 is not sufficient to protect against airbone infections. The Chinese did gain-of-function research on coronoviruses under biosafety level 2 in Wuhan and publically said so in their published papers. This is the most likely reason we have the pandemic. There are strong efforts to cover that up the lab leak, from the Chinese, the US and other parties.

Is there a path you have in mind whereby Open Phil (or anyone else in EA) could have spent that kind of money in a way that would likely have prevented the pandemic, given the information that was available to the relevant parties in the years 2015-2019?

Fund a project that lists who does what gain-of-function research with what safety procautions to understand the threat better. After discovering that the Chinese did their gain-of-function research at biosafety level 2, put public pressure on them to not do that.

After being done with putting pressure on shutting down all the biosafety level 2 gain-of-function research attempt to do the same with biosafety level 3 gain-of-function research. Without the power to push through a global ban on the research pushing for only doing it in biosafety level 4 might be a fight worth having.

It's probably still worth funding such a project.

Replies from: Davidmanheim, aarongertler

↑ comment by Davidmanheim · 2021-12-22T13:44:41.460Z · LW(p) · GW(p)

If you had done even a bit of homework, you'd see that there was money going in to all of this. iGem and the Blue ribbon panel have been getting funded for over half a decade, and CHS for not much less. The problem was that there were too few people working on the problem, and there was no public will to ban scientific research which was risky. And starting from 2017, when I was doing work on exactly these issues - lab safety and precautions, and trying to make the case for why lack of monitoring was a problem - the limitation wasn't a lack funding from EA orgs. Quite the contrary - almost no-one important in biosecurity wasn't getting funded well to do everything that seemed potentially valuable.

So it's pretty damn frustrating to hear someone say that someone should have been working on this, or funding this. Because we were, and they were.

Replies from: ChristianKl

↑ comment by ChristianKl · 2021-12-22T14:42:34.668Z · LW(p) · GW(p)

If you would have done your research you would know that I opened previous threads and have done plenty of research.

I haven't claimed that there wasn't any money being invested into "working on biosecurity" but that most of it wasn't effectively invested to stop the pandemic. The people funding the gain-of-function research are also seeing themselves as working in biosafety.

The problem was that there were too few people working on the problem, and there was no public will to ban scientific research which was risky.

The position at the time shouldn't be to target banning gain-of-function research in general given that's politically not achievable but to say that it should only happen under biosafety 4.

It would have been possible to have a press campaign about how the Trump administration wants to allow dangerous gain-of-function research that was previously banned to happen under conditions that aren't even the highest available biosafety level.

It's probably still true today that "no gain-of-function outside of biosafety level 4" is the correct political demand.

And starting from 2017, when I was doing work on exactly these issues - lab safety and precautions, and trying to make the case for why lack of monitoring was a problem

The Chinese were written openly in their papers that they were doing the work under biosafety level 2. The problem was not about a lack of monitoring of their labs. It was just that nobody cared about them openly doing research in a dangerous setting.

iGem and the Blue ribbon panel have been getting funded for over half a decade, and CHS for not much less.

iGem seems to a be a project about getting people to do more dangerous research and no project about reducing the amount of dangerous research that happens. Such an organization has bad incentives to take on the virology community to stop them from doing harm.

CHS seems to be doing net beneficial work. I'm still a bit confused about why they ran the Coronovirus pandemic exercise after the chaos started in the WIV. That's sort of between "someone was very clever" and "someone should have reacted much better".

Replies from: Davidmanheim, Davidmanheim

↑ comment by Davidmanheim · 2021-12-23T09:53:17.341Z · LW(p) · GW(p)

I can go through details, and you're wrong about what the mentioned orgs have done which matters, but even ignoring that, I strongly disagree about how we can and should push for better policy, and don't think that even giving unlimited funding (which we effectively had,) there could have been enough people working on this to have done what you suggest (and we still don't have enough people for high priority projects, despite, again, an effectively blank check!) and think you're suggesting that we should have prioritized a single task, stopping Chinese BSL-2 work, based purely on post-hoc information, instead of pursuing the highest EV work as it was, IMO correctly, assessed at the time.

But even granting prophecy, I think that there is no world in which even an extra billion dollars per year 2015-2020 would have been able to pay for enough people and resources to get your suggested change done. And if we had tried to push on the idea, it would have destroyed EA Bio's ability to do things now. And more critically, given any limited level of public attention and policy influence, focusing on mitigating existential risks instead of relatively minor events like COVID would probably have been the right move even knowing that COVID was coming! (Though it would certainly have changed the strategy so we could have responded better.)

↑ comment by Davidmanheim · 2021-12-26T07:59:17.555Z · LW(p) · GW(p)

iGem seems to a be a project about getting people to do more dangerous research and no project about reducing the amount of dangerous research that happens. Such an organization has bad incentives to take on the virology community to stop them from doing harm.

Did you look at what Open Philanthropy is actually funding? https://igem.org/Safety

Or would you prefer that safety people not try to influence education and safety standards of people actually doing the work? Because if you ignore everyone with bad incentives, you can't actually change the behaviors of the worst actors.

Replies from: ChristianKl

↑ comment by ChristianKl · 2021-12-26T10:05:40.479Z · LW(p) · GW(p)

I don't think that funding this work is net negative. On the other hand, I don't think it can do what's necessary to prevent the Coronavirus lab leak in 2019 or either or the two potential Coronavirus lab leaks in 2021.

It took the White House Office of Science and Technology to create the first moratorium because the NIH wasn't capable and it would also need outside pressure to achieve anything else that's strong enough to be sufficient to deal with the problem.

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2021-12-26T12:57:31.475Z · LW(p) · GW(p)

You didn't respond to my comment that addressed this, but; "even granting prophecy, I think that there is no world in which even an extra billion dollars per year 2015-2020 would have been able to pay for enough people and resources to get your suggested change done. And if we had tried to push on the idea, it would have destroyed EA Bio's ability to do things now. And more critically, given any limited level of public attention and policy influence, focusing on mitigating existential risks instead of relatively minor events like COVID would probably have been the right move even knowing that COVID was coming!"

↑ comment by aarongertler · 2021-12-21T21:48:32.196Z · LW(p) · GW(p)

Thanks for sharing a specific answer! I appreciate the detail and willingness to engage.

I don't have the requisite biopolitical knowledge to weigh in on whether the approach you mentioned seems promising, but it does qualify as something someone could have been doing pre-COVID, and a plausible intervention at that.

My default assumptions for cases of "no one in EA has funded X", in order from most to least likely:

No one ever asked funders in EA to fund X.
Funders in EA considered funding X, but it seemed like a poor choice from a (hits-based or cost-effectiveness) perspective.
Funders in EA considered funding X, but couldn't find anyone who seemed like a good fit for it.
Various other factors, including "X seemed like a great thing to fund, but would have required acknowledging something the funders thought was both true and uncomfortable".

In the case of this specific plausible thing, I'd guess it was (2) or (3) rather than (1). While anything involving China can be sensitive, Open Phil and other funders have spent plenty of money on work that involves Chinese policy. (CSET got $100 million from Open Phil, and runs a system tracking PRC "talent initiatives" that specifically refers to China's "military goals" — their newsletter talks about Chinese AI progress all the time, with the clear implication that it's a potential global threat.)

That's not to say that I think (4) is impossible — it just doesn't get much weight from me compared to those other options.

FWIW, as far as I've seen, the EA community has been unanimous in support of the argument "it's totally fine to debate whether this was a lab leak". (This is different from the argument "this was definitely a lab leak".) Maybe I'm forgetting something from the early days when that point was more controversial, or I just didn't see some big discussion somewhere. But when I think about "big names in EA pontificating on leaks", things like this and this come to mind.

*****

Do you know of anyone who was trying to build out the gain-of-function project you mentioned during the time before the pandemic? And whether they ever approached anyone in EA about funding? Or whether any organizations actually considered this internally?

Replies from: Davidmanheim

↑ comment by Davidmanheim · 2021-12-22T13:49:03.108Z · LW(p) · GW(p)

See my reply above, but this was actually none of your 4 options - it was "funders in EA were pouring money into this as quickly as they could find people willing to work on it."

And the reasons no-one was pushing the specific proposal of "publicly shame China into stopping [so-called] GoF work" include the fact that US labs have done and still do similar work in only slightly safer conditions, as do microbiologists everywhere else, and that building public consensus about something no-one but a few specific groups of experts care about isn't an effective use of funds.

Replies from: aarongertler

↑ comment by aarongertler · 2021-12-24T07:14:09.233Z · LW(p) · GW(p)

Thanks for the further detail. It sounds like this wasn't actually a case of "no one in EA has funded X", which makes my list irrelevant.

(Maybe the first item on the list should be "actually, people in EA are definitely funding X", since that's something I often find when I look into claims like Christian's, though it wasn't obvious to me in this case.)

comment by denkenberger · 2021-12-16T18:40:15.188Z · LW(p) · GW(p)

The substantive complaint was that they [ALLFED] did an invalid calculation when calculating the annual probability of nuclear war. They did a survey to establish a range of probabilities, then they averaged them. One could argue about what kinds of ‘average them’ moves work for the first year, but over time the lack of a nuclear war is Bayesian evidence in favor of lower probabilities and against higher probabilities. It’s incorrect to not adjust for this, and the complaint was not merely the error, but that the error was pointed out and not corrected.

Tl; dr: ALLFED appreciates the feedback. We disagree that it was a mistake - there were smart people on both sides of this issue. Good epistemics are very important to ALLFED.

Full version:

Zvi is investigating the issue. I won’t name names, but suffice it to say, there were smart people disagreeing on this issue. We have been citing the fault tree analysis of the probability of nuclear war, which we think is the most rigorous study because it uses actual data. Someone did suggest that we should update the probability estimate based on the fact that nuclear war has not yet occurred (excluding World War II). Taking a look at the paper itself (see the top of page 9 and equation (5) on that page), for conditional probabilities of occurrence for which effectively zero historical occurrences have been observed out of n total cases when it could have occurred, the probability in the model was updated according to a Bayesian posterior distribution with a uniform prior and binomial likelihood function. Historical occurrences updated in this way were A) the conditional probability that Threat Assessment Conference (TAC)-level attack indicators will be promoted to a Missile Attack Conference (MAC), and (B) the conditional probability of leaders’ decision to launch in response to mistaken MAC-level indicators of being under attack. Based on this methodology, it would be double-counting to update their final distribution further based on the historical absence of accidental nuclear launches over the last 76 years.

But what we do agree on is that if one starts with a high prior, one should update. And that's what was done by one of our coauthors for his model of the probability of nuclear war, and he got similar results to the fault tree analysis. Furthermore, the fault tree analysis was only for inadvertent nuclear war (one side thinking they are being attacked, and then "retaliating"). However, there are other mechanisms for nuclear war, including intentional attack, and accidental detonation of a nuclear weapon and escalation from there. Furthermore, though many people consider nuclear winter only possible for a US-Russia nuclear war, now that China has a greater purchasing power parity than the US, we think there is comparable combustible material there. So the possibility of US-China nuclear war or Russia-China nuclear war further increases probabilities. So even if there should be some updating downward on the inadvertent US-Russia nuclear war, I think the fault tree analysis still provides a reasonable estimate. I also explained this on my first 80k podcast.

Also, we say in the paper, "Considering uncertainty represented within our models, our result is robust: reverting the conclusion required simultaneously changing the 3-5 most important parameters to the pessimistic ends." So as Zvi has recognized, even if one thinks the probability of nuclear war should be significantly lower, the overall conclusion doesn't change. We have encouraged people to put their own estimates in.

Again, we really appreciate the feedback. Good epistemics are very important to us. We are trying to reach the truth. We want to have maximum positive impact on the world, so that's why we spend a significant amount of time on prioritization.

Replies from: Zvi, Lanrian

↑ comment by Zvi · 2021-12-16T21:26:53.794Z · LW(p) · GW(p)

For clarity: Investigating this further is on my stack, but due to Omicron my stack doth overflow, so I don't know how long it will take me to get to it.

↑ comment by Lukas Finnveden (Lanrian) · 2021-12-16T20:48:38.252Z · LW(p) · GW(p)

My interpretation of Zvi's point wasn't that your model should account for past lack of nuclear war, but that it should be sensitive to future lack of nuclear war. I.e., if you try to figure out the probability that nuclear war happens at least once over (e.g.) the next century, then if it doesn't happen in the next 50 years, you should assign lower probability to it happening in the 50 years after that. I wrote someone a slack message about this exact issue a couple of months ago; I'll copy it here in case that's helpful:

So here’s a tricky thing with your probability extrapolation: On a randomly chosen year, actors should give lower probabilities to p(nuclear war in Nyears) than the naive 1-[1-p(nuclear war next year)]^Nyears.

The reason for this is that the absence of nuclear war on any given year is positively correlated with absence of nuclear way on any other given year. This positive correlation yields an increased probability that nuclear war will never happen in the given time period.

One way to recognise this: Say that someone assigns a 50% chance to the annual risk being exactly 0.2, and 50% chance to the annual risk being exactly 0.01. Then their best-guess for the next year is going to be 0.105. If this was the actual annual risk, then the probability of nuclear war over a decade would be 1-(1-0.105)^10 ~= 0.67. But their actual best guess for nuclear war next decade is going to be 0.5*(1-[1-0.2]^10)+0.5*(1-[1-0.01])^10 ~= 0.45

I think one useful framing of this is that, each year that a person sees that nuclear war didn’t happen, they’ll update towards a lower annual risk. So towards the end of the period, this person will have mostly updated away from the chance that the annual risk was 0.2, and they’ll think that the 0.01 estimate is more likely.

This whole phenomena matter a lot more if the risks you’re dealing with are large, than if they’re small. Take the perspective in the most recent paragraph: If the risk is small each year, then each year without nuclear apocalypse won’t update you very much. Without updates, using constant annual probabilities is more reasonable.

To be concrete, if we lived in the year 1950, then I think it’d be reasonable to assign really high probability to nuclear war in the next few decades, but then assume that — if we survive the next few decades — that must be because the risk is low. So the risk over the 200 years isn’t that much higher than the risk over the next few decades.

In the year 2021, we’ve already seen a lot of years without nukes, so we already have good reason to believe that nukes are rare. So we won’t update a lot on seeing a few extra decades without nukes. So extrapolating annual risks over the next few decades seems fine. Extrapolating it all the way to 2100 is a little bit shakier, though. Maybe I’d guess there’d be like 2-10 percentage points difference, depending on how you did it.

Replies from: denkenberger

↑ comment by denkenberger · 2022-03-27T22:31:40.499Z · LW(p) · GW(p)

Zvi has now put a postscript in the ALLFED section above. We have updated the inadvertent nuclear war fault tree model result based on no nuclear war since the data stopped coming in, and also reduced the annual probability of nuclear war further going forward. And then, so as to not over claim on cost effectiveness, we did not include a correction for non-inadvertent US/Russia nuclear war nor conflict with China. Resilient foods are still highly competitive with AGI safety according to the revised model.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2022-03-27T23:07:26.220Z · LW(p) · GW(p)

woop!

comment by Wei Dai (Wei_Dai) · 2021-12-14T18:54:21.269Z · LW(p) · GW(p)

AIObjectives@Foresight

Suggest linking this to https://ai.objectives.institute/blog/ai-and-the-transformation-of-capitalism, as I had to spend a few minutes searching for it. Also, would be interested in why you think it's strongly net negative. (It doesn't seem like a great use of money to me, but not obviously net negative either, so I'm interested in the part of your model that's generating this conclusion.)

Replies from: Zvi

↑ comment by Zvi · 2021-12-14T19:11:32.098Z · LW(p) · GW(p)

Was following the principle of not linking to things I consider negative. Considered not even talking about if for same reason.

Their principle is to bring AI 'under democratic control' and then use it as a tool to force AI to enforce their political agenda, which I find both negative as a thing to be doing with an arbitrary payload (e.g. no one should be doing this, it's both poisoning the banana further and ensuring the fight is over which monkey gets it), and I am also strongly opposed to the payload in question (so if we are going to fight, which they are determined to do, they're also the wrong monkeys).

Replies from: pde, Wei_Dai

↑ comment by pde · 2021-12-21T06:16:01.758Z · LW(p) · GW(p)

Hi!

I'm a co-founder of the AI Objectives Institute. We're pretty interested in the critical view you've formed about what we're working on! We think it's most likely that we just haven't done a very good job of explaining our thinking yet -- you say we have a political agenda, but as a group we're trying quite hard to avoid having an inherent object-level political agenda, and we're actively looking for input from people with different political perspectives than ours. It's also quite possible that you have deep and reasonable criticisms of our plan, that we should take on board. Either way, we'd be interested in having a conversation, trying to synchronize models and looking for cruxes for any disagreements, if you're open to it!

Replies from: Zvi, quinn-dougherty

↑ comment by Zvi · 2022-01-02T21:31:19.174Z · LW(p) · GW(p)

Sorry I didn't reply earlier, been busy. I would be happy to have a call at some point, you can PM me contact info that is best.

I do think we have disagreements beyond a political agenda, but it is always possible communication fell short somehow.

If you don't have a political agenda I would say your communications seem highly misleading, in the sense that they seem to clearly indicate one.

↑ comment by Quinn (quinn-dougherty) · 2022-01-02T21:11:15.528Z · LW(p) · GW(p)

I think there exists a generic risk of laundering problem. If you say "capitalism is suboptimal" or "we can do better" people are worried about trojan horses, people worry that you're just toning it down to garner mainstream support when behind closed doors you'd look more like like "my specific flavor of communism is definitely the solution". I'm not at all saying I got those vibes from the "transformation of capitalism" post, but that I think it's plausible someone could get those vibes from it. Notably, the book "Inadequate Equilibria" was explicitly about how capitalism is suboptimal and rigorously asks us if we can improve upon it, and managed not to raise anybody's alarms about it being a secret communist plot. I guess because it signaled against such a reading by taking the aesthetic/vocabulary of academic econ.

Replies from: pde

↑ comment by pde · 2022-01-15T05:33:00.962Z · LW(p) · GW(p)

Our first public communications probably over-emphasized one aspect of our thinking, which is that some types of bad (or bad on some people's preferences) outcomes from markets can be thought of as missing components of the objective function that those markets are systematically optimizing for. The corollary of that absolutely isn't that we should dismantle markets or capitalism, but that we should take an algorithmic approach to whether and how to add those missing incentives.

A point that we probably under-emphasized at first is that intervening in market systems (whether through governmental mechanisms like taxes, subsidies or regulation, or through private sector mechanisms like ESG objectives or product labeling schemes) has a significant chance of creating bad and unintended consequences via Goodhart's law and other processes, and that these failures can be viewed as deeply analogous to AI safety failures.

We think that people with left and right-leaning perspectives on economic policy disagree in part because they hold different Bayesian priors about the relative likelihood of something going wrong in the world because market fail to optimize for the right outcome, or because some bureaucracy tried to intervene in people's lives or in market processes in a unintentionally (or deliberately) harmful way.

To us it seems very likely that both kinds of bad outcomes occur at some rate, and the goal of the AI Objectives Institute is to reduce rates of both market and regulatory failures. Of course there are also political disagreements about what goals should be pursued (which I'd call object level politics, and which we're trying not to take strong organizational views on) and on how economic goals should be chosen (where we may be taking particular positions, but we'll try to do that carefully).

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2022-01-16T00:39:15.867Z · LW(p) · GW(p)

some types of bad (or bad on some people’s preferences) outcomes from markets can be thought of as missing components of the objective function that those markets are systematically optimizing for.

This framing doesn't make a lot of sense to me. From my perspective, markets are unlike AI in that there isn't a place in a market's "source code" where you can set or change an objective function. A market is just a group of people, each pursuing their own interests, conducting individual voluntary trades. Bad outcomes of markets come not from wrong objective functions given by some designers, but are instead caused by game theoretic dynamics that make it difficult or impossible for a group of people pursuing their own interests to achieve Pareto efficiency. (See The Second Best [LW · GW] for some pointers in this direction.)

Can you try to explain your perspective to someone like me, or point me to any existing writings on this?

To us it seems very likely that both kinds of bad outcomes occur at some rate, and the goal of the AI Objectives Institute is to reduce rates of both market and regulatory failures.

There is a big literature in economics on both market and government/regulatory failures. How familiar are you with it, and how does your approach compare with the academic mainstream on these topics?

↑ comment by Wei Dai (Wei_Dai) · 2021-12-14T19:51:04.082Z · LW(p) · GW(p)

Was following the principle of not linking to things I consider negative.

What's the thinking behind this? (Would putting the link in parentheses or a footnote work for your purposes? I'm just thinking of the amount of time being wasted by your readers trying to find out what the institute is about.)

Their principle is to bring AI ‘under democratic control’ and then use it as a tool to force AI to enforce their political agenda

Ok, thanks. I guess one of their webpages does mention "through democratic consultation" but that didn't jump out as very salient to me until now.

Replies from: Zvi

↑ comment by Zvi · 2021-12-15T00:04:23.097Z · LW(p) · GW(p)

In the internet age, attention is like oxygen or life. That's especially true for a charity, but everyone lives on clicks and views, and a common strategy is to trick people into going 'hey check out this awful thing.'

If they hadn't been funded, or their name had been obscured by the veto, I wouldn't have included their name at all (as I didn't for several others that I mention briefly, but weren't funded.)

Replies from: Wei_Dai, Zach Stein-Perlman

↑ comment by Wei Dai (Wei_Dai) · 2021-12-15T23:22:48.767Z · LW(p) · GW(p)

In that case, perhaps copy/paste a longer description of the organization in a footnote, so the reader can figure out what the organization is trying to do, without having to look them up?

↑ comment by Zach Stein-Perlman · 2021-12-15T00:53:06.570Z · LW(p) · GW(p)

This makes some sense. On the other hand, not naming such organizations means you can't share your skepticism about specific organizations with the rest of us, who might benefit from hearing it.

comment by Mau (Mauricio) · 2021-12-15T07:20:22.183Z · LW(p) · GW(p)

In my model, one should be deeply skeptical whenever the answer to ‘what would do the most good?’ is ‘get people like me more money and/or access to power.’ One should be only somewhat less skeptical when the answer is ‘make there be more people like me’ or ‘build and fund a community of people like me.’ [...] I wish I had a better way to communicate what I find so deeply wrong here

I'd be very curious to hear more fleshed-out arguments here, if you or others think of them. My best guess about what you have in mind is that it's a combination of the following (lumping all the interventions mentioned in the quoted excerpt into "power-seeking"):

People have personal incentives and tribalistic motivations to pursue power for their in-group, so we're heavily biased toward overestimating its altruistic value.
Seeking power occupies resources and attention that could be spent figuring out how to solve problems, and figuring out how to solve problems is very valuable.
Figuring out how to solve problems isn't just very valuable. It's necessary for things to go well, so mainly doing power-seeking makes it way too easy for us to get the mistaken impression that we're making progress and things are going well, while a crucial input into things going well (knowing what to do with power) remains absent.
Power-seeking attracts leeches (which wastes resources and dilutes relevant fields).
Power-seeking pushes people's attention away from object-level discussion and learning. (This is different from (3) in that (3) is about how power-seeking impacts a specific belief, while this point is about attention.)
Power-seeking makes a culture increasingly value power for its own sake, which is bad for the usual reasons that value drift is bad.

If that's it (is it?), then I'm more sympathetic than I was before writing out the above, but I'm still skeptical:

Re: 1: Speaking of object-level arguments, object-level arguments for the usefulness of power and field growth seem very compelling (and simple enough to significantly reduce room for bias).
4 mainly seems like a problem with poorly executed power-seeking (although maybe that's hard to avoid?).
2-5 and 6 seem to be horrific problems mostly just if power-seeking is the main activity of a community, rather than one of several activities.

(One view from which power-seeking seems much less valuable is if we assume that, on the margin, this kind of power isn't all that useful for solving key problems. But if that were the crux, I'd have expected the original criticism to emphasize the (limited) benefits of power-seeking, rather than its costs.)

Replies from: Zvi

↑ comment by Zvi · 2021-12-19T13:12:00.726Z · LW(p) · GW(p)

This is a good first attempt and it is directionally correct as to what my concerns are.

The big difference is something like your apparent instinct that these problems are practical and avoidable, limited in scope and only serious if you go 'all-in' on power or are 'doing it wrong' in some sense.

Whereas my model says that these problems are unavoidable even under the best of circumstances and at best you can mitigate them, the scope of the issue is sufficient to reverse the core values of those involved and the core values being advanced by groups involved, scale with the amount to which you pay attention to power and money seeking but can be fatal well before you go 'all-in' (and if you do go 'all-in' you have almost certainly already lost The Way and if you haven't you're probably about to quickly even if you make an extraordinary effort not to), and that it is a 'shut-up-and-do-the-impossible' level task to not 'do it wrong' at any scale. And yet money and power are highly valuable, and so these problems are really hard and balance must be found, which is why I say deeply skeptical rather than 'kill it with fire without looking first.'

You're also mostly not noticing the incentive shifts that happen very generally under such circumstances, focusing on a few particular failure modes or examples but missing most of it.

I do think that power tends to be not as useful as one thinks it will be, and that's largely because the act of seeking power constrains your ability to use it to accomplish the things you wanted to do in the first place, both by changing you (your habits, your virtues and values, your associates, your skills, your cultural context and what you think of as normal, what you associate with blame...) and the situation around you more generally, and because you'll notice the tension between executing your thing and continuing to protect and grow your power.

Also because when we say 'grow your power' there's always the question of 'what do you mean we, kemosabe?' Whose power? It will tend to go to the subset of you that desire to seek power, and it will tend to accrue to moral mazes that you help create, and it will not be well-directed. Growing a field is a noble goal, but the field you get is not a larger version of the thing you started with. And if you convince someone that 'EA good' and for them to give away some money, you're not going to get EA-average choices made, you're going to do worse, and the same for the subclasses x-risk or AI safety.

Anyway, yes, I would hope at some point in the future to be able to give several topics here a more careful treatment.

comment by Error · 2021-12-15T16:54:26.490Z · LW(p) · GW(p)

The only concierge service I know about, which I somehow got access to, is completely useless to me because it assumes I’m super rich, and I’m not, and also the people who work there can’t follow basic directions or handle anything at all non-standard.

This is my experience, too, with almost any form of assistance. Actual thinking about the task is absent.

It's annoying, because an obnoxiously large proportion of life goes towards 1. doing all the fiddly stupid bits, 2. procrastinating about doing all the fiddly stupid bits, and 3. worrying about procrastinating too much about doing all the fiddly stupid bits. I would love to not have to deal with that. I've automated or outsourced everything I can, but it's never enough.

I suspect the degree of life-competence needed to be good at “personal assistant tasks” is scarce enough that anyone capable of doing it well is also capable of getting a job that pays better. General personal assistance requires non-cached thought, and most people can't do that on demand, if ever. Task-specific assistance can often be had at a reasonable price, because trainable habits can make up for thought. Sadly, in most cases that just replaces the original task with a more-difficult acquire-task-appropriate-services task, so it's only worth it for long-term maintenance, like cleaning.

...having written that, I wonder if there's a task-specific assistant service for "finding good task-specific services and arranging their help." Probably not. Knowing who's good at X often requires being good at X to begin with.

(unimportant, but related and maybe interesting: I get my groceries curbside or delivered. I'd rather pay for delivery, most of the time, but the curbside service is significantly more accurate and less interaction-required. I think it's because curbside groceries are collected by store employees who can proxy-shop the store by habit, while deliveries are third-party and less familiar with the specific store)

Replies from: Dagon

↑ comment by Dagon · 2021-12-15T17:19:47.302Z · LW(p) · GW(p)

Much of the difficulty in outsourcing is a simple result of principal-agent problems. Almost nobody can pay enough to get someone as competent as they to think about their problems as deeply as they. Only tasks with a pretty significant repeatability and efficiency premium (that is, actually takes less time when done by someone else, without loss in quality) can trivially be offloaded. Everything else takes a fair bit of analysis and meta-planning in order to get someone else to do tolerably.

This changes for the VERY rich - if being your PA/Butler pays well enough to be done by a smart, motivated person, you can shed a lot of things. From what I can tell, it's not a smooth transition, though - normal people have to do most of their crap chores themselves, medium-rich people can outsource the trivial ones (gardening, grocery shopping, some parts of cleaning) and not the difficult ones (travel planning, making the grocery list to shop from, organizing things), and only the super-rich can really just forget the things they don't care about.

comment by romeostevensit · 2021-12-14T17:25:09.917Z · LW(p) · GW(p)

If we're currently in a losing scenario then we want to increase the variance in our betting strategy. But most (all?) collective group decision procedures decrease the variance.

comment by Rohin Shah (rohinmshah) · 2021-12-16T06:53:57.896Z · LW(p) · GW(p)

In my model, one should be deeply skeptical whenever the answer to ‘what would do the most good?’ is ‘get people like me more money and/or access to power.’ One should be only somewhat less skeptical when the answer is ‘make there be more people like me’ or ‘build and fund a community of people like me.’
[...]
Lightcone Infrastructure
[...]
if you gave that person [...] a credit card and told me ‘seriously, use this for whatever you want as long as it’s putting you in a better position to do the stuff worth doing, it doesn’t matter, stop caring about money unless it’s huge amounts.’

Seems like these are pretty centrally "build and fund a community of people like me". Do you agree with this, and think that you have enough evidence to overcome the default skepticism? Or do you think Lightcone doesn't actually fall into that reference class?

My guess is that it's the first one, and that you'd say that unlike the vast majority of other orgs, Lightcone is clearly composed of people who are Doing Thing and will lead to other people Doing Thing better, and this isn't true of most other things where you have default skepticism?

comment by Lukas Finnveden (Lanrian) · 2021-12-16T15:01:09.648Z · LW(p) · GW(p)

I should also mention CLTR@EAF.

I want to note that the preferred acronym is CLR. This wouldn't matter except that there's now another EA-adjacent organisation (the center on long-term resilience) who do use CLTR as their acronym.

Replies from: Zvi

↑ comment by Zvi · 2021-12-19T12:53:50.240Z · LW(p) · GW(p)

Yikes. I was copying the name that appeared on their application title.

comment by bfinn · 2021-12-16T00:12:46.353Z · LW(p) · GW(p)

The Lightcone thing sounds a lot like how university tenure used to be in the UK, at least at Oxford & Cambridge. When I was a student there were academics with lifelong tenure who had had no formal responsibilities for decades, lived in comfortable college rooms in beautiful surroundings, with free food and unlimited intellectual stimulation & company. Some used this privilege to continue their research unhindered, wrote books, lectured etc.; others did little or nothing for the rest of their lives, and were rarely even seen. (Which is not to criticize the system; some wastage is inevitable.)

comment by Quinn (quinn-dougherty) · 2021-12-15T09:26:12.455Z · LW(p) · GW(p)

I deeply appreciate how you feel about EA becoming a self-perpetuating but misaligned engine. It's much stronger writing than what I've told people (usually on discord) when they bring up EA movement building as a cause area.

I think more can be said about TMM. One angle is patience, the idea that we can think of EA institutions as being an order or several of magnitude more wealthy in the future instead of thinking of them as we currently think of them. Combine this insight with some moderate credence in we are not in the hinge of history [? · GW], you could turn the hold option from the S-process into a full-throated choice to invest. This way, the quality bar (the difficult search for fund-worthy people/projects) wouldn't be lowered by any of the TMM pressures.

However. There are some of the considerations about institutions becoming self-perpetuating and misaligned, as a special case of social scenes or movements becoming self-perpetuating and misaligned. It would be really difficult to reconcile patience with your pointing out of the fact that we don't know how to Do Actual Thing, as a society we might be getting worse at it, and as a movement EAs aren't exceptional. As an intuition pump, imagine a researcher trying to figure out when, why, and how harvard arrived at Not Doing Actual Thing: clearly the size of the endowment isn't the key variable of the expected impact of harvard relative to a counterfactual universe in which harvard was Doing Actual Thing.

My vision is for somebody to found an org for one 6-18 month project and promise to shutter the org immediately after. Combine this with stellar impact metrics and I think you have a recipe for EA busting out of it's Not Doing Actual Thing shackles (the impact metrics part is the hard part).

Replies from: ChristianKl

↑ comment by ChristianKl · 2021-12-22T20:35:44.096Z · LW(p) · GW(p)

My vision is for somebody to found an org for one 6-18 month project and promise to shutter the org immediately after. Combine this with stellar impact metrics and I think you have a recipe for EA busting out of it's Not Doing Actual Thing shackles (the impact metrics part is the hard part).

If you shut down the org after 18 months, how will you evaluating them based on impact metrics in any meaningful way?

Replies from: quinn-dougherty

↑ comment by Quinn (quinn-dougherty) · 2021-12-26T10:42:23.790Z · LW(p) · GW(p)

I was probably assuming third party evaluator. I think the individuals should be free to do another project while they wait for the metrics to kick in / the numbers to come back. I think if the metrics come back and it turns out they had done a great job, then they should gain social capital to spend on their future projects, and maybe return to a project similar to the one they shuttered in the future.

You're right that this is a problem if the metrics are expected to be done in house!

Replies from: Zvi

↑ comment by Zvi · 2021-12-28T14:38:12.482Z · LW(p) · GW(p)

Metrics are everywhere and always a problem. If the project doesn't continue and the metrics are used to judge the person's performance, it's even more of a Goodhart issue, so I'd be very cautious about judging via known metrics, unless a given situation provides a very good fit.

comment by philh · 2021-12-19T12:50:31.425Z · LW(p) · GW(p)

so the money likely does not harm on top of everything.

(Sounds like this should be "does net harm"? I don't normally point out typos but this one reverses the meaning, so.)

Replies from: Zvi

↑ comment by Zvi · 2021-12-19T12:52:54.263Z · LW(p) · GW(p)

Thank you. You are correct, and this is an important one to fix (with posts like this it's impossible to fix all typos without an editor).

comment by habryka (habryka4) · 2023-01-16T05:51:03.938Z · LW(p) · GW(p)

I have been doing various grantmaking work for a few years now, and I genuinely think this is one of the best and most important posts to read for someone who is themselves interested in grantmaking in the EA space. It doesn't remotely cover everything, but almost everything that it does say isn't said anywhere else.

comment by lsusr · 2021-12-14T15:49:17.967Z · LW(p) · GW(p)

There's a lot I like in this post. Here are some highlights.

Enough Money…is a sweet spot where money holds its meaning, and you care about value at all, but lack of money doesn’t hold you back from your goals. Of course, if you had orders of magnitude more money, perhaps you’d have different goals, or at least different methods to seek your goals. I know I would. But in a given local context, you can still have EM.

I like this way of thinking about things.

It’s also plausible that this is actually a hidden Universally Correct Strategy for society as a whole and we should be giving everyone who is 22, passes some basic checks and asks nicely either these types of resources or a job providing them to those who get the resources, or something, encouraging them to start businesses and unique projects and the world ends up a much better place, or something like that, although I haven’t gamed that out fully, but it isn’t obviously stupid.

If this works it would unlock massive potential. If it fails it would waste only a little bit of money on the societal scale.

comment by Charlie Steiner · 2021-12-14T21:02:29.745Z · LW(p) · GW(p)

Thanks for sharing in such exhaustive detail.

This makes me want to say "damn the trust dynamics" and just start applying to get funding for an entire new research institute. Though I suspect the trust dynamics would catch up to me (or people like me) pretty quickly.

comment by Joachim Bartosik (joachim-bartosik) · 2021-12-22T13:57:33.410Z · LW(p) · GW(p)

you presumably don’t hand out any company credit cards at least outside of special circumstances.

This reminded me of an anecdote from "Surely You're Joking, Mr. Feynman!" where Feynman says that he

had been used to giving lectures for some company or university or for ordinary people, not for the government. I was used to, "What were your expenses?" "Soandso much." "Here you are, Mr. Feynman.

I remember reading that and thinking that it's different from what I have to do (at a private company) when I want to expense something. I wonder if things were really done differently back then. And how people made it work.

comment by Zach Stein-Perlman · 2021-12-14T14:44:04.294Z · LW(p) · GW(p)

There will inevitably be four different copies of this post, if not more – My Substack and WordPress copies, the LessWrong copy, and then EA Forum. I apologize in advance for the inevitable lack of engagement in some or all of those places, depending on how it goes.

Do you prefer comments on a particular copy? I'm not sure what the Schelling point is.

Replies from: Zvi

↑ comment by Zvi · 2021-12-14T16:05:09.371Z · LW(p) · GW(p)

They're very different places, so I don't think there's a 'right' place for this. The EA Forum copy I expect to be relatively disconnected from for overdetermined reasons, but they were going to end up with a copy regardless so figured I'd provide it myself (all my writing is collective commons provided authorship and links back are provided, and I'm sure they will Have Thoughts). My guess is that if you encountered it here first you should be discussing it here.

Zvi’s Thoughts on the Survival and Flourishing Fund (SFF)

Contents

How the S-Process Works

The Recommenders

Incentives of the S-Process for Applicants

Incentives of the S-Process for Recommenders

The Unilateralist’s Curse

The Time and Resource Gap

Too Much Money

And the Nominees Are

Unserious Applications

Orthogonal Applications

Innovation Station

Nuclear War

Postscript: ALLFED Corrects Its Estimates

AI Safety Paper Production

Access to Power and Money

Lightcone Infrastructure

Conclusion

65 comments