Posts
Comments
It's been a long time since I read those books, but if I'm remembering roughly right: Asimov seems to describe a world where choice is in a finely balanced equilibrium with other forces (I'm inclined to think: implausibly so -- if it could manage this level of control at great distances in time, one would think that it could manage to exert more effective control over things at somewhat less distance).
I've now sent emails contacting all of the prize-winners.
Actually, on 1) I think that these consequentialist reasons are properly just covered by the later sections. That section is about reasons it's maybe bad to make the One Ring, ~regardless of the later consequences. So it makes sense to emphasise the non-consequentialist reasons.
I think there could still be some consequentialist analogue of those reasons, but they would be more esoteric, maybe something like decision-theoretic, or appealing to how we might want to be treated by future AI systems that gain ascendancy.
- Yeah. As well as another consequentialist argument, which is just that it will be bad for other people to be dominated. Somehow the arguments feel less natively consequentialist, and so it seems somehow easier to hold them in these other frames, and then translate them into consequentialist ontology if that's relevant; but also it would be very reasonable to mention them in the footnote.
- My first reaction was that I do mention the downsides. But I realise that that was a bit buried in the text, and I can see that that could be misleading about my overall view. I've now edited the second paragraph of the post to be more explicit about this. I appreciate the pushback.
Ha, thanks!
(It was part of the reason. Normally I'd have made the effort to import, but here I felt a bit like maybe it was just slightly funny to post the one-sided thing, which nudged against linking rather than posting; and also I thought I'd take the opportunity to see experimentally whether it seemed to lead to less engagement. But those reasons were not overwhelming, and now that you've put the full text here I don't find myself very tempted to remove it. :) )
The judging process should be complete in the next few days. I expect we'll write to winners at the end of next week, although it's possible that will be delayed. A public announcement of the winners is likely to be a few more weeks.
I don't see why (1) says you should be very early. Isn't the decrease in measure for each individual observer precisely outweighed by their increasing multitudes?
This kind of checks out to me. At least, I agree that it's evidence against treating quantum computers as primitive that humans, despite living in a quantum world, find classical computers more natural.
I guess I feel more like I'm in a position of ignorance, though, and wouldn't be shocked to find some argument that quantum has in some other a priori sense a deep naturalness which other niche physics theories lack.
You say that quantum computers are more complex to specify, but is this a function of using a classical computer in the speed prior? I'm wondering if it could somehow be quantum all the way down.
It's not obvious that open source leads to faster progress. Having high quality open source products reduces the incentives for private investment. I'm not sure in which regimes that will play out that it's overall accelerationist, but I sort of guess that it will be decelerationist during an intense AI race (where the investments needed to push the frontier out are enormous and significantly profit-motivated).
I like the framework.
Conceptual nit: why do you include inhibitions as a type of incentive? It seems to me more natural to group them with internal motivations than external incentives. (I understand that they sit in the same position in the argument as external incentives, but I guess I'm worried that lumping them together may somehow obscure things.)
I actually agree with quite a bit of this. (I nearly included a line about pursuing excellence in terms of time allocation, but — it seemed possibly-redundant with some of the other stuff on not making the perfect the enemy of the good, and I couldn't quickly see how to fit it cleanly into the flow of the post, so I left it and moved on ...)
I think it's important to draw the distinction between perfection and excellence. Broadly speaking, I think people often put too much emphasis on perfection, and often not enough on excellence.
Maybe I shouldn't have led with the “Anything worth doing, is worth doing right” quote. I do see that it's closer to perfectionist than excellence-seeking, and I don't literally agree with it. Though one thing I like about the quote is the corollary: "anything not worth doing right isn't worth doing" — again something I don't literally agree with, but something I think captures an important vibe.
I do think people in academia can fail to find the corners they should be cutting. But I also think that they write a lot of papers that (to a first approximation) just don't matter. I think that academia would be a healthier place if people invested more in asking "what's the important thing here?" and focusing on that, and not trying to write a paper at all until they thought they could write one with the potential to be excellent.
No, multi-author submissions are welcome! (There's space to disclose this on the entry form.)
Can you say more about why you believe this? At first glance, it seems to be like "fundamental instability" is much more tied to how AI development goes, so I would've expected it to be more tractable [among LW users].
Maybe "simpler" was the wrong choice of word. I didn't really mean "more tractable". I just meant "it's kind of obvious what needs to happen (even if it's very hard to get it to happen)". Whereas with fundamental instability it's more like it's unclear if it's actually a very overdetermined fundamental instability, or what exactly could nudge it to a part of scenario space with stable possibilities.
In a post-catastrophe world, it seems quite plausible to me that the rebounding civilizations would fear existential catastrophes and dangerous technologies and try hard to avoid technology-induced catastrophes.
I agree that it's hard to reason about this stuff so I'm not super confident in anything. However, my inside view is that this story seems plausible if the catastrophe seems like it was basically an accident, but less plausible for nuclear war. Somewhat more plausible is that rebounding civilizations would create a meaningful world government to avoid repeating history.
Just a prompt to say that if you've been kicking around an idea of possible relevance to the essay competition on the automation of wisdom and philosophy, now might be the moment to consider writing it up -- entries are due in three weeks.
My take is that in most cases it's probably good to discuss publicly (but I wouldn't be shocked to become convinced otherwise).
The main plausible reason I see for it potentially being bad is if it were drawing attention to a destabilizing technology that otherwise might not be discovered. But I imagine most thoughts are kind of going to be chasing through the implications of obvious ideas. And I think that in general having the basic strategic situation be closer to common knowledge is likely to reduce the risk of war.
(You might think the discussion could also have impacts on the amount of energy going into racing, but that seems pretty unlikely to me?)
The way I understand it could work is that democratic leaders with "democracy-aligned AI" would get more effective influence on nondemocratic figures (by fine-tuned persuasion or some kind of AI-designed political zugzwang or etc), thus reducing totalitarian risks. Is my understanding correct?
Not what I'd meant -- rather, that democracies could demand better oversight of their leaders, and so reduce the risk of democracies slipping into various traps (corruption, authoritarianism).
My mainline guess is that information about bad behaviour by Sam was disclosed to them by various individuals, and they owe a duty of confidence to those individuals (where revealing the information might identify the individuals, who might thereby become subject to some form of retaliation).
("Legal reasons" also gets some of my probability mass.)
OK hmm I think I understand what you mean.
I would have thought about it like this:
- "our reference class" includes roughly the observations we make before observing that we're very early in the universe
- This includes stuff like being a pre-singularity civilization
- The anthropics here suggest there won't be lots of civs later arising and being in our reference class and then finding that they're much later in universe histories
- It doesn't speak to the existence or otherwise of future human-observer moments in a post-singularity civilization
... but as you say anthropics is confusing, so I might be getting this wrong.
I largely disagree (even now I think having tried to play the inside game at labs looks pretty good, although I have sometimes disagreed with particular decisions in that direction because of opportunity costs). I'd be happy to debate if you'd find it productive (although I'm not sure whether I'm disagreeable enough to be a good choice).
I think point 2 is plausible but doesn't super support the idea that it would eliminate the biosphere; if it cared a little, it could be fairly cheap to take some actions to preserve at least a version of it (including humans), even if starlifting the sun.
Point 1 is the argument which I most see as supporting the thesis that misaligned AI would eliminate humanity and the biosphere. And then I'm not sure how robust it is (it seems premised partly on translating our evolved intuitions about discount rates over to imagining the scenario from the perspective of the AI system).
Wait, how does the grabby aliens argument support this? I understand that it points to "the universe will be carved up between expansive spacefaring civilizations" (without reference to whether those are biological or not), and also to "the universe will cease to be a place where new biological civilizations can emerge" (without reference to what will happen to existing civilizations). But am I missing an inferential step?
I think that you're right that people's jobs are a significant thing driving the difference here (thanks), but I'd guess that the bigger impact of jobs is via jobs --> culture than via jobs --> individual decisions. This impression is based on a sense of "when visiting Constellation, I feel less pull to engage in the open-ended idea exploration vs at FHI", as well as "at FHI, I think people whose main job was something else would still not-infrequently spend some time engaging with the big open questions of the day".
I might be wrong about that ¯\_(ツ)_/¯
I feel awkward about trying to offer examples because (1) I'm often bad at that when on the spot, and (2) I don't want people to over-index on particular ones I give. I'd be happy to offer thoughts on putative examples, if you wanted (while being clear that the judges will all ultimately assess things as seem best to them).
Will probably respond to emails on entries (which might be to decline to comment on aspects of it).
I don't really disagree with anything you're saying here, and am left with confusion about what your confusion is about (like it seemed like you were offering it as examples of disagreement?).
(Caveat: it's been a while since I've visited Constellation, so if things have changed recently I may be out of touch.)
I'm not sure that Constellation should be doing anything differently. I think there's a spectrum of how much your culture is like blue-skies thinking vs highly prioritized on the most important things. I think that FHI was more towards the first end of this spectrum, and Constellation is more towards the latter. I think that there are a lot of good things that come with being further in that direction, but I do think it means you're less likely to produce very novel ideas.
To illustrate via caricatures in a made-up example: say someone turned up in one of the offices and said "OK here's a model I've been developing of how aliens might build AGI". I think the vibe in Constellation would trend towards people are interested to chat about it for fifteen minutes at lunch (questions a mix of the treating-it-as-a-game and the pointed but-how-will-this-help-us), and then say they've got work they've got to get back to. I think the vibe in FHI would have trended more towards people treat it as a serious question (assuming there's something interesting to the model), and it generates an impromptu 3-hour conversation at a whiteboard with four people fleshing out details and variations, which ends with someone volunteering to send round a first draft of a paper. I also think Constellation is further in the direction of being bought into some common assumptions than FHI was; e.g. it would seem to me more culturally legit to start a conversation questioning whether AI risk was real at FHI than Constellation.
I kind of think there's something valuable about the Constellation culture on this one, and I don't want to just replace it with the FHI one. But I think there's something important and valuable about the FHI thing which I'd love to see existing in some more places.
(In the process of writing this comment it occurred to me that Constellation could perhaps decide to have some common spaces which try to be more FHI-like, while trying not to change the rest. Honestly I think this is a little hard without giving that subspace a strong distinct identity. It's possible they should do that; my immediate take now that I've thought to pose the question is that I'm confused about it.)
I completely agree that Oliver is a great fit for leading on research infrastructure (and the default thing I was imagining was that he would run the institute; although it's possible it would be even better if he could arrange to be number two with a strong professional lead, giving him more freedom to focus attention on new initiatives within the institute, that isn't where I'd start). But I was specifically talking about the "research lead" role. By default I'd guess people in this role would report to the head of the institute, but also have a lot of intellectual freedom. (It might not even be a formal role; I think sometimes "star researchers" might do a lot of this work without it being formalized, but it still seems super important for someone to be doing.) I don't feel like Oliver's track record blows me away on any of the three subdimensions I named there, and your examples of successes at research infrastructure don't speak to it. This is compatible with him being stronger than I guess, because he hasn't tried in earnest at the things I'm pointing to. (I'm including some adjustment for this, but perhaps I'm undershooting. On the other hand I'd also expect him to level up at it faster if he's working on it in conjunction with people with strong track records.)
I think it's obvious that you want some beacon function (to make it an attractive option for people with strong outside options). That won't be entirely by having excellent people which will mean that internal research conversations are really good, but it seems to me like that was a significant part of what made FHI work (NB this wasn't just Nick, but people like Toby or Anders or Eric); I think it could be make-or-break for any new endeavour in a way that might be somewhat path-dependent in how it turns out; it seems right and proper to give it attention at this stage.
Makes sense! My inference was because the discussion at this stage is a high-level one about ways to set things up, but it does seem good to have space to discuss object-level projects that people might get into.
I agree in the abstract with the idea of looking for niches, and I think that several of these ideas have something to them. Nevertheless when I read the list of suggestions my overall feeling is that it's going in a slightly wrong direction, or missing the point, or something. I thought I'd have a go at articulating why, although I don't think I've got this to the point where I'd firmly stand behind it:
It seems to me like some of the central FHI virtues were:
- Offering a space to top thinkers where the offer was pretty much "please come here and think about things that seem important in a collaborative truth-seeking environment"
- I think that the freedom of direction, rather than focusing on an agenda or path to impact, was important for:
- attracting talent
- finding good underexplored ideas (b/c of course at the start of the thinking people don't know what's important)
- Caveats:
- This relies on your researchers having some good taste in what's important (so this needs to be part of what you select people on)
- FHI also had some success launching research groups where people were hired to more focused things
- I think this was not the heart of the FHI magic, though, but more like a particular type of entrepreneurship picking up and running with things from the core
- I think that the freedom of direction, rather than focusing on an agenda or path to impact, was important for:
- Willingness to hang around at whiteboards for hours talking and exploring things that seemed interesting
- With an attitude of "OK but can we just model this?" and diving straight into it
- Someone once described FHI as "professional amateurs", which I think is apt
- The approach is a bit like the attitude ascribed to physicists in this xkcd, but applied more to problems-that-nobody-has-good-answers-for than things-with-lots-of-existing-study (and with more willingness to dive into understanding existing fields when they're importantly relevant for the problem at hand)
- Someone once described FHI as "professional amateurs", which I think is apt
- Importantly mostly without directly asking "ok but where is this going? what can we do about it?"
- Prioritization at a local level is somewhat ruthless, but is focused on "how do we better understand important dynamics?" and not "what has external impact in the world?"
- With an attitude of "OK but can we just model this?" and diving straight into it
- Sometimes orienting to "which of our ideas does the world need to know about? what are the best ways to disseminate these?" and writing about those in high-quality ways
- I'd draw some contrast with MIRI here, who I think were also good at getting people to think of interesting things, but less good at finding articulations that translated to broadly-accessible ideas
Reading your list, a bunch of it seems to be about decisions about what to work on or what locally to pursue. My feeling is that those are the types of questions which are largely best left open to future researchers to figure out, and that the appropriate focus right now is more like trying to work out how to create the environment which can lead to some of this stuff.
Overall, the take in the previous paragraph is slightly too strong. I think it is in fact good to think through these things to get a feeling for possible future directions. And I also think that some of the good paths towards building a group like this start out by picking a topic or two to convene people on and get them thinking about. But if places want to pick up the torch, I think it's really important to attend to the ways in which it was special that aren't necessarily well-represented in the current x-risk ecosystem.
I think FHI was an extremely special place and I was privileged to get to spend time there.
I applaud attempts to continue its legacy. However, I'd feel gut-level more optimistic about plans that feel more grounded in thinking about how circumstances are different now, and then attempting to create the thing that is live and good given that, relative to attempting to copy FHI as closely as possible.
Differences in circumstance
You mention not getting to lean on Bostrom's research taste as one driver of differences, and I think this is correct but may be worth tracing out the implications of even at the stage of early planning. Other things that seem salient and important to me:
- For years, FHI was one of the only places in the world that you could seriously discuss many of these topics
- There are now much bigger communities and other institutions where these topics are at least culturally permissible (and some of them, e.g. AI safety, are the subject of very active work)
- This means that:
- One of FHI's purposes was serving a crucial niche which is now less undersupplied
- FHI benefited from being the obvious Schelling location to go to think about these topics
- Whereas even in Berkeley you want to think a bit about how you sit in the ecosystem relative to Constellation (which I think has some important FHI-like virtues, although makes different tradeoffs and misses on others)
- FHI benefited from the respectability of being part of the University
- In terms of getting outsiders to take it seriously, getting meetings with interesting people, etc.
- I'm not saying this was crucial for its success, and in any case the world looks different now; but I think it had some real impact and is worth bearing in mind
- As you mention -- you have a campus!
- I think it would be strange if this didn't have some impact on the shape of plans that would be optimal for you
Pathways to greatness
If I had to guess about the shape of plans that I think you might engage in that would lead to something deserving of the name "FHI of the West", they're less like "poll LessWrong for interest to discover if there's critical mass" (because I think that whether there's critical mass depends a lot on people's perceptions of what's there already, and because many of the people you might most want probably don't regularly read LessWrong), and more like thinking about pathways to scale gracefully while building momentum and support.
When I think about this, two ideas that seem to me like they'd make the plan more promising (that you could adopt separately or in conjunction) are (1) starting by finding research leads, and/or (2) starting small as-a-proportion-of-time. I'll elaborate on these:
Finding research leads
I think that Bostrom's taste was extremely important for FHI. There are a couple of levels this was true on:
- Cutting through unimportant stuff in seminars
- I think it's very easy for people, in research, to get fixated on things that don't really matter. Sometimes this is just about not asking enough which the really important questions are (or not being good enough at answering that); sometimes it's kind of performative, about people trying to show off how cool their work is
- Nick had low tolerance for this, as well as excellent taste. He wasn't afraid to be a bit disagreeable in trying to get to the heart of things
- This had a number of benefits:
- Helping discussions in seminars to be well-focused
- Teaching people (by example) how to do the cut-through-the-crap move
- Shaping incentives for researchers in the institute, towards tackling the important questions head on
- Gatekeeping access to the space
- Bostrom was good at selecting people who would really contribute in this environment
- This wasn't always the people who were keenest to be there; and saying "no" to people who would add a little bit but not enough (and dilute things) was probably quite important
- In some cases this meant finding outsiders (e.g. professors elsewhere) to visit, and keeping things intellectually vibrant by having discussions with people with a wide range of current interests and expertise, rather than have FHI just become an echo chamber
- Bostrom was good at selecting people who would really contribute in this environment
- Being a beacon
- Nick had a lot of good ideas, which meant that people were interested to come and talk to him, or give seminars, etc.
If you want something to really thrive, at some point you're going to have to wrestle with who is providing these functions. I think that one thing you could do is to start with this piece. Rather than think about "who are all the people who might be part of this? does that sound like critical mass?", start by asking "who are the people who could be providing these core functions?". I'd guess if you brainstorm names you'll end up with like 10-30 that might be viable (if they were interested). Then I'd think about trying to approach them to see if you can persuade one or more to play this role. (For one thing, I think this could easily end up with people saying "yes" who wouldn't express interest on the current post, and that could help you in forming a strong nucleus.)
I say "leads" rather than "lead" because it seems to me decently likely that you're best aiming to have these responsibilities be shared over a small fellowship. (I'm not confident in this.)
Your answer might also be "I, Oliver, will play this role". My gut take would be excited for you to be like one of three people in this role (with strong co-leads, who are maybe complementary in the sense that they're strong at some styles of thinking you don't know exactly how to replicate), and kind of weakly pessimistic about you doing it alone. (It certainly might be that that pessimism is misplaced.)
Starting small as-a-proportion-of-time
Generally, things start a bit small, and then scale up. People can be reluctant to make a large change in life circumstance (like moving job or even city) for something where it's unknown what the thing they're joining even is. By starting small you get to iron out kinks and then move on from there.
Given that you have the campus, I'd seriously consider starting small not as-a-number-of-people but as-a-proportion-of-time. You might not have the convening power to get a bunch of great people to make this their full time job right now (especially if they don't have a good sense who their colleagues will be etc.). But you probably do have the convening power to get a bunch of great people to show up for a week or two, talk through big issues, and spark collaborations.
I think that you could run some events like this. Maybe to start they're just kind of like conferences / workshops, with a certain focus. (I'd still start by trying to find something like "research leads" for the specific events, as I think it would help convening power as well as helping the things to go well.) In some sense that might be enough for carrying forward the spirit of FHI -- it's important that there are spaces for it, not that these spaces are open 365. But if it goes well and they seem productive, it could be expanded. Rather than just "research weeks", offer "visiting fellowships" where people take a (well-paid) 1-3 month sabbatical from their regular job to come and be in person all at the same time. And then if that's going well consider expanding to a permanent research group. (Or not! Perhaps the ephemeral nature of short-term things, and constantly having new people, would prove even more productive.)
It's a fair point that wisdom might not be straightforwardly safety-increasing. If someone wanted to explore e.g. assumptions/circumstances under which it is vs isn't, that would certainly be within scope for the competition.
Multiple entries are very welcome!
[With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I'm not sure exactly how we'd handle it if someone did the latter, but we'd aim for something sensible that didn't incentivise people to have been silly about it.]
Thanks, yes, I think that you're looking at things essentially the same way that I am. I particularly like your exploration of what the inner motions feel like; I think "unfixation" is a really good word.
I think that for most of what I'm saying, the meaning wouldn't change too much if you replaced the word "wholesome" with "virtuous" (though the section contrasting it with virtue ethics would become more confusing to read).
As practical guidance, however, I'm deliberately piggybacking off what people already know about the words. I think the advice to make sure that you pay attention to ways in which things feel unwholesome is importantly different from (and, I hypothesize, more useful than) advice to make sure you pay attention to ways in which things feel unvirtuous. And the advice to make sure you pay attention to things which feel frobby would obviously not be very helpful, since readers will not have much of a sense of what feels frobby.
If you personally believe it to be wrong, it's unwholesome. But generically no. See the section on revolutionary action in the third essay.
I think this is essentially correct. The essays (especially the later ones) do contain some claims about ways in which it might or might not be useful; of course I'm very interested to hear counter-arguments or further considerations.
The most straightforward criterion would probably be "things they themselves feel to be mistakes a year or two later". That risks people just failing to own their mistakes so would only work with people I felt enough trust in to be honest with themselves. Alternatively you could have an impartial judge. (I'd rather defer to "someone reasonable making judgements" than try to define exactly what a mistake is, because the latter would cover a lot of ground and I don't think I'd do a good job of it; also my claims don't feel super sensitive to how mistakes are defined.)
I would certainly update in the direction of "this is wrong" if I heard a bunch of people had tried to apply this style of thinking over an extended period, I got to audit it a bit by chatting to them and it seemed like they were doing a fair job, and the outcome was they made just as many/serious mistakes as before (or worse!).
(That's not super practically testable, but it's something. In fact I'll probably end up updating some from smaller anecdata than that.)
I definitely agree that this fails as a complete formula for assessing what's good or bad. My feeling is that it offers an orientation that can be helpful for people aggregating stuff they think into all-things-considered judgements (and e.g. I would in retrospect have preferred to have had more of this orientation in the past).
If someone were using this framework to stop thinking about things that I thought they ought to consider, I couldn't be confident that they weren't making a good faith effort to act wholesomely, but I at least would think that their actions weren't wholesome by my lights.
Good question, my answer on this is nuanced (and I'm kind of thinking it through in response to your question).
I think that what feels to you to be wholesome will depend on your values. And I'm generally in favour of people acting according to their own feeling of what is wholesome.
On the other hand I also think there would be some choices of values that I would describe as "not wholesome". These are the ones which ignore something of what's important about some dimension (perhaps justifying ignoring it by saying "I just don't value this"), at least as felt-to-be-important by a good number of other people in society.
But although "avoiding unwholesomeness" provides some constraints on values, it's not specifying exactly what values or tradeoffs are good to have. And then for any among the range of possible wholesome values, when you come to make decisions acting wholesomely will depend on your personal values. (Or, depending on the situation, perhaps not; in the case of the business plan, if it's supposed to be for the sake of the local community then what is wholesome could depend a lot more on the community's values than on your own.)
So there is an element of "paying at least some attention to traditional values" (at least while fair numbers of people care about them), but it's definitely not trying to say "optimize for them".
I doubt this is very helpful for our carefully-considered ethical notions of what's good.
I think it may be helpful as a heuristic for helping people to more consistently track what's good, and avoid making what they'd later regard as mistakes.
I agree that "paying attention to the whole system" isn't literally a thing that can be done, and I should have been clearer about what I actually meant. It's more like "making an earnest attempt to pay attention to the whole system (while truncating attention at a reasonable point)". It's not that you literally get to attend to everything, it's that you haven't excluded some important domain from things you care about. I think habryka (quoting and expanding on Ben Pace's thoughts) has a reasonable description of this in a comment.
I definitely don't think this is just making an arbitrary choice of what things to value, or that it's especially anchored in traditional values (though I do think it's correlated with traditional values).
I discuss a bit about making the tradeoffs of when to stop giving things attention in the section "wholesomeness vs expedience" in the second essay.
I think that there is some important unwholesomeness in these things, but that isn't supposed to mean that they're never permitted. (Sorry, I see how it could give that impression; but in the cases you're discussing there would often be greater unwholesomeness in not doing something.)
I discuss how I think my notion of wholesomeness intersects with these kind of examples in the section on visionary thought and revolutionary action in the third essay.
I think that there's something interesting here. One of the people I talked about this with asked me why children seem exceptionally wholesome (it's certainly not because they're unusually good at tracking the whole of things), and I thought the answer was about them being a part of the world where it may be especially important to avoid doing accidental harm, so our feelings of harms-to-children have an increased sense of unwholesomeness. But I'm now thinking that something like "robustly not evil" may be an important part of it.
Now we can trace out some of the links between wholesomeness1 and wholesomeness2. If evil is something like "consciously disregarding the impacts of your actions on (certain) others", then wholesomeness1 should robustly avoid it. And failures of wholesomeness1 which aren't evil might still be failures of wholesomeness2 -- because they involve a failure to attend to some impacts of actions, while observers may not be able to tell whether that failure to attend was accidental or deliberate.
A couple more notes:
- I don't think that wholesomeness2 is a crisp thing -- it's dependent on the audience, and how much they get to observe. Someone could have wholesomeness2 in a strong way with respect to one audience, and really not with respect to another audience.
- I think in expectation / in the long run / as your audiences get smarter (or something), pursuing wholesomeness1 may be a good proxy for wholesomeness2. Basically for the kind of reasons discussed in Integrity for consequentialists
FWIW I quite like your way of pointing at things here, though maybe I'm more inclined towards letting things hang out for a while in the (conflationary?) alliance space to see which seem to be the deepest angles of what's going on in this vicinity, and doing more of the conceptual analysis a little later.
That said, if someone wanted to suggest a rewrite I'd seriously consider adopting it (or using it as a jumping-off point); I just don't think that I'm yet at the place where a rewrite will flow naturally for me.
I largely think that the section of the second essay on "wholesomeness vs expedience" is also applicable here.
Basically I agree that you sometimes have to not look at things, and I like your framing of the hard question of wholesomeness. I think that the full art of deciding when it's appropriate to not think about something be better discussed via a bunch of examples, rather than trying to describe it in generalities. But the individual decisions are ones that you can make wholesomely or not, and I think that's my current best guess approach for how to handle this. Setting something aside, when it feels right to do so, with some sadness that you don't get to get to the bottom of it, feels wholesome. Blithely dismissing something as not worth attention typically feels unwholesome, because of something like a missing mood (and relatedly, it not being clear that you're attending enough to notice if it were worth more attention).
There's also a question about how this relates to social reality. I think that if you're choosing not to look at something because it doesn't feel like it's worth the attention, then if someone else raises it (because it seems important to them) it's natural to engage with some curiosity that you now -- for the space of the conversation -- get to look at the thing a bit. You may explain why you don't normally think about it, but you're not actively trying to suppress it. I think the more unwholesome versions of not looking at something are much more likely to try to actively avoid or shut the conversation down.
DALL·E. I often told it in abstract terms the themes I wanted to include, used prompts including "stylized and slightly abstract", and regenerated a few times till I got something I was happy with.
(There are also a few that I drew, but that's probably obvious.)
I'd be tempted to make it a question, and ask something like "what do you think the impacts of this on [me/person] are?".
It might be that question would already do work by getting them to think about the thing they haven't been thinking about. But it could also elicit a defence like "it doesn't matter because the mission is more important" in which case I'd follow up with an argument that it's likely worth at least understanding the impacts because it might help to find actions which are better on those grounds while being comparably good -- or even better -- for the mission. Or it might elicit a mistaken model of the impacts, in which case I'd follow up by saying that I thought it was mistaken and explaining how.
Maybe consider asking the authors if they'd want to volunteer a ?50? word summary for this purpose, and include summaries for those who do?
Examples of EA errors as failures of wholesomeness
In this comment (cross-posted from the EA forum) I’ll share a few examples of things I mean as failures of wholesomeness. I don’t really mean to over-index on these examples. I actually feel like a decent majority of what I wish that EA had been doing differently relates to this wholesomeness stuff. However, I’m choosing examples that are particularly easy to talk about — around FTX and around mistakes I've made — because I have good visibility of them, and in order not to put other people on the spot. Although I’m using these examples to illustrate my points, my beliefs don’t hinge too much on the particulars of these cases. (But the fact that the “failures of wholesomeness” frame can be used to provide insight on a variety of different types of error does increase the degree to which I think there’s a deep and helpful insight here.)
Fraud at FTX
To the extent that the key people at FTX were motivated by EA reasons, it looks like a catastrophic failure of wholesomeness — most likely supported by a strong desire for expedience and a distorted picture where people's gut sense of what was good was dominated by the terms where they had explicit models of impact on EA-relevant areas. It is uncomfortable to think that people could have caused this harm while believing they were doing good, but I find that it has some plausibility. It is hard to imagine that they would have made the same mistakes if they had explicitly held “be wholesome” as a major desideratum in their decision-making.
EA relationship to FTX
Assume that we don't get to intervene to change SBF’s behaviour. I still think that EA would have had a healthier relationship with FTX if it had held wholesomeness as a core virtue. I think many people had some feeling of unwholesomeness associated with FTX, even if they couldn't point to all of the issues. I think focusing on this might have helped EA to keep FTX at more distance, not to extol SBF so much just for doing a great job at making a core metric ($ to be donated) go up, etc. It could have gone a long way to reducing inappropriate trust, if people felt that their degree of trust in other individuals or organizations should vary not just with who espouses EA principles, but with how much people act wholesomely in general.
My relationship to attraction
I had an unhealthy relationship to attraction, and took actions which caused harm. (I might now say that I related to my attraction as unwholesome — arguably a mistake in itself, but compounded because I treated that unwholesomeness as toxic and refused to think about it. This blinded me to a lot of what was going on for other people, which led to unwholesome actions.)
Though I now think my actions were wrong, at some level I felt at the time like I was acting rightly. But (though I never explicitly thought in these terms), I do not think I would have felt like I was acting wholesomely. So if wholesomeness had been closer to a core part of my identity I might have avoided the harms — even without getting to magically intervene to fix my mistaken beliefs.
(Of course this isn’t precisely an EA error, as I wasn’t regarding these actions as in pursuit of EA — but it’s still very much an error where I’m interested in how I could have avoided it via a different high-level orientation.)
Wytham Abbey
Although I still think that the Wytham Abbey project was wholesome in its essence, in retrospect I think that I was prioritizing expedience over wholesomeness in choosing to move forward quickly and within the EV umbrella. I think that the more wholesome thing to do would have been, up front, to establish a new charity with appropriate governance structures. This would have been more inconvenient, and slowed things down — but everything would have been more solid, more auditable in its correctness. Given the scale of the project and its potential to attract public scrutiny, having a distinct brand that was completely separate from “the Centre for Effective Altruism” would have been a real benefit.
I knew at the time that that wasn’t entirely the wholesome way to proceed. I can remember feeling “you know, it would be good to sort out governance properly — but this isn’t urgent, so maybe let’s move on and revisit this later”. Of course there were real tradeoffs there, and I’m less certain than for the other points that there was a real error here; but I think I was a bit too far in the direction of wanting expedience, and expecting that we’d be able to iron out small unwholesomenesses later. Leaning further towards caring about wholesomeness might have led to more-correct actions.