Vote on worthwhile OpenAI topics to discuss 2023-11-21T00:03:03.898Z
Vote on Interesting Disagreements 2023-11-07T21:35:00.270Z
Online Dialogues Party — Sunday 5th November 2023-10-27T02:41:00.506Z
More or Fewer Fights over Principles and Values? 2023-10-15T21:35:31.834Z
Dishonorable Gossip and Going Crazy 2023-10-14T04:00:35.591Z
Announcing Dialogues 2023-10-07T02:57:39.005Z
Closing Notes on Nonlinear Investigation 2023-09-15T22:44:58.488Z
Sharing Information About Nonlinear 2023-09-07T06:51:11.846Z
A report about LessWrong karma volatility from a different universe 2023-04-01T21:48:32.503Z
Shutting Down the Lightcone Offices 2023-03-14T22:47:51.539Z
Open & Welcome Thread — February 2023 2023-02-15T19:58:00.435Z
Rationalist Town Hall: FTX Fallout Edition (RSVP Required) 2022-11-23T01:38:25.516Z
LessWrong Has Agree/Disagree Voting On All New Comment Threads 2022-06-24T00:43:17.136Z
Announcing the LessWrong Curated Podcast 2022-06-22T22:16:58.170Z
Good Heart Week Is Over! 2022-04-08T06:43:46.754Z
Good Heart Week: Extending the Experiment 2022-04-02T07:13:48.353Z
April 2022 Welcome & Open Thread 2022-04-02T03:46:13.743Z
Replacing Karma with Good Heart Tokens (Worth $1!) 2022-04-01T09:31:34.332Z
12 interesting things I learned studying the discovery of nature's laws 2022-02-19T23:39:47.841Z
Ben Pace's Controversial Picks for the 2020 Review 2021-12-27T18:25:30.417Z
Book Launch: The Engines of Cognition 2021-12-21T07:24:45.170Z
An Idea for a More Communal Petrov Day in 2022 2021-10-21T21:51:15.270Z
Facebook is Simulacra Level 3, Andreessen is Level 4 2021-04-28T17:38:03.981Z
Against "Context-Free Integrity" 2021-04-14T08:20:44.368Z
"Taking your environment as object" vs "Being subject to your environment" 2021-04-11T22:47:04.978Z
I'm from a parallel Earth with much higher coordination: AMA 2021-04-05T22:09:24.033Z
Why We Launched LessWrong.SubStack 2021-04-01T06:34:00.907Z
"Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party 2021-03-22T23:44:19.795Z
"You and Your Research" – Hamming Watch/Discuss Party 2021-03-19T00:16:13.605Z
Review Voting Thread 2020-12-30T03:23:06.075Z
Final Day to Order LW Books by Christmas for US 2020-12-09T23:30:36.877Z
The LessWrong 2018 Book is Available for Pre-order 2020-12-01T08:00:00.000Z
AGI Predictions 2020-11-21T03:46:28.357Z
Rationalist Town Hall: Pandemic Edition 2020-10-21T23:54:03.528Z
Sunday October 25, 12:00PM (PT) — Scott Garrabrant on "Cartesian Frames" 2020-10-21T03:27:12.739Z
Sunday October 18, 12:00PM (PT) — Garden Party 2020-10-17T19:36:52.829Z
Have the lockdowns been worth it? 2020-10-12T23:35:14.835Z
Fermi Challenge: Trains and Air Cargo 2020-10-05T21:51:45.281Z
Postmortem to Petrov Day, 2020 2020-10-03T21:30:56.491Z
Open & Welcome Thread – October 2020 2020-10-01T19:06:45.928Z
What are good rationality exercises? 2020-09-27T21:25:24.574Z
Honoring Petrov Day on LessWrong, in 2020 2020-09-26T08:01:36.838Z
Sunday August 23rd, 12pm (PDT) – Double Crux with Buck Shlegeris and Oliver Habryka on Slow vs. Fast AI Takeoff 2020-08-22T06:37:07.173Z
Forecasting Thread: AI Timelines 2020-08-22T02:33:09.431Z
[Oops, there is actually an event] Notice: No LW event this weekend 2020-08-22T01:26:31.820Z
Highlights from the Blackmail Debate (Robin Hanson vs Zvi Mowshowitz) 2020-08-20T00:49:49.639Z
Survey Results: 10 Fun Questions for LWers 2020-08-19T06:10:55.386Z
10 Fun Questions for LessWrongers 2020-08-18T03:28:05.276Z
Sunday August 16, 12pm (PDT) — talks by Ozzie Gooen, habryka, Ben Pace 2020-08-14T18:32:35.378Z
Is Wirecutter still good? 2020-08-07T21:54:06.141Z


Comment by Ben Pace (Benito) on Lying Alignment Chart · 2023-11-29T20:02:56.303Z · LW · GW

This feels like it wants to be a poll to me.

My first idea is to just have a poll like the other two we've had recently, where there's 9 entries and you agree/disagree with whether each statement is a lie.

I'm interested in any other suggestions for how to set up a poll.

Comment by Ben Pace (Benito) on Preserving our heritage: Building a movement and a knowledge ark for current and future generations · 2023-11-29T19:20:47.897Z · LW · GW

Mod feedback: This post would majorly benefit from a tl;dr, it took me a long time to find out that this post's content was a bill for how tech companies should deal with the accounts of deceased users. Something about the writing style also seems a bit overly elaborate, perhaps that's what the language model thinks essayists sound like.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-29T18:52:18.800Z · LW · GW

I am a bit confused how to relate to covertly breaking social norms.

In general I think you can't always tell whether a norm is dumb just by looking at the moral character of the people breaking it. I think sometimes silly norms are only violated by reckless and impulsive people with little ability to self-regulate and little care for ethics, and in some cases it isn't worth the cost to general norm-following behavior.

But as I say, still confused about the issue.

Comment by Ben Pace (Benito) on My techno-optimism [By Vitalik Buterin] · 2023-11-29T18:47:30.611Z · LW · GW

(Clarification: I didn't mean to say that this banner succeeded. I meant to say it was a worthwhile thing to attempt.)

Comment by Ben Pace (Benito) on My techno-optimism [By Vitalik Buterin] · 2023-11-29T05:39:28.204Z · LW · GW

I think it's relevant that Vitalik is 29, Bostrom is 50, and Yudkowsky is 44 (plus he has major chronic health issues).

I'd also say that the broader society has been much more supportive of Vitalik than it has been of Bostrom and Yudkowsky (billionaire, TIME cover, 5M Twitter followers, etc), putting him in a better place personally to try to do the ~political work of uniting people. He is also far more respected by the folks in the accelerationist camp making it more worthwhile for him to invest in an intellectual account that includes their dreams of the future (which he largely shares).

Comment by Ben Pace (Benito) on My techno-optimism [By Vitalik Buterin] · 2023-11-29T03:58:31.140Z · LW · GW

It is a good thing to actually try to find a banner to unite all the peoples...

Comment by Ben Pace (Benito) on A day in the life of a mechanistic interpretability researcher · 2023-11-28T18:40:33.915Z · LW · GW

That was fun to watch. But I would appreciate someone spelling out the implied connection to mechanistic interpretability.

Comment by Ben Pace (Benito) on AISC project: How promising is automating alignment research? (literature review) · 2023-11-28T18:37:33.983Z · LW · GW

I would encourage you to include ~any of your content in the body of the LW post, as I expect ~most people do not click through to read links, especially with very little idea of what's in the link.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-27T19:02:25.627Z · LW · GW

Curated. This is an interesting, thoughtful, and very engagingly written discussion of the things people are incentivized to hide, and how to reason about them.

I would have considered this for curation, but I slightly more wanted to curate it given that it's Duncan's last post on LessWrong (which he assigns 75% to). I hope he continues to write many excellent essays somewhere on the public internet, and personally I have signed up as a paying subscriber to his substack to support him doing so there.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-27T00:34:07.871Z · LW · GW

I think this is accurately described as "an EA organization got a board seat at OpenAI", and the actions of those board members reflect directly on EA (whether internally or externally).

Why did OpenAI come to trust Holden with this position of power? My guess is Holden and Dustin's personal reputations were substantial effects here, along with Open Philanthropy's major funding source, but also that many involved people's excitement about and respect for the EA movement were a relevant factor in OpenAI wanting to partner with Open Philanthropy, and that Helen's and Tasha's actions have directly and negatively reflected on how the EA ecosystem is viewed by OpenAI leadership.

There's a separate question about why Holden picked Helen Toner and Tasha MacAulay, and to what extent they were given power in the world by the EA ecosystem. It seems clear that these people have gotten power through their participation in the EA ecosystem (as OpenPhil is an EA institution), and to the extent that the EA ecosystem advertises itself as more moral than other places, if they executed the standard level of deceptive strategies that others in the tech industry would in their shoes, then that was false messaging.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-27T00:15:31.440Z · LW · GW

Some historical context

Holden in 2013 on the GiveWell blog:

We’re proud to be part of the nascent “effective altruist” movement. Effective altruism has been discussed elsewhere (see Peter Singer’s TED talk and Wikipedia); this post gives our take on what it is and isn’t.

Holden in 2015 on the EA Forum (talking about GiveWell Labs, which grew into OpenPhil):

We're excited about effective altruism, and we think of GiveWell as an effective altruist organization (while knowing that this term is subject to multiple interpretations, not all of which apply to us).

Holden in April 2016 about plans for working on AI:

Potential risks from advanced artificial intelligence will be a major priority for 2016. Not only will Daniel Dewey be working on this cause full-time, but Nick Beckstead and I will both be putting significant time into it as well. Some other staff will be contributing smaller amounts of time as appropriate.

(Dewey who IIRC had worked at FHI and CEA ahead of this, and Beckstead from FHI.)

Holden in 2016 about why they're making potential risks from advanced AI a priority:

I believe the Open Philanthropy Project is unusually well-positioned from this perspective:

  • We are well-connected in the effective altruism community, which includes many of the people and organizations that have been most active in analyzing and raising awareness of potential risks from advanced artificial intelligence. For example, Daniel Dewey has previously worked at the Future of Humanity Institute and the Future of Life Institute, and has been a research associate with the Machine Intelligence Research Institute.

Holden about the OpenAI grant in 2017:

This grant initiates a partnership between the Open Philanthropy Project and OpenAI, in which Holden Karnofsky (Open Philanthropy’s Executive Director, “Holden” throughout this page) will join OpenAI’s Board of Directors and, jointly with one other Board member, oversee OpenAI’s safety and governance work.

OpenAI initially approached Open Philanthropy about potential funding for safety research, and we responded with the proposal for this grant. Subsequent discussions included visits to OpenAI’s office, conversations with OpenAI’s leadership, and discussions with a number of other organizations (including safety-focused organizations and AI labs), as well as with our technical advisors.

As a negative datapoint: I looked through a bunch of the media articles linked at the bottom of this GiveWell page, and most of them do not mention Effective Altruism, only effective giving / cost-effectiveness. So their Effective Altruist identity have had less awareness amongst folks who primarily know of Open Philanthropy through their media appearances.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-26T21:57:27.815Z · LW · GW

In this mess, Altman and Helen should not be held to the same ethical standards, because I believe one of them has been given a powerful career in substantial part based on her commitments to higher ethical standards (a movement that prided itself on openness and transparency and trying to do the most good).

If Altman played deceptive strategies, and insofar as Helen played back the same deceptive strategies as Altman, then she did not honor the EA name.

(The name has a lot of dirt on it these days already, but still. It is a name that used to mean something back when it gave her power.)

Insofar as you got a position specifically because you were affiliated with a movement claiming to be good and open and honest and to have unusually high moral standards, and then when you arrive you become a standard political player, that's disingenuous.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-25T02:29:11.005Z · LW · GW

The most important thing right now: I still don't know why they chose to fire Altman, and especially why they chose to do it so quickly

That's an exceedingly costly choice to make (i.e. with the speed of it), and so when I start to speculate on why, I only come up with commensurately worrying states of affair e.g. he did something egregious enough to warrant it, or he didn't and the board acted with great hostility.

Them going back on their decision is bayesian evidence for the latter — if he'd done something egregious, they'd just be able to tell relevant folks, and Altman wouldn't get his job back.

So many people are asking this (e.g. everyone at the company). I'll be very worried if the reason doesn't come out.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-25T02:09:57.264Z · LW · GW

Also, I don't know that I've said this, but from reading enough of his public tweets, I had blocked Sam Altman long ago. He seemed very political in how he used speech, and so I didn't want to include him in my direct memetic sphere.

As a small pointer to why: he would commonly choose not to share object-level information about something, but instead share how he thought social reality should change. I think I recall him saying that the social consensus was wrong about fusion energy, and pushed for it to move in a specific direction; he did this rather than just plainly say what his object level beliefs about fusion were, or offer a particular counter-argument to an argument that was going around.

It's been a year or two since I blocked him, so I don't recall more specifics, but it seemed worth mentioning, as a datapoint for folks to include in their character assessments.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-24T07:20:37.671Z · LW · GW

My current guess is that most of the variance in what happened is explained by a board where 3 out of 4 people don't know the dynamics of upper management in a multi-billion dollar company, where the board don't know each other well, and (for some reason) the decision was made very suddenly. Pretty low-expectations given that situation. Seems like Shear was a pretty great replacement get given the hand dealt. Assuming that they had legit reason to fire the CEO, it's probably primarily through lack of skill and competence that they failed, more so than as a result of Altman's superior deal-making skill and leadership abilities (though that was what finished it off).

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-24T07:06:55.318Z · LW · GW

In brief: I'm saying that once you condition on:

  1. The board decided the firing was urgent.
  2. The board does not know each other very well and defaults to making decisions by consensus.
  3. The board is immediately in a high-stakes high-stress situation.

Then you naturally get

       4. The board fails to come to consensus on public comms about the decision.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-24T06:57:25.780Z · LW · GW

I'm not quite sure in the above comment how to balance between "this seems to me like it could explain a lot" and also "might just be factually false". So I guess I'm leaving this comment, lampshading it.

Comment by Ben Pace (Benito) on Benito's Shortform Feed · 2023-11-24T06:49:45.761Z · LW · GW

I don't normally just write-up takes, especially about current events, but here's something that I think is potentially crucially relevant to the dynamics involved in the recent actions of the OpenAI board, that I haven't seen anyone talk about:

The four members of the board who did the firing do not know each other very well.

Most boards meet a few times per year, for a couple of hours. Only Sutskever works at OpenAI. D'Angelo works senior roles in tech companies like Facebook and Quora, Toner is in EA/policy, and MacAulay at other tech companies (I'm not aware of any overlap with D'Angelo).

It's plausible to me that MacAulay and Toner have spent more than 50 hours in each others' company, but overall I'd probably be willing to bet at even odds that no other pair of them had spent more than 10 hours together before this crisis.

This is probably a key factor in why they haven't written more publicly about their decision. Decision-by-committee is famously terrible, and it's pretty likely to me that everyone pushes back hard on anything unilateral by the others in this high-tension scenario. So any writing representing them has to get consensus, and they're focused on firefighting and getting a new CEO, to spend time iterating on an explanation of their reasoning that they can all get behind. That's why Sutskever's public writing is only speaking for himself (he just says that he regrets the decision, he's said nothing about why or that in-principle speaks for the others).

I think this also predicts that Shear getting involved, and being the only direct counterparty that they must collectively and repeatedly work something out with, improved things. (Which accounts I've read suggest was a turning point in the negotiations.) He's the first person that they are all engaged with and need to make things work out with, so he is in a position where they are forced to get consensus in a timely fashion, and he can actually demand specific things of them. This was a forcing function on them making decisions and continuing to communicate with an individual.

It's standard to expect them to prepare a proper explanation in-advance, but from the information in this comment, I believe this firing decision was made within just a couple days of the event. A fast decision may have been the wrong call, but once it happened, then a team who doesn't really know each other is thrust into an extremely high-stakes position and has to make decisions by consensus. My guess is that this was really truly quite difficult and it was very hard to get anything done at all.

This lens on the situation makes me update in the direction that they will eventually talk about why, once they've had time to iterate on the text explaining the reasoning, now that the basic function of the company isn't under fire.

My current guess is that in many ways, a lot of the board's decision-making since the firing has been worse than any individual's on the board would've been had they been working alone.

Comment by Ben Pace (Benito) on AI Timelines · 2023-11-24T06:05:32.919Z · LW · GW

You'd be more likely to get this change if you suggested a workable alternative.

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-24T05:12:23.871Z · LW · GW

Yep, I can see how that could be confusing in context.

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-24T05:11:37.235Z · LW · GW

If you've entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.

If you make a trade agreement, and the other side does not actually pay up, then I do not think you are bound to provide the good anyway. It was a trade.

If you make a commitment, and then later come to realize that in requesting that commitment the other party was actually taking advantage of you, I think there are a host of different strategies one could pick. I think my current ideal solution is "nonetheless follow-through on your commitment, but make them pay for it in some other way", but I acknowledge that there are times when it's correct pick other strategies like "just don't do it and when anyone asks you why give them a straight answer" and more. 

Your strategy in a given domain will also depend on all sorts of factors like how costly the commitment is, how much they're taking advantage of you for, what recourse you have outside of the commitment (e.g. if they've broken the law they can be prosecuted, but in other cases it is harder to punish them).

The thing I currently believe and want to say here is that it is not good to renege on commitments even if you have reason to, and it is better to not renege on them while setting the incentives right. It can be the right choice to do so in order to set the incentives right, but even when it's the right call I want to acknowledge that this is a cost to our ability to trust in people's commitments.

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-24T04:47:09.454Z · LW · GW

Eh, I prefer to understand why the rules exist rather than blindly commit to them. Similarly, the Naskapi hunters used divination as a method of ensuring they'd randomize their hunting spots, and I think it's better to understand why you're doing it, rather than doing it because you falsely believe divination actually works.

Comment by Ben Pace (Benito) on Possible OpenAI's Q* breakthrough and DeepMind's AlphaGo-type systems plus LLMs · 2023-11-24T02:54:07.581Z · LW · GW

This post has a bunch of comments and was linked from elsewhere, so I've gone through and cleaned up the formatting a bunch.

In future, please 

  • Name the source in the post, rather than just providing a link that the reader has to click through to find who's speaking
  • Use the quotes formatting, so readers can easily distinguish between quotes and your text
  • Format the quotes as they were originally, do not include your own sentences like "Aligns with DeepMind Chief AGI scientist Shane Legg saying:" in a way that reads as though they were part of the original quote, nor cut tweets together and skip over the replies as though they were a single quote.
Comment by Ben Pace (Benito) on TurnTrout's shortform feed · 2023-11-24T00:18:58.223Z · LW · GW

They are not being treated worse than foot soldiers, because they do not have an enemy army attempting to murder them during the job. (Unless 'foot soldiers' itself more commonly used as a metaphor for 'grunt work' and I'm not aware of that.)

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-23T19:44:12.607Z · LW · GW

This question seems like a difficult interaction between utilitarianism and virtue ethics...

I think not being honorable is in large part a question of strategy. If you don't honor implicit agreements on the few occasions when you really need to win, it's a pretty different strategy from whether you honor implicit agreements all of the time. So it's not just a local question to a single decision, it's a broader strategic question.

I am sympathetic to consequentialist evaluations of strategies. I am broadly like "If you honor implicit agreements then people will be much more willing to trade with you and give you major responsibilities, and so going back on them in one occasion generally strikes down a lot of ways you might be able to effect the world." It's not just about this decision, but about an overall comparison of the costs and benefits of different kinds of strategies. There are many strategies one can play.

I could potentially make up some fake numbers to give a sense of how different decisions change which strategies to run (e.g. people who play more to the letter than the spirit of agreements, people who will always act selfishly if the payoff is at least going to 2x their wealth, people who care about their counter-parties ending up okay, people who don't give a damn about their counter-parties ending up okay, etc). I roughly think much more honest and open, straightforward, pro-social, and simple strategies are widely trusted, better for keeping you and your allies sane, are more effective on the particular issues you care about, but less effective at getting generic un-scoped power. I don't much care about the latter relative to the first three so it seems to me way better at achieving my goals.

I think it's extremely costly for trust to entirely change strategies during a single high-stakes decision, so I don't think it makes sense to re-evaluate it during the middle of the decision based on a simple threshold. (There could be observations that would make me realize during a high-stakes thing that I had been extremely confused about what game we were even playing, and then I'd change, but that doesn't fit as an answer to your question, which is about a simple probability/utility tradeoff.) It's possible that your actions on a board like this are overwhelmingly the most important choices you'll make and should determine your overall strategy, and you should really think through that ahead of time and let your actions show what strategy you're playing — well before agreeing to be on such a board.

Hopefully that explained how I think about the tradeoff you asked, while not giving specific numbers. I'm willing to answer more on this.

(Also, a minor correction: I said I was considering broadly dis-endorsing the sacred, for that reason. It seems attractive to me as an orientation to the world but I'm pretty sure I didn't say this was my resolute position.)

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-23T19:17:07.503Z · LW · GW

Oh, but I don't mean to say that Lukas was excluding me. I mean he was excluding all other people who exist who would also care about honoring partnerships after losing faith in the counter-party, of which there are more than just me, and more than just EAs.

Comment by Ben Pace (Benito) on Open Thread – Autumn 2023 · 2023-11-23T19:12:44.329Z · LW · GW

That's a lot of sailing! What did you get up to while doing it? Reading books? Surfing the web?

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-23T01:09:14.215Z · LW · GW


I find the situation a little hard to talk about concretely because whatever concrete description I give will not be correct (because nobody involved is telling us what happened).

Nonetheless, let us consider the most uncharitable narrative regarding Altman here, where members of the board come to believe he is a lizard, a person who is purely selfish and who has no honor. (To be clear I do not think this is accurate, I am using it for communication purposes.) Here are some rules.

  • Just because someone is a lizard, does not make it okay to lie to them
  • Just because someone is a lizard, does not make it okay to go back on agreements with them
  • While the lizard had the mandate to make agreements and commitments on behalf of your company, it is not now okay to disregard those agreements and commitments

The situation must not be "I'll treat you honorably if I think you're a good person, but the moment I decide you're a lizard then I'll act with no honor myself." The situation must be "I will treat you honorably because it is right to be honorable." Otherwise the honor will seep out of the system as probabilities we assign to others' honor wavers.

I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone's faith in one another to see people in powerful positions behave badly.


I respect that in response to my disapproval of your statement, you took the time to explain in detail the reasoning behind your comment and communicate some more of your perspective on the relevant game theory. I think that generally helps, when folks are having conflicts, to examine openly the reasons why decisions were made and investigate those. And it also gives us more surface area to locate key parts of the disagreement.

I still disagree with you. I think it was an easy-and-wrong thing to suggest that only people in the EA tribe would care about this important ethical principle I care about. But I am glad we're exploring this rather than just papering over it, or just being antagonistic, or just leaving.



"Suppose you come to the conclusion that I'm a lizard. Will you give me no chance for a rebuttal, and fire me immediately, without giving our business partners notice, and never give me a set of reasons, and never tell our staff a set of reasons?"

Prospective Board Member:

"No you can be confident that I would not do that.We would conduct an investigation, and at that time bar your ability to affect the board. We would be candid with the staff about our concerns, and we would not wantonly harm the agreements you made with your business partners."


"But what if you came to believe that I was maneuvering to remove power from you within days?"

Prospective Board Member:

"I think there are worlds where I would take sudden action. I could see myself voting to remove you from the board while the investigation is under way, and letting the staff and business partners know that we're investigating you and a possible outcome is you being fired."

Contracts are filled with many explicit terms and agreements, but I also believe they ~all come with an implicit one: in making this deal we are agreeing not to screw each other over. I think if they would have thought that this sudden-firing-without-cause and not explaining anything to the staff would be screwing Altman over when accepting the board seat, and if they did not bring up this sort of action as a thing they might do before it was time to do so, then they should not have done it.

IV .

I agree that there are versions of "agreeing to work closely together on the crucial project" where I see this as "speak up now or otherwise allow this person into your circle of trust." Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn't work as a circle of trust.

I don't think this is a "circle of trust". I think accepting a board seat is an agreement. It is an agreement to be given responsibility, and to use it well and in accordance with good principles. I think there is a principle to give someone a chance to respond and be open about why you are destroying their lives and company before you do so, regardless of context, and you don't forgo that just because they are acting antagonistically toward you. Barring illegal acts or acts of direct violence, you should give someone a chance to respond and be open about why you destroy everything they've built.

Batman shouldn't tell the joker that he's coming for him.

The Joker had killed many people when the Batman came for him. From many perspectives this is currently primarily a disagreement over managing a lot of money and a great company. These two are not similar.

Perhaps you wish to say that Altman is in an equivalent moral position, as his work is directly responsible for an AI takeover (as I believe), similar in impact to an extinction event. I think if Toner/MacAulay/etc believed this, then they should have said this openly far, far earlier, so that their counter-parties in this conflict (and everyone else in the world) were aware of the rules at play.

I don't believe that any of them said this before they were given board seats.


In the most uncharitable case (toward Altman) where they believed he was a lizard, they should probably have opened up an investigation before firing him, and taken some action to prevent him from outvoting them (e.g. just removed him from the board, or added an extra board member).

They claim to have done a thorough investigation. Yet it has produced no written results and they could not provide any written evidence to Emmett Shear. So I do not believe they have done a proper investigation or produced any evidence to others. If they can produce ~no evidence to others, then they should cast a vote of no confidence, fire Altman, implement a new CEO, implement a new board, and quit. I would have respected them more if they had also stated that they did not act honorably in ousting Altman and would be looking for a new board to replace them.

You can choose to fire someone for misbehavior even when you have no legible evidence of misbehavior. But then you have to think about how you can gain the trust of the next person who comes along, who understands you fired the last person with no clear grounds.


Lukas: I think it's a thing that only EAs would think up that it's valuable to be cooperative towards people who you're convinced are deceptive/lack integrity.

Ben: Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

Lukas: I think there's something off about the way you express whatever you meant to express here – something about how you're importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/"shameful" than if you expressed it under my frame. 

In any case, I'd understand it if you said something like "shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben's, interpretation)." If that's what you had said, I'm now replying that I hope that you no longer think this after reading what I elaborated above.

I keep reading this and not understanding your last reply. I'll rephrase my understanding of our positions.

I think you view the board firing situation as thus: some people, who didn't strongly trust Altman, were given the power to oust him, came to think he's a lizard (with zero concrete evidence), and then just got rid of him.

I'm saying that even if that's true, they should have acted more respectfully to him and honored their agreement to wield the power with care, so should have given him notice and done a proper investigation (again given they have zero concrete evidence).

I think that you view me as trying to extend the principle of charity arbitrarily far (to the point of self-harm), and so you're calling me too naive and cooperating, a lizard's a lizard, just destroy it.

I'm saying that you should honor the agreement you've made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly. A man was told on Friday that he had been severed from the ~$100B company he had built. He was given no cause, the company was given no cause, it appears as if there was barely any clear cause, and there was no way to make the decision right (were it a mistake). This method currently seems to me both a little cruel and a little power-hungry/unjust, even when I assume the overall call is the correct one.

For you to say that I'm just another EA who is playing cooperate bot lands with me as (a) inaccurately calling me naive and rounding off my position to a stupider one (b) disrespecting all the other people in the world who care about people wielding power well, and (c) kind of saying your tribe is the only one with good people in it. Which I think is a pretty inappropriate reply.

I have provided some rebuttals on a bunch of specific points above. Sorry for the too-long comment.

Comment by Ben Pace (Benito) on Sam Altman's ouster at OpenAI was precipitated by letter to board about AI breakthrough - Reuters · 2023-11-22T23:29:00.368Z · LW · GW

My first take is to bet against this being true, as Emmett Shear said the board's reasoning had nothing to do with a specific safety issue, and the reporters could not get confirmation from any of the people directly involved.

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-22T18:12:49.588Z · LW · GW

Absolutely not. When I make an agreement to work closely with you on a crucial project, if I think you're deceiving me, I will let you know. I will not surprise backstab you and get on with my day. I will tell you outright and I will say it loudly. I may move quickly to disable you if it's an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly. Furthermore I will provide evidence and argument in response to criticism of my decision by other stakeholders who are shocked and concerned by it.

Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-22T08:00:22.378Z · LW · GW

It entirely depends on the reasoning.

Quick possible examples:

  • "Altman, we think you've been deceiving us and tricking us about what you're doing. Here are 5 documented instances where we were left with a clear impression about what you'd do that is counter to what eventually occurred. I am pretty actively concerned that in telling you this, you will cover up your tracks and just deceive us better. So we've made the decision to fire you 3 months from today. In that time, you can help us choose your successor, and we will let you announce your departure. Also if anyone else in the company should ask, we will also show them this list of 5 events."
  • "Altman, we think you've chosen to speed ahead with selling products to users at the expense of having control over these home-grown alien intelligences you're building. I am telling you that there needs to be fewer than 2 New York Times pieces about us in the next 12 months, and that we must overall slow the growth rate of the company and not 2x in the next year. If either of these are not met, we will fire you, is that clear?"

Generally not telling the staff why was extremely disrespectful, and generally not highlighting it to him ahead of time, is also uncooperative.

Comment by Ben Pace (Benito) on OpenAI: Facts from a Weekend · 2023-11-22T05:03:07.290Z · LW · GW

I was confused about the counts, but I guess this makes sense if Helen cannot vote on her own removal. Then it's Altman/Brockman/Sutskever v Tasha/D'Angelo.

Pretty interesting that Sutskever/Tasha/D'Angelo would be willing to fire Altman just to prevent Helen from going. They instead could have negotiated someone to replace her. Wouldn't you just remove Altman from the Board, or maybe remove Brockman? Why would they be willing to decapitate the company in order to retain Helen?

Comment by Ben Pace (Benito) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-21T21:21:03.957Z · LW · GW

FTR I am not spending much time calculating the positive or negative direct effect of this firing. I am currently pretty concerned about whether it was done honorably and ethically or not. It looks not to me, and so I oppose it regardless of the sign of the effect.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T01:44:15.458Z · LW · GW

I assign more than 20% probability to this claim: the firing of Sam Altman was part of a plan to merge OpenAI with Anthropic.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T00:14:26.009Z · LW · GW

I am interested in info sharing and discussion and hope this poll will help. However I feel unclear if this poll is encouraging people to "pick their positions" too quickly, while the proverbial fog of war is still high (I feel like that when considering agree/disagree voting some of the poll options). I am interested in hearing if others have that reaction (via react, comment, or DM). My guess is I am unlikely to take this down but it will inform whether we do this sort of thing in the future.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-20T23:38:20.195Z · LW · GW

The way this firing has played out so far (to Monday Nov 20th) is evidence that the non-profit board effectively was not able to fire the CEO.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-20T21:49:33.618Z · LW · GW

Insofar as lawyers are recommending against speaking, the board should probably ignore them.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-20T21:44:02.416Z · LW · GW

I assign >80% probability to this claim: the board should be straightforward with its employees about why they fired the CEO.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-20T21:43:37.460Z · LW · GW

I assign >50% to this claim: The board should be straightforward with its employees about why they fired the CEO.

Comment by Ben Pace (Benito) on Vote on worthwhile OpenAI topics to discuss · 2023-11-20T21:40:36.065Z · LW · GW

Poll For Topics of Discussion and Disagreement

Use this thread to (a) upvote topics you're interested in reading about, (b) agree/disagree with positions, and (c) add new positions for people to vote on.

Comment by Ben Pace (Benito) on OpenAI: Facts from a Weekend · 2023-11-20T19:47:29.007Z · LW · GW

Fun story.

I met Emmett Shear once at a conference, and have read a bunch of his tweeting.

On Friday I turned to a colleague and asked for Shear's email, so that I could email him suggesting he try to be CEO, as he's built a multi-billion company before and has his head screwed on about x-risk.

My colleague declined, I think they thought it was a waste of time (or didn't think it was worth their social capital).

Man, I wish I had done it, that would have been so cool to have been the one to suggest it to him.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-20T03:25:36.836Z · LW · GW

Mm, perhaps rather than saying that most such people are untrustworthy, I just want to instead make an argument about risk and the availability of evidence.

  1. Some people are very manipulative and untrustworthy and covertly break widespread social norms.
  2. Some people covertly break widespread social norms for good reasons.
  3. Even if you find out one time people are covertly breaking a norm, you do not know how much more often they are covertly breaking social norms, and it's hard to understand the reasoning that went into the one you have learned about.

Suppose the amount of covert social norm breaking is heavy-tailed, where 90% of people break none, 8% of people break 1, 1% of people break 2-3, and 1% of people break 4+ (and are doing it all the time).

If you find out that someone breaks one, then you find out that they're not in the first bucket, and this is a 10x multiplier toward them being the sort of person who breaks 10+. So this is pretty scary.

And what's worse is regardless of which bucket they're in, they're not going to tell you which bucket they're in. Because they're not going to volunteer to you info about other norms they're breaking.

So (if this model/distribution is accurate) when you find out that someone has covertly broken a widespread social norm, you need to suddenly have your guard up, and to be safe you should probably apply a high standard before feeling confident that the person is not also violating other norms that you care about and keeping that from you.

(I just want to acknowledge in my comments I'm doing a lot of essentialism about people's long-standing personality traits, I'm not sure I'd endorse that if I reflected longer.)

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-20T02:43:43.756Z · LW · GW

Here's two sentences that I think are both probably true.

  1. In order to do what is right, at some point in a person's life they will have to covertly break certain widespread social norms.
  2. Most people who covertly break widespread social norms are untrustworthy people.

(As a note on my epistemic state: I assign a higher probability to the first claim being true than the second.)

One of the things I read the OP as saying is "lots of widespread social norms are very poorly justified by using extreme cases and silencing all the fine cases (and you should fix this faulty reasoning in your own mind)". I can get behind this. I think it's also saying "Most people are actually covertly violating widespread social norms in some way". I am genuinely much more confused about this. Many of the examples in the OP are more about persistent facts about people's builds (e.g. whether they have violent impulses or whether they are homosexual) than about their active choices (e.g. whether they carry out violence or whether they had homosexual sex).

For instance I find myself sympathetic to arguments where people say that many people would prefer to receive corporal punishment than be imprisoned for a decade, but if I were to find out that one particular prison was secretly beating the prisoners and then releasing them, I would be extremely freaked out by this. (This example doesn't quite make sense because that just isn't a state of affairs that you could keep quiet, but hopefully it conveys the gist of what I mean.)

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-20T02:16:59.223Z · LW · GW

(Remember: if, after thirty seconds of conscious awareness and deliberate thought, you come to the conclusion "no, this actually is bad, I should be on the warpath," you can always ramp right back up again!  Any panic that can be destroyed by a mere thirty seconds of slow, deep breathing is probably panic you didn't want in the first place, and it's pretty rare that literally immediate action is genuinely called-for, such that you can't afford to take the thirty seconds.)

This is interesting. I hope it's true. I'm not certain in general that, if I successfully tamper down my flared up anger or rage, that I will be able to straightforwardly bring it up again. If my emotions behave rationally and in response to my situation, then that's true, but people recently have been arguing to me that the coming and going of emotions is a much more random process influenced by chemicals and immediate environment and so on.

I think I'll accept it as probably true, but look out for evidence of this failing.

Comment by Ben Pace (Benito) on Sam Altman fired from OpenAI · 2023-11-18T17:26:19.821Z · LW · GW

It read like propaganda to me, whether the person works at the company or not.

Comment by Ben Pace (Benito) on Sam Altman fired from OpenAI · 2023-11-17T21:51:19.674Z · LW · GW

Also D'Angelo is on the board of Asana, Moskovitz's company (Moskovitz who funds Open Phil).

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-17T18:15:32.157Z · LW · GW

Appreciate the link. I'm updating from some of the people and their stories toward it not generally correlating with a broader disregard for decency to strategically break certain strongly enforced norms. I think I'm also substantially updating about how much homosexual recognition/acceptance there was in the early 1900s — there was a very successful theater production called The Captive about a lesbian that had famous actors and ~160 showings (until it was cancelled due to its subject being scandalous).

Curious quote 8 mins into the documentary about Speakeasies. I'm not sure what to make of it directionally about rule-breakers at the time and how to update about their motives.

The main thought behind the thing was to break the law, and live as wildly as you could. And everybody did. Because the Speakeasies were all over the town. Even the old residences, some of them had Speakeasies in the basement. Now a lot of people write about prohibition, but they don't bring out the fact that everybody was breaking the law because it was the thing to do.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-17T17:37:59.716Z · LW · GW

I agree it is also bayesian evidence for that! My current guess is it was more in the other direction, as in general I think there are more people breaking rules for bad reasons than for good reasons, but I'm not that confident, and would be interested in hearing from someone who disagreed about this (in specific or in general) and why.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-17T05:57:43.022Z · LW · GW

The mere fact of being gay (whilst being otherwise well-behaved and in compliance with all social norms and standards, such that most people never even noticed) is not a major risk factor for child molestation, and is not evidence that Mr. So-and-So was actually a ticking time bomb all along and we simply never knew, thank God we got rid of him before he fiddled with somebody all of a sudden after never doing anything of the sort for thirty years.

I think... finding out (in the 1950s) that someone maintained many secret homosexual relationships for many years is actually a signal the person is fairly devious, and is both willing and capable of behaving in ways that society has strong norms about not doing.

It obviously isn't true about homosexuals once the norm was lifted, but my guess is that it was at the time accurate to make a directional bayesian update that the person had behaved in actually bad and devious ways.

Edit: From looking through some of a YouTube documentary linked below, I updated that many of these people seemed pretty harmless and kindly. So I think there's a good chance I'm wrong in this case.

Comment by Ben Pace (Benito) on Social Dark Matter · 2023-11-17T05:47:57.181Z · LW · GW

Liking the score from the movie "Titanic" as a seventh-grade boy in N.C in 1998. 

(I don't know why, but I enjoy reading Duncan quietly slipping in references to situations that bothered him 25 years ago, all the more-so when there's vanishingly little chance that anyone involved will ever read him mention it.)