Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"

post by johnswentworth, Ruby · 2023-11-21T17:39:17.828Z · LW · GW · 84 comments

Contents

84 comments
johnswentworth

I've seen/heard a bunch of people in the LW-o-sphere saying that the OpenAI corporate drama this past weekend was clearly bad. And I'm not really sure why people think that? To me, seems like a pretty clearly positive outcome overall.

I'm curious why in the world people are unhappy about it (people in the LW-sphere, that is, obviously I can see why e.g. AI accelerationists would be unhappy about it). And I also want to lay out my models.

johnswentworth

Here's the high-gloss version of my take. The main outcomes are:

  • The leadership who were relatively most focused on racing to AGI and least focused on safety are moving from OpenAI to Microsoft. Lots of employees who are relatively more interested in racing to AGI than in safety will probably follow.
  • Microsoft is the sort of corporate bureaucracy where dynamic orgs/founders/researchers go to die. My median expectation is that whatever former OpenAI group ends up there will be far less productive than they were at OpenAI.
  • It's an open question whether OpenAI will stick around at all.
    • Insofar as they do, they're much less likely to push state-of-the-art in capabilities, and much more likely to focus on safety research.
    • Insofar as they shut down, the main net result will be a bunch of people who were relatively more interested in racing to AGI and less focused on safety moving to Microsoft, which is great.
johnswentworth

My current (probably wrong) best guesses at why other people in the LW-o-sphere are saying this is terrible:

  • There's apparently been a lot of EA-hate on twitter as a result. I personally expect this to matter very little, if at all, in the long run, but I'd expect it to be extremely disproportionately salient to rationalists/EAs/alignment folk.
  • OpenAI was an organization with a lot of AGI-accelerationists, and maybe people thought OpenAI was steering those accelerationist impulses in more safety-friendly directions, whereas Microsoft won't?
  • Obviously the board executed things relatively poorly. They should have shared their reasons/excuses for the firing. (For some reason, in politics/corporate politics, people try to be secretive all the time and this seems-to-me to be very stupid in like 80+% of cases, including this one.) I don't think that mistake will actually matter that much in the long term, but I can see why people focused on it would end up with a sort of general negative valence around the board's actions.
Ruby

(Quick caveat that I think this will question will be easier to judge once more info comes out. That said, I think that thinking about it is useful even now for thinking about and sharing relevant observations and considerations.)

Ruby

I think what happens to Sam and others who end up at Microsoft is a pretty big crux here. If I thought that indeed those going to Microsoft would get caught in bureaucracy and not accomplish as much, and also those staying behind wouldn't pursue as much, that might make the whole thing good for x-risk.

I'm not overwhelmingly confident here, but my impression is Sama might be competent enough to cut through the bureaucracy and get a lot done notwithstanding, and more than that, by being competent and getting AI, ends up running much of Microsoft. And being there just gives him a lot more resources with less effort than the whole invest-in-OpenAI cycle, and with less restrictions than he had at OpenAI.

One question is how independently he could operate. Nadella mentioned LinkedIn and Github (?) operating quite independently within Microsoft. Also I think Microsoft will feel they have to "be nice" to Sama as he is likely is their key to AI dominance. He clearly commands a following and could go elsewhere, and I doubt he'd put up bureaucracy that slowed him down.


An unanswered question so far is whether the board has acted with integrity (/cooperativeness). If the board is both judged to represent the most AI-concerned cluster (of which we are part) and they acted in a pretty bad way, then that itself could be really terrible for our (the cluster's) ability to cooperate or move things in the broader world in good directions. Like, possibly a lot worse than any association with SBF.

johnswentworth

An unanswered question so far is whether the board has acted with integrity (/cooperativeness). If the board is both judged to represent the most AI-concerned cluster (of which we are part) and they acted in a pretty bad way, then that itself could be really terrible for our (the cluster's) ability to cooperate or move things in the broader world in good directions. Like, possibly a lot worse than any association with SBF.

I remember shortly after the Sam-Bankman-Fried-pocalypse lots of rats/EAs/etc were all very preoccupied with how this would tarnish the reputation of EA etc. And a year later, I think it... just wasn't that bad? Like, we remain a relatively-high-integrity group of people overall, and in the long run PR equilibrated to reflect that underlying reality. And yes, there are some bad apples, and PR equilibrated to reflect that reality as well, and that's good, insofar as I worry about PR at all (which is not much) I mostly just want it to be accurate.

I think this is very much the sort of thing which will seem massively more salient to EAs/alignment researchers/etc than to other people, far out of proportion to how much it actually matters, and we need to adjust for that.

johnswentworth

(Side note: I also want to say here something like "integrity matters a lot more than PR", and the events of the past weekend seem like a PR problem much more than an integrity problem.)

Ruby

And a year later, I think it... just wasn't that bad?

Disagree, or at least, I don't think we know yet.

I agree that we're not continuing to hear hate online and the group continues and gets new members and life seems to go on. But also, in a world where this weekends events hadn't happened (I think they might dwarf what happened, or likely compound them), I think it's likely the SBF association would influence key events and ability to operate in the world.

There are what get called "stand by the Levers of Power" strategies and I don't know if they're good, but things like getting into positions within companies and governments that let you push for better AI outcomes, and I do think SBF might have made that a lot harder.

If Jason Matheny had been seeking positions (Whitehouse, RAND, etc) following SBF,  I think being a known EA might have been a real liability. And that's not blatantly visible a year later, but I would bet it's an association we have not entirely lost. And I think the same could be said of this weekends events to the extent that were came from EA-typical motivations. That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things.

johnswentworth

I'm not incredibly confident here, but my impression is Sama might be competent enouhg to cut through the bureaucracy and get a lot done notwithstanding, and more than that, by being competent and getting AI, ends up running much of Microsoft. And being there just gives him a lot more resources with less effort than the whole invest-in-OpenAI cycle, and with less restrictions than he had at OpenAI.

One question is how independently he could operate. Nadella mentioned LinkedIn and Github (?) operating quite independently within Microsoft. Also I think Microsoft will feel they have to "be nice" to Sama as he is likely is their key to AI dominance. He clearly commands a following and could go elsewhere, and I doubt he'd put up bureaucracy that slowed him down.

To my knowledge, Sama never spent much time in a big bureaucratic company before? He was at a VC firm and startups.

And on top of that, my not-very-informed-impression-from-a-distance is that he's more a smile-and-rub-elbows guy than an actual technical manager; I don't think he was running much day-to-day at OpenAI? Low confidence on that part, though, I have not heard it from a source who would know well.

The other side of this equation has less uncertainty, though. I found it hilarious that Nadella mentioned Linkedin as a success of Microsoft; that product peaked right before it was acquired. Microsoft pumped money out of it, but they sure didn't improve it.

Ruby

I agree about LinkedIn going downhill and so really does support your stance more than mine. Still, seems there are precedents of acquisitions getting operate independently. Instagram, I think. Pixar. Definitely others where a lot of freedom is maintained.

johnswentworth

And that's not blatantly visible a year later, but I would bet it's an association we have not entirely lost. And I think the same could be said of this weekends events to the extent that were came from EA-typical motivations. That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things.

I basically agree with this as a mechanism, to be clear. I totally think the board made some unforced PR errors and burned some of the EA commons as a result. I just think it's nowhere near as important as the other effects, and EAs are paying disproportionate attention to it.

Ruby

This seems to be crux: did the board merely make PR errors vs other errors?

I can dive in with my take, but also happy to first get your accounting of the errors.

johnswentworth

Still, seems there are precedents of acquisitions getting operate independently. Instagram, I think. Pixar. Definitely others where a lot of freedom is maintained.

But not so much Microsoft acquisitions. (Also, Instagram is pretty debatable, but I'll definitely give you Pixar.) Like, the people who want to accelerate AGI are going to end up somewhere, and if I had to pick a place Microsoft would definitely make the short list. The only potentially-better option which springs to mind right now would be a defense contractor: same-or-worse corporate bloat, but all their work would be Officially Secret.

Ruby

I think for most people I'd agree with that. Perhaps I'm buying into much to charisma, but Sam does give me the vibe of Actual Competence which makes it more scary.

johnswentworth

This seems to be crux: did the board merely make PR errors vs other errors?

I'll note that this is not a crux for me, and I don't really know why it's a crux for you? Like, is this really the dominant effect, such that it would change the sign of the net effect on X-risk from AGI? What's the mechanism by which it ends up mattering that much?

Ruby

I was going to write stuff about integrity, and there's stuff to that, but the thing that is striking me most right now is that the whole effort seemed very incompetent and naive. And that's upsetting. I have the fear that my cluster has been revealed to be (I mean, if it's true, good for it to come out) to not be very astute and you shouldn't deal with us because we clearly don't know what we're doing.

johnswentworth

I have the fear that my cluster has been revealed to be (I mean, if it's true, good for it to come out) to not be very astute and you shouldn't deal with us because we clearly don't know what we're doing.

First, that sure does sound like the sort of thing which the human brain presents to us as a far larger, more important fact than it actually is. Ingroup losing status? Few things are more prone to distorted perception than that.

But anyway, I repeat my question: what's the mechanism by which it ends up mattering that much? Like, tell me an overly-specific story with some made-up details in it. (My expectation is that the process of actually telling the story will make it clear that this probably doesn't matter as much as it feels like it matters.)

johnswentworth

This seems to be crux: did the board merely make PR errors vs other errors?

I can dive in with my take, but also happy to first get your accounting of the errors.

(Now realizing I didn't actually answer this yet.) I think the board should have publicly shared their reasons for sacking Sama. That's basically it; that's the only part which looks-to-me like an obvious unforced mistake.

Ruby

Suppose that johnswentworth and a related cluster of researchers figure out some actually key useful stuff that makes the difference for Alignment. It has large implications for how to go about training your models and evaluating your models (in evals sense), and relatedly kinds of policy that are useful. Something something in natural abstractions that isn't immediately useful for building more profitable AI, but does shape what your agenda and approaches should look like.

Through the social graph, people reach out to former-OpenAI employees to convince of the importance of the results. Jan Leike is convinced but annoyed because he feels burned, Ilya is also sympathetic, but feels constrained in pushing LW/EA-cluster research ideas among a large population of people who have antagonistic, suspicious feelings of the doomers who tried to fire Sam Altman and didn't explain why.

People go to Washington and/or UK taskforce (I can pick names if it matters), some people are sympathetic, other fear the association. It doesn't help that Helen Toner, head of CSET, seemed to be involved in this fiasco. It's not that no one will talk to us and that no one will entertain the merits of the research, but that most people don't judge the research on its merits and there isn't the credibility to adopt it or its implications. A lot of friction arises from interacting with the "doomer cluster". There's bleed over from judgment that political judgment is poor to general judgment (including risks of AI at all) is poor.

If the things you want people to do differently are costly, e.g. your safer AI is more expensive, but you are seen as untrustworthy, low-integrity, low-tranparency, low political competence, then I think you'll have a hard time getting buy in for it.

I can get more specific if that would help. 

johnswentworth

Ok, and having written all that out... quantitatively, how much do you think the events of this past weekend increased the chance of something-vaguely-like-that-but-more-general happening, and that making the difference between doom and not-doom?

Ruby

That's a good question. I don't think I'm maintaining enough calibration/fine-grained estimates to answer in a non-bullshit way, so I want to decline answering that for now.

My P(Doom) feels like it's up somewhere between above 1x but less than 2x but would take some work to separate out "I learned info about how the world is" and "actions by the board made things worse".

johnswentworth

I don't think I'm maintaining enough calibration/fine-grained estimates to answer in a non-bullshit way, so I want to decline answering that for now.

Yeah, totally fair, that was a pretty large ask on my part.

johnswentworth

Getting back to my own models here: as I see it, (in my median world) the AI lab which was previously arguably-leading the capabilities race is now de-facto out of the capabilities race. Timelines just got that much longer, race dynamics just got much weaker. And I have trouble seeing how the PR blowback from the board's unforced errors could plausibly be bad enough to outweigh a benefit that huge, in terms of the effect on AI X-risk.

Like, that is one of the best possible things which could realistically happen from corporate governance, in terms of X-risk.

Ruby

I'm not ruling the above out, but seems very non-obvious to me, and I'd at least weakly bet against.

As above, crux of how much less/more AGI progress happens at Microsoft. I viewed OpenAI as racing pretty hard but still containing voices of restraint and some earnest efforts on behalf of at least some (Jan, Ilya ?) to make AGI go well. I imagine Microsoft having even less caution than OpenAI, and more profit focus, and less ability of the more concerned/good models of AI risk to influence things for the better.

There's the model of Microsoft as slow bureaucracy, but also it's a 3 trillion dollar company with a lot of cash and correctly judging that AI is the most importance for dominance in the future. If they decide they want to start manufacturing their own chips or whatever, easy to do so. Also natural for Google to feel more threatened by them if Microsoft contains it's own division. We end up with both Microsoft and Google containing large skilled AI research teams, each with billions of capital to put behind a race.

Really, to think this reduces race dynamics to me seems crazy actually.

johnswentworth

I mean, I don't really care how much e.g. Facebook AI thinks they're racing right now. They're not in the game at this point. That's where I expect Microsoft to be a year from now (median world). Sure, the snails will think they're racing, but what does it matter if they're not going to be the ones in the lead?

(If the other two currently-leading labs slow down enough that Facebook/Microsoft are in the running, then obviously Facebook/Microsoft would become relevant, but that would be a GREAT problem to have relative to current position.)

Google feeling threatened... maaayyyybe, but that feels like it would require a pretty conjunctive path of events for it to actually accelerate things.

johnswentworth

... anyway, we've been going a couple hours now, and I think we've identified the main cruxes and are hitting diminishing returns. Wanna do wrap-up thoughts? (Or not, we can keep pushing a thread if you want to.)

Ruby

Sounds good, yeah, I think we've ferreted out some cruxes. Seems Microsoft productivity is a big one, and how much the remain a leading player. I think they do. I think they keep most of the talent and OpenAI gets to carry on with more resources and less restraint.

johnswentworth

I think this part summarized my main take well:

(in my median world) the AI lab which was previously arguably-leading the capabilities race is now de-facto out of the capabilities race. Timelines just got that much longer, race dynamics just got much weaker. And I have trouble seeing how the PR blowback from the board's unforced errors could plausibly be bad enough to outweigh a benefit that huge, in terms of the effect on AI X-risk.

johnswentworth

I think we both agree that "how productive are the former OpenAI folks at Microsoft?" is a major crux?

Ruby

The other peace if I think it's likely a lot of credibility and trust with those expressing AI-doom/risk got burned and will be very hard to replace, in ways that have reduced our ability to nudge things for the better (whether we had the judgment to or were going to successfully otherwise, I'm actually really not sure).

Ruby

I think we both agree that "how productive are the former OpenAI folks at Microsoft?" is a major crux?

Yup, agree on that one. I'd predict something like we continue to see the same or greater rate of GPT editions and capability, etc, out of Microsoft that OpenAI was producing until now.

Ruby

Cheers, I enjoyed this

 

johnswentworth

[Added 36 hours later:] Well that model sure got downweighted quickly.

84 comments

Comments sorted by top scores.

comment by Eli Tyre (elityre) · 2023-11-21T20:00:03.453Z · LW(p) · GW(p)

There's apparently been a lot of EA-hate on twitter as a result. I personally expect this to matter very little, if at all, in the long run, but I'd expect it to be extremely disproportionately salient to rationalists/EAs/alignment folk.

I think this matters insofar as thousands of tech people just got radicalized into the "accelerationist" tribe, and have a narrative of "EAs destroy value in their arrogance and delusion." Whole swaths of Silicon Valley are now primed to roll their eyes at and reject any technical, governance, or policy proposals justified by "AI safety."

See this Balaji thread for instance, which notably is close to literally correct (like, yeah, I would endorse "anything that reduces P(doom) is good"), but slips in the presumption that "doomers" are mistaken about which things reduce P(doom) / delusional about there being doom at all. Plus a good dose of the attack-via-memetic-association (comparing "doomers" to the unibomber). This is basically just tribalism.

I don't know if this was inevitable in the long run. It seems like if the firing of Sam Altman was handled better, if they had navigated the politics so that Sam "left to spend more time with his family" or whatever is typical when you oust a CEA, we could have avoided this impact.

Replies from: ChristianKl, JenniferRM
comment by ChristianKl · 2023-11-23T03:39:08.970Z · LW(p) · GW(p)

I think this matters insofar as thousands of tech people just got radicalized into the "accelerationist" tribe, and have a narrative of "EAs destroy value in their arrogance and delusion." Whole swaths of Silicon Valley are now primed to roll their eyes at and reject any technical, governance, or policy proposals justified by "AI safety."

I'm not sure. This episode might give some people the idea that Sam Altman position in regards to regulation is a middle of two extremes when they previously opposed the regulations that Sam sought as being too much. 

comment by JenniferRM · 2023-11-22T01:50:28.592Z · LW(p) · GW(p)

That's part of the real situation though. Sam would never quit to "spend more time with his family".

When we predict good outcomes for startups, the qualities that come up in the supporting arguments are toughness, adaptability, determination. Which means to the extent we're correct, those are the qualities you need to win.

Investors know this, at least unconsciously. The reason they like it when you don't need them is not simply that they like what they can't have, but because that quality is what makes founders succeed.

Sam Altman has it. You could parachute him into an island full of cannibals and come back in 5 years and he'd be the king. If you're Sam Altman, you don't have to be profitable to convey to investors that you'll succeed with or without them. (He wasn't, and he did.)

Link in sauce.

Replies from: elityre
comment by Eli Tyre (elityre) · 2023-11-22T07:41:29.923Z · LW(p) · GW(p)

He can leave "to avoid a conflict of interest with AI development efforts at Tesla", then. It doesn't have to be "relaxation"-coded. Let him leave with dignity.

Insofar as Sam would never cooperate with the board firing him at all, even more gracefully, then yeah, I guess the board never really had any governance power at all, and it's good that the fig leaf has been removed. 

And basically, we'll never know if they bungled a coup that they could have pulled off if they were more savvy, or if this was a foregone conclusion.

Replies from: JenniferRM
comment by JenniferRM · 2023-11-27T20:37:13.239Z · LW(p) · GW(p)

With apologies for the long response... I suspect the board DID have governance power, but simply not decisive power.

Also it was probably declining, and this might have been a net positive way to spend what remained of it... or not?

It is hard to say, and I don't personally have the data I'd need to be very confident. "Being able to maintain a standard of morality for yourself even when you don't have all the data and can't properly even access all the data" is basically the core REASON for deontic morality, after all <3

Naive consequentialism has a huge GIGO data problem that Kant's followers do not have.

(The other side of it (the "cost of tolerated ignorance" so to speak) is that Kantian's usually are leaving "expected value" (even altruistic expected value FOR OTHERS) on the table by refraining from actions that SEEM positive EV but which have large error bars based on missing data, where some facts could exist that they don't know about that would later cause them to have appeared to lied or stolen or used a slave or run for high office in a venal empire or whatever.)

I personally estimate that it would have been reasonable and prudent for Sam to cultivate other bases of power, preparing for a breach of amity in advance, and I suspect he did. (This is consistent with suspecting the board's real power was declining.)

Conflict in general is sad, and often bad, and it usually arises at the boundaries where two proactive agentic processes show up with each of them "feeling like Atlas" and feeling that that role morally authorizes them to regulate others in a top-down way... to grant rewards, or to judge conflicts, or to sanction wrong-doers...

...if two such entities recognize each other as peers, then it can reduce the sadness of their "lonely Atlas feelings"!  But also they might have true utility functions, and not just be running on reflexes! Or their real-agency-echoing reflexive tropisms might be incompatible. Or mixtures thereof?

Something I think I've seen many times is a "moral reflex" on one side (that runs more on tropisms?) be treated as a "sign of stupidity" by someone who habitually runs a shorter tighter OODA loop and makes a lot of decisions, whose flexibility is taken as a "sign of evil". Then both parties "go mad" :-(

Before any breach, you might get something with a vibe like "a meeting of sovereigns", with perhaps explicit peace or honorable war... like with two mafia families, or like two blockchains pondering whether or how to fund dual smart contracts that maintain token-value pegs at a stable ratio, or like the way Putin and Xi are cautious around each other (but probably also "get" each other (and "learn from a distance" from each other's seeming errors)).

In a democracy, hypothetically, all the voters bring their own honor to a big shared table in this way, and then in Fukuyama's formula such "Democrats" can look down on both "Peasants" (for shrinking from the table even when invited to speak and vote in safety) and also "Nobles" (for simple power-seeking amorality that only cares about the respect and personhood of other Nobles who have fought for and earned their nobility via conquest or at least via self defense).

I could easily imagine that Sam does NOT think of himself "as primarily a citizen of any country or the world" but rather thinks of himself as something like "a real player", and maybe only respects "other real players"?

(Almost certainly Sam doesn't think of himself AS a nominal "noble" or "oligarch" or whatever term. Not nominally. I just suspect, as a constellation of predictions and mechanisms, that he would be happy if offered praise shaped according to a model of him as, spiritually, a Timocracy-aspiring Oligarch (who wants money and power, because those are naturally good/familiar/oikion, and flirts in his own soul (or maybe has a shadow relationship?) with explicitly wanting honor and love), rather than thinking of himself as a Philosopher King (who mostly just wants to know things, and feels the duty of logically coherent civic service as a burden, and does NOT care for being honored or respected by fools, because fools don't even know what things are properly worthy of honor). In this framework, I'd probably count as a sloth, I think? I have mostly refused the call to adventure, the call of duty, the call to civic service.)

I would totally get it if Sam might think that OpenAI was already "bathed in the blood of a coup" from back when nearly everyone with any internal power somehow "maybe did a coup" on Elon?

The Sam in my head would be proud of having done that, and maybe would have wished to affiliate with others who are proud of it in the same way?

From a distance, I would have said that Elon starting them up with such a huge warchest means Elon probably thereby was owed some debt of "governing gratitude" for his beneficence?

If he had a huge say in the words of the non-profit's bylaws, then an originalist might respect his intent when trying to apply them far away in time and space. (But not having been in any of those rooms, it is hard to say for sure.)

Elon's ejection back then, if I try to scry it from public data, seems to have happened with the normal sort of "oligarchic dignity" where people make up some bullshit about how a breakup was amicable.

((It can be true that it was "amicable" in some actual pareto positive breakups, whose outer forms can then be copied by people experiencing non-pareto-optimal breakups. Sometimes even the "loser" of a breakup values their (false?) reputation for amicable breakups more than they think they can benefit from kicking up a fuss about having been "done dirty" such that the fuss would cause others to notice ad help them less than the lingering reputation for conflict would hurt.

However there are very many wrinkles to the localized decision theory here!

Like one big and real concern is that a community would LIKE to "not have to take sides" over every single little venal squabble, such as to maintain itself AS A COMMUNITY (with all the benefits of large scale coordination and so on) rather than globally forking every single time any bilateral interaction goes very sour, with people dividing based on loyalty rather than uniting via truth and justice.

This broader social good is part of why a healthy and wise and cheaply available court system is, itself, an enormous public good for a community full of human people who have valid selfish desires to maintain a public reputation as "a just person" and yet also as "a loyal person".))

So the REAL "psychological" details about "OpenAI's possible first coup" are very obscure at this point, and imputed values for that event are hard to use (at least hard for me who is truly ignorant of them) in inferences whose conclusions could be safely treated as "firm enough to be worth relying on in plans"?

But if that was a coup, and if OpenAI already had people inside of it who already thought that OpenAI ran on nearly pure power politics (with only a pretense of cooperative non-profit goals), then it seems like it would be easy (and psychologically understandable) for Sam to read all pretense of morality or cooperation (in a second coup) as bullshit.

And if the board predicted this mental state in him, then they might "lock down first"?

Taking the first legibly non-negotiated non-cooperative step generally means that afterwards things will be very complex and time dependent [LW · GW] and once inter-agent conflict gets to the "purposeful information hiding stage" everyone is probably in for a bad time :-(

For a human person to live like either a naive saint (with no privacy or possessions at all?!) or a naive monster (always being a closer?) would be tragic and inhuman.

Probably digital "AI people" will have some equivalent experience of similar tradeoffs, relative to whatever Malthusian limits they hit (if they ever hit Malthusian limits, and somehow retain any semblance or shape of "personhood" as they adapt to their future niche). My hope is that they "stay person shaped" somehow. Because I'm a huge fan of personhood.)

The intrinsic tensions between sainthood and monsterhood means that any halo of imaginary Elons or imaginary Sams, who I could sketch in my head for lack of real data, might have to be dropped in an instant based on new evidence.

In reality, they are almost certainly just dudes, just people, and neither saints, nor monsters.

Most humans are neither, and the lack of coherent monsters is good for human groups (who would otherwise be preyed upon), and the lack of coherent saints is good for each one of us (as a creature in a world, who has to eat, and who has parents and who hopefully also has children, and for whom sainthood would be locally painful).

Both sainthood and monsterhood are ways of being that have a certain call on us, given the world we live in. Pretending to be a saint is a good path to private power over others, and private power is subjectively nice to have... at least until the peasants with knifes show up (which they sometimes do).

I think that tension is part of why these real world dramatic events FEEL like educational drama, and pull such huge audiences (of children?), who come to see how the highest and strongest and richest and most prestigious people in their society balance such competing concerns within their own souls.

comment by faul_sname · 2023-11-21T20:18:22.494Z · LW(p) · GW(p)

There are what get called "stand by the Levers of Power" strategies and I don't know if they're good, but things like getting into positions within companies and governments that let you push for better AI outcomes, and I do think SBF might have made that a lot harder.

I think this is an important point: one idea that is very easy to take away from the FTX and OpenAI situations is something like

People associated with EA are likely to decide at some point that the normal rules for the organization do not apply to them, if they expect that they can generate a large enough positive impact in the world by disregarding those rules.  Any agreement you make with an EA-associated person should be assumed to have an "unless I think the world would be better if I broke this agreement" rider (in addition to the usual "unless I stand to personally gain a lot by breaking this agreement" rider that people already expect and have developed mitigations for).

Basically, I expect that the strategy of "attempt to get near the levers of power in order to be able to execute weird plans where, if the people in charge of the decision about whether to let you near the levers of power knew about your plans, they never would have let you near the levers of power in the first place" will work less well for EAs in the future. To the extent that EAs actually have a tendency to attempt those sorts of plans, it's probably good that people are aware of that tendency.

But if you start from the premise of "EAs having more ability to influence the world is good, and the reason they have that ability is not relevant", then this weekend was probably quite bad.

Replies from: Ruby
comment by Ruby · 2023-11-21T20:28:58.607Z · LW(p) · GW(p)

People associated with EA are likely to decide at some point that the normal rules for the organization do not apply to them, if they expect that they can generate a large enough positive impact in the world by disregarding those rules.

I am myself consequentialist at my core, but invoking consequentialism to justify breaking commitments, non-cooperation, theft, or whatever else is just a stupid bad policy (the notion of people doing this generates some strong emotions for me) that as a policy/algorithm, won't result in accomplishing one's consequentialist goals.

I fear what you say is not wholly inaccurate and is true of at least some in EA, I hope though it's not that true of many.

Where it does get tricky is potentially unilateral pivotal acts about which I think go in this direction but also feel different from what you describe.

comment by mtaran · 2023-11-21T23:08:14.841Z · LW(p) · GW(p)

A few others have commented about how MSFT doesn't necessarily stifle innovation, and a relevant point here is that MSFT is generally pretty good at letting its subsidiaries do their own thing and have their own culture. In particular GitHub (where I work), still uses Google Workspace for docs/email, slack+zoom for communication, etc. GH is very much remote-first whereas that's more of an exception at MSFT, and GH has a lot less suffocating bureaucracy, and so on. Over the years since the acquisition this has shifted to some extent, and my team (Copilot) is more exposed to MSFT than most, but we still get to do our own thing and at worst have to jump through some hoops for compute resources. I suspect if OAI folks come under the MSFT umbrella it'll be as this sort of subsidiary with almost complete ability to retain whatever aspects of its previous culture that it wants.

Standard disclaimer: my opinions are my own, not my employer's, etc.

comment by Vladimir_Nesov · 2023-11-21T18:42:33.975Z · LW(p) · GW(p)

For me the crux is influence of these events on Sutskever ending up sufficiently in charge of a leading AGI project. It appeared borderline true before; it would've become even more true than that if Altman's firing stuck without disrupting OpenAI overall; and right now with the strike/ultimatum letter it seems less likely than ever (whether he stays in an Altman org or goes elsewhere).

(It's ambiguous if Anthropic is at all behind, and then there's DeepMind that's already in the belly of Big Tech, so I don't see how timelines noticeably change.)

Replies from: mishka
comment by mishka · 2023-11-21T20:13:23.886Z · LW(p) · GW(p)

Exactly. And then one's estimate of the actual impact depends on whether one believes Sutskever is one of the best people to lead an AI existential safety effort.

If one believes that, and if the outcome is that he ends up less likely to do so in the context of the leading AGI/ASI project, then the impact on safety might be very negative.

If one does not believe that he is one of the best people to lead this kind of effort, then one might think that the impact is not negative.


(I personally believe Ilya's approach is one of the better ones, and it seems to me that he has been in the process of fixing the defects in the original OpenAI superalignment plan, and basically trying to gradually create a better plan, but people's views on that might differ.)

comment by Casey B. (Zahima) · 2023-11-21T19:30:27.290Z · LW(p) · GW(p)

The main reason I think a split OpenAI means shortened timelines is that the main bottleneck to capabilities right now is insight/technical-knowledge. Quibbles aside, basically any company with enough cash can get sufficient compute. Even with other big players and thousands/millions of open source devs trying to do better, to my knowledge GPT4 is still the best, implying some moderate to significant insight lead. I worry by fracturing OpenAI, more people will have access to those insights, which 1) significantly increases the surface area of people working on the frontiers of insight/capabilities, 2) we burn the lead time OpenAI had, which might otherwise have been used to pay off some alignment tax, and 3) the insights might end up at a less scrupulous (wrt alignment) company. 

A potential counter to (1): OpenAI's success could be dependent on having all (or some key subset) of their people centralized and collaborating. 

Counter-counter: OpenAI staff, especially the core engineering talent but it seems the entire company at this point, clearly wants to mostly stick together, whether at the official OpenAI, Microsoft, or with any other independent solution. So them moving to any other host, such as Microsoft, means you get some of the worst of both worlds; OAI staff are centralized for peak collaboration, and Microsoft probably unavoidably gets their insights. I don't buy the story that anything under the Microsoft umbrella gets swallowed and slowed down by the bureaucracy; Satya knows what he is dealing with and what they need, and won't get in the way. 

Replies from: stephen-mcaleese
comment by Stephen McAleese (stephen-mcaleese) · 2023-11-21T22:58:56.311Z · LW(p) · GW(p)

GPT-4 is the model that has been trained with the most training compute which suggests that compute is the most important factor for capabilities. If that wasn't true, we would see some other company training models with more compute but worse performance which doesn't seem to be happening.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2023-11-21T23:20:16.507Z · LW(p) · GW(p)

Falcon-180b illustrates how throwing compute at an LLM can result in unusually poor capabilities. Epoch's estimate puts it close to Claude 2 in compute, yet it's nowhere near as good. Then there's the even more expensive PaLM 2, though since weights are not published, it's possible that unlike with Falcon the issue is that only smaller, overly quantized, or incompetently tuned models are being served.

comment by Ben Pace (Benito) · 2023-11-21T21:21:03.957Z · LW(p) · GW(p)

FTR I am not spending much time calculating the positive or negative direct effect of this firing. I am currently pretty concerned about whether it was done honorably and ethically or not. It looks not to me, and so I oppose it regardless of the sign of the effect.

Replies from: elityre, elityre
comment by Eli Tyre (elityre) · 2023-11-22T07:46:43.892Z · LW(p) · GW(p)

I am currently pretty concerned about whether it was done honorably and ethically or not.

What does the ethical / honorable version of the firing look like?

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-22T08:00:22.378Z · LW(p) · GW(p)

It entirely depends on the reasoning.

Quick possible examples:

  • "Altman, we think you've been deceiving us and tricking us about what you're doing. Here are 5 documented instances where we were left with a clear impression about what you'd do that is counter to what eventually occurred. I am pretty actively concerned that in telling you this, you will cover up your tracks and just deceive us better. So we've made the decision to fire you 3 months from today. In that time, you can help us choose your successor, and we will let you announce your departure. Also if anyone else in the company should ask, we will also show them this list of 5 events."
  • "Altman, we think you've chosen to speed ahead with selling products to users at the expense of having control over these home-grown alien intelligences you're building. I am telling you that there needs to be fewer than 2 New York Times pieces about us in the next 12 months, and that we must overall slow the growth rate of the company and not 2x in the next year. If either of these are not met, we will fire you, is that clear?"

Generally not telling the staff why was extremely disrespectful, and generally not highlighting it to him ahead of time, is also uncooperative.

Replies from: Lukas_Gloor
comment by Lukas_Gloor · 2023-11-22T11:00:28.547Z · LW(p) · GW(p)

Maybe, yeah. Definitely strongly agree with not telling the staff a more complete story seems to be bad for both intrinsic and instrumental reasons. 

I'm a bit unsure how wise it would be to tip Altman off in advance given what we've seen he can mobilize in support of himself. 

And I think it's a thing that only EAs would think up that it's valuable to be cooperative towards people who you're convinced are deceptive/lack integrity. [Edit: You totally misunderstood what I meant here; I was criticizing them for doing this too naively. I was not praising the norms of my in-group. Your reply actually confused me so much that I thought you were being snarky in some really strange way.] Of course, they have to consider all the instrumental reasons for it, such as how it'll reflect on them if others don't share their assessment of the CEO lacking integrity. 

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-22T18:12:49.588Z · LW(p) · GW(p)

Absolutely not. When I make an agreement to work closely with you on a crucial project, if I think you're deceiving me, I will let you know. I will not surprise backstab you and get on with my day. I will tell you outright and I will say it loudly. I may move quickly to disable you if it's an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly. Furthermore I will provide evidence and argument in response to criticism of my decision by other stakeholders who are shocked and concerned by it.

Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

Replies from: Thane Ruthenis, Lukas_Gloor, kave, Linch
comment by Thane Ruthenis · 2023-11-22T23:54:47.572Z · LW(p) · GW(p)

That does not seem sensible to me. We're not demons for the semantics of our contracts to be binding. If you've entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.

If the other person merely turned out to be generically a bad/unpleasant person, with no indication they're planning to specifically exploit their contract with you? Then yes, backstabbing them is dishonorable.

But if you've realized, with high confidence, that they're not actually cooperating with you the way you understood them to be, and they know they're not cooperating with you the way you expect them to be, and they deliberately maintain your misunderstanding in order to exploit you? Then, I feel, giving them advance warning is just pointless self-harm in the name of role-playing being cooperative; not a move to altruistically preserve social norms.

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-24T05:11:37.235Z · LW(p) · GW(p)

If you've entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.

If you make a trade agreement, and the other side does not actually pay up, then I do not think you are bound to provide the good anyway. It was a trade.

If you make a commitment, and then later come to realize that in requesting that commitment the other party was actually taking advantage of you, I think there are a host of different strategies one could pick. I think my current ideal solution is "nonetheless follow-through on your commitment, but make them pay for it in some other way", but I acknowledge that there are times when it's correct pick other strategies like "just don't do it and when anyone asks you why give them a straight answer" and more. 

Your strategy in a given domain will also depend on all sorts of factors like how costly the commitment is, how much they're taking advantage of you for, what recourse you have outside of the commitment (e.g. if they've broken the law they can be prosecuted, but in other cases it is harder to punish them).

The thing I currently believe and want to say here is that it is not good to renege on commitments even if you have reason to, and it is better to not renege on them while setting the incentives right. It can be the right choice to do so in order to set the incentives right, but even when it's the right call I want to acknowledge that this is a cost [LW · GW] to our ability to trust in people's commitments.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-24T07:49:09.530Z · LW(p) · GW(p)

Sure, I agree with all of the local points you've stated here ("local" as in "not taking account the particulars of the recent OpenAI drama").  For clarity, my previous disagreement was with the following local claim:

When I make an agreement to work closely with you on a crucial project, if I think you're deceiving me, I will let you know

In my view, "knowingly executing your part of the agreement in a way misaligned with my understanding of how that part is to be executed" counts as "not paying up in a trade agreement", and is therefore grounds for ceasing to act in accordance with the agreement on my end too. From this latest comment, it sounds like you'd agree with that?

Reading the other branch of this thread, you seem to disagree that that was the situation in which the OpenAI board had been. Sure, I'm hardly certain of this myself. However, if it were, and if they were highly certain of being in that position, I think their actions are fairly justified.

My understanding is that OpenAI's foundational conceit was prioritizing safety over profit/power-pursuit, and that their non-standard governance structure was explicitly designed to allow the board to take draconian measures if they concluded the company went astray. Indeed, going off these sort of disclaimers or even the recent actions, it seems they were hardly subtle or apologetic about such matters.

"Don't make us think that you'd diverge from our foundational conceit if given the power to, or else" was part of the deal Sam Altman effectively signed by taking the CEO role. And if the board had concluded that this term was violated, then taking drastic and discourteous measures to remove him from power seems entirely fine to me.

Paraphrasing: while in the general case of deal-making, a mere "l lost trust in you" is not reasonable grounds for terminating the deal, my understanding is that "we have continued trust in you" was part of this specific deal, meaning losing trust was reasonable grounds for terminating this specific deal.

I acknowledge, though, that it's possible I'm extrapolating from the "high-risk investment" disclaimer plus their current actions incorrectly, that the board had actually failed to communicate this to Sam when hiring him. Do we have cause to believe that, though?

comment by Lukas_Gloor · 2023-11-22T19:02:12.322Z · LW(p) · GW(p)

When I make an agreement to work closely with you on a crucial project,

I agree that there are versions of "agreeing to work closely together on the crucial project" where I see this as "speak up now or otherwise allow this person into your circle of trust." Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn't work as a circle of trust.

So, there are circumstances where I'd agree with you. Whether the relationship between a board member and a CEO should be like that could be our crux here. I'd say yes in the ideal, but was it like that for the members of the board and Altman? I'd say it depends on the specific history. And my guess is that, no, there was no point where the board could have said "actually we're not yet sure we want to let Altman into our circle of trust, let's hold off on that." And there's no yes without the possibility of no [LW · GW]. 

if I think you're deceiving me, I will let you know.

I agree that one needs to do this if one lost faith in people who once made it into one's circle of trust. However, let's assume they were never there to begin with. Then, it's highly unwise if you're dealing with someone without morals who feels zero obligations towards you in response. Don't give them an advance warning out of respect or a sense of moral obligation. If your mental model of the person is "this person will internally laugh at you for being stupid enough to give them advance warning and will gladly use the info you gave against you," then it would be foolish to tell them. Batman shouldn't tell the joker that he's coming for him.

I may move quickly to disable you if it's an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly.

What I meant to say in my initial comment is the same thing as you're saying here.

"Acknowledging the cost" is also an important thing in how I think about it, (edit) but I see that cost as not being towards the Joker (respect towards him), but towards the broader cooperative fabric. [Edit: deleted a passage here because it was long-winded.]

"If I assign a decent chance to them behaving poorly" – note that in my description, I spoke of practical certainty, not just "a decent chance that." Even in contexts where I think mutual expectations of trustworthiness and cooperativeness are lower than in what I call "circles of trust," I'm all in favor of preserving respect up until way past the point where you're just a bit suspicious of someone. It's just that, if the stakes are high and if you're not in a high-trust relationship with the person (i.e., you don't have a high prior that they're for sure cooperating back with you), there has to come a point where you'll stop giving them free information that could harm you. 

I admit this is a step in the direction of act utilitarianism, and act utilitarianism is a terrible, wrong ideology. However, I think it's only a step and not all the way, and there's IMO a way to codify rules/virtues where it's okay to take these steps and you don't get into a slippery slope. We can have a moral injunction where we'd only make such moves against other people if our confidence is significantly higher than it needs to be on mere act utilitarian grounds. Basically, you either need smoking-gun evidence of something sufficiently extreme, or need to get counsel from other people and see if they agree to filter out unilateralism in your judgment, or have other solutions/safety-checks like that before allowing yourself to act.

I think what further complicates the issue is that there are "malefactor [LW · GW] types" who are genuinely concerned about doing the right thing and where it looks like they're capable of cooperating with people in their inner circle, but then they are too ready to make huge rationalization-induced updates (almost like "splitting" in BPD) that the other party was bad throughout all along and is now out of the circle. Their inner circle is way too fluid and their true circle of trust is only themselves. The existence of this phenotype means that if someone like that tries to follow the norms I just advocated, they will do harm. How do I incorporate this into my suggested policy? I feel like this is analogous to discussions about modest epistemology vs non-modest epistemology. [LW · GW] What if you're someone who's deluded to think he's Napolean/some genius scientist? If someone is deluded like that, non-modest epistemology doesn't work. To this, I say "epistemology is only helpful if you're not already hopelessly deluded." Likewise, what if your psychology is hopelessly self-deceiving and you'll do on-net-harmful self-serving things even when you try your best not to do them? Well, sucks to be you (or rather, sucks for other people that you exist), but that doesn't mean that the people with a more trust-compatible psychology have to change the way they go about building a fabric of trust that importantly also has to be protected against invasion from malefactors.

I actually think it's a defensible position to say that the temptation to decide who is or isn't "trustworthy" is too big and humans need moral injunctions and that batman should give the joker an advance warning, so I'm not saying you're obviously wrong here, but I think my view is defensible as well, and I like it better than yours and I'll keep acting in accordance with it. (One reason I like it better is because if I trust you and you play "cooperate" with someone who only knows deception and who moves against you and your cooperation partners and destroys a ton of value, then I shouldn't have trusted you either. Being too undiscriminately cooperative makes you less trustworthy in a different sort of way.)

Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

I think there's something off about the way you express whatever you meant to express here – something about how you're importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/"shameful" than if you expressed it under my frame. 

[Edit: November 22nd, 20:46 UK time. Oh, I get it now. You totally misunderstood what I meant here! I was criticizing EAs for doing this too naively. I was not praising the norms of my in-group (EA). Your reply actually confused me so much that I thought you were being snarky at me in some really strange way. Like, I thought you knew I was criticizing EAs. I guess you might identify as more of a rationalist than an EA, so I should have said "only EAs and rationalists" to avoid confusion. And like I say below, this was somewhat hyperbolic.]

In any case, I'd understand it if you said something like "shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben's, interpretation)." If that's what you had said, I'm now replying that I hope that you no longer think this after reading what I elaborated above.

Edit: And to address the part about "your tribe" – okay, I was being hyperbolic about only EAs having a tendency to be (what-I-consider-to-be) naive when it comes to applying norms of cooperation. It's probably also common in other high-trust ideological communities. I think it actually isn't very common in Silicon Valley, which very much supports my point here. When people get fired or backstabbed over startup drama (I'm thinking of the movie The Social Network), they are not given three months adjustment period where nothing really changes except that they now know what's coming. Instead, they have their privileges revoked and passwords changed and have to leave the building. I think focusing over how much notice someone has given is more a part of the power struggle and war over who has enough leverage to get others on their side, than it is genuinely about "this particular violation of niceness norms is so important that it deserves to be such a strong focus of this debate." Correspondingly, I think people would complain a lot less about how much notice was given if the board had done a better job convincing others that their concerns were fully justified. (Also, Altman himself certainly wasn't going to give Helen a lot of time still staying on the board and adjusting to the upcoming change, still talking to others about her views and participating in board stuff, etc., when he initially thought he could get rid of her.)

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-23T01:09:14.215Z · LW(p) · GW(p)

1.

I find the situation a little hard to talk about concretely because whatever concrete description I give will not be correct (because nobody involved is telling us what happened).

Nonetheless, let us consider the most uncharitable narrative regarding Altman here, where members of the board come to believe he is a lizard, a person who is purely selfish and who has no honor. (To be clear I do not think this is accurate, I am using it for communication purposes.) Here are some rules.

  • Just because someone is a lizard, does not make it okay to lie to them
  • Just because someone is a lizard, does not make it okay to go back on agreements with them
  • While the lizard had the mandate to make agreements and commitments on behalf of your company, it is not now okay to disregard those agreements and commitments

The situation must not be "I'll treat you honorably if I think you're a good person, but the moment I decide you're a lizard then I'll act with no honor myself." The situation must be "I will treat you honorably because it is right to be honorable." Otherwise the honor will seep out of the system as probabilities we assign to others' honor wavers.

I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone's faith in one another to see people in powerful positions behave badly.

II.

I respect that in response to my disapproval of your statement, you took the time to explain in detail the reasoning behind your comment and communicate some more of your perspective on the relevant game theory. I think that generally helps, when folks are having conflicts, to examine openly the reasons why decisions were made and investigate those. And it also gives us more surface area to locate key parts of the disagreement.

I still disagree with you. I think it was an easy-and-wrong thing to suggest that only people in the EA tribe would care about this important ethical principle I care about. But I am glad we're exploring this rather than just papering over it, or just being antagonistic, or just leaving.

III.

CEO:

"Suppose you come to the conclusion that I'm a lizard. Will you give me no chance for a rebuttal, and fire me immediately, without giving our business partners notice, and never give me a set of reasons, and never tell our staff a set of reasons?"

Prospective Board Member:

"No you can be confident that I would not do that.We would conduct an investigation, and at that time bar your ability to affect the board. We would be candid with the staff about our concerns, and we would not wantonly harm the agreements you made with your business partners."

CEO

"But what if you came to believe that I was maneuvering to remove power from you within days?"

Prospective Board Member:

"I think there are worlds where I would take sudden action. I could see myself voting to remove you from the board while the investigation is under way, and letting the staff and business partners know that we're investigating you and a possible outcome is you being fired."

Contracts are filled with many explicit terms and agreements, but I also believe they ~all come with an implicit one: in making this deal we are agreeing not to screw each other over. I think if they would have thought that this sudden-firing-without-cause and not explaining anything to the staff would be screwing Altman over when accepting the board seat, and if they did not bring up this sort of action as a thing they might do before it was time to do so, then they should not have done it.

IV .

I agree that there are versions of "agreeing to work closely together on the crucial project" where I see this as "speak up now or otherwise allow this person into your circle of trust." Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn't work as a circle of trust.

I don't think this is a "circle of trust". I think accepting a board seat is an agreement. It is an agreement to be given responsibility, and to use it well and in accordance with good principles. I think there is a principle to give someone a chance to respond and be open about why you are destroying their lives and company before you do so, regardless of context, and you don't forgo that just because they are acting antagonistically toward you. Barring illegal acts or acts of direct violence, you should give someone a chance to respond and be open about why you destroy everything they've built.

Batman shouldn't tell the joker that he's coming for him.

The Joker had killed many people when the Batman came for him. From many perspectives this is currently primarily a disagreement over managing a lot of money and a great company. These two are not similar.

Perhaps you wish to say that Altman is in an equivalent moral position, as his work is directly responsible for an AI takeover (as I believe), similar in impact to an extinction event. I think if Toner/MacAulay/etc believed this, then they should have said this openly far, far earlier, so that their counter-parties in this conflict (and everyone else in the world) were aware of the rules at play.

I don't believe that any of them said this before they were given board seats.

V.

In the most uncharitable case (toward Altman) where they believed he was a lizard, they should probably have opened up an investigation before firing him, and taken some action to prevent him from outvoting them (e.g. just removed him from the board, or added an extra board member).

They claim to have done a thorough investigation. Yet it has produced no written results and they could not provide any written evidence to Emmett Shear. So I do not believe they have done a proper investigation or produced any evidence to others. If they can produce ~no evidence to others, then they should cast a vote of no confidence, fire Altman, implement a new CEO, implement a new board, and quit. I would have respected them more if they had also stated that they did not act honorably in ousting Altman and would be looking for a new board to replace them.

You can choose to fire someone for misbehavior even when you have no legible evidence of misbehavior. But then you have to think about how you can gain the trust of the next person who comes along, who understands you fired the last person with no clear grounds.

VI.

Lukas: I think it's a thing that only EAs would think up that it's valuable to be cooperative towards people who you're convinced are deceptive/lack integrity.

Ben: Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

Lukas: I think there's something off about the way you express whatever you meant to express here – something about how you're importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/"shameful" than if you expressed it under my frame. 

In any case, I'd understand it if you said something like "shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben's, interpretation)." If that's what you had said, I'm now replying that I hope that you no longer think this after reading what I elaborated above.

I keep reading this and not understanding your last reply. I'll rephrase my understanding of our positions.

I think you view the board firing situation as thus: some people, who didn't strongly trust Altman, were given the power to oust him, came to think he's a lizard (with zero concrete evidence), and then just got rid of him.

I'm saying that even if that's true, they should have acted more respectfully to him and honored their agreement to wield the power with care, so should have given him notice and done a proper investigation (again given they have zero concrete evidence).

I think that you view me as trying to extend the principle of charity arbitrarily far (to the point of self-harm), and so you're calling me too naive and cooperating, a lizard's a lizard, just destroy it.

I'm saying that you should honor the agreement you've made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly. A man was told on Friday that he had been severed from the ~$100B company he had built. He was given no cause, the company was given no cause, it appears as if there was barely any clear cause, and there was no way to make the decision right (were it a mistake). This method currently seems to me both a little cruel and a little power-hungry/unjust, even when I assume the overall call is the correct one.

For you to say that I'm just another EA who is playing cooperate bot lands with me as (a) inaccurately calling me naive and rounding off my position to a stupider one (b) disrespecting all the other people in the world who care about people wielding power well, and (c) kind of saying your tribe is the only one with good people in it. Which I think is a pretty inappropriate reply.

I have provided some rebuttals on a bunch of specific points above. Sorry for the too-long comment.

Replies from: Ruby, Lukas_Gloor
comment by Ruby · 2023-11-23T02:24:27.071Z · LW(p) · GW(p)

I see interesting points on both sides here. Something about how this comment(s) is expressed makes me feel uncomfortable, like this isn't the right tone for exploring disagreements about correct moral/cooperative behavior, it at least it makes it a lot harder for me to participate. I think it's something like it feels like performing moral outrage/indignation in a way that feels more persuadey than explainy, and more in the direction of social pressure, norms-enforcery. The phrase "shame on you" is a particularly clear thing I'll point at that makes me perceive this.

comment by Lukas_Gloor · 2023-11-23T03:04:22.122Z · LW(p) · GW(p)

a) A lot of your points are specifically about Altman and the board, whereas many of my points started that way but then went into the abstract/hypothetical/philosophical. At least, that's how I meant it – I should have made this more clear. I was assuming, for the sake of the argument, that we're speaking of a situation where the person in the board's position found out that someone else is deceptive to their very core, with no redeeming principles they adhere to. So, basically what you're describing in your point "I" with the lizardpeople. I focused on that type of discussion because I felt like you were attacking my principles, and I care about defending my specific framework of integrity. (I've commented elsewhere on things that I think the board should or shouldn't have done, so I also care about that, but I probably already spent too many comments on speculations about the board's actions.) 
Specifically about the actual situation with Altman, you say: 
"I'm saying that you should honor the agreement you've made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly."
I very much agree with that, fwiw. I think it's very possible that the board did not act with integrity here. I'm just saying that I can totally see circumstances where they did act with integrity. The crux for me is "what did they believe about Altman and how confident were they in their take, and did they make an effort to factor in moral injunctions against using their power in a self-serving way, etc?" 

b) You make it seem like I'm saying that it's okay to move against people (and e.g. oust them) without justifying yourself later or giving them the chance to reply at some point later when they're in a less threatening position. I think we're on the same page about this: I don't believe that it would be okay to do these things. I wasn't saying that you don't have to stand answer to what you did. I was just saying that it can, under some circumstances, be okay to act first and then explain yourself to others later and establish yourself as still being trustworthy.

c) About your first point (point "I"), I disagree. I think you're too deontological here. Numbers do count. Being unfair to someone who you think is a bad actor but turns out they aren't has a victim count of one. Letting a bad actor take over the startup/community/world you care about has a victim count of way more than one. I also think it can be absolutely shocking how high this can go (in terms of various types of harms caused by the bad tail of bad actors) depending on the situation. E.g., think of Epstein or dictators. On top of that, there are indirect bad effects that don't quite fit the name "victim count" but that still weigh heavily, such as distorted epistemics or destruction of a high-trust environment when it gets invaded by bad actors. Concretely, I feel like when you talk about the importance of the variable "respect towards Altman in the context of how much notice to give him," I'm mostly thinking, sure, it would be nice to be friendly and respectful, but that's a small issue compared to considerations like "if the board is correct, how much could he mobilize opposition against them if he had a longer notice period?" So, I thought three months notice would be inappropriate given what's asymmetrically at stake on both sides of the equation. (It might change once we factor in optics and how it'll be easier for Altman to mobilize opposition if he can say he was treated unfairly – for some reason, this always works wonders. DARVO is like dark magic. Sure, it sucks for Altman to lose a 100 billion company that they built. But an out-of-control CEO recklessly building the most dangerous tech in the world sucks more for way more people in expectation.) In the abstract, I think it would be an unfair and inappropriate sense of what matters if a single person who is accused of being a bad actor gets more respect than their many victims would suffer in expectation. And I'm annoyed that it feels like you took the moral high ground here by making it seem like my positions are immoral. But maybe you meant the "shame on yourself" for just one isolated sentence, and not my stance as a whole. I'd find that more reasonable. In any case, I understand now that you probably feel bothered for an analogous reason, namely that I made a remark about how it's naive to be highly charitable or cooperative under circumstances where I think it's no longer appropriate. I want to flag that nothing you wrote in your newest reply seems naive to me, even though I do find it misguided. (The only thing that I thought was maybe naive was the point about three months notice – though I get why you made it and I generally really appreciate examples like that about concrete things the board could have done. I just think it would backfire when someone would use these months to make moves against you.)

d) The "shame on yourself" referred to something where you perceived me to be tribal, but I don't really get what that was about. You write "and (c) kind of saying your tribe is the only one with good people in it." This is not at all what I was kind of saying. I was saying my tribe is the only one with people who are "naive in such-and-such specific way" in it, and yeah, that was unfair towards EAs, but then it's not tribal (I self-identify as EA), and I feel like it's okay to use hyperbole this way sometimes to point at something that I perceive to be a bit of a problem in my tribe. In any case, it's weirdly distorting things when you then accuse me of something that only makes sense if you import your frame on what I said. I didn't think of this as being a virtue, so I wasn't claiming that other communities don't also have good people. 

e) Your point "III" reminds me of this essay [LW · GW] by Eliezer titled "Meta-Honesty: Firming Up Honesty Around Its Edge-Cases." Just like Eliezer in that essay explains that there are circumstances where he thinks you can hide info or even deceive, there are circumstances where I think you can move against someone and oust them without advance notice. If a prospective CEO interviews me as a board member, I'm happy to tell them exactly under which circumstances I would give them advance notice (or things like second and third chances) and under which ones I wouldn't. (This is what reminded me of the essay and the dialogues with the Gestapo officer.) (That said, I'd decline the role because I'd probably have overdosed on anxiety medication if I had been in the OpenAI board's position.) 
The circumstances would have to be fairly extreme for me not to give advanced warnings or second chances, so if a CEO thinks I'm the sort of person who doesn't have a habit of interpreting lots of things in a black-and-white and uncharitable manner, then they wouldn't have anything to fear if they're planning on behaving well and are at least minimally skilled at trust-building/making themselves/their motives/reasons for actions transparent.

f) You say: 
"I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone's faith in one another to see people in powerful positions behave badly." 
I agree that it's damaging, but the way I see it, the problem here is the existence of psychopaths and other types of "bad actors" (or "malefactors"). They are why issues around trust and trustworthiness are sometimes so vexed and complicated. It would be wonderful if such phenotypes didn't exist, but we have to face reality. It doesn't actually help "the social fabric/fabric of trust" if one lends too much trust to people who abuse it to harm others and add more deception. On the contrary, it makes things worse. 

g) I appreciate what you say in the first paragraph of your point IV! I feel the same way about this. (I should probably have said this earlier in my reply, but I'm about to go to sleep and so don't want to re-alphabetize all of the points.) 

comment by kave · 2023-11-22T21:52:27.768Z · LW(p) · GW(p)

I'm uncomfortable reading this comment. I believe you identified as an EA for much of your adult life, and the handle "EA" gives me lots of the bits that distinguish you from the general population. But you take for granted that Lukas' "EA" is meant to exclude you.

By contrast, I would not have felt uncomfortable if you had been claiming adherence to the same standards by your childhood friends, startup culture, or some other clearly third party group.

Replies from: Benito, Linch
comment by Ben Pace (Benito) · 2023-11-23T19:17:07.503Z · LW(p) · GW(p)

Oh, but I don't mean to say that Lukas was excluding me. I mean he was excluding all other people who exist who would also care about honoring partnerships after losing faith in the counter-party, of which there are more than just me, and more than just EAs.

Replies from: kave
comment by kave · 2023-11-23T22:23:37.903Z · LW(p) · GW(p)

OK, I guess the confusion was that it seemed like your counter-evidence to his claim was about how you would behave.

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-24T05:12:23.871Z · LW(p) · GW(p)

Yep, I can see how that could be confusing in context.

comment by Linch · 2023-11-23T02:09:03.795Z · LW(p) · GW(p)

I think it was clear from context that Lukas' "EAs" was intentionally meant to include Ben, and is also meant as a gentle rebuke re: naivete, not a serious claim re: honesty.

comment by Linch · 2023-11-23T02:07:45.985Z · LW(p) · GW(p)

I feel like you misread Lukas, and his words weren't particularly unclear.

comment by Eli Tyre (elityre) · 2023-11-23T03:54:11.973Z · LW(p) · GW(p)

This is a serious question, playfully asked.

Ben, in private conversation recently, you said that you were against sacredness, because it amounted to deciding that there's something about which you'll ignore tradeoffs (feel free to correct my compression of you here, if you don't endorse it).

So here's a tradeoff: what percentage probability of successfully removing Sam Altman would you trade for doing so honorably. How much does behaving honorably have to cause the success probability to fall before it's not worth it?

Or is this a place where you would say, "no, I defy the tradeoff. I'll take my honor over any differences in success probability." (In which case, I would put forth, you regard honor as sacred.)

: P 

Replies from: Benito, johnswentworth
comment by Ben Pace (Benito) · 2023-11-23T19:44:12.607Z · LW(p) · GW(p)

This question seems like a difficult interaction between utilitarianism and virtue ethics...

I think not being honorable is in large part a question of strategy. If you don't honor implicit agreements on the few occasions when you really need to win, it's a pretty different strategy from whether you honor implicit agreements all of the time. So it's not just a local question to a single decision, it's a broader strategic question.

I am sympathetic to consequentialist evaluations of strategies. I am broadly like "If you honor implicit agreements then people will be much more willing to trade with you and give you major responsibilities, and so going back on them in one occasion generally strikes down a lot of ways you might be able to effect the world." It's not just about this decision, but about an overall comparison of the costs and benefits of different kinds of strategies. There are many strategies one can play.

I could potentially make up some fake numbers to give a sense of how different decisions change which strategies to run (e.g. people who play more to the letter than the spirit of agreements, people who will always act selfishly if the payoff is at least going to 2x their wealth, people who care about their counter-parties ending up okay, people who don't give a damn about their counter-parties ending up okay, etc). I roughly think much more honest and open, straightforward, pro-social, and simple strategies are widely trusted, better for keeping you and your allies sane, are more effective on the particular issues you care about, but less effective at getting generic un-scoped power. I don't much care about the latter relative to the first three so it seems to me way better at achieving my goals.

I think it's extremely costly for trust to entirely change strategies during a single high-stakes decision, so I don't think it makes sense to re-evaluate it during the middle of the decision based on a simple threshold. (There could be observations that would make me realize during a high-stakes thing that I had been extremely confused about what game we were even playing, and then I'd change, but that doesn't fit as an answer to your question, which is about a simple probability/utility tradeoff.) It's possible that your actions on a board like this are overwhelmingly the most important choices you'll make and should determine your overall strategy, and you should really think through that ahead of time and let your actions show what strategy you're playing — well before agreeing to be on such a board.

Hopefully that explained how I think about the tradeoff you asked, while not giving specific numbers. I'm willing to answer more on this.

(Also, a minor correction: I said I was considering broadly dis-endorsing the sacred, for that reason. It seems attractive to me as an orientation to the world but I'm pretty sure I didn't say this was my resolute position.)

Replies from: elityre
comment by Eli Tyre (elityre) · 2023-11-24T03:36:07.044Z · LW(p) · GW(p)

If you don't honor implicit agreements on the few occasions when you really need to win, it's a pretty different strategy from whether you honor implicit agreements all of the time.

Right. I strongly agree. 

I think this is the at least part of the core bit of law that underlies sacredness. Having some things that are sacred to you is a way to implement this kind of reliable-across-all-worlds kind of policy, even when the local incentives might tempt you to violate that policy for local benefit.

Replies from: Benito
comment by Ben Pace (Benito) · 2023-11-24T04:47:09.454Z · LW(p) · GW(p)

Eh, I prefer to understand why the rules exist rather than blindly commit to them. Similarly, the Naskapi hunters used divination as a method of ensuring they'd randomize their hunting spots, and I think it's better to understand why you're doing it, rather than doing it because you falsely believe divination actually works.

Replies from: gwern, elityre
comment by gwern · 2023-11-27T03:00:18.416Z · LW(p) · GW(p)

Minor point: the Naskapi hunters didn't actually do that. That was speculation which was never verified, runs counter to a lot of facts, and in fact, may not have been about aboriginal hunters at all but actually inspired by the author's then-highly-classified experiences in submarine warfare in WWII in the Battle of the Atlantic. (If you ever thought to yourself, 'wow, that Eskimo story sounds like an amazingly clear example of mixed-strategies from game theory'...) See some anthropologist criticism & my commentary on the WWII part at https://gwern.net/doc/sociology/index#vollweiler-sanchez-1983-section

comment by Eli Tyre (elityre) · 2023-11-24T05:52:05.384Z · LW(p) · GW(p)

I certainly don't disagree with understanding the structure of good strategies!

comment by johnswentworth · 2023-11-23T04:56:32.839Z · LW(p) · GW(p)

(One could reasonably say that the upside of removing Sam Altman from OpenAI is not high enough to be worth dishonor over the matter, in which case percent chance of success doesn't matter that much.

Indeed, success probabilities in practice only range over 1-2 orders of magnitude before they stop being worth tracking at all, so probably the value one assigns to removing Sam Altman at all dominates the whole question, and success probabilities just aren't that relevant.)

Replies from: elityre
comment by Eli Tyre (elityre) · 2023-11-23T18:17:01.449Z · LW(p) · GW(p)

Ok. Then you can least-convenient-possible-world [LW · GW] the question. What's the version of the situation where removing the guy is important enough that the success probabilities start mattering in the calculus?

Also to be clear, I think my answer to this is "It's just fine for some things to be sacred. Especially for features like honor or honesty, in which their strength comes from them being reliable under all circumstances (or at least that they have strength in proportion to the circumstances in which they hold).

Replies from: Ruby
comment by Ruby · 2023-11-23T21:09:41.341Z · LW(p) · GW(p)

My intuition (not rigorous) is there a multiple levels in the consequentialist/deontoligical/consequentialist dealio.

I believe that unconditional friendship is approximately something one can enter into, but one enters into it for contingent reasons (perhaps in a Newcomb-like way – I'll unconditionally be your friend because I'm betting that you'll unconditionally be my friend). Your ability to credibly enter such relationships (at least in my conception of them) is dependent on you not starting to be more "conditional" because you doubt that the other person is also being there. This I think is related to not being a "fair weather" friend. I continue to be your friend even when it's not fun (you're sick, need taking care of whatever) even if I wouldn't have become your friend to do that. And vice versa. Kind of a mutual insurance policy.

Same thing could be with contracts, agreements, and other collaborations. In a Newcomb-like way, I commit to being honest, being cooperative, etc to a very high degree even in the face of doubts about you. (Maybe you stop by the time someone is threatening your family, not sure what Ben, et al, think about that.) But the fact I entered into this commitment was based on the probabilities I assigned to your behavior at the start.

comment by peterbarnett · 2023-11-21T22:27:02.245Z · LW(p) · GW(p)

I think this misses one of the main outcomes I'm worried about, which is if Sam comes back as CEO and the board is replaced by less safety-motivated people. This currently seems likely (Manifold at 75% Sam returning, at time of posting). 

You could see this as evidence that the board never had much power, and so them leaving doesn't actually change anything. But it seems like they (probably) made a bunch of errors, and if they hadn't then they would have retained influence to use to steer the org in a good direction. 

(It is also still super unclear wtf is going on, maybe the board acted in a reasonable way, and can't say for legal (??) reasons.)

Replies from: johnswentworth, tremmor19
comment by johnswentworth · 2023-11-21T22:38:38.106Z · LW(p) · GW(p)

You could see this as evidence that the board never had much power, and so them leaving doesn't actually change anything.

In the world where Sam Altman comes back as CEO and the board is replaced by less safety-motivated people (which I do not currently expect on an inside view), that would indeed be my interpretation of events.

Replies from: zrkrlc
comment by tremmor19 · 2023-11-22T11:40:25.969Z · LW(p) · GW(p)

Yea, im with you on it being a false dichotomy: some people are saying that because the board lost when they tried to use their legal power, it means they never had any power to begin with, so it doesnt matter. It seems plausible they had some level of power, but it was fragile and subject to failure if used carelessly in the wrong circumstances. Like, well, most real world political power.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-22T11:51:47.665Z · LW(p) · GW(p)

I'd model it as them having had the amount of power equal to their on-paper power times Sam's probability that they can successfully wield it. Being perceived as having social power is what having social power means, after all. I doubt he'd been certain they'd lose in a conflict like this, so he would've been at least a bit wary of starting it, i. e. would've shied away from actions that the board would dislike.

Now it's established knowledge that they have no real power, and so they truly don't have it anymore, and so Sam is free to do whatever he wants and he at last knows it.

comment by jacobjacob · 2023-11-21T19:44:32.066Z · LW(p) · GW(p)

In the poll [LW · GW]most people (31) disagreed with the claim John is defending here, but I'm tagging the additional few (3) who agreed with it @Charlie Steiner [LW · GW] @Oliver Sourbut [LW · GW] @Thane Ruthenis [LW · GW

Interested to hear your guys' reasons, in addition to John's above! 

Replies from: Oliver Sourbut, Charlie Steiner, Oliver Sourbut
comment by Oliver Sourbut · 2023-11-22T08:48:47.382Z · LW(p) · GW(p)

Quick dump.

Impressions

  • Having met Sam (only once) it's clear he's a slick operator and is willing to (at least) distort facts to serve a narrative
  • I tentatively think Sam is one of the 'immortality or die trying' crowd (which is maybe acceptable for yourself but not when gambling with everything else too)
  • Story from OpenAI leadership re racing has always struck me as suspicious rationalisation (esp re China)
    • 'You aren't stuck in traffic/race; you are the traffic/race'
  • A few interactions with OpenAI folks weakly suggests even the safety-conscious ones aren't thinking that clearly about safety
  • I've been through org shakeups and it tends to take time to get things ticking over properly again, maybe months (big spread)

Assumptions

  • I admit I wasn't expecting Sam to come back. If it sticks, this basically reverses my assessment!
  • I've been assuming that the apparent slavish loyalty of the mass of employees is mostly a fog of war illusion/artefact
    • the most suckered might follow, but I'm tentatively modelling those as the least competent
    • crux: there aren't large capabilities insights that most employees know
    • looks like we won't get a chance to find out (and nor will they!)

Other

  • Note that if the board gets fired, this is bad evidentially for the whole 'corrigibly aim a profit corp' attempt
    • it turns out the profit corp was in charge after all
    • it's also bad precedent, which can make a difference for future such things
    • but it presumably doesn't change much in terms of actual OpenAI actions
  • I buy the 'bad for EA PR' thing, and like John I'm unsure how impactful that actually is
    • I think I'm less dismissive of this than John
    • in particular it probably shortened the fuse on tribalism/politicisation catching up (irrespective of EA in particular)
    • but I've some faith that tribalism won't entirely win the day and ideas can be discussed on their merit

New news

Anyway, I certainly wasn't (and ain't) sure what's happening, but I tentatively expected that if Sam were replaced, that'd a) remove a particular source of racingness b) slow things through generic shakeup faff c) set a precedent for reorienting a profit corp for safety reasons. These were enough to make it look net good.

It looks like Sam is coming back, which isn't a massive surprise, though not what I was expecting. So, OpenAI direction maybe not changed much. In this branch, EA PR thing maybe ends up dominating after all. Hard to say what the effect is on Sam's personal brand; lots to still cash out I expect. It could enhance his charisma, or he might have spent something which is hard to get back.

Based on new news, I softly reverse my position [EDIT on the all-considered goodness-of-outcome, mainly for the PR and 'shortened the fuse' reasons].

Incidentally, I think the way things played out is more evidence for the underlying views [LW · GW] that made it a good idea to (try to) oust Sam (both his direct actions, and the behaviour of the people around him, are evidence that he's effective at manipulation and not especially safety-oriented). The weird groupthink (outward) from OpenAI employees is also a sign of quite damaged collective epistemics, which is sad (but informative/useful) evidence. But hey ho.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-22T12:00:03.175Z · LW(p) · GW(p)

Hard to say what the effect is on Sam's personal brand; lots to still cash out I expect. It could enhance his charisma, or he might have spent something which is hard to get back.

I think my model of cycles of power expansion and consolidation [LW(p) · GW(p)] is applicable here:

When you try to get something unusual done, you "stake" some amount of your political capital on this. If you win, you "expand" the horizon of the socially acceptable actions available to you. You start being viewed as someone who can get away with doing things like that, you get an in with more powerful people, people are more tolerant of you engaging in more disruptive action.

But if you try to immediately go for the next, even bigger move, you'll probably fail. You need buy-in from other powerful actors, some of which have probably only now became willing to listen to you and entertain your more extreme ideas. You engage in politicking with them, arguing with them, feeding them ideas, establishing your increased influence and stacking the deck in your favor. You consolidate your power.

I'd model what happened as Sam successfully expanding his power. He's staked some amount of his political capital on the counter-revolution, and he (currently appears to have) won, which means the political-capital investment was successful and will be repaid to him with interest. He'll need to spend some time consolidating it, but what happened will only benefit him in the long run.

Replies from: Oliver Sourbut
comment by Oliver Sourbut · 2023-11-22T12:30:45.815Z · LW(p) · GW(p)

Yep, this is a nicely explained and more detailed version of 'it could enhance his charisma' and is my model there too. But there's also a lot of latent (to us) stuff that might have been traded away, and furthermore a bunch of public 'potentially reputation-tarnishing' fallout which might swing it the other way.

comment by Charlie Steiner · 2023-11-22T09:04:02.111Z · LW(p) · GW(p)

I guess the big questions for me were "relative to what?", "did the board have good-faith reasons?", and "will the board win a pyrrhic victory or get totally routed?"

At the time of answering I thought the last two answers were: the board probably had plausible reasons, and they would probably win a pyrrhic victory.

Both of these are getting less likely, the second faster than the first. So I think it's shaping up that I was wrong and this will end up net-negative.

Relative to what? I think I mean "relative to the board completely checking out at their jobs," not "relative to the board doing their job masterfully," which I think would be nice to hope for but is bad to compare to.

comment by Oliver Sourbut · 2023-11-22T10:27:13.976Z · LW(p) · GW(p)

Thane has a good take in this comment [LW(p) · GW(p)]. I'm pointing at the difference between 'evidential' and 'causal' (?) updates in my answer but Thane does it more clearly.

comment by Thane Ruthenis · 2023-11-21T20:59:35.625Z · LW(p) · GW(p)

And on top of that, my not-very-informed-impression-from-a-distance is that [Sam]'s more a smile-and-rub-elbows guy than an actual technical manager

I agree, but I'm not sure that's insufficient to carve out a productive niche at Microsoft. He appears to be a good negotiator, so if he goes all-in spending his political capital to ensure his subsidiary isn't crippled by bureaucracy, he has a good chance of achieving it.

The questions are (1) whether he'd realize he needs to do that, and (2) whether he'd care to do that, versus just negotiating for more personal power and trying to climb to Microsoft CEO or whatever.

  • (1) depends on whether he's actually generally competent (as in, "he's able to quickly generalize his competence to domains he's never navigated before"), as opposed to competent-at-making-himself-appear-competent.
    • I've never paid much attention to him before, so no idea on him specifically. On priors, though, people with his profile are usually the latter type, not the former.
  • (2) depends on how much he's actually an AGI believer vs. standard power-maximizer who'd, up to now, just been in a position where appearing to be an AGI believer was aligned with maximizing power.
    • The current events seem to down-weight "he's actually an AGI believer", so that's good at least.

... Alright, having written this out, I've now somewhat updated towards "Microsoft will strangle OpenAI". Cool.

I've seen/heard a bunch of people in the LW-o-sphere saying that the OpenAI corporate drama this past weekend was clearly bad. And I'm not really sure why people think that?

In addition to what's been discussed, I think there's some amount of people conflating the updates they made based on what happened with their updates based on what the events revealed.

E. g., prior to the current thing, there was some uncertainty regarding "does Sam Altman actually take AI risk seriously, even if he has a galaxy-brained take on it, as opposed to being motivated by profit motives and being pretty good at paying lip service to safety?" and "would OpenAI's governance structure perfectly work to rein in profit motives?" and such. I didn't have much allocated to optimism here, and I expect a lot of people didn't, but there likely was a fair amount of hopefulness about such things.

Now it's all been dashed. Turns out, why, profit motives and realpolitik don't give up in the face of a cleverly-designed local governance structure, and Sam appears to be a competent power-maximizer, not a competent power-maximizer who's also secretly a good guy.

All of that was true a week ago, we only learned about it now, but the two types of updates are pretty easy to conflate if you're not being careful.

comment by orthonormal · 2023-11-22T00:33:02.373Z · LW(p) · GW(p)

I mean, I don't really care how much e.g. Facebook AI thinks they're racing right now. They're not in the game at this point.

The race dynamics are not just about who's leading. FB is 1-2 years behind (looking at LLM metrics), and it doesn't seem like they're getting further behind OpenAI/Anthropic with each generation, so I expect that the lag at the end will be at most a few years.

That means that if Facebook is unconstrained, the leading labs have only that much time to slow down for safety (or prepare a pivotal act) as they approach AGI before Facebook gets there with total recklessness.

If Microsoft!OpenAI lags the new leaders by less than FB (and I think that's likely to be the case), that shortens the safety window further.

I suspect my actual crux with you is your belief (correct me if I'm misinterpreting you) that your research program will solve alignment and that it will not take much of a safety window for the leading lab to incorporate the solution, and therefore the only thing that matters is finishing the solution and getting the leading lab on board. It would be very nice if you were right, but I put a low probability on it.

comment by Ruby · 2023-11-23T00:47:47.083Z · LW(p) · GW(p)

I was going to write stuff about integrity, and there's stuff to that, but the thing that is striking me most right now is that the whole effort seemed very incompetent and naive. And that's upsetting.

I am now feeling uncertain about the incompetence and naivety of it. Whether this was the best move possible that failed to work out, or best move possible that actually did get a good outcome, or a total blunder is determined by info I don't have.

I have some feeling of they were playing against a higher-level political player which both makes it hard but also means they needed to account for that? Their own level might be 80+th percentile in reference class of executive/board type-people, but still lower than Sam.

The piece that does seem most like they really made a mistake was trying to appoint an interim CEO (Mira) who didn't want the role. It seems like before doing that, you should be confident the person wants it.

I've seen it raised that the board might find the outcome to be positive (board stays independent even if current members leave?). If that's true, does change the evaluation of the competence. Feels hard for me to confidently judge that, though my gut sense is Sam got more of what he wanted/common knowledge of his sway than others.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-23T01:24:07.848Z · LW(p) · GW(p)

The initial naive blunder was putting Sam Altman in CEO position to begin with. It seems like it was predictable-in-advance (from e. g. Paul Graham's comments [LW(p) · GW(p)] from years and years ago) that he's not the sort of person to accept being fired, rather than mounting a realpolitik-based counteroffensive, and that he would be really good at the counteroffensive. Deciding to hire him essentially predestined everything that just happened; it was inviting the fox into the henhouse. OpenAI governance controls might have worked if the person subjected to them was not specifically the sort of person Sam is.

How was the decision to hire him made, and under what circumstances?

What needs to happen for this sort of mistake not to be repeated?

comment by interstice · 2023-11-21T21:13:31.074Z · LW(p) · GW(p)

Missing from this discussion is the possibility that Sam might be reinstated as CEO, which seems like a live option at this point. If that happens I think it's likely that the decision to fire him was a mistake.

comment by [deleted] · 2023-11-21T19:43:18.559Z · LW(p) · GW(p)

I think this discussion is too narrow and focused on just Sama and Microsoft.

The global market "wants" AGI, ASI, human obsolescence*.

The consequences of this event accelerate that:

  1. Case 1: Microsoft bureaucracy drags Sama's teams productivity down to zero. In this case, OpenAI doesn't develop a GPT-5, and Microsoft doesn't release a better model either. This opens up the market niche for the next competitor at a productive startup to develop the model, obviously assisted by former openAI employees who bring all the IP with them, and all the money and business flows to the startup. Brief delay in AGI, new frontrunner could be a firm with no EA influence and no safety culture beyond the legal minimum.

  2. Case 2: Microsoft pours money in Sama's group, Microsoft releases increasingly powerful models, bing and windows gain the market share they lost. Models become powerful enough that some real world incidents start to happen.

Conclusion: both outcomes benefit acceleration.

I think OAIs board understands the absolute requirement to remain with the best model in order to stay relevant, hence the panic merger proposal with anthropic.

*As an emergent property of the rules, aka Moloch. Note that chaos makes optimizing emergent outcomes more probable. See RNA experiments where thermal noise will cause RNA to self organize I to replicators.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-21T20:38:13.837Z · LW(p) · GW(p)

While you have a point, I think this model might have too much faith in the Efficient Market Hypothesis.

It's true that the market "wants" human obsolescence, in the sense that companies that sell it would earn a ton of money. But if DeepMind and Anthropic went bankrupt tomorrow, it's not obvious that anyone would actually step up to fill the niche left by them.

Market failures are common, and the major AI Labs arguably sprung up because of ideological motives; not profit motives. Because of people buying into EA-style AI risk and deciding to go about solving it in a galaxy-brained manner.

They're something like high-risk high-reward startups. And if a speculative startup fails, it's not at all obvious that a replacement startup will spring up to fill its niche the very next day, even if the idea were solid. (Or, another analogy, they're something like institutions concerned with fundamental theoretical research, and those famously need to be funded by governments, not the market.)

On the contrary, the reference point for how AGI-research-based-on-profit-motives goes is FAIR, and FAIR is lagging behind. The market, left to its own devices, will push for finding LLM applications or scaling them in the dumbest manner possible or such; not for racing for AGI.

I don't have so little faith in the market as to completely discount your model, but I wouldn't assign it 100% either.

Replies from: None
comment by [deleted] · 2023-11-21T21:49:39.551Z · LW(p) · GW(p)

Were the ai startups high risk and dependent on investor largess to exist, then absolutely I would agree with your model.

But : https://sacra.com/c/openai/#:~:text=OpenAI has surpassed %241.3B,at the end of 2022.

Assume openAI was collecting a 10 percent profit margin (the other 90 percent paid for compute). Allegedly they burned about 540 million in 2022 when gpt-4 was developed. Call it 1 billion total cost for a gpt-4 (compute + staff compensation).

Then 130 million annual net revenue on a 1 billion investment is 13 percent ROI. In terms of "monetary free energy" that's net output. Large businesses exist that run on less margin.

I am not specced in economics or finance but that looks like a sustainable business, and it's obviously self amplifying. Assuming the nascent general ai* industry has easy short term potential growth (as in, companies can rent access to a good model and collect many more billions) then it's possibly self sustaining. Even without outside investment some of those profits would get invested into the next model, and so on.

You can also think of a second form of "revenue" as investor hype. Fresh hype is created with each major improvement in the model that investors can perceive, and they bring more money each round.

While yes the EMH is imperfect,* investors clearly see enormous profit in the short term from general ai. This is the main thing that will drive the general ai industry to whatever is technically achievable : tens of billions in outside money. And yes, blowing up openAI slows this down...but the investors who were willing to give openAI tens of billions in a few weeks still have their money. Where's it going to go next?

  • By general ai I mean a model that is general purpose with many customers without needing expensive modifications. Different from AGI, which means also human level capabilities. General ai appears, from the data above, to be financially self sustaining even if it is not human level.

*As other lesswrong posts point out, downside risks like nuclear wars or markets ceasing to exist because a rampant ASI ate everything would not be "priced in".

Replies from: Thane Ruthenis, johnswentworth
comment by Thane Ruthenis · 2023-11-21T22:40:17.618Z · LW(p) · GW(p)

I am not specced in economics or finance but that looks like a sustainable business, and it's obviously self amplifying

"Make a giant LLM and deploy it" is a self-sustaining business, yes, and if all major AI Labs died tomorrow a plethora of companies filling the niche of "make a giant LLM and deploy it" would spring up, yes.

"Re-invest revenue into making an even larger LLM" is a sensible company policy, as well.

But is "have a roadmap to AGI and invest into research that brings your models closer to it, even if that doesn't immediately translate into revenue" a self-sustaining business model? I'm much less confident on that. It's possible that the ideas of "AGI = money" have already propagated enough that profit-oriented non-imaginative business people would decide to spend their revenue on that. But that's not obvious to me.

I expect the non-ideologically-motivated replacements for the current major AI labs to just have no idea what "racing to AGI" even means, in terms of "what research directions to pursue" as opposed to "what utterances to emit" [LW · GW]. The current AI industry as a whole is pretty bad at it as-is [? · GW], but the major AI labs explicitly have some vision of what it physically means. I don't expect the replacements for them that the market would generate on its own to have even that much of a sense of direction.

Again, it's possible that it's no longer true, that the ideas propagated enough that some of the "native" replacement companies would competently race as well. But it's not an open-and-shut case,  I think.

Replies from: None, Vladimir_Nesov
comment by [deleted] · 2023-11-21T23:37:31.029Z · LW(p) · GW(p)

So your model is that people can make big llms, and the innovation from openAI and from open source will eventually all be in one large model. Aka "gpt 4.1". But that each llm shop, while free of encumbrances and free to seek maximum profit, would not have the necessary concentration of money and talent in one place to develop AGI.

Instead they would simply keep making smaller delta's to their product, something a less talented and GPU poorer crew could do, and capabilities would be stuck in a local minimum.

So you believe that either this would push back AGI several years (eventually the staff at these smaller shops would skill up from experience and as compute gets cheaper they would eventually have what 100B of compute will buy in 2024) or possibly longer if there is no smooth path of small incremental steps from gpt-4.1 to AGI.

I will add one comment to this : it's not actually a threshold of "gpt4.1 to AGI". Assuming you believe RSI will work, you need "a good enough seed model plus sufficient compute to train and benchmark thousands of automatically generated AGI candidates".

Gpt4.1 plus a reinforcement learning element might be enough for the "seed AI".

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-21T23:42:46.499Z · LW(p) · GW(p)

That summary sounds right, yep!

Gpt4.1 plus a reinforcement learning element might be enough for the "seed AI".

Except that. It might, but I don't think that's particularly likely.

comment by Vladimir_Nesov · 2023-11-21T22:51:09.344Z · LW(p) · GW(p)

Giant LLMs are as useful as they are agentic (with ability to remain aware of a specific large body of data and keep usefully chipping away at a task), which doesn't seem particularly different from AGI as a direction (at least while it hasn't yet been walked far enough to tell the difference). The distinction is in AGI being a particular crucial threshold of capability [LW(p) · GW(p)] that local pursuit of better agentic LLMs will ignore until it's crossed.

comment by johnswentworth · 2023-11-21T21:58:53.741Z · LW(p) · GW(p)

Assume openAI was collecting a 10 percent profit margin (the other 90 percent paid for compute).

Wait, is OpenAI net positive profit on marginal users? I had assumed probably not, although it's not particularly central to any of my models right at the moment.

Replies from: None
comment by [deleted] · 2023-11-21T22:21:28.420Z · LW(p) · GW(p)

As Zvi mentioned in one of the roundups, the conventional wisdom for entering a new monopolistic tech niche is to grow as fast as possible.

So it's likely that OpenAI loses money per user. GitHub copilot allegedly costs $40 in compute per $20 a month subscriber.

So yes, you are right, but no, it doesn't matter. This is because there's other variables. The cost of compute is driven up by outside investment. If somehow dynamiting openAI causes all the outside investors to go invest somewhere else - sort of like the hype cycles for nft or crypto - the cost of compute would drop.

For example Nvidia is estimated to pay $3000 to build each H100. If Nvidia charges $5000 a card, and stops charging a 20 percent software license fee, that essentially cuts the compute cost by more than half*, making current AI models at current prices more than profitable.

Nvidia would do this in the hypothetical world of "investors get bored and another ai winter begins". This neglects Nvidia reducing their costs and developing a cheaper to build card per unit of LLM performance, which they obviously are doing.

*Quick and dirty sanity check: assuming 50 percent utilization (GPU is bounded by memory I/o then it would use $33,000 in electricity over 5 years and currently costs $50,000 at current prices, 25k is list price, 25k is license fee. Were Nvidia to simply charge a more modest margin the all in cost would drop from 83k to 38k. Data center electricity is probably cheaper than 13 cents but there are costs for backup power and other systems)

Conclusion: what's different now is general ai is bringing in enough revenue to be a self sustaining business. It's not an industry that can fold and go dormant like failed tech startups that folded and the product or service they developed ceased to be available anywhere.

The time to blow up openAI was prior to the release of chatGPT.

comment by Simon Fischer (SimonF) · 2023-11-21T22:06:20.971Z · LW(p) · GW(p)

Microsoft is the sort of corporate bureaucracy where dynamic orgs/founders/researchers go to die. My median expectation is that whatever former OpenAI group ends up there will be far less productive than they were at OpenAI.


I'm a bit sceptical of that. You gave some reasonable arguments, but all of this should be known to Sam Altman, and he still chose to accept Microsoft's offer instead of founding his own org (I'm assuming he would easily able to raise a lot of money). So, given that "how productive are the former OpenAI folks at Microsoft?" is the crux of the argument, it seems that recent events are good news iff Sam Altman made a big mistake with that decision.

Replies from: Thane Ruthenis
comment by Thane Ruthenis · 2023-11-21T23:02:35.768Z · LW(p) · GW(p)

recent events are good news iff Sam Altman made a big mistake with that decision

Or if Sam Altman isn't actually primarily motivated by the desire to build an AGI, as opposed to standard power-/profit-maximization motives. Accelerationists are now touting him as their messiah, and he'd obviously always been happy to generate hype about OpenAI's business vision. But it's not necessarily the case that it translates into him actually believing, at the gut level, that the best way to maximize prosperity/power is to build an AGI.

He may realize that an exodus into Microsoft would cripple OpenAI talent's ability to be productive, and do it anyway, because it offers him personally better political opportunities for growth.

It doesn't even have to be a dichotomy of "total AGI believer" vs "total simulacrum-level-4 [LW · GW] power-maximizer". As long as myopic political motives have a significant-enough stake in his thinking, they may lead one astray.

"Doomers vs. Accelerationists" is one frame on this conflict, but it may not be the dominant one.

"Short-sighted self-advancement vs. Long-term vision" is another, and a more fundamental one. Moloch favours capabilities over alignment, so it usually hands the victory to the accelerationists. But that only goes inasmuch as accelerationists' motives coincide with short-sighted power-maximization. The moment there's an even shorter-sighted way for things to go, an even lower energy-state to fall into, Moloch would cast capability-pursuit aside.

The current events may (or may not!) be an instance of that.

Replies from: o-o
comment by O O (o-o) · 2023-11-24T00:54:52.068Z · LW(p) · GW(p)

He was involved in the rationalist circle for a long time iirc. His said social status would still matter in a post AGI world so I suspect his true goal is either being known forever as the person who came across AGI (status) or immortality related.

comment by the gears to ascension (lahwran) · 2023-11-21T19:53:01.564Z · LW(p) · GW(p)

microsoft has put out some pretty impressive papers lately. not sure how that bodes for their overall productivity, of course.

comment by Feel_Love · 2023-11-21T20:42:51.308Z · LW(p) · GW(p)

Thanks for the good discussion.

I could equally see these events leading to AI capability development speeding or slowing. Too little is known about the operational status quo that has been interrupted for me to imagine counterfactuals at the company level.

But that very lack of information gives me hope that the overall PR impact of this may (counterintuitively) incline the Overton window toward more caution.

"The board should have given the press more dirt to justify this action!" makes sense as an initial response. When this all sinks in, what will people think of Effective Altruism then?! ...They won't. People don't think much about EA or care what that is. But the common person does think more and more about AI these days. And due to the lack of detail around why Altman was removed, the takeaway from this story cannot be "Sam is alleged to have XYZ'd. Am I pro- or anti-XYZ?" Instead, the media is forced to frame the news in broad terms of profit incentives versus AI safety measures. That's a topic that many people outside of this niche community may now be considering for the first time.

Ideally, this could be like a Sydney Bing moment that gets people paying attention without causing too much direct damage.

(The worst case: Things are playing out exactly as the AI told Sam they would before his ouster. Speculating about agents with access to cutting-edge AI may soon be futile.)

comment by jacobjacob · 2023-11-21T19:03:28.406Z · LW(p) · GW(p)

One of my takeaways of how the negotiations went is that it seems sama is extremely concerned with securing access to lots of compute, and that the person who ultimately got their way was the person who sat on the compute.

The "sama running Microsoft" idea seems a bit magical to me. Surely the realpolitik update here should be: power lies in the hands of those with legal voting power, and those controlling the compute. Sama has neither of those things at Microsoft. If he can be fired by a board most people have never heard of, then for sure he can get fired by the CEO of Microsoft. 

People seem to think he is somehow a linchpin of building AGI. Remind me... how many of OpenAI's key papers did he coauthor? Paul Graham says if you dropped him into an island of cannibals he would be king in 5 years. Seems plausible. Paul Graham did not say he would've figured out how to engineer a raft good enough to get him out of there. If there were any Manifold markets on "Sama is the linchpin to building AGI", I would short them for sure. 

We already have strong suspicion from the open letter vote counts there's a personality cult around Sama at OpenAI (no democratic election ever ends with a vote of 97% in favor). It also makes sense people in the LessWrong sphere would view AGI as the central thing to the future of the world and on everyone's minds, and thus fall in the trap of also viewing Sama as the most important thing at Microsoft. (Question to ask yourself about such a belief: who does it benefit? And is that beneficiary also a powerful agent deliberately attempting to shape narratives to their own benefit?) 

Satya Nadella might have a very different perspective than that, on what's important for Microsoft and who's running it.

Replies from: Vladimir_Nesov, elityre, ErickBall
comment by Vladimir_Nesov · 2023-11-21T19:21:17.896Z · LW(p) · GW(p)

People seem to think he is somehow a linchpin of building AGI. Remind me... how many of OpenAI's key papers did he coauthor?

Altman's relevant superpowers are expertise at scaling of orgs and AI-related personal fame and connections making him an AI talent Schelling point. So wherever he ends up, he can get a world class team and then competently scale its operations. The personality cult is not specious, it's self-fulfilling in practical application.

comment by Eli Tyre (elityre) · 2023-11-21T20:15:07.526Z · LW(p) · GW(p)

If he can be fired by a board most people have never heard of, then for sure he can get fired by the CEO of Microsoft. 

This seems right in principle, but I think he's way less likely to be fired by anyone at microsoft, because they can play a positive-sum political game together, which was (apparently) less true of Sam and the OpenAI board.

Replies from: Ruby
comment by Ruby · 2023-11-21T20:29:45.007Z · LW(p) · GW(p)

If he can lead an exodus from OpenAI to Microsoft, he can lead one from Microsoft to somewhere else.

comment by ErickBall · 2023-11-21T21:38:05.904Z · LW(p) · GW(p)

Here's a market, not sure how to define linchpin but we can at least predict whether he'll be part of it.

https://manifold.markets/ErickBall/will-the-first-agi-be-built-by-sam?r=RXJpY2tCYWxs

comment by Noosphere89 (sharmake-farah) · 2023-11-23T18:37:55.626Z · LW(p) · GW(p)

I tend to view the events of OpenAI's firing of Sam Altman much more ambiguously than others, and IMO, it probably balances out to nothing in the end, so I don't care as much as some other people here.

To respond more substantially:

From johnswentworth:

Here's the high-gloss version of my take. The main outcomes are:

The leadership who were relatively most focused on racing to AGI and least focused on safety are moving from OpenAI to Microsoft. Lots of employees who are relatively more interested in racing to AGI than in safety will probably follow. Microsoft is the sort of corporate bureaucracy where dynamic orgs/founders/researchers go to die. My median expectation is that whatever former OpenAI group ends up there will be far less productive than they were at OpenAI. It's an open question whether OpenAI will stick around at all. Insofar as they do, they're much less likely to push state-of-the-art in capabilities, and much more likely to focus on safety research. Insofar as they shut down, the main net result will be a bunch of people who were relatively more interested in racing to AGI and less focused on safety moving to Microsoft, which is great.

I agree with a rough version of the claim that they might be absorbed into Microsoft, thus making it less likely to advance capabilities, and this is plausibly at least somewhat important.

My main disagreement here is that I don't think that capabilities advances matter as much as LWers think for AI doom, and may even be anti-helpful to slow down, depending on the circumstances. This probably comes down to very different views on stuff like how strong do the priors need to be, etc.

From johnswentworth:

There's apparently been a lot of EA-hate on twitter as a result. I personally expect this to matter very little, if at all, in the long run, but I'd expect it to be extremely disproportionately salient to rationalists/EAs/alignment folk.

I actually think this partially matters, but the trickiness here is that on the one hand, twitter can be important, but I also agree that people overrate it a lot here.

My main disagreement tends to be that I don't think OpenAI actually matters too much in the capabilities race, and I think that social stuff matters more than John Wentworth thinks. Also, given my optimistic world-model on alignment, corporate drama like this mostly doesn't matter.

One final thought: I feel like the AGI clauses in the OpenAI's charter weew extremely terrible, because AGI is very ill-defined, and in a corporate setting/court setting, this is a very bad basis to build upon. They need to use objective metrics that are verifiable if they want to deal with dangerous AI. More generally, I kind of hate the AGI concept, for lots of reasons.

comment by FireStormOOO · 2023-11-21T23:57:21.052Z · LW(p) · GW(p)

I wouldn't count on Microsoft being ineffective, but there's good reason to think they'll push for applications for the current state of the art over further blue sky capabilities stuff.  The commitment to push copilot into every Microsoft product is already happening, the copilot tab is live in dozens of places in their software and in most it works as expected.  It's already good enough to replace 80%+ of the armies of temps and offshore warm bodies that push spreadsheets and forms around today without any further big capabilities gains, and that's a plenty huge market to sate public investors.  Sure more capabilities gets you more markets, but what they have now probably gets the entire AI division self-supporting on cashflow, or at least able to help with the skyrocketing costs of compute, plus funding the coming legal and lobbying battles over training data.

comment by Jonas V (Jonas Vollmer) · 2023-11-21T23:05:11.672Z · LW(p) · GW(p)

Market on the primary claim discussed here: 

comment by Hide · 2023-11-21T22:17:42.556Z · LW(p) · GW(p)

It seems intuitively bad:

  • Capabilities and accelerationist-focused researchers have gone from diluted and restrained to concentrated and encouraged
  • Microsoft now has unbounded control, rather than a 49% stake
  • Microsoft cannot be expected to have any meaningful focus on alignment/safety
  • They are not starting from scratch: a huge chunk of their most capable staff and leadership will be involved
  • The "superalignment" project will be at best dramatically slowed, and possibly abandoned if OpenAI implodes
  • Other major labs smell blood in the water, possibly exacerbating race dynamics, not to mention a superficial increase (by 1) in the number of serious players. 
comment by michael_mjd · 2023-11-21T19:52:09.089Z · LW(p) · GW(p)

One fear I have is that the open source community will come out ahead, and push for greater weight sharing of very powerful models.

Edit: To make more specific, I mean that the open source community will become more attractive, because they will say, you cannot rely on individual companies whose models may or may not be available. You must build on top of open source. Related tweet:

https://twitter.com/ylecun/status/1726578588449669218

Whether their plan works or not, dunno.

comment by trevor (TrevorWiesinger) · 2023-11-21T20:45:44.596Z · LW(p) · GW(p)

I read the whole thing, glad I did. It really makes me think that many of AI safety's best minds are doing technical work like technical alignment 8 hours a day, when it would be better for them to do 2 hours a day to keep their skills honed, and spend 6 hours a day acting as generalists to think through the most important problems of the moment.

They should have shared their reasons/excuses for the firing. (For some reason, in politics/corporate politics, people try to be secretive all the time and this seems-to-me to be very stupid in like 80+% of cases, including this one.)

Hard disagree in the OpenAI case. I'm putting >50% that they were correctly worried about people correctly deducing all kinds of things from honest statements, because AI safety is unusually smart and bayesian. There's literally prediction markets here. 

I'm putting >50% on that alone; also, if the true reason was anything super weird e.g. Altman accepting bribes or cutting deals with NSA operatives, then it would also be reasonable not to share it, even if AI safety didn't have tons of high-agency people that made it like herding cats.

That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things...

If the things you want people to do differently are costly, e.g. your safer AI is more expensive, but you are seen as untrustworthy, low-integrity, low-tranparency, low political competence, then I think you'll have a hard time getting buy in for it.

I think this gets into the complicated issue of security dilemmas; AI safety has a tradeoff of sovereignty and trustworthiness, since groups that are more powerful and sovereign have a risk of betraying their allies and/or going on the offensive (a discount rate since the risk accumulates over time), but not enough sovereignty means the group can't defend itself against infiltration and absorption.

The situation with slow takeoff means that historically unprecedented things will happen and it's not clear what the correct course of action is for EA and AI safety. I've argued that targeted influence is already a significant risk due to the social media paradigm already being really good at human manipulation by default [LW · GW] and due to major governments and militaries already being interested in the use of AI for information warfare [LW · GW]. But that's only one potential facet of the sovereignty-tradeoff problem and it's only going to get more multifaceted from here; hence why we need more Rubys and Wentworths spending more hours on the problem.

Replies from: Ruby
comment by Ruby · 2023-11-21T20:58:52.483Z · LW(p) · GW(p)

These recent events have me thinking the opposite: policy and cooperation approaches to making AI go well are doomed – while many people are starting to take AI risk seriously, not enough are, and those who are worried will fail to restrain those who aren't (where not being risked in a consequence of humans often being quite insane when incentives are at play). The hope lies in somehow developing enough useful AI theory that leading labs adopt and resultantly build an aligned AI even though they never believed they were going to cause AGI ruin.

And so maybe let's just get everyone to focus on the technical stuff. Actually more doable than wrangling other people to not build unsafe stuff.

Replies from: TrevorWiesinger
comment by trevor (TrevorWiesinger) · 2023-11-22T00:02:23.522Z · LW(p) · GW(p)

That largely depends on where AI safety's talent has been going, and could go. 

I'm thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won't succeed, especially without access to AI architectures that haven't been invented yet, and most of the people research policy and cooperation weren't our best. 

If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren't good enough to extrapolate to the crunch time systems, and our best thinkers haven't been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we're 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we're two years away.

comment by bluefalcon · 2023-11-22T01:54:42.923Z · LW(p) · GW(p)

It's clearly bad because Altman is launching a coup attempt against the board, making clear that we cannot control dangerous AI even with a nonprofit structure as long as someone has career/profit incentives to build it. Altman should be put down like a rabid dog.