LessWrong 2.0 Reader
View: New · Old · Top← previous page (newer posts) · next page (older posts) →
← previous page (newer posts) · next page (older posts) →
See "Zero Sum" is a misnomer [LW · GW], shifting and rescaling of utility functions breaks formulations that simply ask to take a sum of payoffs, but we can rescue the concept to mean that all outcomes/strategies of the game are Pareto efficient.
"Positive sum" seems to be about Kaldor-Hicks improvement, an outcome that in principle admits a redistribution of resources that would turn the outcome into a Pareto improvement (over some original situation of "not playing"), but there is no commitment or possibly even practical feasibility to actually perform the redistribution. This hypothetical redistribution step takes care of comparing utilities of different players. A whole game/interaction/project would then be "positive-sum" if each outcome/strategy is equivalent to some hypothetical "outcome" via a redistribution that is a Pareto improvement over the status quo of not engaging in the game/interaction/project. In actuality, without the hypothetical redistribution step, some players can end up worse off.
mesaoptimizer on Tamsin Leake's ShortformYou continue to model OpenAI as this black box monolith instead of trying to unravel the dynamics inside it and understand the incentive structures that lead these things to occur. Its a common pattern I notice in the way you interface with certain parts of reality.
I don't consider OpenAI as responsible for this as much as Paul Christiano and Jan Leike and his team (back in 2016 or 2017). It was extremely predictable that Sam Altman and OpenAI would leverage this unexpected success to gain more investment and translate that into more researchers and compute. But Sam Altman and Greg Brockman aren't researchers, and they didn't figure out a path that minimized capabilities overhang -- Paul Christiano did. And more important -- this is not mutually exclusive with OpenAI using the additional resources for both capabilities research and (what they call) alignment research. While you might consider everything they do as effectively capabilities research, the point I am making is that this is still consistent with the hypothesis that while they are misguided, they still are roughly doing the best they can given their incentives.
What really changed my perspective here was the fact that Sam Altman seems to have been systematically destroying extremely valuable information about how we could evaluate OpenAI. Specifically, this non-disparagement clause that ex-employees cannot even mention without falling afoul of this contract, that I didn't expect (I did expect non-disclosure clauses but not something this extreme). This meant that my model of OpenAI was systematically too optimistic about how cooperative and trustworthy they are and will be in the future. In addition, if I was systematically deceived about OpenAI due to non-disparagement clauses that cannot even be mentioned, I would expect that something similar to also be possible when it comes to other frontier labs (especially Anthropic, but also DeepMind). In essence, I no longer believe that Sam Altman (for OpenAI is nothing but his tool now) is doing the best they can to benefit humanity given their incentives. I expect that Sam Altman is entirely doing whatever makes sense to him that will give him more influence and power, and this includes the use of AGI, if and when his teams finally achieve that level of capabilities.
This is the update I expect people are making. It is about being systematically deceived at multiple levels. It is not about "OpenAI being irresponsible".
lorxus on Lorxus's ShortformWait, some of y'all were still holding your breaths for OpenAI to be net-positive in solving alignment?
After the whole "initially having to be reminded alignment is A Thing"? And going back on its word to go for-profit? And spinning up a weird and opaque corporate structure? And people being worried about Altman being power-seeking? And everything to do with the OAI board debacle? And OAI Very Seriously proposing what (still) looks to me to be like a souped-up version of Baby Alignment Researcher's Master Plan B (where A involves solving physics and C involves RLHF and cope)? That OpenAI? I just want to be very sure. Because if it took the safety-ish crew of founders resigning to get people to finally pick up on the issue... it shouldn't have. Not here. Not where people pride themselves on their lightness.
amalthea on Tamsin Leake's ShortformHalf a year ago, I'd have guessed that OpenAI leadership, while likely misguided, was essentially well-meaning and driven by a genuine desire to confront a difficult situation. The recent series of events has made me update significantly against the general trustworthiness and general epistemic reliability of Altman and his circle. While my overall view of OpenAI's strategy hasn't really changed, my likelihood of them possibly "knowing better" has dramatically gone down now.
gilch on "If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"Yep. It would take a peculiar near-miss for an unfriendly AI to preserve Nature, but not humanity. Seemed obvious enough to me. Plants and animals are made of atoms it can use for something else.
By the way, I expect the rapidly expanding sphere of Darkness engulfing the Galaxy to happen even if things go well. The stars are enormous repositories of natural resources that happen to be on fire. We should put them out so they don't go to waste.
gilch on "If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"Humans instinctively like things like flowers and birdsong, because it meant a fertile area with food to our ancestors. We literally depended on Nature for our survival, and despite intensive agriculture, we aren't independent from it yet.
tamsin-leake on Tamsin Leake's ShortformI believe that ChatGPT was not released with the expectation that it would become as popular as it did.
Well, even if that's true, causing such an outcome by accident should still count as evidence of vast irresponsibility imo.
akash-wasil on Akash's ShortformAgreed— my main point here is that the marketplace of ideas undervalues criticism.
I think one perspective could be “we should all just aim to do objective truth-seeking”, and as stated I agree with it.
The main issue with that frame, imo, is that it’s very easy to forget that the epistemic environment can be tilted in favor of certain perspectives.
EG I think it can be useful for “objective truth-seeking efforts” to be aware of some of the culture/status games that underincentivize criticism of labs & amplify lab-friendly perspectives.
benito on Stephen Fowler's ShortformNot OP, but I take the claim to be "endorsing getting into bed with companies on-track to make billions of dollars profiting from risking the extinction of humanity in order to nudge them a bit, is in retrospect an obviously doomed strategy, and yet many self-identified effective altruists trusted their leadership to have secret good reasons for doing so and followed them in supporting the companies (e.g. working there for years including in capabilities roles and also helping advertise the company jobs). now that a new consensus is forming that it indeed was obviously a bad strategy, it is also time to have evaluated the leadership's decision as bad at the time of making the decision and impose costs on them accordingly, including loss of respect and power".
So no, not disincentivizing making positive EV bets, but updating about the quality of decision-making that has happened in the past.
habryka4 on simeon_c's ShortformSure, I'll try to post here if I know of a clear opportunity to donate to either.