LessWrong 2.0 Reader
View: New · Old · Top← previous page (newer posts) · next page (older posts) →
← previous page (newer posts) · next page (older posts) →
Humans instinctively like things like flowers and birdsong, because it meant a fertile area with food to our ancestors. We literally depended on Nature for our survival, and despite intensive agriculture, we aren't independent from it yet.
tamsin-leake on Tamsin Leake's ShortformI believe that ChatGPT was not released with the expectation that it would become as popular as it did.
Well, even if that's true, causing such an outcome by accident should still count as evidence of vast irresponsibility imo.
akash-wasil on Akash's ShortformAgreed— my main point here is that the marketplace of ideas undervalues criticism.
I think one perspective could be “we should all just aim to do objective truth-seeking”, and as stated I agree with it.
The main issue with that frame, imo, is that it’s very easy to forget that the epistemic environment can be tilted in favor of certain perspectives.
EG I think it can be useful for “objective truth-seeking efforts” to be aware of some of the culture/status games that underincentivize criticism of labs & amplify lab-friendly perspectives.
benito on Stephen Fowler's ShortformNot OP, but I take the claim to be "endorsing getting into bed with companies on-track to make billions of dollars profiting from risking the extinction of humanity in order to nudge them a bit, is in retrospect an obviously doomed strategy, and yet many self-identified effective altruists trusted their leadership to have secret good reasons for doing so and followed them in supporting the companies (e.g. working there for years including in capabilities roles and also helping advertise the company jobs). now that a new consensus is forming that it indeed was obviously a bad strategy, it is also time to have evaluated the leadership's decision as bad at the time of making the decision and impose costs on them accordingly, including loss of respect and power".
So no, not disincentivizing making positive EV bets, but updating about the quality of decision-making that has happened in the past.
habryka4 on simeon_c's ShortformSure, I'll try to post here if I know of a clear opportunity to donate to either comes up.
zach-stein-perlman on DeepMind's "Frontier Safety Framework" is weak and unambitiousSorry for brevity.
We just disagree. E.g. you "walked away with a much better understanding of how OpenAI plans to evaluate & handle risks than how Anthropic plans to handle & evaluate risks"; I felt like Anthropic was thinking about everything well.
I think Anthropic's ASL-3 is reasonable and OpenAI's thresholds and corresponding commitments are unreasonable. If the ASL-4 threshold was high or commitments are poor such that ASL-4 was meaningless, I agree Anthropic's RSP would be at least as bad as OpenAI's.
One thing I think is a big deal: Anthropic's RSP treats internal deployment like external deployment; OpenAI's has almost no protections for internal deployment.
I agree "an initial RSP that mostly spells out high-level reasoning, makes few hard commitments, and focuses on misuse while missing the all-important evals and safety practices for ASL-4" is also a fine characterization of Anthropic's current RSP.
edit: or, like, pf thresholds too high, so pf seems doomed / not on track, but rsp:v1 is consistent with rspv:1.1 being great. At least Anthropic knows and says there’s a big hole. That’s not relevant to evaluating labs’ current commitments but is very relevant to predicting.
vladimir_nesov on What Are Non-Zero-Sum Games?—A PrimerSee "Zero Sum" is a misnomer [LW · GW], shifting and rescaling of utility functions breaks formulations that simply ask to take a sum of payoffs, but we can rescue the concept to mean that all outcomes/strategies of the game are Pareto efficient.
"Positive sum" seems to be about Kaldor-Hicks efficiency, strategies where in principle there is a post-game redistribution of resources that would turn the strategies Pareto efficient, but there is no commitment or possibly even practical feasibility to actually perform the redistribution. This hypothetical redistribution step takes care of comparing utilities of different players. A whole game/interaction/project would then be "positive-sum" if each outcome/strategy is equivalent to some Pareto efficient hypothetical "outcome" via a redistribution.
yonatan-cale-1 on simeon_c's Shortform@habryka [LW · GW] , Would you reply to this comment if there's an opportunity to donate to either? Me and another person are interested, and others could follow this comment too if they wanted to
(only if it's easy for you, I don't want to add an annoying task to your plate)
zach-stein-perlman on Akash's ShortformSorry for brevity, I'm busy right now.
My current perspective is that criticism of AGI labs is an under-incentivized public good. I suspect there's a disproportionate amount of value that people could have by evaluating lab plans, publicly criticizing labs when they break commitments or make poor arguments, talking to journalists/policymakers about their concerns, etc.
Some quick thoughts:
With all this in mind, I find myself more deeply appreciating folks who have publicly and openly critiqued labs, even in situations where the cultural and economic incentives to do so were quite weak (relative to staying silent or saying generic positive things about labs).
Examples: Habryka, Rob Bensinger, CAIS, MIRI, Conjecture, and FLI. More recently, @Zach Stein-Perlman [LW · GW], and of course Jan Leike and Daniel K.