Posts
Comments
The WWII generation is negligible in 2024. The actual effect is partly the inverted demographic pyramid (older population means more women than men even under normal circumstances), and partly that even young Russian men die horrifically often:
At 2005 mortality rates, for example, only 7% of UK men but 37% of Russian men would die before the age of 55 years
And for that, a major culprit is alcohol (leading to accidents and violence, but also literally drinking oneself to death).
Among the men who don't self-destruct, I imagine a large fraction have already been taken, meaning that the gender ratio among singles has to be off the charts.
That first statistic, that it swiped right 353 times and got to talk to 160 women, is completely insane. I mean, that’s almost a 50% match rate, whereas estimates in general are 4% to 14%.
Given Russia's fucked-up gender ratio (2.5 single women for every single man), I don't think it's that unreasonable!
Generally, the achievement of "guy finds a woman willing to accept a proposal" impresses me far less in Russia than it would in the USA. Let's see if this replicates in a competitive dating pool.
In high-leverage situations, you should arguably either be playing tic-tac-toe (simple, legible, predictable responses) or playing 4-D chess to win. If you're making really nonstandard and surprising moves (especially in PR), you have no excuse for winding up with a worse outcome than you would have if you'd acted in bog-standard normal ways.
(This doesn't mean suspending your ethics! Those are part of winning! But if you can't figure out how to win 4-D chess ethically, then you need to play an ethical tic-tac-toe strategy instead.)
Ah, I'm talking about introspection in a therapy context and not about exhorting others.
For example:
Internal coherence: "I forgive myself for doing that stupid thing".
Load-bearing but opaque: "It makes sense to forgive myself, and I want to, but for some reason I just can't".
Load-bearing and clear resistance: "I want other people to forgive themselves for things like that, but when I think about forgiving myself, I get a big NOPE NOPE NOPE".
P.S. Maybe forgiving oneself isn't actually the right thing to do at the moment! But it will also be easier to learn that in the third case than in the second.
"I endorse endorsing X" is a sign of a really promising topic for therapy (or your preferred modality of psychological growth).
If I can simply say "X", then I'm internally coherent enough on that point.
If I can only say "I endorse X", then not-X is psychologically load-bearing for me, but often in a way that is opaque to my conscious reasoning, so working on that conflict can be slippery.
But if I can only say "I endorse endorsing X", then not only is not-X load-bearing for me, but there's a clear feeling of resistance to X that I can consciously hone in on, connect with, and learn about.
Re: Canadian vs American health care, the reasonable policy would be:
"Sorry, publicly funded health care won't cover this, because the expected DALYs are too expensive. We do allow private clinics to sell you the procedure, though unless you're super wealthy I think the odds of success aren't worth the cost to your family."
(I also approve of euthanasia being offered as long as it's not a hard sell.)
I think MIRI is correct to call it as they see it, both on general principles and because if they turn out to be wrong about genuine alignment progress being very hard, people (at large, but also including us) should update against MIRI's viewpoints on other topics, and in favor of the viewpoints of whichever AI safety orgs called it more correctly.
Prior to hiring Shear, the board offered a merger to Dario Amodei, with Dario to lead the merged entity. Dario rejected the offer.
I mean, I don't really care how much e.g. Facebook AI thinks they're racing right now. They're not in the game at this point.
The race dynamics are not just about who's leading. FB is 1-2 years behind (looking at LLM metrics), and it doesn't seem like they're getting further behind OpenAI/Anthropic with each generation, so I expect that the lag at the end will be at most a few years.
That means that if Facebook is unconstrained, the leading labs have only that much time to slow down for safety (or prepare a pivotal act) as they approach AGI before Facebook gets there with total recklessness.
If Microsoft!OpenAI lags the new leaders by less than FB (and I think that's likely to be the case), that shortens the safety window further.
I suspect my actual crux with you is your belief (correct me if I'm misinterpreting you) that your research program will solve alignment and that it will not take much of a safety window for the leading lab to incorporate the solution, and therefore the only thing that matters is finishing the solution and getting the leading lab on board. It would be very nice if you were right, but I put a low probability on it.
I'm surprised that nobody has yet brought up the development that the board offered Dario Amodei the position as a merger with Anthropic (and Dario said no!).
(There's no additional important content in the original article by The Information, so I linked the Reuters paywall-free version.)
Crucially, this doesn't tell us in what order the board made this offer to Dario and the other known figures (GitHub CEO Nat Friedman and Scale AI CEO Alex Wang) before getting Emmett Shear, but it's plausible that merging with Anthropic was Plan A all along. Moreover, I strongly suspect that the bad blood between Sam and the Anthropic team was strong enough that Sam had to be ousted in order for a merger to be possible.
So under this hypothesis, the board decided it was important to merge with Anthropic (probably to slow the arms race), booted Sam (using the additional fig leaf of whatever lies he's been caught in), immediately asked Dario and were surprised when he rejected them, did not have an adequate backup plan, and have been scrambling ever since.
P.S. Shear is known to be very much on record worrying that alignment is necessary and not likely to be easy; I'm curious what Friedman and Wang are on record as saying about AI x-risk.
No, I don't think the board's motives were power politics; I'm saying that they failed to account for the kind of political power moves that Sam would make in response.
In addition to this, Microsoft will exert greater pressure to extract mundane commercial utility from models, compared to pushing forward the frontier. Not sure how much that compensates for the second round of evaporative cooling of the safety-minded.
If they thought this would be the outcome of firing Sam, they would not have done so.
The risk they took was calculated, but man, are they bad at politics.
- The quote is from Emmett Shear, not a board member.
- The board is also following the "don't say anything literally false" policy by saying practically nothing publicly.
- Just as I infer from Shear's qualifier that the firing did have something to do with safety, I infer from the board's public silence that their reason for the firing isn't one that would win back the departing OpenAI members (or would only do so at a cost that's not worth paying).
- This is consistent with it being a safety concern shared by the superalignment team (who by and large didn't sign the statement at first) but not by the rest of OpenAI (who view pushing capabilities forward as a good thing, because like Sam they believe the EV of OpenAI building AGI is better than the EV of unilaterally stopping). That's my current main hypothesis.
It's too late for a conditional surrender now that Microsoft is a credible threat to get 100% of OpenAI's capabilities team; Ilya and Jan are communicating unconditional surrender because the alternative is even worse.
I agree, it's critical to have a very close reading of "The board did *not* remove Sam over any specific disagreement on safety".
This is the kind of situation where every qualifier in a statement needs to be understood as essential—if the statement were true without the word "specific", then I can't imagine why that word would have been inserted.
The most likely explanation I can think of, for what look like about-faces by Ilya and Jan this morning, is realizing that the worst plausible outcome is exactly what we're seeing: Sam running a new OpenAI at Microsoft, free of that pesky charter. Any amount of backpedaling, and even resigning in favor of a less safety-conscious board, is preferable to that.
They came at the king and missed.
Did anyone at OpenAI explicitly say that a factor in their release cadence was getting the public to wake up about the pace of AI research and start demanding regulation? Because this seems more like a post hoc rationalization for the release policy than like an actual intended outcome.
I expect AGI to emerge as part of the frontier model training run (and thus get a godshatter of human values), rather than only emerging after fine-tuning by a troll (and get a godshatter of reversed values), so I think "humans modified to be happy with something much cheaper than our CEV" is a more likely endstate than "humans suffering" (though, again, both much less likely than "humans dead").
Steelmanning a position I don't quite hold: non-extinction AI x-risk scenarios aren't limited to inescapable dystopias as we imagine them.
"Kill all humans" is certainly an instrumental subgoal of "take control of the future lightcone" and it certainly gains an extra epsilon of resources compared to any form of not literally killing all humans, but it's not literally required, and there are all sorts of weird things the AGI could prefer to do with humanity instead depending on what kind of godshatter it winds up with, most of which are so far outside the realm of human reckoning that I'm not sure it's reasonable to call them dystopian. (Far outside Weirdtopia, for that matter.)
It still seems very likely to me that a non-aligned superhuman AGI would kill humanity in the process of taking control of the future lightcone, but I'm not as sure of that as I'm sure that it would take control.
[See corrections in replies; "think for five minutes" was in EY posts as far back as 2007, the HPMOR chapter was in 2010, and the first CFAR retreat (though not under that name) was 2011 IIRC. Still curious to know where he got it from.]
Before HPMOR, "think for five minutes by the clock" was a CFAR exercise; I don't recall where they picked it up from.
I think a substantial fraction of LWers have the (usually implicit—they may not have even read about simulacra) belief that higher levels are inherently morally problematic, and that engaging on those levels about an important topic is at best excusable under the kind of adversarial circumstances where direct lies are excusable. (There's the obvious selection effect where people who feel gross about higher levels feel more comfortable on LW than almost anywhere else.)
I think there need to be better public arguments against that viewpoint, not least because I'm not fully convinced it's wrong.
Elizabeth has put at least dozens of hours into seeking good RCTs on vegan nutrition, and has come up nearly empty. At this point, if you want to say there is an expert consensus that disagrees with her, you need to find a particular study that you are willing to stand behind, so that we can discuss it. This is why Elizabeth wrote a post on the Adventist study—because that was the best that people were throwing at her.
This is a pretty transparent isolated demand for rigor. Can you tell me you've never uncritically cited surveys of self-reported data that make veg*n diets look good?
Simply type the at-symbol to tag people. I don't know when LW added this, but I'm glad we have it.
Your framing makes it sound like individual raising of livestock, which is silly—specialization of expertise and labor is a very good thing, and "EA reducetarians find or start up a reasonably sized farm whose animal welfare standards seem to them to be net positive" seems to dominate "each EA reducetarian tries to personally raise chickens in a net positive way" (even for those who think both are bad, the second one seems simply worse at a fixed level of consumption).
Seems fair to tag @Liron here.
I agree with "When you say 'there's a good chance AGI is near', the general public will hear 'AGI is near'".
However, the general public isn't everyone, and the people who can distinguish between the two claims are the most important to reach (per capita, and possibly in sum).
So we'll do better by saying what we actually believe, while taking into account that some audiences will round probabilities off (and seeking ways to be rounded closer to the truth while still communicating accurately to anyone who does understand probabilistic claims). The marginal gain by rounding ourselves off at the start isn't worth the marginal loss by looking transparently overconfident to those who can tell the difference.
I reached this via Joachim pointing it out as an example of someone urging epistemic defection around AI alignment, and I have to agree with him there. I think the higher difficulty posed by communicating "we think there's a substantial probability that AGI happens in the next 10 years" vs "AGI is near" is worth it even from a PR perspective, because pretending you know the day and the hour smells like bullshit to the most important people who need convincing that AI alignment is nontrivial.
Fair enough, I've changed my wording.
The thread is closer to this post's Counter-Examples than its examples.
Richard calls out the protest for making arguments that diverge from the protesters' actual beliefs about what's worth protesting, and is highly upvoted for doing so. In the ensuing discussion, Steven changes Holly and Ben's minds on whether it's right to use the "not really open-source" accusation against FB (because we think true open-source would be even worse).
Tyler's comment that [for public persuasion, messages get rounded to "yay X" or "boo X" anyway, so it's not worth worrying about nuance nuance is less important] deserves a rebuttal, but I note that it's already got 8 disagrees vs 4 agrees, so I don't think that viewpoint is dominant.
I think the downvotes are coming because people don't realize you're doing the exercise at the start of the post, and rather think that you're making these claims after having read the rest of the post. I don't think you should lose karma for that, so I'm upvoting; but you may want to state at the top that's what you're doing.
It's a very unusual disclaimer that speaks well of the post.
The default journalistic practice at many outlets is to do an asymmetric search once the journalist or editor decides which way the wind is blowing, but of course nobody says this in the finished piece.
Ben is explicitly telling the reader that he did not spend another hundred hours looking for positive information about Nonlinear, so that we understand that absence of exculpatory evidence in the post should not be treated as strong evidence of absence.
Which is why I said that the probabilities are similar, rather than claiming the left side exceeds the right side.
I'm surprised (unless I've missed it) that nobody has explicitly pointed out the most obvious reason to take the responses of the form "Kat/Emerson/Drew have been really good to me personally" as very weak evidence at best.
The allegations imply that in the present situation, Kat/Emerson/Drew would immediately tell anyone in their orbit to come and post positive testimonials of them under promises of reward or threat of retaliation (precisely as the quoted Glassdoor review says).
P(generic positive testimonials | accusation true) ≈ P(generic positive testimonials | accusation false).
The only thing that would be strong evidence against the claims here would be direct counterevidence to the claims in the post. Everything else so far is a smokescreen.
Plenty of "weird and atypical" things aren't red flags; this one, however, is a well-known predictor of abusive environments.
I believe that a commitment to transparently reward whistleblowers, in cases where you conclude they are running a risk of retaliation, is a very good policy when it comes to incentivizing true whistleblowing.
Ben, I want to say thank you for putting in a tremendous amount of work, and also for being willing to risk attempts at retaliation when that's a pretty clear threat.
You're in a reasonable position to take this on, having earned the social standing to make character smears unlikely to stick, and having the institutional support to fight a spurious libel claim. And you're also someone I trust to do a thorough and fair job.
I wish there were someone whose opportunity cost were lower who could handle retaliation-threat reporting, but it's pretty likely that anyone with those attributes will have other important opportunities.
Re: QB sneaks, the obvious tradeoff that's not factored into the naive model is the risk of an injury to the QB. Running the QB can be worth it if the expected points added are high enough (goal line or pivotal 4th and 1, or if your QB can often pick up 10+ yards on a draw), but I doubt you want to roll those dice on a midfield 2nd and 4 sneak in the first quarter of a regular-season game.
(God, American football is a beautiful game. I miss the days when I could enjoy it because I didn't know about the CTE.)
I worked in the AI/ML org at Apple for a few recent years. They are not a live player to even the extent that Google Brain was a live player before it was cannibalized.
When Apple says "AI", they really mean "a bunch of specialized ML algorithms from warring fiefdoms, huddling together in a trenchcoat", and I don't see Tim Cook's proclamation as anything but cheap talk.
Re: mileage, if the car is a Tesla, it's not an accident: they set their numbers to the maximally optimistic ones at every stage (without regard for e.g. temperature) and have a team dedicated to diverting people who call in about the reliable inaccuracy.
I'm surprised I didn't see here my biggest objection:
MIRI talks about "pivotal acts", building an AI that's superhuman in some engineering disciplines (but not generally) and having it do a human-specified thing to halt the development of AGIs (e.g. seek and safely melt down sufficiently large compute clusters) in order to buy time for alignment work. Their main reason for this approach is that it seems less doomed to have an AI specialize in consequentialist reasoning about limited domains of physical engineering than to have it think directly about how its developers' minds work.
If you are building an alignment researcher, you are building a powerful AI that is directly thinking about misalignment—exploring concepts like humans' mental blind spots, deception, reward hacking, hiding thoughts from interpretability, sharp left turns, etc. It does not seem wise to build a consequentialist AI and explicitly train it to think about these things, even when the goal is for it to treat them as wrong. (Consider the Waluigi Effect: you may have at least latently constructed the maximally malicious agent!)
I agree, of course, that the biggest news here is the costly commitment—my prior model of them was that their alignment team wasn't actually respected or empowered, and the current investment is very much not what I would expect them to do if that were the case going forward.
I agree that, given the dynamics, it's rare to get a great journalist on a technical subject (we're lucky to have Zeynep Tufekci on public health), but my opinion is that Metz has a negative Value Over Replacement Tech Journalist, that coverage of AI in the NYT would be significantly more accurate if he quit and was replaced by whomever the Times would poach.
Cade Metz already has multiple strikes against him when it comes to journalistic carelessness around the rationalist community and around AI risk. In addition to outing Scott, he blithely mischaracterized the situation between Geoff Hinton and Google.
It's harmful that the NYT still has him on this beat (though I'm sure his editors don't know/care that he's treating the topic as an anthropological curiosity rather than something worth taking seriously).
Elizabeth has invested a lot of work already, and has explicitly requested that people put in some amount of work when trying to argue against her cruxes (including actually reading her cruxes, and supporting one's points with studies whose methodology one has critically checked).
The citation on that sentence is the same as the first paragraph in this post about the animal-ethics.org website; Elizabeth is aware of that study and did not find it convincing.
In this and your comments below, you recapitulate points Elizabeth made pretty exactly- so it looks like you didn't need to read it after all!
My answer is "work on applications of existing AI, not the frontier". Advancing the frontier is the dangerous part, not using the state-of-the-art to make products.
But also, don't do frontend or infra for a company that's advancing capabilities.
I also had a bunch of thoughts of ‘oh, well, that’s easy, obviously you would just [OH MY LORD IS THIS CENSORED]’
I applaud your Virtue of Silence, and I'm also uncomfortable with the simplicity of some of the ones I'm sitting on.
Thanks for the info about sockpuppeting, will edit my first comment accordingly.
Re: Glassdoor, the most devastating reviews were indeed after 2017, but it's still the case that nobody rated the CEO above average among the ~30 people who worked in the Spartz era.