Posts
Comments
I'm definitely fine with not having Superman, but I'm willing to settle on him not intervening.
On a different note, I'd disagree that Superman, just by existing and being powerful, is a de facto ruler in any sense - he of course could be, but that would entail a tradeoff that he may not like (living an unburdened life).
In what way do prediction markets provide significant evidence on this type of question?
He's clearly not completely discounting that there's progress, but overall it doesn't feel like he's "updating all the way":
This is a recent post about the deepmind math olympiad results: https://mathstodon.xyz/@tao/112850716240504978
"1. This is great work, shifting once again our expectations of which benchmark challenges are within reach of either #AI-assisted or fully autonomous methods"
Thanks for explaining your point - that viability of inference scaling makes development differentially safer (all else equal) seems right.
Tall buildings are very predictable, and you can easily iterate on your experience before anything can really go wrong. Nuclear bombs is similar (you can in principle test in a remote enough location).
Biological weapons seems inherently more dangerous (still overall more predictable than AI), and I'd naively imagine it to be simply very risky to develop extremely potent biological weapons.
Interestingly, Terence Tao has recently started thinking about AI, and his (publicly stated) opinions on it are ... very conservative? I find he mostly focuses on the capabilities that are already here and doesn't really extrapolate from it in any significant way.
I think you're getting this exactly wrong (and this invalidates most of the OP). If you find a model that has a constant factor of 100 in the asymptotics that's a huge deal if everything else has log scaling. That would already present discontinuous progress and potentially put you at ASI right away.
Basically the current scaling laws, if they keep holding are a lower bound on the expected progress and can't really give you any information to upper bound it.
It's certainly covered by the NYT, although their angle is "OpenAI is growing up".
Some reporting on this: https://www.vox.com/future-perfect/374275/openai-just-sold-you-out
I understand that this hit a bad tone, but I do kind of stand behind the original comment for the precise reason that Altman has been particularly good at weaponizing things like "fun" and "high status" which the OP plays right into.
Bernard Arnault?
I don't have any experience with actual situations where this could be relevant, but it does feel like you're overly focusing on the failure case where everyone is borderline incompetent and doing arbitrary things (which of course happens on less wrong sometimes, since the variation here is quite large!). There's clearly a huge upside to being able to spot when you're trying to do something that's impossible for theoretical reasons, and being extra sceptical in these situations. (E.g. someone trying to construct a perpetual motion machine). I'm open to the argument that there's a lot to be wished for in the way people in practice apply these things.
(Epistemic status: I know basic probability theory but am otherwise just applying common sense here)
This seems to mostly be a philosophical question. I believe the answer is that then you're hitting the limits of your model and Bayesianism doesn't necessarily apply. In practical terms, I'd say it's most likely that you were mistaken about the probability of the event in fact being 0. (Probability 1 events occuring should be fine).
In 2021, I predicted math to be basically solved by 2023 (using the kind of reinforcement learning on formally checkable proofs that deepmind is using). It's been slower than expected and I wouldn't have guessed some less formal setting like o1 to go relatively well - but since then I just nod along to these kinds of results.
(Not sure what to think of that claimed 95% number though - wouldn't that kind of imply they'd blown past the IMO grand challenge? EDIT: There were significant time limits on the human participants, see Qumeric's comment.)
I think the main problem is that society-at-large doesn't significantly value AI safety research, and hence that the funding is severely constrained. I'd be surprised if the consideration you describe in the last paragraph plays a significant role.
Ideas come from unsupervised training, answers from supervised training and proofs from RL on a specified reward function.
Mask off, mask on.
I think your concrete suggestions such as these are very good. I still don't think you have illustrated the power-seeking aspect you are claiming very well (it seems to be there for EA, but less so for AI safety in general).
In short, I think you are conveying certain important, substantive points, but are choosing a poor framing.
Thanks for clarifying. I do agree with the broader point that one should have a sort of radical uncertainty about (e.g.) a post AGI world. I'm not sure I agree it's a big issue to leave that out of any given discussion though, since it shifts probability mass from any particular describable outcome to the big "anything can happen" area. (This might be what people mean by "Knightian uncertainty"?)
I don't think it's unreasonable to distrust doom arguments for exactly this reason?
I agree that dunking on OS communities has apparently not been helpful in these regards. It seems kind of orthogonal to being power-seeking though. Overall, I think part of the issue with AI safety is that the established actors (e.g. wide parts of CS academia) have opted out of taking a responsible stance, e.g. compared to recent developments in biosciences and RNA editing. Partially, one could blame this on them not wanting to identify too closely with, or grant legitimacy to, the existing AI safety community at the time. However, a priori, it seems more likely that it is simply due to the different culture in CS vs life sciences, with the former lacking the deep culture of responsibility for their research (in particular as far as they're connected to e.g. Silicon Valley startup culture).
The casual boosting of Sam Altman here makes me quite uncomfortable, and there's probably better examples: One could argue that his job isn't "paying" him as much as he's "taking" things by unilateral action and being a less than trustworthy actor. Other than that, this was an interesting read!
I found Ezra Vogel's biography of Deng Xiaoping to be on a comparable level.
On a brief reading, I found this to strike a refreshingly neutral and factual tone. I think it could be quite useful as a reference point.
You mean specifically that an LLM solved it? Otherwise Deepmind's work will give you many examples. (Although there've been surprisingly little breakthroughs in math yet)
Note that LLMs, while general, are still very weak in many important senses.
Also, it's not necessary to assume that LLM's are lying in wait to turn treacherous. Another possibility is that trained LLMs are lacking the mental slack to even seriously entertain the possibility of bad behavior, but that this may well change with more capable AIs.
I agree with the first sentence. I agree with the second sentence with the caveat that it's not strong absolute evidence, but mostly applies to the given setting (which is exactly what I'm saying).
People aren't fixed entities and the quality of their contributions can vary over time and depend on context.
That said, It also appears to me that Eliezer is probably not the most careful reasoner, and appears indeed often (perhaps egregiously) overconfident. That doesn't mean one should begrudge people finding value in the sequences although it is certainly not ideal if people take them as mantras rather than useful pointers and explainers for basic things (I didn't read them, so might have an incorrect view here). There does appear to be some tendency to just link to some point made in the sequences as some airtight thing, although I haven't found it too pervasive recently.
You're describing a situational character flaw which doesn't really have any bearing on being able to reason carefully overall.
I'm echoing other commenters somewhat, but - personally - I do not see people being down-voted simply for having different viewpoints. I'm very sympathetic to people trying to genuinely argue against "prevailing" attitudes or simply trying to foster a better general understanding. (E.g. I appreciate Matthew Barnett's presence, even though I very much disagree with his conclusions and find him overconfident). Now, of course, the fact that I don't notice the kind of posts you say are being down-voted may be because they are sufficiently filtered out, which indeed would be undesirable from my perspective and good to know.
When you have a role in policy or safety, it may usually be a good idea not to voice strong opinions on any given company. If you nevertheless feel compelled to do so by circumstances, it's a big deal if you have personal incentives against that - especially if they're not disclosed.
Might be good to estimate the date of the recommendation - as the interview where Carmack mentioned this was in 2023, a rough guess might be 2021/22?
It might not be legal reasons specifically, but some hard-to-specify mix of legal reasons/intimidation/bullying. While it's useful to discuss specific ideas, it should be kept in mind that Altman doesn't need to restrict his actions to any specific avenue that could be neatly classified.
I'd like to listen to something like this in principle, but it has really unfortunate timing with the further information that's been revealed, making it somewhat less exciting. It would be interesting to hear how/whether the participants believes change.
Have you ever written anything about why you hate the AI safety movement? I'd be quite curious to hear your perspective.
I think the best bet is to vote for a generally reasonable party. Despite their many flaws, it seems like Green Party or SPD are the best choices right now. (CDU seems to be too influenced in business interests, the current FDP is even worse)
The alternative would be to vote for a small party with a good agenda to help signal-boost them, but I don't know who's around these days.
It's not an entirely unfair characterization.
Half a year ago, I'd have guessed that OpenAI leadership, while likely misguided, was essentially well-meaning and driven by a genuine desire to confront a difficult situation. The recent series of events has made me update significantly against the general trustworthiness and general epistemic reliability of Altman and his circle. While my overall view of OpenAI's strategy hasn't really changed, my likelihood of them possibly "knowing better" has dramatically gone down now.
Fundamentally, OP is making the case that Biorisk is an extremely helpful (but not exact) analogy for AI risk, in the sense that we can gain understanding by looking the ways it's analogous, and then get an even more nuanced understanding by analyzing the differences.
The point made seems to be more about it's place in the discourse than about the value of the analogy itself? (E.g. "The Biorisk analogy is over-used" would be less misleading then)
What do you think about building legal/technological infrastructure to enable a prompt pause, should it seem necessary?
- I agree that potentially the benefits can go to everyone. The point is that as the person pursuing AGI you are making the choice for everyone else.
- The asymmetry is that if you do something that creates risk for everyone else, I believe that does single you out as an aggressor? While conversely, enforcing norms that prevent such risky behavior seems justified. The fact that by default people are mortal is tragic, but doesn't have much bearing here. (You'd still be free to pursue life-extension technology in other ways, perhaps including limited AI tools).
- Ideally, of course, there'd be some sort of democratic process here that let's people in aggregate make informed (!) choices. In the real world, it's unclear what a good solution here would be. What we have right now is the big labs creating facts that society has trouble catching up with, which I think many people are reasonably uncomfortable with.
I think the perspective that you're missing regarding 2. is that by building AGI one is taking the chance of non-consensually killing vast amounts of people and their children for some chance of improving one's own longevity.
Even if one thinks it's a better deal for them, a key point is that you are making the decision for them by unilaterally building AGI. So in that sense it is quite reasonable to see it as an "evil" action to work towards that outcome.
Somewhat of a nitpick, but the relevant number would be p(doom | strong AGI being built) (maybe contrasted with p(utopia | strong AGI)) , not overall p(doom).
I was down voting this particular post because I perceived it as mostly ideological and making few arguments, only stating strongly that government action will be bad. I found the author's replies in the comments much more nuanced and would not have down-voted if I'd perceived the original post to be of the same quality.
Basically, I think whether or not one thinks whether alignment is hard or not is much more of the crux than whether or not they're utilitarian.
Pesonally, I don't find Pope & Belrose very convincing, although I do commend them for the reasonable effort - but if I did believe that AI is likely to go well, I'd probably also be all for it. I just don't see how this is related to utilitarianism (maybe for all but a very small subset of people in EA).
"One reason that goes overlooked is that most human beings are not utilitarians" I think this point is just straightforwardly wrong. Even from a purely selfish perspective, it's reasonable to want to stop AI.
The main reason humanity is not going to stop seems mainly like coordination problems, or something close to learned helplessness in these kind of competitive dynamics.
Unambiguously evil seems unnecessarily strong. Something like "almost certainly misguided" might be more appropriate? (still strong, but arguably defensible)
Do you have an example for where better conversations are happening?
You're conflating between "have important consequences" and "can be used as weapons in discourse"
What do you mean by example, here? That this is demonstrating a broader property, or that in this situation, there was a tribal dynamic?