Posts
Comments
Thanks for sharing this. Based on the About page, my 'vote' as a EU citizen working in an ML/AI position could conceivably count for a little more, so it seems worth doing it. I'll put it in my backlog and aim to get to it on time (it does seem like a lengthy task).
If you don't know who to believe then falling back on prediction markets or at least expert consensus is not the worst strategy.
Do you truly not believe that for your own ljfe - to use the examples there - solving aging, curing all disease, solving energy isn't even more valuable? To you? Perhaps you don't believe those possible but then that's where the whole disagreement lies.
And if you are talking about Superintelligent AGI and automation why even talk about output per person? I thought you at least believe people are automated out and thus decoupled?
Does he not believe in AGI and Superintelligence at all? Why not just say that?
AI could cure all diseases and “solve energy”. He mentions “radical abundance” as a possibility as well, but beyond the R&D channel
This is clearly about Superintelligence and the mechanism through which it will happen in that scenario is straightforward and often talked about. And if he disagrees he either doesn't believe in AGI (or at least advanced AGI) or believes that solving energy, curing disease is not that valuable? Or he is purposefully talking about a pre-AGI scenario while arguing against post-AGI views?
to lead to an increase in productivity and output *per person*
This quote certainly suggests this. It's just hard to tell if this is due to bad reasoning or on purpose to promote his start-up.
AI 2027 is more useful for the arguments than the specific year but even if not as aggressive, prediction markets (or at least Manifold) predict 61% chance before 2030, 65% before 2031, 73% by 2033.
I, similarly, can see it happening slightly later than 2027-2028 because some specific issues take longer to solve than others but I see no reason to think a timeline beyond 2035, like yours, let alone 30 years is grounded in reality.
It also doesn't help that when I look at your arguments and apply them to what would then seem to be very optimistic forecast in 2020 about progress in 2025 (or even Kokotajlo's last forecast), those same arguments would have similarly rejected what has happened.
I believe he means rationality-associsted discourse and it's not like there are so many contenders.
There's indeed been no one with that level of reach that has spread this much misinformation and started this many negative rumors in the space as David Gerard and RW. Whoever the second closest contender is, is likely not even close.
You can trace back to him A LOT of the negative press online that LW, EY and a ton of other places and people have got. If it wasn't for RW LW would be much, much more respected.
It's hard for me to respect a Safety-ish org so obviously wrong about the most important factors of their chosen topic.
I won't judge a random celebrity for expecting e.g. very long timelines but an AI research center? I'm sure they are very cool people but come on.
As in ultimately more people are likely to like their condition and agree (comparably more) with the AI's decisions while having roughly equal rights.
Democratic in the 'favouring or characterized by social equality; egalitarian.' sense (one of the definitions from Google), rather than about Elections or whatever.
For example, I recently wrote a Short Story of my Day in 2035 in the scenario where things continue mostly like that and we get positive AGI that's similarish enough to current trends. There, people influenced the initial values - mainly via The Spec, and can in theory vote to make some changes to The Spec that governs the general AI values, but in practice by that point AGI controls everything and it's more or less set in stone. Still, it overall mostly tries to fulfil people's desires (overly optimistic that we go this route, I know).
I'd call that more democratic than one that upholds CCP values specifically.
Western AI is much more likely to be democratic and have humanity's values a bit higher up. Chinese one is much more likely to put CCP values and control higher up.
But yes, if it's the current US administration specifically, neither option is that optimistic.
While, showing the other point of view and all that is a reasonable practice, it's disappointing of Dwarkesh to use his platform specifically to promote this anti-safety start-up.
Most people's main mistake boils down to the assumption that his 2nd term would be more in line with his first - which, while full of quirks, was overall closer to business as usual. His 2nd term in comparison steers much further and the negative effects are much larger.
If so, by default the existence of AGI will be a closely guarded secret for some months. Only a few teams within an internal silo, plu,s leadership & security, will know about the capabilities of the latest systems.,
Are they really going to be that secret - at this point, progress is if not linear, almost predictable and we are well aware of the specific issues to be solved next for AGI - longer task horizons, memory, fewer hallucinations, etc. If you tell me someone is 3-9 months ahead and nearing AGI, I'd simply guess those are the things they are ahead on.
>Even worse, a similarly tiny group of people — specifically, corporate leadership + some select people, from the executive branch of the US government — will be the only people reading the reports and making high-stakes judgment calls
That does sound pretty bad, yes. My last hope in this scenario is that at the last step (even only for the last week or two) when it's clear they'll win they at least withold it from the US executive branch and make some of the final decisions on their own - not ideal, but a few % more chance the final decisions aren't godawful.
For example, imagine Ilya's lab ends up ahead - I can at least imagine him doing some last minute fine-tuning to make the AGI work for humanity first, ignoring what the US executive branch has ordered, and I can imagine some chance that once that's done it can mostly be too late to change it.
Sure, but with increased capability in order to predict and simulate the impact of pain or fear better and better one might end up producing a mechanism that simulates it too well. After all, if you are trying to predict really well how a human will react when they are in pain, a natural approach is to check how a different human reacts when you cause them pain.
I don't know about the usefulness of the map (looks cool, at minimum), but I liked the simple 'Donation Guide' section on the site, which led me to make a small donation to AI Safety.
One recommendation - I'd make the 'Donate to a fund, such as the AI Risk Mitigation Fund.' portion a bit more prominent. At first I didn't even realize there was a donation link on the page, just text. It will help if it's for example an obvious button or just a more prominent call to action that leads you to the fund's donation page or whoever else you recommend.
>Would you be a human supremacist? From 2025, I'd like to think I would be.
Not really. I'd mostly be happy enough this is where we ended up rather than much worse scenarios which are as of now still possible.
Did we “find” this?
It's still everyone involved's best guess as far as I can tell, it's just not a field that moves that fast or is that actionable (especially compared to AI).
Huh, Aella is more commited to the anti-shower stance than even Twitter would think.
Honestly, if replacing OpenAI with OpenEye is even supposed to be microfunny it makes me take it even less seriously but if it's working for others fair enough.
I don't understand what's the point of using fake names that close to the real names. Is there a reason I'm missing?
For me either the real names or something like Company 1, Company 2 or completely different names would have made it pattern match less to a telling of a made up scenario rather than a possible future.
Survival is obviously much better because 1. You can lose jobs but eventually still have a good life (think UBI at minimum) and 2. Because if you don't like it you can always kill yourself and be in the same spot as the non-survival case anyway.
If we get to the point where prediction markets actually direct policy, then yes you need them to be very deep - which in at least some cases is expected to happen naturally or can be subsidized but you also want to make the decision based off a deeper analysis than just the resulting percentages - depth of market, analysis of unusual large trades, blocking bad actors etc.
Weed smells orders of magnitude more than many powders and I imagine releases way more particles into the air but assuming this is doable for well packed fentanyl is there a limit? Can you expose dogs to enough say carfentanil safely initially to start training them? lofentanil?
And if it's detectable by dogs surely it can't be that far from our capabilities to create a sensor that detects particles in the air at the same fidelity as a dog's nose by copying the general functionality of olfactory receptors and neurons if cost isn't a big issue.
I think you are overrating it. Biggest concern comes from whomever trains a model that passes some treshold in the first place. Not from a model that one actor has been using for a while getting leaked to another actor. The bad actor who got access to the leak is always going to be behind in multiple ways in this scenario.
>Once again, open weights models actively get a free exception, in a way that actually undermines the safety purpose.
Open weights getting a free exception doesn't seem that bad to me, because yes on one hand it increases the chance of a bad actor getting a cutting-edge model but on the other hand the financial incentives are weaker, and it brings more capability to good actors outside of the top 5 companies earlier. And those weights can be used for testing, safety work, etc.
I think what's released openly will always be a bit behind anyway (and thus likely fairly safe), so at least everyone else can benefit.
They are not full explanations, but as far as, I at leat can get.
>tells you more about what exists
It's still more satisfying, because a state of ~ everything existing is more 'stable' than a state of a specific something existing, in exactly the same way as to why I even think nothing makes more sense as a default state than something to be asking the queston. Nothing existing, and everything existing just require less explanation than a specific something existing. It doesn't mean it necesserily requires 0 explanation.
And, if everything mathemetically describable and consistent/computable exists, I can wrap my head around it not requiring an orgin more easily, in a similar way why I don't require an orgin for actual mathematical objects, but without it seeming like necesserily a Type error (though that's the counterargument I most consider here) like with most explanations.
>because how can you have a "fluctuation" without something already existing, which does the fluctuating
That's at least somewhat more satisfying to me because we already know about virtual particles and fluctuations from Quantum Mechanics, so it's at least a recognized low-level mechanism that does cause something to exist even while the state is zero energy (nothing).
It still leaves us with nothing existing over something overall in at least one way (zero energy), is already demonstratable with fields, which are at the lowest level of what we already know of how the universe works and which can be examined and thought about furtther.
The only appealing answers to why there is something instead of nothing for me currently are
1. MUH is true, and all universes that can be defined mathematically exist. It's not a specific something that exists but all internally consistent somethings.
or
2. The default state is nothing but there are small positive and negative fluctuations (either literally quantum fluctuations or similar but at a lower level) and over infinite time those fluctuations eventually result in a huge something like our and other universes.
Also even If 2 happens only at the regular quantum fluctuations level, there's a non-zero chance of a new universe emerging due to fluctuations after heat death, which over infinite time would mean it is bound to happen and a new universe/rebirth of ours from scratch will eventually emerge.
Also 1 can happen due to 2 if the fluctuations are at such a low level that any possible mathematical structure eventually emerges over infinite time.
I am dominated by it, and okay, I see what you are saying. Whichever scenario results in a higher chance of human control of the light cone is the one I prefer, and these considerations are relevant only where we don't control it.
Sure, but 1. I only put 80% or so on MWI/MUH etc. and 2. I'm talking about optimizing for more positive-human-lived-seconds, not for just a binary 'I want some humans to keep living' .
I have a preference for minds as close to mine continuing existence assuming their lives are worth living. If it's misaligned enough that the remaining humans don't have good lives, then yes it doesn't matter but I'd just lead with that rather than just the deaths.
And if they do have lives worth living and don't end up being the last humans, then that leaves us with a lot more positive-human-lived-seconds in the 2B death case.
Okay, then what are your actual probabilities? I'm guessing it's not sub-20% otherwise you wouldnt just say "<50%", because for me preventing a say 10% chance of extinction is much more important than even a 99% chance of 2B people dying. And your comment was specifically dismissing focus on full extinction due to the <50% chance.
unlikely (<50% likely).
That's a bizarre bar to me! 50%!? I'd be worried if it was 5%.
It's a potentially useful data point but probably only slightly comparable. Big, older, well-established companies face stronger and different pressures than small ones and do have more to lose. For humans that's much less the case after a point.
>"The problem is when people get old, they don't change their minds, they just die. So, if you want to have progress in society, you got to make sure that, you know, people need to die, because they get old, they don't change their mind."
That's valid today but I am willing to bet a big reason why old people change their mind less is biological - less neuroplasticity, accumulated damage, mental fatigue etc. If we are fixing aging, and we fix those as well it should be less of an issue.
Additionally, if we are in some post-death utopia, I have to assume we have useful, benevolent AI solving our problems, and that ideally it doesn't matter all that much who held a lot of wealth or power before.
He does not have a good plan for alignment, but he is far less confused about this fact than most others in similar positions.
Yes he seems like a great guy but he doesn't just come up as not having a good plan but as them being completely disconnected about having a plan or doing much of anything
JS: If AGI came way sooner than expected we would definitely want to be careful about it.
DP: What would being careful mean? Presumably you're already careful, right
And yes aren't they being careful? Well, sounds like no
JS: Maybe it means not training the even smarter version or being really careful when you do train it. You can make sure it’s properly sandboxed and everything. Maybe it means not deploying it at scale or being careful about what scale you deploy it at
"Maybe"? That's a lot of maybes for just potentially doing the basics. Their whole approximation of a plan is 'maybe not deploying it at scale' or 'maybe' stopping training after that and only theoretically considering sandboxing it?. That seems like kind of a bare minimum and it's like he is guessing based on having been around, not based on any real plans they have.
He then goes on to molify, that it probably won't happen in a year.. it might be a whole two or three years, and this is where they are at.
First of all, I don't think this is going to happen next year but it's still useful to have the conversation. It could be two or three years instead.
It comes off as if all their talk of Safety is complete lip service even if he agrees with the need for Safety in theory. If you were 'pleasantly surprised and impressed' I shudder to imagine what the responses would have had to be to leave you disappointed.
Somewhat tangential but when you list the Safety people who have departed, I'd have prefered to see some sort of comparison group or base rate, as it always raises a red flag for me when only the absolute numbers are provided.
I did a quick check by changing your prompt from 'AGI Safety or AGI Alignment' to 'AGI Capabilities or AGI Advancement' and got 60% departed (compared to 70% for AGI Safety by you) with 4o. I do think what we are seeing is alarming but it's too easy for either 'side' to accidentally exagerate via framing if you don't watch for that sort of thing.
When considering that my thinking was that I'd expect the last day to be slightly after, but the announcement can be slightly before since that doesn't need to be quite on the last day but can and often would be a little before - e.g. be on the first day of his last week.
The 21st when Altman was reinstated, is a logical date for the resignation, and within a week of 6 months now which is why a notice period/agreement to wait ~half a year/something similar is the first thing I thought of, since obviously the ultimate reason why he is quitting is rooted in what happened around then.
Is there a particular reason to think that he would have had an exactly 6-month notice
You are right, there isn't, but 1, 3, 6 months is where I would have put the highest probability a priori.
Sora & GPT-4o were out.
Sora isn't out out, or at least not how 4o is out and Ilya isn't listed as a contributor in any form on it (compared to being an 'additional contributor' for gpt-4 or 'additional leadership' for gpt-4o) and in general, I doubt it had that much to do with the timing.
GPT-4o of course, makes a lot of sense, timing-wise (it's literally the next day!) and he is listed on it (though not as one of the many contributors or leads). But if he wasn't in the office during that time (or is that just a rumor?) it's just not clear to me if he was actually participating in getting it out as his final project (which yes, is very plausible) or if he was just asked not to announce his departure until after the release, given that the two happen to be so close in time in that case.
Reasons are unclear
This is happening exactly 6 months after the November fiasco (the vote to remove Altman was on Nov 17th) which is likely what his notice period was, especially if he hasn't been in the office since then.
Are the reasons really that unclear? The specifics of why he wanted Altman out might be, but he is ultimately clearly leaving because he didn't think Altman should be in charge, while Altman thinks otherwise.
I own only ~5 physical books now (prefer digital) and 2 of them are Thinking, Fast and Slow. Despite not being on the site I've always thought of him as something of a founding grandfather of LessWrong.
He comes out pretty unsympathetic and stubborn.
Did any of your views of him change?
I'm sympathetic to some of your arguments but even if we accept that the current paradigm will lead us to an AI that is pretty similar to a human mind, and even in the best case I'm already not super optimistic that a scaled up random almost human is a great outcome. I simply disagree where you say this:
>For example, humans are not perfectly robust. I claim that for any human, no matter how moral, there exist adversarial sensory inputs that would cause them to act badly. Such inputs might involve extreme pain, starvation, exhaustion, etc. I don't think the mere existence of such inputs means that all humans are unaligned.
Humans aren't that aligned at the extreme and the extreme matters when talking about the smartest entity making every important decision about everything.
Also, your general arguments about the current paradigms being not that bad are reasonable but again, I think our situation is a lot closer to all or nothing - if we get pretty far with RLHF or whatever, scale up the model until it's extremely smart and thus eventually making every decision of consequence then unless you got the alignment near perfectly the chance that the remaining problematic parts screw us over seems uncomfortably high to me.
I can't even get a good answer of "What's the GiveWell of AI Safety" so I can quickly donate to a very reputable and agreed upon option with little thinking without at best getting old lists to a ton of random small orgs and giving up. I'm not very optimistic ordinary less convinced people who want to help are having an easier time.
It seems quite different. Tha main argument in that article is that Climate Change wouldn't make the lives of readers' children much worse or shorter and that's not the case for AI.
Do you have any evidence for this?
My prior is that other things are less effective and you need evidence to show they are more effective not vice versa.
Not all EA's are longtermists.
Of course. I'm saying it doesn't even get to make that argument which can sometimes muddy the waters enough to make some odd-seeming causes look at least plausibly effective.
I'm impressed how modern EAs manage to spin any cause into being supposedly EA.
There's just no way that things like this are remotely as effective as say GiveWell causes (though it wouldn't even meet a much lower bar) and it barely even has longtermist points for it that can make me see why there's at least a chance it could be worth it.
EA's whlole brand is massively diluted by all these causes and I don't think they are remotely as effective as other places where your money can go, nor that they help the general message.
It's like people get into EA, realize it's a good idea but then want to participate in the community and not just donate so everyone tries to come up with new clearly ineffective (compared to alternatives) causes and spin them as EA.
While NVDA is naively the most obvious play - the vast majority of GPU-based AI systems use them, I fail to see why you'd expect it will outperform the market, at least in the medium term. Even if you don't believe in the EMH, I assume you acknowledge things can be more or less priced-in? Well, NVDA's such an obvious choice that it does seem like all the main arguments for it are priced-in which has helped get it to a PE ratio of 55.
I also don't see OpenAI making a huge dent on MSFT's numbers anytime soon. Almost all of MSFT's price is going to be determined by the rest of their business. Quick googling suggests revenue of 3m for OpenAI, and 168b total for MSFT for 2021. If OpenAI was already 100 times larger I still wouldn't see how a bet on MSFT just because of it is justified. It seems like this was chosen just because OpenAI is popular and not out of any real analysis beyond it. Can you explain what I'm missing?
I do like your first 3 choices of TSM, Google and Samsung (is that really much of an AI play though).
No, it's the blockchain Terra (with Luna being its main token).
https://en.wikipedia.org/wiki/Terra_(blockchain)