Posts

Have any parties in the current European Parliamentary Election made public statements on AI? 2024-05-10T10:22:48.342Z
On what research policymakers actually need 2024-04-23T19:50:12.833Z
The Filan Cabinet Podcast with Oliver Habryka - Transcript 2023-02-14T02:38:34.867Z
Open & Welcome Thread - November 2022 2022-11-01T18:47:40.682Z
Health & Lifestyle Interventions With Heavy-Tailed Outcomes? 2022-06-06T16:26:49.012Z
Open & Welcome Thread - June 2022 2022-06-04T19:27:45.197Z
Why Take Care Of Your Health? 2022-04-06T23:11:07.840Z
MondSemmel's Shortform 2022-02-02T13:49:32.844Z
Recommending Understand, a Game about Discerning the Rules 2021-10-28T14:53:16.901Z
Quotes from the WWMoR Podcast Episode with Eliezer 2021-03-13T21:43:41.672Z
Another Anki deck for Less Wrong content 2013-08-22T19:31:09.513Z

Comments

Comment by MondSemmel on Charbel-Raphaël's Shortform · 2025-04-22T07:39:41.198Z · LW · GW

Altman has already signed the CAIS Statement on AI Risk ("Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."), but OpenAI's actions almost exclusively exacerbate extinction risk, and nowadays Altman and OpenAI even downplay the very existence of this risk.

Comment by MondSemmel on Charbel-Raphaël's Shortform · 2025-04-22T06:49:35.411Z · LW · GW

Even if this strategy would work in principle among particularly honorable humans, surely Sam Altman in particular has already conclusively proven that he cannot be trusted to honor any important agreements? See: the OpenAI board drama; the attempt to turn OpenAI's nonprofit into a for-profit; etc.

Comment by MondSemmel on Pablo's Shortform · 2025-04-21T16:00:48.099Z · LW · GW

I see your point re: free speech, and I don't endorse any appeals to "I should abandon my principles because the other side has already done so" as constantly happens in politics. And I can absolutely understand why you wouldn't be interested in joining such a lawsuit, both due to free speech concerns and because the FTX litigation can't have been remotely pleasant.

That said, when people do stuff like dismissing x-risk because their top Google search result pointed them at RationalWiki, what exactly was the proper non-free-speech-limiting solution to this problem?

Comment by MondSemmel on jenn's Shortform · 2025-04-16T20:48:53.794Z · LW · GW

In case you haven't seen it, there's an essay on the EA forum about a paper by Tyler Cowen which argues that there's no way to "get off" the train to crazy town. I.e. it may be a fundamental limitation of utilitarianism plus scope sensitivity, that this moral framework necessarily collapses everything into a single value (utility) to optimize at the expense of everything else. Some excerpts:

So, the problem is this. Effective Altruism wants to be able to say that things other than utility matter—not just in the sense that they have some moral weight, but in the sense that they can actually be relevant to deciding what to do, not just swamped by utility calculations. Cowen makes the condition more precise, identifying it as the denial of the following claim: given two options, no matter how other morally-relevant factors are distributed between the options, you can always find a distribution of utility such that the option with the larger amount of utility is better. The hope that you can have ‘utilitarianism minus the controversial bits’ relies on denying precisely this claim. ...

Now, at the same time, Effective Altruists also want to emphasise the relevance of scale to moral decision-making. The central insight of early Effective Altruists was to resist scope insensitivity and to begin systematically examining the numbers involved in various issues. ‘Longtermist’ Effective Altruists are deeply motivated by the idea that ‘the future is vast’: the huge numbers of future people that could potentially exist gives us a lot of reason to try to make the future better. The fact that some interventions produce so much more utility—do so much more good—than others is one of the main grounds for prioritising them. So while it would technically be a solution to our problem to declare (e.g.) that considerations of utility become effectively irrelevant once the numbers get too big, that would be unacceptable to Effective Altruists. Scale matters in Effective Altruism (rightly so, I would say!), and it doesn’t just stop mattering after some point.

So, what other options are there? Well, this is where Cowen’s paper comes in: it turns out, there are none. For any moral theory with universal domain where utility matters at all, either the marginal value of utility diminishes rapidly (asymptotically) towards zero, or considerations of utility come to swamp all other values. ...

I hope the reasoning is clear enough from this sketch. If you are committed to the scope of utility mattering, such that you cannot just declare additional utility de facto irrelevant past a certain point, then there is no way for you to formulate a moral theory that can avoid being swamped by utility comparisons. Once the utility stakes get large enough—and, when considering the scale of human or animal suffering or the size of the future, the utility stakes really are quite large—all other factors become essentially irrelevant, supplying no relevant information for our evaluation of actions or outcomes. ...

Once you let utilitarian calculations into your moral theory at all, there is no principled way to prevent them from swallowing everything else. And, in turn, there’s no way to have these calculations swallow everything without them leading to pretty absurd results. While some of you might bite the bullet on the repugnant conclusion or the experience machine, it is very likely that you will eventually find a bullet that you don’t want to bite, and you will want to get off the train to crazy town; but you cannot consistently do this without giving up the idea that scale matters, and that it doesn’t just stop mattering after some point.

Comment by MondSemmel on A Slow Guide to Confronting Doom · 2025-04-09T19:33:51.771Z · LW · GW

My initial comment isn't really arguing for the >99% thing. Most of that comes from me sharing the same so-called pessimistic (I would say realistic) expectations as some LWers (e.g. Yudkowsky's AGI Ruin: A List of Lethalities) that the default outcome of AI progress is unaligned AGI -> unaligned ASI -> extinction, that we're fully on track for that scenario, and that it's very hard to imagine how we'd get off that track.

meaning you believe that any one of those things is sufficient enough to ensure doom on its own (which seems nowhere near obviously true to me?)

No, I didn't mean it like that. I meant that we're currently (in 2025) in the >99% doom scenario, and I meant it seemed to me like we were overdetermined (even back in e.g. 2010) to end up in that scenario (contra Ruby's "doomed for no better reason than because people were incapable of not doing something"), even if some stuff changed, e.g. because some specific actors like our leading AI labs didn't come to exist. Because we're in a world where technological extinction is possible and the default outcome of AI research, and our civilization is fundamentally unable to grapple with that fact. Plus a bunch of our virtues (like democracy, or freedom of commerce) turn from virtue to vice in a world where any particular actor can doom everyone by doing sufficient technological research; we have no mechanism whereby these actors are forced to internalize these negative externalities of their actions (like via extinction insurance or some such).

that not even an extra 50 years could move

I don't understand this part. Do you mean an alternative world scenario where compute and AI progress had been so slow, or the compute and algorithmic requirements for AGI had been so high, that our median expected time for a technological singularity would be around the year 2070? I can't really imagine a coherent world where AI alignment progress is relatively easier to accomplish than algorithmic progress (e.g. AI progress yields actual feedback, whereas AI alignment research yields hardly any feedback), so wouldn't we then in 2067 just be in the same situation as we are now?

Although, something I’d like to see would be some kind of coordination regarding setting standards for testing or minimum amounts of safety research and then have compliance reviewed by a board maybe, with both legal and financial penalties to be administered in case of violations.

I don't understand the world model where that prevents any negative outcomes. For instance, AI labs like OpenAI currently argue that they should be under zero regulations, and even petitioned the US government to be exempted from regulation; and the current US government itself cheerleads race dynamics and is strictly against safety research. Even if some AI labs voluntarily submitted themselves to some kinds of standards, that wouldn't help anyone when OpenAI and the US government don't play ball.

(Not to mention that the review board would inevitably be captured by interests like anti-AI-bias stuff, since there's neither sufficient expertise nor a sufficient constituency for anti-extinction policies.)

Something that would get me there would be actually seeing the cloud of poison spewing death drones (or whatever) flying towards me. Heck, even if I had a crystal ball right now and saw exactly that, I still wouldn’t see previously having a >99% credence as justifiable.

That's a disbelief in superintelligence. You need to deflect the asteroid (prevent unaligned ASI from coming into being) long before it crashes into earth, not only when it's already burning up in the atmosphere. From my perspective, the asteroid is already almost upon us (e.g. see the recent AI 2027 forecast), you're just not looking at it, or you're not understanding what you're seeing.

Comment by MondSemmel on A Slow Guide to Confronting Doom · 2025-04-07T20:11:17.903Z · LW · GW

I'm not that invested in defending the p>99% thing; as Yudkowsky argues in this tweet:

If you want to trade statements that will actually be informative about how you think things work, I'd suggest, "What is the minimum necessary and sufficient policy that you think would prevent extinction?"

I see the business-as-usual default outcome as AI research progressing until unaligned AGI, resulting in an intelligence explosion, and thus extinction. That would be the >99% thing.

The kinds of minimum necessary and sufficient policies I can personally imagine which might possibly prevent that default outcome, would require institutions laughably more competent than what we have, and policies utterly outside the Overton window. Like a global ban on AI research plus a similar freeze of compute scaling, enforced by stuff like countries credibly threatening global nuclear war over any violations. (Though probably even that wouldn't work, because AI research and GPU production cannot be easily detected via inspections and surveillance, unlike the case of producing nuclear weapons.)

Comment by MondSemmel on A Slow Guide to Confronting Doom · 2025-04-06T23:07:12.220Z · LW · GW

One more virtue-turned-vice for my original comment: pacifism and disarmament: the world would be a more dangerous place if more countries had more nukes etc., and we might well have had a global nuclear war by now. But also, more war means more institutional turnover, and the destruction and reestablishment of institutions is about the only mechanism of institutional reform which actually works. Furthermore, if any country could threaten war or MAD against AI development, that might be one of the few things that could possibly actually enforce an AI Stop.

Comment by MondSemmel on A Slow Guide to Confronting Doom · 2025-04-06T22:37:50.741Z · LW · GW

I know that probabilities are in the map, not in the territory. I'm just wondering if we were ever sufficiently positively justified to anticipate a good future, or if we were just uncertain about the future and then projected our hopes and dreams onto this uncertainty, regardless of how realistic that was. In particular, the Glorious Transhumanist Future requires the same technological progress that can result in technological extinction, so I question whether the former should've ever been seen as the more likely or default outcome.

I've also wondered about how to think about doom vs. determinism. A related thorny philosophical issue is anthropics: I was born in 1988, so from my perspective the world couldn't have possibly ended before then, but that's no defense whatsoever against extinction after that point.

Re: AI timelines, again this is obviously speaking from hindsight, but I now find it hard to imagine how there could've ever been 50-year timelines. Maybe specific AI advances could've come a bunch of years later, but conversely, compute progress followed Moore's Law and IIRC had no sign of slowing down, because compute is universally economically useful. And so even if algorithmic advances had been slower, compute progress could've made up for that to some extent.

Re: solving coordination problems: some of these just feel way too intractable. Take the US constitution, which governs your political system: IIRC it was meant to be frequently updated in constitutional conventions, but instead the political system ossified and the last meaningful amendment (18-year voting age) was ratified in 1971, or 54 years ago. Or, the US Senate made itself increasingly ungovernable with the filibuster, and even the current Republican-majority Senate didn't deign to abolish it. Etc. Our political institutions lack automatic repair mechanisms, so they inevitably deteriorate over time, when what we needed was for them to improve over time instead.

Comment by MondSemmel on A Slow Guide to Confronting Doom · 2025-04-06T21:54:31.780Z · LW · GW

I personally first deeply felt the sense of "I'm doomed, I'm going to die soon" almost exactly a year ago, due to a mix of illness and AI news. It was a double-whammy of getting both my mortality, and AGI doom, for the very first time.

Re: mortality, it felt like I'd been immortal up to that point, or more accurately a-mortal or non-mortal or something. Up to this point I hadn't anticipated death happening to me as anything more than a theoretical exercise. I was 35, felt reasonably healthy, was familiar with transhumanism, had barely witnessed any deaths in the family, etc. I didn't feel like a mortal being that can die very easily, but more like some permanently existing observer watching a livestream of my life: it's easy to imagine turning off the livestream, but much harder to imagine that I, the observer, will eventually turn off.

After I felt like I'd suddenly become mortal, I experienced panic attacks for months.

Re: AGI doom: even though I've thought way less about this topic than you, I do want to challenge this part:

And doomed for no better reason than because people were incapable of not doing something.

Just as I felt non-mortal because of an anticipated transhumanist future or something, so too did it feel like the world was not doomed, until one day it was. But did the probability of doom suddenly jump to >99% in the last few years, or was the doom always the default outcome and we were just wrong to expect anything else? Was our glorious transhumanist future taken from us, or was it merely a fantasy, and the default outcome was always technological extinction?

Are we in a timeline where a few actions by key players doomed us, or was near-term doom always the default overdetermined outcome? Suppose we go back to the founding of LessWrong in 2009, or the founding of OpenAI in 2015. Would a simple change, like OpenAI not being founded, actually meaningfully change the certainty of doom, or would it have only affected the timeline by a few years? (That said, I should stress that I don't absolve anyone who dooms us in this timeline from their responsibility.)

From my standpoint now in 2025, AGI doom seems overdetermined for a number of reasons, like:

  • Humans are the first species barely smart enough to take over the planet, and to industrialize, and to climb the tech ladder. We don't have dath ilan's average IQ of 180 or whatever. The arguments for AGI doom are just too complicated and time-consuming to follow, for most people. And even when people follow them, they often disagree about them.
  • Our institutions and systems of governance have always been incapable of fully solving mundane problems, let alone extinction-level ones. And they're best at solving problems and disasters they can actually witness and learn from, which doesn't happen with extinction-level problems. And even when we faced a global disaster like Covid-19, our institutions didn't take anywhere sufficient steps to prevent future pandemics.
  • Capitalism: our best, or even ~only, truly working coordination mechanism for deciding what the world should work on is capitalism and money, which allocates resources towards the most productive uses. This incentivizes growth and technological progress. There's no corresponding coordination mechanism for good political outcomes, incl. for preventing extinction.
  • In a world where technological extinction is possible, tons of our virtues become vices:
    • Freedom: we appreciate freedoms like economic freedom, political freedom, and intellectual freedom. But that also means freedom to (economically, politically, scientifically) contribute to technological extinction. Like, I would not want to live in a global tyranny, but I can at least imagine how a global tyranny could in principle prevent AGI doom, namely by severely and globally restricting many freedoms. (Conversely, without these freedoms, maybe the tyrant wouldn't learn about technological extinction in the first place.)
    • Democracy: politicians care about what the voters care about. But to avert extinction you need to make that a top priority, ideally priority number 1, which it can never be: no voter has ever gone extinct, so why should they care?
    • Egalitarianism: resulted in IQ denialism; if discourse around intelligence was less insane, that would help discussion of superintelligence.
    • Cosmopolitanism: resulted in pro-immigration and pro-asylum policy, which in turn precipitated both a global anti-immigration and an anti-elite backlash.
    • Economic growth: the more the better; results in rising living standards and makes people healthier and happier... right until the point of technological extinction.
    • Technological progress: I've used a computer, and played video games, all my life. So I cheered for faster tech, faster CPUs, faster GPUs. Now the GPUs that powered my games instead speed us up towards technological extinction. Oops.
  • And so on.

Yudkowsky had a glowfic story about how dath ilan prevents AGI doom, and that requires a whole bunch of things to fundamentally diverge from our world. Like a much smaller population; an average IQ beyond genius-level; fantastically competent institutions; a world government; a global conspiracy to slow down compute progress; a global conspiracy to work on AI alignment; etc.

I can imagine such a world to not blow itself up. But even if you could've slightly tweaked our starting conditions from a few years or decades ago, weren't we going to blow ourselves up anyway?

And if doom is sufficiently overdetermined, then the future we grieve for, transhumanist or otherwise, was only ever a mirage.

Comment by MondSemmel on Why Have Sentence Lengths Decreased? · 2025-04-06T20:45:13.712Z · LW · GW

I see. I guess I can appreciate that the style is aiming for a particular aesthetic, but for me it's giving up more in clarity than it gains in aesthetic. In a phrasing like "Cant you, Papa? Yes, he said. I can." I have to think about who each part of the dialogue belongs to, and which parts are even dialogue, all due to the missing quotation marks.

This style reads to me like someone removed a bunch of parentheses from a math formula, ones which may not be strictly necessary if one knows about some non-universal order of operations. This may look prettier in some sense, but in exchange it will definitely confuse a fraction of readers. I personally don't think this tradeoff is worth it.

Comment by MondSemmel on Why Have Sentence Lengths Decreased? · 2025-04-06T10:30:40.413Z · LW · GW

The colon seems optional to me, but quotation marks absolutely aren't, as evidenced by how comparatively unreadable this author's dialogue looks. From his book "The Road":

He screwed down the plastic cap and wiped the bottle off with a rag and hefted it in his hand. Oil for their little slutlamp to light the long gray dusks, the long gray dawns. You can read me a story, the boy said. Cant you, Papa? Yes, he said. I can.

That already looks unnecessarily hard to read even though the dialogue is so short. I guess the author made it work somehow, but this seems like artificially challenging oneself to write a novel without the letter 'E': intriguing, but not beneficial to either reader or prose.

Comment by MondSemmel on Why Have Sentence Lengths Decreased? · 2025-04-05T16:55:40.902Z · LW · GW

Sourcing the Orwell quote:

This mermaid of the punctuation world—period above, comma below—is viewed with suspicion by many people, including well-known writers. George Orwell deliberately avoided semicolons in his novel Coming Up for Air (London: V. Gollancz, 1939). As he explained to his editor (Roger Senhouse) at the time, “I had decided … that the semicolon is an unnecessary stop and that I would write my next book without one” (quoted in George Orwell: The Collected Essays, Journalism & Letters, ed. Sonia Orwell and Ian Angus, in Vol. 4: In Front of Your Nose, Jaffrey, NH: David R. Godine, 2000). Kurt Vonnegut had this advice for writers: “First rule: Do not use semicolons. They are transvestite hermaphrodites representing absolutely nothing. All they do is show you’ve been to college” (A Man Without a Country, New York: Seven Stories Press, 2005).

[...] British journalist Lynne Truss affirmed that “a full stop ought always to be an alternative” to the semicolon (Eats, Shoots & Leaves, New York: Gotham Books, 2004). The American writer Noah Lukeman views the semicolon as a mark more suitable for creative writing. Otherwise, he argues, “The first thing to realize is that one could always make a case for not using a semicolon. As an unnecessary form of punctuation, as the luxury item in the store, we must ask ourselves: why use it at all?” (A Dash of Style: The Art and Mastery of Punctuation, New York: Norton, 2006).

And this article has an infographic "number of semicolons per 100,000 words" for a bunch of famous authors. And it includes this claim (though note that statistics from tools like Google Books Ngram Viewer can suffer from stuff like OCR ideosyncrasies).

You probably notice the older authors I’ve selected use far more than modern authors. Google Books Ngram Viewer, which includes novels, nonfiction, and even scientific literature, hows that semicolon use has dropped by about 70 percent from 1800 to 2000.

Comment by MondSemmel on AI 2027: What Superintelligence Looks Like · 2025-04-05T08:50:31.675Z · LW · GW

Large institutions are super slow to change, and usually many years behind the technological frontier. It seems to me like the burden of proof is very obviously on your perspective. For instance, US policy only acted large-scale on Covid after we were already far along the exponential. That should be a dealbreaker for this being your dealbreaker.

Also, there is no single entity called "the government"; individuals can be more or less aware of stuff, but that doesn't mean the larger entity acts with something resembling awareness. Or cohesion, for that matter.

Comment by MondSemmel on Mo Putera's Shortform · 2025-03-30T20:23:47.493Z · LW · GW

LLMs use tokens instead of letters, so counting letters is sufficiently unnatural to them relative to their other competencies that I don't see much value in directly asking LLMs to do this kind of thing. At least give them some basic scaffolding, like a full English dictionary with a column which explicitly indicates respective word lengths. In particular, the Gemini models have a context window of 1M tokens, which should be enough to fit most of the Oxford English Dictionary in there (since it includes 171k words which are in current use).

Comment by MondSemmel on Policy for LLM Writing on LessWrong · 2025-03-25T23:31:13.690Z · LW · GW

If you're still open for inspiration on this implementation of collapsible sections, I'll reiterate my recommendation of Notion's implementation of toggles and toggle headings in terms of both aesthetics and effect. For example, I love having the ability to make both bullet points and headings collapsible, and I love how easy they are to create (by beginning an empty line with "> text").

Comment by MondSemmel on Linch's Shortform · 2025-03-24T22:57:06.264Z · LW · GW

There is a contingent of people who want excellence in education (e.g. Tracing Woodgrains) and are upset about e.g. the deprioritization of math and gifted education and SAT scores in the US. Does that not count?

Given that ~ no one really does this, I conclude that very few people are serious about moving towards a meritocracy.

This sounds like an unreasonably high bar for us humans. You could apply it to all endeavours, and conclude that "very few people are serious about <anything>". Which is true from a certain perspective, but also stretches the word "serious" far past how it's commonly understood.

Comment by MondSemmel on Elizabeth's Shortform · 2025-03-23T23:43:25.282Z · LW · GW

I haven't read Friendship is Optimal, but from the synopsis it sounds like it's clearly and explicitly about AI doom and AI safety, whereas HPMoR is mainly about rationality and only implicitly about x-risk and AI safety?

Comment by MondSemmel on abstractapplic's Shortform · 2025-03-13T21:23:59.637Z · LW · GW

Also see Scott Alexander's Heuristics That Almost Always Work.

Comment by MondSemmel on So how well is Claude playing Pokémon? · 2025-03-08T10:12:09.147Z · LW · GW

By this I was mainly arguing against claims like that this performance is "worse than a human 6-year-old".

Comment by MondSemmel on So how well is Claude playing Pokémon? · 2025-03-08T09:39:06.786Z · LW · GW

Fair. But then also restrict it to someone who has no hands, eyes, etc.

Comment by MondSemmel on So how well is Claude playing Pokémon? · 2025-03-08T08:42:00.302Z · LW · GW

Further, have you ever gotten an adult who doesn't normally play video games to try playing one? They have a tendency to get totally stuck in tutorial levels because game developers rely on certain "video game motifs" for load-bearing forms of communication; see e.g. this video.

So much +1 on this.

Also, I've played a ton of games, and in the last few years started helping a bit with playtesting them etc. And I found it striking how games aren't inherently intuitive, but are rather made so via strong economic incentives, endless playtests to stop players from getting stuck, etc. Games are intuitive for humans because humans spend a ton of effort to make them that way. If AIs were the primary target audience, games would be made intuitive for them.

And as a separate note, I'm not sure what the appropriate human reference class for game-playing AIs is, but I challenge the assumption that it should be people who are familiar with games. Rather than, say, people picked at random from anywhere on earth.

Comment by MondSemmel on Viliam's Shortform · 2025-03-07T20:32:45.909Z · LW · GW

Right now it doesn't make sense; it is better to let the current owners keep improving their AIs.

Only if alignment progress keeps up with or exceeds AI progress, and you thus expect a controllable AI you can take over to do your bidding. But isn't all the evidence pointing towards AI progress >> alignment progress?

Comment by MondSemmel on Viliam's Shortform · 2025-03-05T09:27:09.006Z · LW · GW

A lot of things could happen, but something that has already happened is that official US AI policy is now that not racing towards AGI is bad, and that impeding AI progress is bad. Doesn't that policy imply that AI lab nationalization is now less likely, rather than more likely,  than it would've been under a D president?

Conversely, your scenario assumes that the Trump administration can do whatever it wants to do, but this ability is partially dependent on it staying popular with the general public. The public may not care about AI for now, but it very much does care about economics and inflation, and once Trump's policies worsen those (e.g. via tariffs), then that severely restricts his ability to take arbitrary actions in other domains.

Comment by MondSemmel on MondSemmel's Shortform · 2025-02-28T15:21:28.708Z · LW · GW

AI assistants are weird. Here's a Perplexity Pro search I did for an EY tweet about finding the sweet spot between utilitarianism & deontology. Perplexity Pro immediately found the correct tweet:

Eliezer Yudkowsky, a prominent figure in the rationalist community, has indeed expressed a view that suggests finding a balance between utilitarianism and deontology. In a tweet, he stated: "Go three-quarters of the way from deontology to utilitarianism and then stop. You are now in the right place. Stay there at least until you..."

But I wondered why it didn't provide the full quote (which is just a few more words, namely "Stay there at least until you have become a god."), and I just couldn't get it to do so, even with requests like "Just quote the full tweet from here: <URL>". Instead, it invented alternative versions like this:

Go three-quarters of the way from deontology to utilitarianism and then stop. You are now in the right place. Stay there at least until you understand why.

or this:

Go three-quarters of the way from deontology to utilitarianism and then stop. You are now in the correct place. Stay there at least until you understand why you shouldn't go any further.

I finally provided the full quote and asked it directly:

Does the following quote represent Yudkowsky's tweet with 100% accuracy?

"Go three-quarters of the way from deontology to utilitarianism and then stop. You are now in the right place. Stay there at least until you have become a god."

And it still doubled down on the wrong version.

Comment by MondSemmel on Benito's Shortform Feed · 2025-02-25T21:51:52.436Z · LW · GW

"Copy link to highlight" is not available in Firefox. And while e.g. Bing search seems to automatically generate these "#:~:text=" links, I find they don't work with any degree of consistency. And they're even more affected by link rot than usual, since any change to the initial text (like a typo fix) will break that part of the link.

Comment by MondSemmel on Cole Wyeth's Shortform · 2025-02-25T21:47:46.478Z · LW · GW

I don't like it. Among various issues, people already muddy the waters by erroneously calling climate change an existential risk (rather than what it was, a merely catastrophic one, before AI timelines made any worries about climate change in the year 2100 entirely irrelevant), and it's extremely partisan-coded. And you're likely to hear that any mention of AI x-risk is a distraction from the real issues, which are whatever the people cared about previously.

I prefer an analogy to gain-of-function research. As in, scientists grow viruses/AIs in the lab, with promises of societal benefits, but without any commensurate acknowledgment of the risks. And you can't trust the bio/AI labs to manage these risks, e.g. even high biosafety levels can't entirely prevent outbreaks.

Comment by MondSemmel on LWLW's Shortform · 2025-02-23T17:12:32.217Z · LW · GW

This seems way overdetermined. For example, AI labs have proven extremely successful at spending arbitrary amounts of money to increase capabilities (<-> scaling laws), and there's been no similar ability to convert arbitrary amounts of money into progress on alignment.

Comment by MondSemmel on LWLW's Shortform · 2025-02-23T12:06:56.820Z · LW · GW

The question is not whether alignment is impossible (though I would be astonished if it was), but rather whether it's vastly easier to increase capabilities to AGI/ASI than it is to align AGI/ASI, and ~all evidence points to yes. And so the first AGI/ASI will not be aligned.

Comment by MondSemmel on LWLW's Shortform · 2025-02-23T10:17:08.427Z · LW · GW

i am much more worried

Why? I figure all the AI labs worry mostly about how to get the loot, without ensuring that there's going to be any loot in the first place. Thus there won't be any loot, and we'll go extinct without any human getting to play god-emperor. It seems to me like trying to build an AGI tyranny is an alignment-complete challenge, and since we're not remotely on track to solving alignment, I don't worry about that particular bad ending.

Comment by MondSemmel on Open Thread Winter 2024/2025 · 2025-02-22T15:05:42.785Z · LW · GW

The German federal election is tomorrow. I looked up whether any party was against AI or for a global AI ban or similar. From what I can tell, the answer is unfortunately no, they all want more AI and just disagree about how much. I don't want to vote for any party that wants more AI, so this is rather disappointing.

Comment by MondSemmel on The case for the death penalty · 2025-02-22T15:01:53.242Z · LW · GW

(Trollish reply. I'm not in favor of the death penalty.) I take your point re: that a death penalty cannot be implemented if individual executioners need to kill individual people via guns. But I'll counter that it also can't be implemented in the 21st century via gas chambers, because your executioners will realize the parallel to Nazi Germany. ("Are we the baddies?") To split the difference, how about having executioners execute people via drones armed with bullets?

Comment by MondSemmel on Arbital has been imported to LessWrong · 2025-02-20T16:35:11.287Z · LW · GW

I assume the idea of "lens" as a term is that it's one specific person's opinionated view of a topic. As in, "here's the concept seen through EY's lens". So terms like "variant" or "alternative" are too imprecise, but e.g. "perspective" might also work.

Comment by MondSemmel on ozziegooen's Shortform · 2025-02-19T22:07:33.095Z · LW · GW

Based on AI organisations frequently achieving the opposite of their chosen name (OpenAI, Safe Superintelligence, etc.), UNBIASED would be the most biased model, INTELLECT would be the dumbest model, JUSTICE would be particularly unjust, MAGA would in effect be MAWA, etc.

Comment by MondSemmel on Daniel Kokotajlo's Shortform · 2025-02-19T15:00:42.610Z · LW · GW

I'm skeptical to which extent the latter can be done. That's like saying an AI lab should suddenly care about AI safety. One can't really bolt a security mandate onto an existing institution and expect a competent result.

Comment by MondSemmel on Daniel Kokotajlo's Shortform · 2025-02-19T10:21:21.341Z · LW · GW

or stopped disclosing its advancements publicly

Does this matter all that much, given lack of opsec, relationships between or poaching of employees of other labs, corporate espionage, etc.?

Comment by MondSemmel on nikola's Shortform · 2025-02-19T10:17:53.925Z · LW · GW

A more cynical perspective is that much of this arms race, especially the international one against China (quote from above: "If we don't build fast enough, then the authoritarian countries could win."), is entirely manufactured by the US AI labs.

Comment by MondSemmel on Jay Bailey's Shortform · 2025-02-17T15:26:44.096Z · LW · GW

Thanks! I haven't had time to skim either report yet, but I thought it might be instructive to ask the same questions to Perplexity Pro's DR mode (it has the same "Deep Research" as OpenAI, but otherwise has no relation to it, in particular it's free with their regular monthly subscription, so it can't be particularly powerful): see here. (This is the second report I generated, as the first one froze indefinitely while the app generated the conclusion, and thus the report couldn't be shared.)

Comment by MondSemmel on Jay Bailey's Shortform · 2025-02-15T16:23:29.948Z · LW · GW

How about "analyze the implications for risk of AI extinction, based on how OpenAI's safety page has changed over time"? Inspired by this comment (+ follow-up w/ Internet archive link).

Comment by MondSemmel on Mateusz Bagiński's Shortform · 2025-02-10T20:50:13.010Z · LW · GW

apparently China as a state has devoted $1 trillion to AI

Source? I only found this article about 1 trillion Yuan, which is $137 billion.

Comment by MondSemmel on artifex0's Shortform · 2025-02-06T10:15:06.893Z · LW · GW

If this risk is in the ballpark of a 5% chance in the next couple of years, then it seems to me entirely dominated by AI doom.

Comment by MondSemmel on ChristianKl's Shortform · 2025-01-30T14:13:52.405Z · LW · GW

Yeah. Though as a counterpoint, something I picked up from IIRC Scott Alexander or Marginal Revolution is that the FDA is not great about accepting foreign clinical trials, or demands that they always be supplemented by trials of Americans, or similar.

Comment by MondSemmel on What Goes Without Saying · 2025-01-25T22:24:46.882Z · LW · GW

Milton Friedman teaspoon joke

Total tangent: this article from 2011 attributes the quote to a bunch of people, and finds an early instance in a 1901 newspaper article.

Comment by MondSemmel on Yonatan Cale's Shortform · 2025-01-20T15:18:23.122Z · LW · GW

Law question: would such a promise among businesses, rather than an agreement mandated by / negotiated with governments, run afoul of laws related to monopolies, collusion, price gouging, or similar?

Comment by MondSemmel on Noosphere89's Shortform · 2025-01-18T19:07:13.810Z · LW · GW

I like Yudkowsky's toy example of tasking an AGI to copy a single strawberry, on a molecular level, without destroying the world as a side-effect.

Comment by MondSemmel on I'm offering free math consultations! · 2025-01-14T22:14:14.213Z · LW · GW

You're making a very generous offer of your time and expertise here. However, to me your post still feels way, way more confusing than it should be.

Suggestions & feedback:

  • Title: "Get your math consultations here!" -> "I'm offering free math consultations for programmers!" or similar.
    • Or something else entirely. I'm particularly confused how your title (math consultations) leads into the rest of the post (debuggers and programming).
  • First paragraph: As your first sentence, mention your actual, concrete offer (something like "You screenshare as you do your daily tinkering, I watch for algorithmic or theoretical squiggles that cost you compute or accuracy or maintainability." from your original post, though ideally with much less jargon). Also your target audience: math people? Programmers? AI safety people? Others?
  • "click the free https://calendly.com/gurkenglas/consultation link" -> What you mean is: "click this link for my free consultations". What I read is a dark pattern à la: "this link is free, but the consultations are paid". Suggested phrasing: something like "you can book a free consultation with me at this link"
  • Overall writing quality
    • Assuming all your users would be as happy as the commenters you mentioned, it seems to me like the writing quality of these posts of yours might be several levels below your skill as a programmer and teacher. In which case it's no wonder that you don't get more uptake.
    • Suggestion 1: feed the post into an LLM and ask it for writing feedback.
    • Suggestion 2: imagine you're a LW user in your target audience, whoever that is, and you're seeing the post "Get your math consultations here!" in the LW homepage feed, written by an unknown author. Do people in your target audience understand what your post is about, enough to click on the post if they would benefit from it? Then once they click and read the first paragraph, do they understand what it's about and click on the link if they would benefit from it? Etc.
Comment by MondSemmel on quila's Shortform · 2025-01-14T16:24:18.275Z · LW · GW

Are you saying that the 1 aligned mind design in the space of all potential mind designs is an easier target than the subspace composed of mind designs that does not destroy the world?

I didn't mean that there's only one aligned mind design, merely that almost all (99.999999...%) conceivable mind designs are unaligned by default, so the only way to survive is if the first AGI is designed to be aligned, there's no hope that a random AGI just happens to be aligned. And since we're heading for the latter scenario, it would be very surprising to me if we managed to design a partially aligned AGI and lose that way.

No, because the you who can ask (the persons in power) is themselves misaligned with the 1 alignment target that perfectly captures all our preferences.

I expect the people in power are worrying about this way more than they worry about the overwhelming difficulty of building an aligned AGI in the first place. (Case in point: the manufactured AI race with China.) As a result I expect they'll succeed at building a by-default-unaligned AGI and driving themselves and us to extinction. So I'm not worried about instead ending up in a dystopia ruled by some government or AI lab owner.

Comment by MondSemmel on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2025-01-13T22:58:19.421Z · LW · GW

Have donated $400. I appreciate the site and its team for all it's done over the years. I'm not optimistic about the future wrt to AI (I'm firmly on the AGI doom side), but I nonetheless think that LW made a positive contribution on the topic.

Anecdote: In 2014 I was on a LW Community Weekend retreat in Berlin which Habryka either organized or did a whole bunch of rationality-themed presentations in. My main impression of him was that he was the most agentic person in the room by far. Based on that experience I fully expected him to eventually accomplish some arbitrary impressive thing, though it still took me by surprise to see him specifically move to the US and eventually become the new admin/site owner of LW.

Comment by MondSemmel on Bryce Robertson's Shortform · 2025-01-09T12:23:28.404Z · LW · GW

Recommendation: make the "Last updated" timestamp on these pages way more prominent, e.g. by moving them to the top below the page title. (Like what most news websites nowadays do for SEO, or like where timestamps are located on LW posts.) Otherwise absolutely no-one will know that you do this, or that these resources are not outdated but are actually up-to-date.

The current timestamp location is so unusual that I only noticed it by accident, and was in fact about to write a comment suggesting you add a timestamp at all.

Comment by MondSemmel on OpenAI #10: Reflections · 2025-01-08T11:58:06.424Z · LW · GW

The frustrating thing is that in some ways this is exactly right (humanity is okay at resolving problems iff we get frequent feedback) and in other ways exactly wrong (one major argument for AI doom is that you can't learn from the feedback of having destroyed the world).

Comment by MondSemmel on OpenAI #10: Reflections · 2025-01-08T11:53:36.250Z · LW · GW

The implication is that you absolutely can't take Altman at his bare word, especially when it comes to any statement he makes that, if true, would result in OpenAI getting more resources. Thus you need to a) apply some interpretative filter to everything Altman says, and b) listen to other people instead who don't have a public track record of manipulation like Altman.