LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

Trying to translate when people talk past each other
Kaj_Sotala · 2024-12-17T09:40:02.640Z · comments (12)

A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (14)

[link] A car journey with conservative evangelicals - Understanding some British political-religious beliefs
Nathan Young · 2024-12-06T11:22:45.563Z · comments (8)

MATS mentor selection
DanielFilan · 2025-01-10T03:12:52.141Z · comments (8)

Causal Undertow: A Work of Seed Fiction
Daniel Murfet (dmurfet) · 2024-12-08T21:41:48.132Z · comments (0)

AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan · 2024-12-01T06:00:06.345Z · comments (0)

What happens next?
Logan Zoellner (logan-zoellner) · 2024-12-29T01:41:33.685Z · comments (19)

[question] What are the most interesting / challenging evals (for humans) available?
Raemon · 2024-12-27T03:05:26.831Z · answers+comments (13)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb · 2024-10-28T17:10:04.272Z · comments (3)

My January alignment theory Nanowrimo
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T00:07:24.050Z · comments (2)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (10)

Estimating the benefits of a new flu drug (BXM)
DirectedEvolution (AllAmericanBreakfast) · 2025-01-06T04:31:16.837Z · comments (2)

Dmitry's Koan
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-10T04:27:30.346Z · comments (6)

Signaling with Small Orange Diamonds
jefftk (jkaufman) · 2024-11-07T20:20:08.026Z · comments (1)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)

[question] Are You More Real If You're Really Forgetful?
Thane Ruthenis · 2024-11-24T19:30:55.233Z · answers+comments (25)

AI Safety Camp 10
Robert Kralisch (nonmali-1) · 2024-10-26T11:08:09.887Z · comments (9)

Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader
Sonia Joseph (redhat) · 2024-12-07T21:42:29.038Z · comments (7)

Drug development costs can range over two orders of magnitude
rossry · 2024-11-03T23:13:17.685Z · comments (0)

Sleep, Diet, Exercise and GLP-1 Drugs
Zvi · 2025-01-21T12:20:06.018Z · comments (3)

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well)
Roland Pihlakas (roland-pihlakas) · 2025-01-12T03:37:59.692Z · comments (5)

Worries about latent reasoning in LLMs
CBiddulph (caleb-biddulph) · 2025-01-20T09:09:02.335Z · comments (3)

Lecture Series on Tiling Agents
abramdemski · 2025-01-14T21:34:03.907Z · comments (14)

Rolling Thresholds for AGI Scaling Regulation
Larks · 2025-01-12T01:30:23.797Z · comments (6)

Six Small Cohabitive Games
Screwtape · 2025-01-15T21:59:29.778Z · comments (7)

Resolving von Neumann-Morgenstern Inconsistent Preferences
niplav · 2024-10-22T11:45:20.915Z · comments (5)

The Laws of Large Numbers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-04T11:54:16.967Z · comments (11)

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-01-07T03:08:51.447Z · comments (2)

[link] Locally optimal psychology
Chipmonk · 2024-11-25T18:35:11.985Z · comments (7)

Evolution and the Low Road to Nash
Aydin Mohseni (aydin-mohseni) · 2025-01-22T07:06:32.305Z · comments (2)

Doing Research Part-Time is Great
casualphysicsenjoyer (hatta_afiq) · 2024-11-22T19:01:15.542Z · comments (7)

Orca communication project - seeking feedback (and collaborators)
Towards_Keeperhood (Simon Skade) · 2024-12-03T17:29:40.802Z · comments (16)

The quantum red pill or: They lied to you, we live in the (density) matrix
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-17T13:58:16.186Z · comments (34)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

[link] Scaling Wargaming for Global Catastrophic Risks with AI
rai (nonveumann) · 2025-01-18T15:10:39.696Z · comments (2)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (38)

[link] The Way According To Zvi
Sable · 2024-12-07T17:35:48.769Z · comments (0)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

Fireplace and Candle Smoke
jefftk (jkaufman) · 2025-01-01T01:50:01.408Z · comments (4)

Deep Learning is cheap Solomonoff induction?
Lucius Bushnaq (Lblack) · 2024-12-07T11:00:56.455Z · comments (1)

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data
Sohaib Imran (sohaib-imran) · 2024-11-16T23:22:21.857Z · comments (11)

A Matter of Taste
Zvi · 2024-12-18T17:50:07.201Z · comments (4)

AI #98: World Ends With Six Word Story
Zvi · 2025-01-09T16:30:07.341Z · comments (2)

Childhood and Education #8: Dealing with the Internet
Zvi · 2025-01-06T14:00:09.604Z · comments (7)

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean (euanmclean) · 2024-10-29T12:16:18.448Z · comments (8)

Grammars, subgrammars, and combinatorics of generalization in transformers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T09:37:23.191Z · comments (0)

A Straightforward Explanation of the Good Regulator Theorem
Alfred Harwood · 2024-11-18T12:45:48.568Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

eleven-1 on AI and Non-Existence.

If you are a naturalist or physicalist about humans - these copies are not me, they are my identical twins. If you want to go beyond naturalism or physicalism, that is perfectly fine, but based on our current understanding, these are identical twins of me and in no sense are they me. So whatever happens in these other universes - it is not going to be me going through the timeline, it will be my identical twin.

An infinite or very large universe/multiverse is highly speculative and no matter what I decide in this universe, if there is an infinite universe, it makes essentially no difference if all universes are the same. There is going to be 10^1000000 and far beyond that of my identical twins and I have no power to influence anything. To say that you can influence anything in that scenario is worse than saying that you can move earth to another galaxy by jumping on it. You have 0.00...0000...0001% effect on it, and in an infinite universe you have effectively zero effect on the fate of your copies - so no matter what you decide, you will not have any influence over it.

yegreg on Announcement: Learning Theory Online Course

Note also that those questions are included for the sake of giving the reader a sense of the general flavor of the subject. The actual course will use a different set of homework questions, focused on the topics listed above [LW(p) · GW(p)].

yegreg on Announcement: Learning Theory Online Course

Right, that's a cliff-hanger from the screenshot. You can find the full list of questions used in the summer workshop linked before the screenshot too if you wanted to see more detail.

seth-herd on Yudkowsky on The Trajectory podcast

Agreed and well said. Playing a number of different strategies simultaneously is the smart move. I'm glad you're pursuing that line of research.

seth-herd on Yudkowsky on The Trajectory podcast

Sorry if I sound overconfident. My actual considered belief is that AGI this decade is quite possible, and it is crazy overconfident in longer timeline predictions to not prepare seriously for that possibility.

Multigenerational stuff needs a way longer timeline. There's a lot of space between three years and two generations.

jblack on AI and Non-Existence.

There's a very plausible sense in which you may not actually get a choice to not exist.

In pretty much any sort of larger-than-immediately-visible universe, there are parts of the world (timelines, wavefunction sections, distant copies in an infinite universe, Tegmark ensembles, etc) in which you exist and have the same epistemic state as immediately prior to this choice, but weren't offered the choice. Some of those versions of you are going to suffer for billions of years regardless of you choosing to no longer exist in this fragment of the world.

Granted, there's nothing you can do about them - you can only choose your response in worlds where you get the choice.

From the wider point of view it may or may not change things. For example suppose you knew (or the superintelligence told you as follow-up information) that in worlds having an essentially identical "you" in them, 10% will be unconditionally tortured for billions of years, and 90% are offered the question (with a 2% chance of hell and 98% chance of utopia). The superintelligence knows that in most timelines leading to hellworlds there is no care for consent while utopias do, which is why conditional on consent the chance is only 2% rather than the overall 11.8%.

If you are the sort of person to choose "nonexistence" then 10% of versions of you go to hell and 90% die. If you choose "live" then in total 11.8% of you go to hell and 88.2% to utopia. The marginal numbers are the same, but you no longer get the option to completely save all versions of you from torture.

Is it still worthwhile for those who can to choose death? This is not rhetorical, it is a question that only you can answer for yourself. Certainly those in the 1.8% would regret being a "choose life" decider and joining the 10% who never got a choice.

eleven-1 on AI and Non-Existence.

To say that non-existence is a gamble too is kind of like saying that a person who does not gamble in a casino is gambling too - because they are missing on a chance to win millions of dollars - to me that is more a matter of definitions and if one wants to argue for that, sure, let's accept that every single thing in life is a gamble.

Your assertion that humans will be able to integrate over large timespans might be true given the current human brain - but here we are talking about superintelligence, even with relatively primitive AIs we are already talking about new medication and cures, superintelligence that would want to cause widespread suffering or torture you and be able to build a Dyson sphere around the sun and a thousand of other advanced technologies will be able not just to figure out how to torture you persistently (so that your brain does not adapt to the new state of constant torture) but also to increase your pain levels by 1000x - not all animals feels the same pain, and there is no reason to think that current pain experience of humans cannot be increased by a huge amount.

I don't think that it is not rational to take the gamble when the odds are 1%, much less when the odds are 20% or 49% or 70%. Let's go with 1% because I am willing to give you favourable odds - so the post asks, would you be willing to be in a torture chamber now 1 hour for every 99 hours that you are in a really happy state? We can increase that to 20 hours (20%) or what have you. And here I am talking about real extreme torture, not you have a headache. So imagine the worst torture methods that currently exist, and it is not waterboarding - check worst torture methods of history and if you are objective, whatever odds you would be willing to accept, if you say 20% or 1%, would you be willing to be really tortured for that amount of time single every day?

connor_flexman on Tail SP 500 Call Options

Kelly always applies

nathan-helm-burger on The Hopium Wars: the AGI Entente Delusion

Ah, I meant, would agree to pause once things came to a head. Pretty sure these political leaders are selfish enough that if they saw clear evidence of their imminent demise, and had a safer option, they'd take the out.

tsvibt on Yudkowsky on The Trajectory podcast

You people are somewhat crazy overconfident about humanity knowing enough to make AGI this decade. https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce [LW · GW]
One hope on the scale of decades is that strong germline engineering should offer an alternative vision to AGI. If the options are "make supergenius non-social alien" and "make many genius humans", it ought to be clear that the latter is both much safer and gets most of the hypothetical benefits of the former.