LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI as a powerful meme, via CGP Grey
TheManxLoiner · 2024-10-30T18:31:58.544Z · comments (8)

Motivation control
Joe Carlsmith (joekc) · 2024-10-30T17:15:50.881Z · comments (7)

[link] Analyzing how SAE features evolve across a forward pass
bensenberner · 2024-11-07T22:07:02.827Z · comments (0)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

[link] The Choice Transition
owencb · 2024-11-18T12:30:56.198Z · comments (4)

[link] Dangerous capability tests should be harder
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:20:50.610Z · comments (3)

[link] Literacy Rates Haven't Fallen By 20% Since the Department of Education Was Created
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-22T20:53:59.007Z · comments (0)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (0)

Monthly Roundup #24: November 2024
Zvi · 2024-11-18T13:20:06.086Z · comments (14)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb · 2024-10-28T17:10:04.272Z · comments (3)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (10)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

[link] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Tamay · 2024-11-14T06:13:22.042Z · comments (0)

Signaling with Small Orange Diamonds
jefftk (jkaufman) · 2024-11-07T20:20:08.026Z · comments (1)

Drug development costs can range over two orders of magnitude
rossry · 2024-11-03T23:13:17.685Z · comments (0)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing
Connor Kissane (ckkissane) · 2024-10-27T18:46:21.316Z · comments (4)

AI Safety Camp 10
Robert Kralisch (nonmali-1) · 2024-10-26T11:08:09.887Z · comments (9)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (38)

Doing Research Part-Time is Great
casualphysicsenjoyer (hatta_afiq) · 2024-11-22T19:01:15.542Z · comments (7)

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean (euanmclean) · 2024-10-29T12:16:18.448Z · comments (8)

[question] Feedback request: what am I missing?
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-02T17:38:39.625Z · answers+comments (5)

ARENA 4.0 Impact Report
Chloe Li (chloe-li-1) · 2024-11-27T20:51:54.844Z · comments (1)

[link] Locally optimal psychology
Chipmonk · 2024-11-25T18:35:11.985Z · comments (7)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
Joe Rogero · 2024-11-12T23:55:46.770Z · comments (17)

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data
Sohaib Imran (sohaib-imran) · 2024-11-16T23:22:21.857Z · comments (5)

AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan · 2024-11-27T06:30:03.821Z · comments (0)

Basics of Handling Disagreements with People
Camille Berger (Camille Berger) · 2024-11-12T17:55:08.143Z · comments (4)

[question] Are You More Real If You're Really Forgetful?
Thane Ruthenis · 2024-11-24T19:30:55.233Z · answers+comments (24)

The slingshot helps with learning
Wilson Wu (wilson-wu) · 2024-10-31T23:18:16.762Z · comments (0)

Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link
tailcalled · 2024-11-04T21:11:57.788Z · comments (0)

Housing Roundup #10
Zvi · 2024-10-29T13:50:09.416Z · comments (2)

[link] Stone Age Herbalist's notes on ant warfare and slavery
trevor (TrevorWiesinger) · 2024-11-09T02:40:01.128Z · comments (0)

A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (12)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

[question] How should vegans think about Methionine needs?
ChristianKl · 2024-11-10T09:28:47.655Z · answers+comments (2)

Bay Winter Solstice 2024: Speech Auditions
ozymandias · 2024-11-04T22:31:38.680Z · comments (0)

AI #90: The Wall
Zvi · 2024-11-14T14:10:04.562Z · comments (6)

Context-dependent consequentialism
Jeremy Gillen (jeremy-gillen) · 2024-11-04T09:29:24.310Z · comments (6)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

SAE Probing: What is it good for? Absolutely something!
Subhash Kantamneni (subhashk) · 2024-11-01T19:23:55.418Z · comments (0)

Incentive design and capability elicitation
Joe Carlsmith (joekc) · 2024-11-12T20:56:05.088Z · comments (0)

Compute and size limits on AI are the actual danger
Shmi (shminux) · 2024-11-23T21:29:37.433Z · comments (5)

Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks
Tom DAVID (tom-david) · 2024-11-27T02:54:16.263Z · comments (0)

Causal inference for the home gardener
braces · 2024-11-27T17:55:52.629Z · comments (1)

[link] Why Recursion Pharmaceuticals abandoned cell painting for brightfield imaging
Abhishaike Mahajan (abhishaike-mahajan) · 2024-11-05T14:51:41.310Z · comments (1)

Option control
Joe Carlsmith (joekc) · 2024-11-04T17:54:03.073Z · comments (0)

[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (5)

Trading Candy
jefftk (jkaufman) · 2024-11-01T01:10:08.024Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

niplav on OpenAI Email Archives (from Musk v. Altman)

We had originally just wanted space cycles donated

I think this is a mistake, and it should be "spare cycles" instead.

annasalamon on Information vs Assurance

Seems helpful for understanding how believing-ins [LW · GW] get formed by groups, sometimes.

vladimir_nesov on Bogdan Ionut Cirstea's Shortform

From proliferation perspective, it reduces overhang [LW · GW], makes it more likely that Llama 4 gets long reasoning trace post-training in-house rather than later, and so initial capability evaluations give more relevant results. But if Llama 4 is already training, there might not be enough time for the technique to mature, and Llamas have been quite conservative in their techniques so far.

zac-hatfield-dodds on Is the mind a program?

I think your argument also has to establish that the cost of simulating any that happen to matter is also quite high.

My intuition is that capturing enough secondary mechanisms, in sufficient-but-abstracted detail that the simulated brain is behaviorally normal (e.g. a sim of me not-more-different than a very sleep-deprived me), is likely to be both feasible by your definition and sufficient for consciousness.

vladimir_nesov on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

That's relevant, but about what I expected and why I hedged with "it should be possible to post-train", which that paper doesn't explore. Residual stream on many tokens is working memory, N layers of "vertical" compute over one token only have one activation vector to work with, while with more filler tokens you have many activation vectors that can work on multiple things in parallel and then aggregate. If a weaker model doesn't take advantage of this, or gets too hung up on concrete tokens to think about other things in the meantime, instead of being able to maintain multiple trains of thought simultaneously, a stronger model might^[1].

Performance on large questions (such as reading comprehension) with immediate answer (no CoT) shows that N layers across many tokens and no opportunity to get deeper serial compute is sufficient for many purposes. But a question is only fully understood when it's read completely, so some of the thinking about the answer can't start before that. If there are no more tokens, this creates an artificial constraint on working memory for thinking about the answer, filler tokens should be useful for lifting it. Repeating the question seems to help for example (see Figure 3 and Table 5).

The paper is from Jul 2023 and not from OpenAI, so it didn't get to play with 2e25+ FLOPs models, and a new wave of 2e26+ FLOPs models is currently imminent. ↩︎

christiankl on Repeal the Jones Act of 1920

Congress does have leadership that's separate from the president. People like Nancy Pelosi have political power.

You also have a lot of other organizations. Organizations like the Chamber of Commerce can drive legislative change as well.

davidmanheim on Mitigating Geomagnetic Storm and EMP Risks to the Electrical Grid (Shallow Dive)

That is an interesting question l, but I unfortunately do not know enough to even figure out how to answer it.

viliam on Making a conservative case for alignment

I agree with most of that, but it seems to me that respecting homosexuality is mostly a passive action; if you ignore what other people do, you are already maybe 90% there. Homosexuals don't change their names or pronouns after coming out. You don't have to pretend that ten years ago they were something else than they appeared to you at that time.

With transsexuality, you get the taboo of deadnaming, and occasionally the weird pronouns.

Also, the reaction seems different when you try to opt out of the game. Like, if someone is uncomfortable with homosexuality, they can say "could we please just... not discuss our sexual relations here, and focus on the job (or some other reason why we are here)?" and that's usually accepted. If someone similarly says "could we please just... call everyone 'they' as a compromise solution, or simply refer to people using their names", that already got some people cancelled.

Shortly, with homosexuals I never felt like my free speech was under attack.

It is possible that most of the weirdness and pushing boundaries does not actually come from the transsexuals themselves, but rather from woke people who try to be their "allies". Either way, in effect, whenever a discussion about trans topics starts, I feel like "oh my, the woke hordes are coming, people are going to get cancelled". (And I am not really concerned about myself here, because I am not American, so my job is not on the line; and if some online community decides to ban me, well then fuck them. But I don't want to be in a community where people need to watch their tongues, and get filtered by political conformity.)

sil-ver on Is the mind a program?

Sort of not obvious what exactly "causal closure" means if the error tolerance is not specified. We could differentiate literally 100% perfect causal closure, almost perfect causal closure, and "approximate" causal closure. Literally 100% perfect causal closure is impossible for any abstraction due to every electron exerting nonzero force on any other electron in its future lightcone. Almost perfect causal closure (like 99.9%+) might be given for your laptop if it doesn't have a wiring issue(?), maybe if a few more details are included in the abstraction. And then whether or there exists an abstraction for the brain with approximate causal closure (95% maybe?) is an open question.

I'd argue that almost perfect causal closure is enough for an abstraction to contain relevant information about consciousness, and approximate causal closure probably as well. Of course there's not really a bright line between those two, either. But I think insofar as OP's argument is one against approximate causal closure, those details don't really matter.

donald-hobson on "The Solomonoff Prior is Malign" is a special case of a simpler argument

That is, our experiences got more reality-measure, thus matter more, by being easier to point at them because of their close proximity to the conspicuous event of the hottest object in the Universe coming to existence.

Surely not. Surely our experiences always had more reality measure from the start because we were the sort of people who would soon create the hottest thing.

Reality measure can flow backwards in time. And our present day reality measure is being increased by all the things an ASI will do when we make one.