LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Forecasting One-Shot Games
Raemon · 2024-08-31T23:10:05.475Z · comments (0)

I finally got ChatGPT to sound like me
lsusr · 2024-09-17T09:39:59.415Z · comments (18)

[link] MIRI's September 2024 newsletter
Harlan · 2024-09-16T18:15:40.785Z · comments (0)

[link] What Ketamine Therapy Is Like
Sable · 2024-11-11T11:09:08.602Z · comments (8)

The Shallow Bench
Karl Faulks (karl-faulks) · 2024-11-05T05:07:27.357Z · comments (5)

Toy Models of Feature Absorption in SAEs
chanind · 2024-10-07T09:56:53.609Z · comments (8)

AI #88: Thanks for the Memos
Zvi · 2024-10-31T15:00:07.412Z · comments (5)

Conflating value alignment and intent alignment is causing confusion
Seth Herd · 2024-09-05T16:39:51.967Z · comments (18)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (5)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

AI #91: Deep Thinking
Zvi · 2024-11-21T14:30:06.930Z · comments (9)

[link] Dangerous capability tests should be harder
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:20:50.610Z · comments (3)

In defense of technological unemployment as the main AI concern
tailcalled · 2024-08-27T17:58:01.992Z · comments (36)

Secular Solstice Round Up 2024
dspeyer · 2024-11-21T10:49:36.682Z · comments (10)

[link] Analyzing how SAE features evolve across a forward pass
bensenberner · 2024-11-07T22:07:02.827Z · comments (0)

[link] Epistemic status: poetry (and other poems)
Richard_Ngo (ricraz) · 2024-11-21T18:13:17.194Z · comments (4)

Start an Upper-Room UV Installation Company?
jefftk (jkaufman) · 2024-10-19T02:00:10.691Z · comments (9)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

[link] Literacy Rates Haven't Fallen By 20% Since the Department of Education Was Created
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-22T20:53:59.007Z · comments (0)

Economics Roundup #3
Zvi · 2024-09-10T13:50:06.955Z · comments (9)

[link] The Choice Transition
owencb · 2024-11-18T12:30:56.198Z · comments (4)

Motivation control
Joe Carlsmith (joekc) · 2024-10-30T17:15:50.881Z · comments (7)

Monthly Roundup #24: November 2024
Zvi · 2024-11-18T13:20:06.086Z · comments (14)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (12)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (0)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Minimal Motivation of Natural Latents
johnswentworth · 2024-10-14T22:51:58.125Z · comments (14)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

Australian AI Safety Forum 2024
Liam Carroll (liam-carroll) · 2024-09-27T00:40:11.451Z · comments (0)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (10)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

[link] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (23)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Reflections on the Metastrategies Workshop
gw · 2024-10-24T18:30:46.255Z · comments (5)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

[link] IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman · 2024-10-24T20:30:41.159Z · comments (12)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

Live Machinery: An Interface Design Philosophy for Wholesome AI Futures
Sahil · 2024-11-01T17:24:09.957Z · comments (2)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
AI Impacts (AI Imacts) · 2024-10-28T17:10:04.272Z · comments (3)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (17)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

viliam on Alignment is not intelligent

LessWrong has a nice font, and the screenshots are a bit difficult to read. You could have copied the text.

(I am not really interested in debating Claude, btw.)

viliam on James Camacho's Shortform

For utilitarianism, you need to choose a utility function. This is entirely based on your preferences: what you value, and who you value get weighed and summed to create your utility function. I don't see how this differs from selfish egoism: you decide what and who you value, and take actions that maximize these values.

I see a difference in the word "summed". In practice this would probably mean things like cooperating in the Prisoner's Dilemma (maximizing the sum of utility, rather than the utility of an individual player).

mikhail-samin on "The Solomonoff Prior is Malign" is a special case of a simpler argument

I’d bet 1:1 that, conditional on building a CEV-aligned AGI, we won’t consider this type of problem to have been among the top-5 hardest to solve.

Reality-fluid in our universe should pretty much add up to normality, to the extent it’s Tegmark IV (and it’d be somewhat weird for your assumed amount of compute and simulations to exist but not for all computations/maths objects to exist).

If a small fraction of computers simulating this branch stop, this doesn’t make you stop. All configurations of you are computed; simulators might slightly change the relative likelihood of currently being in one branch or another, but they can’t really terminate you

Furthermore, our physics seems very simple, and most places that compute us probably do it faithfully, on the level of the underlying physics, with no interventions.

I feel like thinking of reality-fluid as just inverse relationship to the description length might produce wrong intuitions. In Tegmark IV, you still get more reality-fluid if someone simulates you; and it’s less intuitive why this translates into shorter description length. It might be better to think of it as: if all computation/maths exists and I open my eyes in a random place, how often would that happen here? All the places run this world give some of their reality-fluid to this world. If a place visible from a bunch of other places starts to simulate this universe, it will be visible from slightly more places.

You can think of the entire object of everything, with all of its parts being simulated in countless other parts; or imagine a Markov process, but with worlds giving each other reality-fluid.

In that sense, the resource that we have is the reality-fluid of our future lightcone; it is our endowment, and we can use it to maximize the overall flourishing in the entire structure.

If we make decisions based on how good the overall/average use of the reality-fluid would be, you’ll gain less reality-fluid by manipulating our world the way described in the post than you’ll spend on the manipulation. It’s probably better for you to trade with us instead.

(I also feel like there might be a reasonable way to talk about causal descendants, where the probabilities are whatever abides the math of probability theory and causality down the nodes we care about, instead of being the likelihoods of opening eyes in different branches in a particular moment of evaluation.)

ape-in-the-coat on Antropical Probabilities Are Fully Explained by Difference in Possible Outcomes

It is an observation selection effect

It's just the simple fact that conditional probability of an event can be different from unconditional one.

Before you toss the coin you can reason only based on priors and therefore your credence is 1/2. But when a person hears "Hello", they've observed an event "I was selected from a large crowd" which happens twice as likely when the coin is Tails, therefore they can update on this information and get their credence in Tails up to 2/3.

This is exactly as surprising as the fact that after you tossed the coin and observed that it's Heads suddenly your credence in Heads is 100%, even though before the coin toss it was merely 50%.

mr-hire on Two flavors of computational functionalism

For me the answer is yes. There's some way of interpreting the colors of grains of sands on the beach as they swirl in the wind that would perfectly implement the miller robin primality test algorithm. So is the wind + sand computing the algorithm?

dragongod on DeepSeek beats o1-preview on math, ties on coding; will release weights

o1's reasoning trace also does this for different languages (IIRC I've seen Chinese and Japanese and other languages I don't recognise/recall), usually an entire paragraph not a word, but when I translated them it seemed to make sense in context.

steve2152 on Two flavors of computational functionalism

Do you think there are edge cases where I ask “Is such-and-such system running the Miller-Rabin primality test algorithm?”, and the answer is not a clear yes or no, but rather “Well, umm, kinda…”?

(Not rhetorical! I haven’t thought about it much.)

momom2 on Which things were you surprised to learn are not metaphors?

Top of the head like when I'm trying to frown too hard

jeremy-gillen on lemonhope's Shortform

would hopefully include many people who understand that understanding constraints is key and that past research understood some constraints.

Good point, I'm convinced by this.

build on past agent foundations research
I don't really agree with this. Why do you say this?

That's my guess at the level of engagement required to understand something. Maybe just because when I've tried to use or modify some research that I thought I understood, I always realise I didn't understand it deeply enough. I'm probably anchoring too hard on my own experience here, other people often learn faster than me.

(Also I'm confused about the discourse in this thread (which is fine), because I thought we were discussing "how / how much should grantmakers let the money flow".)

I was thinking "should grantmakers let the money flow to unknown young people who want a chance to prove themselves."

knight-lee on A better “Statement on AI Risk?”

It's true that risk alone isn't a good way to decide budgets. You're even more correct that convincing demands to spend money are something politicians learn to ignore out of necessity.

But while risk alone isn't a good way to decide budgets, you have to admit that lots of budget items have the purpose of addressing risk. For example, flood barriers address hurricane/typhoon rick. Structural upgrades address earthquake risk. Some preparations also address pandemic risk.

If you accept that some budget items are meant to address risk, shouldn't you also accept that the amount of spending should be somewhat proportional to the amount of risk? In that case, if the risk of NATO getting invaded is similar in amount to the rogue AGI risk, then the military spending to protect against invasion should be similar in amount to the spending to protect against rogue ASI.

I admit that politicians might not be rational enough to understand this, and there is a substantial probability this statement will fail. But it is still worth trying. The cost is a mere signature and the benefit may be avoiding a massive miscalculation.

Making this statement doesn't prevent others from making an even better statement. Many AI experts have signed multiple statements, e.g. the "Statement on AI Risk," and "Pause Giant AI Experiments." Some politicians and people are more convinced by one argument, while others are more convinced by another argument, so it helps to have different kinds of arguments backed by many signatories. Encouraging AI safety spending doesn't conflict with encouraging AI regulation. I think the competition between different arguments isn't actually that bad.