LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Conflating value alignment and intent alignment is causing confusion
Seth Herd · 2024-09-05T16:39:51.967Z · comments (18)

Untrustworthy models: a frame for scheming evaluations
Olli Järviniemi (jarviniemi) · 2024-08-19T16:27:11.088Z · comments (3)

[link] MIRI's September 2024 newsletter
Harlan · 2024-09-16T18:15:40.785Z · comments (0)

All The Latest Human tFUS Studies
sarahconstantin · 2024-08-09T22:20:04.561Z · comments (2)

Decision Theory in Space
lsusr · 2024-08-18T07:02:11.847Z · comments (18)

Bounty for Evidence on Some of Palisade Research's Beliefs
benwr · 2024-09-23T20:01:20.917Z · comments (4)

D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset
aphyer · 2024-10-29T01:21:03.075Z · comments (10)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

How to hire somebody better than yourself
lukehmiles (lcmgcd) · 2024-08-28T08:12:53.450Z · comments (5)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

Startup Roundup #2
Zvi · 2024-08-06T13:30:06.554Z · comments (0)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (5)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

Start an Upper-Room UV Installation Company?
jefftk (jkaufman) · 2024-10-19T02:00:10.691Z · comments (9)

Economics Roundup #3
Zvi · 2024-09-10T13:50:06.955Z · comments (9)

In defense of technological unemployment as the main AI concern
tailcalled · 2024-08-27T17:58:01.992Z · comments (36)

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (7)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Minimal Motivation of Natural Latents
johnswentworth · 2024-10-14T22:51:58.125Z · comments (14)

[link] cancer rates after gene therapy
bhauth · 2024-10-16T15:32:53.949Z · comments (0)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (12)

Debate: Get a college degree?
Ben Pace (Benito) · 2024-08-12T22:23:34.744Z · comments (14)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (8)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

[link] you should probably eat oatmeal sometimes
bhauth · 2024-08-25T14:50:37.570Z · comments (31)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

[link] What's important in "AI for epistemics"?
Lukas Finnveden (Lanrian) · 2024-08-24T01:27:06.771Z · comments (0)

Superintelligent AI is possible in the 2020s
HunterJay · 2024-08-13T06:03:26.990Z · comments (3)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

[link] [Paper] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

[link] Adverse Selection by Life-Saving Charities
vaishnav92 · 2024-08-14T20:46:23.662Z · comments (16)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Californians, tell your reps to vote yes on SB 1047!
Holly_Elmore · 2024-08-12T19:50:09.817Z · comments (24)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (22)

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (12)

Reflections on the Metastrategies Workshop
gw · 2024-10-24T18:30:46.255Z · comments (5)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

ete on plex's Shortform

Ayahuasca: Informed consent on brain rewriting

(based on anecdotes and general models, not personal experience)

tailcalled on Three Notions of "Power"

Your link doesn't contain a detailed description of their methodology or intermediate results. I would have to do a lot of digging to make heads or tails of it.

I guess feel free to opt out of this conversation if you want but then ultimately I don't see you as having contributed any point that is relevant for me to respond to or update with.

ete on plex's Shortform

Grantmaking models and bottlenecks

ete on plex's Shortform

An opinionated tour of the AI safety funding landscape

avturchin on avturchin's Shortform

As was said above, first you need to pick a stone from the ground or pretend that you are doing this if there is no stone around. Even if you have a stone, make the gesture that you take it from the ground.

Another important point is to do it quickly and aggressively with loud cry. Also you can pull back one's arm with a stone.

The whole trick is that dogs are so afraid of stones that they will run away before you actually throw it or they see where it fails.

christiankl on Three Notions of "Power"

I consider it worthwhile to provide sources so that you can read their methodology yourself. I don't see a good reason for me to give you my interpretation of their methodology.

ete on plex's Shortform

Mesa-Hires - Making AI Safety funding stretch much further

ete on plex's Shortform

Why SWE automation means you should probably sell most of your cryptocurrency

ete on plex's Shortform

Titles of posts I'm considering writing in comments, upvote ones you'd like to see written.

anthonyc on Occupational Licensing Roundup #1

I'm not sure, IANAL, but I have wondered the same thing. Apparently there is this lawsuit in NJ: https://pltrust.org/blog/news/interstate-commerce-professional-licensing

Makes me wonder if the temporary relaxations of standards during covid made this a more defensible claim, since it meant imposing non-reciprocity was taking something away instead of not having allowed it all along. That might give a lot of plaintiffs standing to sue where they may not have had it previously?