LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)

[question] Is there a known method to find others who came across the same potential infohazard without spoiling it to the public?
hive · 2024-10-17T10:47:05.099Z · answers+comments (6)

[link] A Logical Proof for the Emergence and Substrate Independence of Sentience
rife (edgar-muniz) · 2024-10-24T21:08:09.398Z · comments (31)

[link] The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind
Roko · 2024-10-16T01:24:51.102Z · comments (18)

Developmental Stages in Multi-Problem Grokking
James Sullivan · 2024-09-29T18:58:22.954Z · comments (0)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (2)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

It is time to start war gaming for AGI
yanni kyriacos (yanni) · 2024-10-17T05:14:17.932Z · comments (1)

Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation
Antonio Clarke (antonio-clarke) · 2024-09-29T18:48:23.308Z · comments (0)

[question] How do you follow AI (safety) news?
PeterH · 2024-09-24T13:58:48.916Z · answers+comments (2)

[link] Predictions as Public Works Project — What Metaculus Is Building Next
ChristianWilliams · 2024-10-22T16:35:13.999Z · comments (0)

Jailbreaking ChatGPT and Claude using Web API Context Injection
Jaehyuk Lim (jason-l) · 2024-10-21T21:34:37.579Z · comments (0)

The Case For Giving To The Shrimp Welfare Project
omnizoid · 2024-11-15T16:03:57.712Z · comments (0)

[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)

Methodology: Contagious Beliefs
James Stephen Brown (james-brown) · 2024-10-19T03:58:17.966Z · comments (0)

Transformers Explained (Again)
RohanS · 2024-10-22T04:06:33.646Z · comments (0)

[question] EndeavorOTC legit?
FinalFormal2 · 2024-10-17T01:33:12.606Z · answers+comments (0)

Bellevue Meetup
Cedar (xida-ren) · 2024-10-16T01:07:58.761Z · comments (0)

On the Practical Applications of Interpretability
Nick Jiang (nick-jiang) · 2024-10-15T17:18:25.280Z · comments (0)

Personal Philosophy
Xor · 2024-10-13T03:01:59.324Z · comments (0)

Enabling New Applications with Today's Mechanistic Interpretability Toolkit
ananya_joshi · 2024-10-25T17:53:23.960Z · comments (0)

AI Compute governance: Verifying AI chip location
Farhan · 2024-10-12T17:36:45.942Z · comments (0)

Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)

Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)

Hamiltonian Dynamics in AI: A Novel Approach to Optimizing Reasoning in Language Models
Javier Marin Valenzuela (javier-marin-valenzuela) · 2024-10-09T19:14:56.162Z · comments (0)

Near-death experiences
Declan Molony (declan-molony) · 2024-10-08T06:34:04.107Z · comments (1)

[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)

[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)

(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)

[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-09-29T03:13:34.615Z · comments (0)

Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)

Quantum Immortality: A Perspective if AI Doomers are Probably Right
avturchin · 2024-11-07T16:06:08.106Z · comments (28)

On Measuring Intellectual Performance - personal experience and several thoughts
Alexander Gufan (alexander-gufan) · 2024-09-20T17:21:19.747Z · comments (2)

Collapsing “Collapsing the Belief/Knowledge Distinction”
Jeremias (jeremias-sur) · 2024-09-20T16:11:33.558Z · comments (0)

Endogenous Growth and Human Intelligence
Nicholas D. (nicholas-d) · 2024-09-18T14:05:54.567Z · comments (0)

Theories With Mentalistic Atoms Are As Validly Called Theories As Theories With Only Non-Mentalistic Atoms
Lorec · 2024-11-12T06:45:26.039Z · comments (4)

For Limited Superintelligences, Epistemic Exclusion is Harder than Robustness to Logical Exploitation
Lorec · 2024-09-15T20:49:06.370Z · comments (9)

[question] Calibration training for 'percentile rankings'?
david reinstein (david-reinstein) · 2024-09-14T21:51:55.705Z · answers+comments (0)

[link] Podcast discussing Hanson's Cultural Drift Argument
vaishnav92 · 2024-10-20T17:58:41.416Z · comments (0)

[link] How to give effectively to US Dems
Hauke Hillebrandt (hauke-hillebrandt) · 2024-09-24T14:38:29.678Z · comments (0)

Can AI Quantity beat AI Quality?
Gianluca Calcagni (gianluca-calcagni) · 2024-10-02T15:21:45.711Z · comments (0)

Breaking beliefs about saving the world
Oxidize · 2024-11-15T00:46:03.693Z · comments (0)

Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)

Advice on Communicating Concisely
EvolutionByDesign (bioluminescent-darkness) · 2024-10-20T16:45:41.053Z · comments (9)

Evaluating LLaMA 3 for political sycophancy
alma.liezenga · 2024-09-28T19:02:36.342Z · comments (2)

How to Teach Your Brain to Hate Procrastination
10xyz (10xyz-coder) · 2024-10-21T20:12:40.809Z · comments (0)

Changing the Mind of an LLM
testingthewaters · 2024-10-11T22:25:37.464Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

satron on Sabotage Evaluations for Frontier Models

If the default path is AI's taking over control from humans, then what is the current plan in leading AI labs? Surely all the work they put in AI safety is done to prevent exactly such scenarios. I would find it quite hard to believe that a large group of people would vigorously do something if they believed that their efforts will go to vain.

mondsemmel on Lao Mein's Shortform

Just as one example, OpenAI was against SB 1047, whereas Musk was for it. I'm not optimistic about regulation being enough to save us, but presumably they would be helpful, and some AI companies like OpenAI were against even the limited regulations of SB 1047. Plus SB 1047 also included stuff like whistleblower protections, and that's the kind of thing that could help policymakers make better decisions in the future.

mondsemmel on Lao Mein's Shortform

I'm sympathetic to Musk being genuinely worried about AI safety. My problem is that one of his first actions after learning about AI safety was to found OpenAI, and that hasn't worked out very well. Not just due to Altman; even the "Open" part was a highly questionable goal. Hopefully Musk's future actions in this area would have positive EV, but still.

leon-lang on johnswentworth's Shortform

What’s your opinion on the possible progress of systems like AlphaProof, o1, or Claude with computer use?

johnswentworth on johnswentworth's Shortform

I don't expect that to be particularly relevant. The data wall is still there; scaling just compute has considerably worse returns than the curves we've been on for the past few years, and we're not expecting synthetic data to be anywhere near sufficient to bring us close to the old curves.

unexpectedvalues on Seven lessons I didn't learn from election day

I don't really know, sorry. My memory is that 2023 already pretty bad for incumbent parties (e.g. the right-wing ruling party in Poland lost power), but I'm not sure.

alexander-gietelink-oldenziel on Lao Mein's Shortform

How would removing Sam Altman significantly reduce extinction risk? Conditional on AI alignment being hard and Doom likely the exact identity of the Shoggoth Summoner seems immaterial.

benito on Sabotage Evaluations for Frontier Models

Yes, it does imply that the default path is permanent-disempowerment or extinction.

johnswentworth on The Median Researcher Problem

unless you additionally posit an additional mechanism like fields with terrible replication rates have a higher standard deviation than fields without them

Why would that be relevant?

sherrinford on Seven lessons I didn't learn from election day

The most important fact about politics in 2024 is that across the world, it's a terrible time to be an incumbent. For the first time this year since at least World War II, the incumbent party did worse than it did in the previous election in every election in the developed world. ...

What influence does the exclusion of "years where fewer than five countries had elections" in the graph have?