LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

MIT FutureTech are hiring for a Head of Operations role
peterslattery · 2024-10-02T17:11:42.960Z · comments (0)

A Taxonomy Of AI System Evaluations
Maxime Riché (maxime-riche) · 2024-08-19T09:07:45.224Z · comments (0)

[link] Metaculus's 'Minitaculus' Experiments — Collaborate With Us
ChristianWilliams · 2024-08-26T20:44:32.125Z · comments (0)

[question] Practical advice for secure virtual communication post easy AI voice-cloning?
hmys (the-cactus) · 2024-08-09T17:32:33.458Z · answers+comments (5)

Grounding self-reference paradoxes in reality
Fiora from Rosebloom · 2024-09-29T05:50:30.559Z · comments (3)

Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?
Lorec · 2024-09-28T14:07:42.412Z · comments (6)

[link] Memorising molecular structures
dkl9 · 2024-07-12T22:40:42.307Z · comments (0)

The Existential Dread of Being a Powerful AI System
testingthewaters · 2024-09-26T10:56:32.904Z · comments (1)

Spark in the Dark Guest Spots
jefftk (jkaufman) · 2024-07-14T01:40:05.311Z · comments (0)

[link] A (paraconsistent) logic to deal with inconsistent preferences
B Jacobs (Bob Jacobs) · 2024-07-14T11:17:45.426Z · comments (2)

[Aspiration-based designs] A. Damages from misaligned optimization – two more models
Jobst Heitzig · 2024-07-15T14:08:15.716Z · comments (0)

Budapest Hungary - ACX Meetups Everywhere Fall 2024
Timothy Underwood (timothy-underwood-1) · 2024-08-29T18:37:41.313Z · comments (0)

[Research log] The board of Alphabet would stop DeepMind to save the world
Lucie Philippon (lucie-philippon) · 2024-07-16T04:59:14.874Z · comments (0)

GPT4o is still sensitive to user-induced bias when writing code
Reed (ThomasReed) · 2024-09-22T21:04:54.717Z · comments (0)

[question] Opinions on Eureka Labs
jmh · 2024-07-17T00:16:02.959Z · answers+comments (2)

Establishing a Connection (Ch 13-16)
a littoral wizard · 2024-07-17T23:56:23.069Z · comments (4)

Halifax Canada - ACX Meetups Everywhere Fall 2024
interstice · 2024-08-29T18:39:12.490Z · comments (0)

Activation Engineering Theories of Impact
kubanetics (jakub-nowak) · 2024-07-18T16:44:33.656Z · comments (1)

[link] Yet Another Critique of "Luxury Beliefs"
ymeskhout · 2024-07-18T18:37:28.703Z · comments (10)

Introduction to Modern Dating: Strategic Dating Advice for beginners
Jesper Lindholm · 2024-07-20T15:45:25.705Z · comments (5)

Inquisitive vs. adversarial rationality
gb (ghb) · 2024-09-18T13:50:09.198Z · comments (9)

[link] Demography and Destiny
Zero Contradictions · 2024-07-21T20:34:07.176Z · comments (11)

Food, Prison & Exotic Animals: Sparse Autoencoders Detect 6.5x Performing Youtube Thumbnails
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-09-17T03:52:43.269Z · comments (2)

[question] Can subjunctive dependence emerge from a simplicity prior?
Daniel C (harper-owen) · 2024-09-16T12:39:35.543Z · answers+comments (0)

Establishing a Connection (Ch 17-20)
a littoral wizard · 2024-07-23T21:56:48.122Z · comments (2)

Thirty random thoughts about AI alignment
Lysandre Terrisse · 2024-09-15T16:24:10.572Z · comments (1)

[link] SCP Foundation - Anti memetic Division Hub
landscape_kiwi · 2024-09-15T13:40:52.691Z · comments (1)

What does a Gambler's Verity world look like?
ErioirE (erioire) · 2024-07-25T22:03:56.447Z · comments (6)

Forever Leaders
Justice Howard (justice-howard) · 2024-09-14T20:55:39.095Z · comments (9)

[link] Optimising under arbitrarily many constraint equations
dkl9 · 2024-09-12T14:59:28.475Z · comments (0)

Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning
Tom Angsten (tom-angsten) · 2024-07-30T16:36:06.518Z · comments (0)

[link] Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions
James Stephen Brown (james-brown) · 2024-09-11T09:53:07.474Z · comments (0)

[link] Solutions to problems with Bayesianism
B Jacobs (Bob Jacobs) · 2024-07-31T14:18:27.910Z · comments (0)

[link] [Linkpost] Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)
Fernando Avalos (fernando-avalos) · 2024-09-09T03:33:53.548Z · comments (1)

[link] Contra Yudkowsky on 2-4-6 Game Difficulty Explanations
Josh Hickman (josh-hickman) · 2024-09-08T16:13:33.187Z · comments (1)

Does “Ultimate Neartermism” via Eternal Inflation dominate Longtermism in expectation?
Jordan Arel · 2024-08-17T22:28:21.849Z · comments (1)

[question] Can agents coordinate on randomness without outside sources?
Mikhail Samin (mikhail-samin) · 2024-07-06T13:43:44.633Z · answers+comments (16)

[question] Request for AI risk quotes, especially around speed, large impacts and black boxes
Nathan Young · 2024-08-02T17:49:48.898Z · answers+comments (0)

The Pragmatic Side of Cryptographically Boxing AI
Bart Jaworski (bart-jaworski) · 2024-08-06T17:46:21.754Z · comments (0)

Avoiding jailbreaks by discouraging their representation in activation space
Guido Bergman · 2024-09-27T17:49:20.785Z · comments (2)

Ethical Deception: Should AI Ever Lie?
Jason Reid (jason-reid) · 2024-08-02T17:53:38.744Z · comments (2)

[question] How do we know dreams aren't real?
Logan Zoellner (logan-zoellner) · 2024-08-22T12:41:57.380Z · answers+comments (31)

[link] The AI regulator’s toolbox: A list of concrete AI governance practices
Adam Jones (domdomegg) · 2024-08-10T21:15:09.265Z · comments (1)

[question] Can UBI overcome inflation and rent seeking?
Gordon Seidoh Worley (gworley) · 2024-08-01T00:13:51.693Z · answers+comments (34)

Meta: On viewing the latest LW posts
quiet_NaN · 2024-08-25T19:31:39.008Z · comments (2)

[link] In-Context Learning: An Alignment Survey
alamerton · 2024-09-30T18:44:28.589Z · comments (0)

[link] Launching the AI Forecasting Benchmark Series Q3 | $30k in Prizes
ChristianWilliams · 2024-07-08T17:20:54.717Z · comments (0)

Democracy beyond majoritarianism
Arturo Macias (arturo-macias) · 2024-09-03T15:10:56.284Z · comments (2)

Biasing LLM Response with Visual Stimuli
Jaehyuk Lim (jason-l) · 2024-10-03T18:04:31.474Z · comments (0)

[link] An "Observatory" For a Shy Super AI?
Sherrinford · 2024-09-27T21:22:40.296Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

leogao on leogao's Shortform

people love to find patterns in things. sometimes this manifests as mysticism- trying to find patterns where they don't exist, insisting that things are not coincidences when they totally just are. i think a weaker version of this kind of thinking shows up a lot in e.g literature too- events occur not because of the bubbling randomness of reality, but rather carry symbolic significance for the plot. things don't just randomly happen without deeper meaning.

some people are much more likely to think in this way than others. rationalists are very far along the spectrum in the "things just kinda happen randomly a lot, they don't have to be meaningful" direction.

there are some obvious cognitive bias explanations for why people would see meaning/patterns in things. most notably, it's comforting to feel like we understand things. the idea of the world being deeply random and things just happening for no good reason is scary.

but i claim that there is something else going on here. I think an inclination towards finding latent meaning is actually quite applicable when thinking about people. people's actions are often driven by unconscious drives to be quite strongly correlated with those drives. in fact, unconscious thoughts are often the true drivers, and the conscious thoughts are just the rationalization. but from the inside, it doesn't feel that way; from the inside it feels like having free will, and everything that is not a result of conscious thought is random or coincidental. this is a property that is not nearly as true of technical pursuits, so it's very reasonable to expect a different kind of reasoning to be ideal.

not only is this useful for modelling other people, but it's even more useful for modelling yourself. things only come to your attention if your unconscious brain decides to bring them to your attention. so even though something happening to you may be a coincidence, whether you focus on it or forget about it tells you a lot about what your unconscious brain is thinking. from the inside, this feels like things that should obviously be coincidence nonetheless having some meaning behind them. even the noticing of a hypothesis for the coincidence is itself a signal from your unconscious brain.

I don't quite know what the right balance is. on the one hand, it's easy to become completely untethered from reality by taking this kind of thing too seriously and becoming superstitious. on the other hand, this also seems like an important way of thinking about the world that is easy for people like me (and probably lots of people on LW) to underappreciate.

linda-linsefors on [Intuitive self-models] 3. The Homunculus

The way I understand it the homunculus is part of self. So if you put the wanting in the homunculus, it's also inside self. I don't know about you, but my self concept has more than wanting. To be fair, he homunculus concept is also a bit richer than wanting (I think?) but less encompassing than the full self (I think?).

rom on Cryonics is free

I don't know.

If the aldehyde preservation method is as good as traditional cryopreservation, then this looks like a pretty glaring market inefficiency—someone should be able to swoop in and undercut the established cryo companies.

I just don't know enough about the object level arguments to say much with confidence, but I'm a bit skeptical such a gap in the market exists.

lelapin on Dialogue introduction to Singular Learning Theory

I don't strongly disagree but do weakly disagree on some points so I guess I'll answer

Re first- if you buy into automated alignment work by human level AGI, then trying to align ASI now seems less worth it. The strongest counterargument to this I see is that "human level AGI" is impossible to get with our current understanding, as it will be superhuman in some things and weirdly bad at others.

Re second- disagreements might be nitpicking on "few other approaches" vs "few currently pursued approaches". There are probably a bunch of things that would allow fundamental understanding if they panned out (various agent foundations agendas, probably safe ai agendas like davidad's), though one can argue they won't apply to deep learning or are less promising to explore than SLT

crissman on Crissman's Shortform

Well, it was a bummer that my research on fasting found out I wasted my time doing 5:2 fasting for the last decade. Welp, I'll just research the next blog on calorie restriction. Everyone knows that's grea...

https://www.unaging.com/calorie-restriction/

Dammit. Why did I waste those five years doing calorie restriction before I started fasting?

aaa on MichaelDickens's Shortform

Out of curiosity - “it's because Dustin is very active in the democratic party and doesn't want to be affiliated with anything that is right-coded” Are the projects related to AI safety or just generally? And what are some examples?

cubefox on [Intuitive self-models] 3. The Homunculus

Steve writes:

the homunculus is definitionally the thing that carries “vitalistic force”, and that does the “wanting”, and that does any acts that we describe as “acts of free will”

Who wants things? Me of course. I want things. So the homunculus seems to be myself.

bohaska on Is Text Watermarking a lost cause?

Google Gemini uses a watermarking system called Synth ID that claims to be able to watermark text by skewing its probability distribution. Do you think it’ll be effective? Do you think that it’s useful to have this?

benito on Extended Interview with Zhukeepa on Religion

Semi-related ACX post that came out today: Against The Cultural Christianity Argument.

mary-chernyshenko on Silliness

(I don't think all silliness is fun. I have been hearing lame jokes of the same kind for months and they drive me up the wall.)