LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

How I select alignment research projects
Ethan Perez (ethan-perez) · 2024-04-10T04:33:08.092Z · comments (4)

A sketch of acausal trade in practice
Richard_Ngo (ricraz) · 2024-02-04T00:32:54.622Z · comments (4)

[link] OpenAI appoints Retired U.S. Army General Paul M. Nakasone to Board of Directors
Joel Burget (joel-burget) · 2024-06-13T21:28:18.110Z · comments (10)

List of strategies for mitigating deceptive alignment
joshc (joshua-clymer) · 2023-12-02T05:56:50.867Z · comments (2)

Open Thread – Winter 2023/2024
habryka (habryka4) · 2023-12-04T22:59:49.957Z · comments (160)

Humans aren't fleeb.
Charlie Steiner · 2024-01-24T05:31:46.929Z · comments (5)

Secondary Risk Markets
Vaniver · 2023-12-11T21:52:46.836Z · comments (4)

[link] My article in The Nation — California’s AI Safety Bill Is a Mask-Off Moment for the Industry
garrison · 2024-08-15T19:25:59.592Z · comments (0)

Empirical vs. Mathematical Joints of Nature
Elizabeth (pktechgirl) · 2024-06-26T01:55:22.858Z · comments (1)

[link] List of Collective Intelligence Projects
Chipmonk · 2024-07-02T14:10:41.789Z · comments (9)

Categories of leadership on technical teams
benkuhn · 2024-07-22T04:50:04.071Z · comments (0)

Economics Roundup #2
Zvi · 2024-07-02T12:40:05.908Z · comments (5)

Representation Tuning
Christopher Ackerman (christopher-ackerman) · 2024-06-27T17:44:33.338Z · comments (9)

Dangers of Closed-Loop AI
Gordon Seidoh Worley (gworley) · 2024-03-22T23:52:22.010Z · comments (7)

My Detailed Notes & Commentary from Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:51.894Z · comments (16)

Index of rationalist groups in the Bay Area July 2024
Lucie Philippon (lucie-philippon) · 2024-07-26T16:32:25.337Z · comments (10)

[link] Twitter thread on politics of AI safety
Richard_Ngo (ricraz) · 2024-07-31T00:00:34.298Z · comments (2)

[link] On Fables and Nuanced Charts
Niko_McCarty (niko-2) · 2024-09-08T17:09:07.503Z · comments (2)

Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?
RogerDearnaley (roger-d-1) · 2024-01-11T12:56:29.672Z · comments (4)

[link] Suffering Is Not Pain
jbkjr · 2024-06-18T18:04:43.407Z · comments (45)

[link] hydrogen tube transport
bhauth · 2024-04-18T22:47:08.790Z · comments (12)

[link] The last era of human mistakes
owencb · 2024-07-24T09:58:42.116Z · comments (2)

[link] My Apartment Art Commission Process
jenn (pixx) · 2024-08-26T18:36:44.363Z · comments (4)

If You Can Climb Up, You Can Climb Down
jefftk (jkaufman) · 2024-07-30T00:00:06.295Z · comments (9)

[link] The $100B plan with "70% risk of killing us all" w Stephen Fry [video]
Oleg Trott (oleg-trott) · 2024-07-21T20:06:39.615Z · comments (8)

[link] Robin Hanson & Liron Shapira Debate AI X-Risk
Liron · 2024-07-08T21:45:40.609Z · comments (4)

AI Safety Strategies Landscape
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-09T17:33:45.853Z · comments (1)

[link] Romae Industriae
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-19T13:03:31.536Z · comments (2)

[link] legged robot scaling laws
bhauth · 2024-01-20T05:45:56.632Z · comments (8)

AI Impacts Survey: December 2023 Edition
Zvi · 2024-01-05T14:40:06.156Z · comments (6)

LessWrong: After Dark, a new side of LessWrong
So8res · 2024-04-01T22:44:04.449Z · comments (5)

The Schumer Report on AI (RTFB)
Zvi · 2024-05-24T15:10:03.122Z · comments (3)

AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan · 2024-06-12T03:30:05.747Z · comments (0)

Computational Mechanics Hackathon (June 1 & 2)
Adam Shai (adam-shai) · 2024-05-24T22:18:44.352Z · comments (5)

AI #56: Blackwell That Ends Well
Zvi · 2024-03-21T12:10:05.412Z · comments (16)

[link] GPT2, Five Years On
Joel Burget (joel-burget) · 2024-06-05T17:44:17.552Z · comments (0)

Intransitive Trust
Screwtape · 2024-05-27T16:55:29.294Z · comments (15)

[link] Why Yudkowsky is wrong about "covalently bonded equivalents of biology"
titotal (lombertini) · 2023-12-06T14:09:15.402Z · comments (40)

Reflective consistency, randomized decisions, and the dangers of unrealistic thought experiments
Radford Neal · 2023-12-07T03:33:16.149Z · comments (25)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures
abstractapplic · 2024-05-17T00:25:42.950Z · comments (12)

Unpicking Extinction
ukc10014 · 2023-12-09T09:15:41.291Z · comments (10)

[link] AI governance needs a theory of victory
Corin Katzke (corin-katzke) · 2024-06-21T16:15:46.560Z · comments (6)

How to develop a photographic memory 1/3
PhilosophicalSoul (LiamLaw) · 2023-12-28T13:26:36.669Z · comments (6)

Monthly Roundup #12: November 2023
Zvi · 2023-11-14T15:20:06.926Z · comments (5)

Copyright Confrontation #1
Zvi · 2024-01-03T15:50:04.850Z · comments (7)

Linear encoding of character-level information in GPT-J token embeddings
mwatkins · 2023-11-10T22:19:14.654Z · comments (4)

Adam Smith Meets AI Doomers
James_Miller · 2024-01-31T15:53:03.070Z · comments (10)

What I Learned (Conclusion To "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-20T21:24:37.464Z · comments (0)

Difficulty classes for alignment properties
Jozdien · 2024-02-20T09:08:24.783Z · comments (5)

[link] Inferring the model dimension of API-protected LLMs
Ege Erdil (ege-erdil) · 2024-03-18T06:19:25.974Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

abandon on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

With regards to placebo, the strength of the effect has actually been debated here on Less Wrong— Why I don't believe in the placebo effect [LW · GW] argues that the experimental evidence is quite weak and in some cases plausibly an artifact of poor study design

nick_tarleton on Scissors Statements for President?

It sounds to me like the model is 'the candidate needs to have a (party-aligned) big blind spot in order to be acceptable to the extremists(/base)'. (Which is what you'd expect, if those voters are bucketing 'not-seeing A' with 'seeing B'.)

(Riffing off from that: I expect there's also something like, Motive Ambiguity-style, 'the candidate needs to have some, familiar/legible(?), big blind spot, in order to be acceptable/non-triggering to people who are used to the dialectical conflict'.)

startattheend on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

They did answer the question, there's just a little bit of deduction required? I understood it at a glance and didn't even notice any typos. Situations in which agents can learn something without understanding the reasons behind what they learn are quite common, it's not a novel idea, it just raises a red flag in people who are used to scientific thinking. The general bias in society against tradition/spirituality/religion is too strong compared to the utility (even if not correctness) of these three.

That useless extra text in my previous comment saves a future comment or to by taking things into account in advance. I even wrote the "I didn't understand the explanation" reaction above (as something one might have thought before downvoting the comment), so it's not that I didn't think of it, I just considered it an unlikely reaction as I disagree with it

abandon on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

The explanation is bad both in the sense of being unkind and in the sense of being unlikely. There are many explanations which are likelier, kinder, and simpler. I think you overestimate your skill at thinking of explanations, and commented for that reason. (Edit: that is, I think you should, if your likeliest explanation is of this quality, consider yourself not to know the true explanation, rather than believing the one you came up with).

kjz on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

I could imagine this turning into a flexible system of alliances similar to the conference system in NCAA college football and other sports (see here for a nice illustrated history of the many changes over time). Just as conferences and schools negotiate membership based on the changing quality of their sports programs, ability to generate revenue, and so on, states could form coalitions that could be renegotiated based on changing populations or voter preferences.

Thinking from that perspective, one potential Schelling point could be a "Northwest" coalition of WA/OR/ID/MT/WY/ND/SD/NE. This is quite well-balanced, as these states combined to give 21 EV to each candidate. And although the state populations are higher in WA/OR (12.0M) than the six red states (7.4M), the combined vote totals actually show a small lead for Trump (4.1M vs 3.9M, with more votes remaining to be counted in the blue states likely to close the gap).

After this, maybe the remaining "Southwest" states (NV, UT, CO, AZ, NM) decide to join forces? Here a state by state analysis is less useful, especially since two of them still haven't been called, but the current combined vote count is a very narrow Trump lead of 4.07M to 4.05M.

The eastern half of the country seems harder to predict - clearly there are large potential blocs of blue states in the northeast and red states in the southeast, but it's harder to see clear geographical groupings that make sense.

Unlikely any of this happens of course, but fun to think about.

brendan-long on AI #89: Trump Card

Finally, note to self, probably still don’t use SQLite if you have a good alternative? Twice is suspicious, although they did fix the bug same day and it wasn’t ever released.

But is this because SQLite is unusually buggy, or because its code is unusually open, short and readable and thus understandable by an AI? I would guess that MySQL (for example) has significantly worse vulnerabilities but they're harder to find.

startattheend on Alexander Gietelink Oldenziel's Shortform

This seems like an argument in favor of:

Stability over potential improvement, tradition over change, mutation over identical offspring, settling in a local maximum over shaking things up, and specialization vs generalization.

It seems like a hyperparameter. A bit like the learning rate in AI perhaps? Echo chambers are a common consequence, so I think the optimal ratio of preaching to the choir is something like 0.8-0.9 rather than 1. In fact, I personally prefer the /allPosts suburl over the LW frontpage because the first few votes result in a feedback loop of engagement and upvotes (forming a temporary consensus on which new posts are better, in a way which seems unfairly weighted towards the first few votes). If the posts chosen for the frontpage use the ratio of upvotes and downvotes rather than the absolute amount, then I don't thing this bias will occur (conformity might still create a weak feedback loop though).

I'm simplifying some of these dynamics though.

avturchin on Quantum Immortality: A Perspective if AI Doomers are Probably Right

The problem with observables here is that there is another copy of me in another light cone, which has the same observables. So we can't say that another light cone is unobservable - I am already there and observing it. This is a paradoxical property of big world immortality: it requires actually existing but causally disconnected copies, which contradicts some definitions of actuality.

BTW, can you comment below to Vladinir Nesov, who seems to think that first-person perspective is illusion and only third-person perspective is real?

avturchin on Quantum Immortality: A Perspective if AI Doomers are Probably Right

A more interesting counterargument is "distribution shift." My next observer-moments have some probability distribution P of properties - representing what I am most likely to do in the next moment. If I die, and MWI is false, but chaotic inflation is true, then there are many minds similar to me and to my next observer-moments everywhere in the multiverse. However, they have a distribution of properties P2 - representing what they are more likely to observe. And maybe P ≠ P2. Or may be we can prove that P=P2 based on typicality.

dagon on Quantum Immortality: A Perspective if AI Doomers are Probably Right

I suspect we don't agree on what it means for something to matter. If outside the causal/observable cone (add dimensions to cover MWI if you like), the difference or similarity is by definition not observable.

And the distinction between "imaginary" and "real, but fully causally disconnected" is itself imaginary.

There is no identity substance, and only experience-reachable things matter. All agency and observation is embedded, there is no viewpoint from outside.