LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Turning Your Back On Traffic
jefftk (jkaufman) · 2024-07-17T01:00:08.627Z · comments (7)

My best guess at the important tricks for training 1L SAEs
Arthur Conmy (arthur-conmy) · 2023-12-21T01:59:06.208Z · comments (4)

[link] Dark Skies Book Review
PeterMcCluskey · 2023-12-29T18:28:59.352Z · comments (3)

Medical Roundup #2
Zvi · 2024-04-09T13:40:05.908Z · comments (18)

Thousands of malicious actors on the future of AI misuse
Zershaaneh Qureshi (zershaaneh-qureshi) · 2024-04-01T10:08:42.357Z · comments (0)

Interview with Vanessa Kosoy on the Value of Theoretical Research for AI
WillPetillo · 2023-12-04T22:58:40.005Z · comments (0)

Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-01-02T18:15:54.168Z · comments (0)

[link] Increasing IQ is trivial
George3d6 · 2024-03-01T22:43:32.037Z · comments (60)

AI #66: Oh to Be Less Online
Zvi · 2024-05-30T14:20:03.334Z · comments (6)

On DeepMind’s Frontier Safety Framework
Zvi · 2024-06-18T13:30:21.154Z · comments (4)

A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
Lao Mein (derpherpize) · 2024-09-20T13:13:26.181Z · comments (7)

[link] Turning 22 in the Pre-Apocalypse
testingthewaters · 2024-08-22T20:28:25.794Z · comments (14)

[link] I didn't have to avoid you; I was just insecure
Chipmonk · 2024-08-17T16:41:50.237Z · comments (7)

Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi (jarviniemi) · 2024-09-05T19:13:34.443Z · comments (0)

The murderous shortcut: a toy model of instrumental convergence
Thomas Kwa (thomas-kwa) · 2024-10-02T06:48:06.787Z · comments (0)

[link] A High Decoupling Failure
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-14T19:46:09.552Z · comments (5)

Your LLM Judge may be biased
Henry Papadatos (henry) · 2024-03-29T16:39:22.534Z · comments (9)

[question] Is there software to practice reading expressions?
lsusr · 2024-04-23T21:53:00.679Z · answers+comments (10)

[link] Twitter thread on AI takeover scenarios
Richard_Ngo (ricraz) · 2024-07-31T00:24:33.866Z · comments (0)

[link] WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals
trevor (TrevorWiesinger) · 2024-04-23T21:33:08.049Z · comments (5)

[link] [Fiction] A Confession
Arjun Panickssery (arjun-panickssery) · 2024-04-18T16:28:48.194Z · comments (2)

UDT1.01: The Story So Far (1/10)
Diffractor · 2024-03-27T23:22:35.170Z · comments (6)

AI #49: Bioweapon Testing Begins
Zvi · 2024-02-01T15:30:04.690Z · comments (11)

Free Will and Dodging Anvils: AIXI Off-Policy
Cole Wyeth (Amyr) · 2024-08-29T22:42:24.485Z · comments (12)

Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition
cmathw · 2024-04-08T11:14:43.268Z · comments (4)

Eye contact is effortless when you’re no longer emotionally blocked on it
Chipmonk · 2024-09-27T21:47:01.970Z · comments (24)

[link] Toki pona FAQ
dkl9 · 2024-03-17T21:44:21.782Z · comments (8)

An anti-inductive sequence
Viliam · 2024-08-14T12:28:54.226Z · comments (10)

[link] Claude 3 Opus can operate as a Turing machine
Gunnar_Zarncke · 2024-04-17T08:41:57.209Z · comments (2)

(Appetitive, Consummatory) ≈ (RL, reflex)
Steven Byrnes (steve2152) · 2024-06-15T15:57:39.533Z · comments (1)

But Where do the Variables of my Causal Model come from?
Dalcy (Darcy) · 2024-08-09T22:07:57.395Z · comments (1)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (46)

Debate: Is it ethical to work at AI capabilities companies?
Ben Pace (Benito) · 2024-08-14T00:18:38.846Z · comments (21)

Good job opportunities for helping with the most important century
HoldenKarnofsky · 2024-01-18T17:30:03.332Z · comments (0)

We’re not as 3-Dimensional as We Think
silentbob · 2024-08-04T14:39:16.799Z · comments (16)

[link] Shifting Headspaces - Transitional Beast-Mode
Jonathan Moregård (JonathanMoregard) · 2024-08-12T13:02:06.120Z · comments (9)

Drone Wars Endgame
RussellThor · 2024-02-01T02:30:46.161Z · comments (71)

The Evolution of Humans Was Net-Negative for Human Values
Zack_M_Davis · 2024-04-01T16:01:10.037Z · comments (1)

On Dwarkesh’s 3rd Podcast With Tyler Cowen
Zvi · 2024-02-02T19:30:05.974Z · comments (9)

[link] "Model UN Solutions"
Arjun Panickssery (arjun-panickssery) · 2023-12-08T23:06:33.490Z · comments (5)

Finding the Wisdom to Build Safe AI
Gordon Seidoh Worley (gworley) · 2024-07-04T19:04:16.089Z · comments (10)

AI companies' commitments
Zach Stein-Perlman · 2024-05-29T11:00:31.339Z · comments (0)

Deeply Cover Car Crashes?
jefftk (jkaufman) · 2023-12-10T22:20:01.133Z · comments (31)

Introduce a Speed Maximum
jefftk (jkaufman) · 2024-01-11T02:50:04.284Z · comments (28)

AI Safety Camp final presentations
Linda Linsefors · 2024-03-29T14:27:43.503Z · comments (3)

List of strategies for mitigating deceptive alignment
joshc (joshua-clymer) · 2023-12-02T05:56:50.867Z · comments (2)

[link] Scaling laws for dominant assurance contracts
jessicata (jessica.liu.taylor) · 2023-11-28T23:11:07.631Z · comments (5)

A Socratic dialogue with my student
lsusr · 2023-12-05T09:31:05.266Z · comments (14)

Childhood and Education Roundup #5
Zvi · 2024-04-17T13:00:03.015Z · comments (4)

Please Bet On My Quantified Self Decision Markets
niplav · 2023-12-01T20:07:38.284Z · comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

simon on Magic by forgetting

This actually sounds about right. What's paradoxical here?

Not that it's necessarily inconsistent, but in my view it does seem to be pointing out an important problem with the assumptions (hence indeed a paradox if you accept those false assumptions):

(ignore this part, it is just a rehash of the path dependence paradigm. It is here to show that I am not complaining about the math, but about its relation to reality):

Imagine you are going to be split (once). It is factually the case that there are going to be two people with memories, etc. consistent with having been you. Without any important differences to distinguish them, and if you insist on coming up with some probability number for "waking up" as one particular one of them obviously it has to be ½.

And then, if one of those copies subsequently splits, if you insist on assigning a probability number for those further copies, then from the perspective of that parent copy, the further copies also have to be ½ each.

And then if you take these probability numbers seriously and insist on them all being consistent then obviously from the perspective of the original the probability numbers for the final numbers have to be ½ and ¼ and ¼. As you say "this actually sounds about right".

What's paradoxical here is that in the scenario provided we have the following facts:

you have 3 identical copies all formed from the original
all 3 copies have an equal footing going forward

and yet, the path-based identity paradigm is trying to assign different weights to these copies, based on some technical details of what happened to create them. The intuition that this is absurd is pointing at the fact that these technical details aren't what most people probably would care about, except if they insist on treating these probability numbers as real things and trying to make them follow consistent rules.

Ultimately "these three copies will each experience being a continuation of me" is an actual fact about the world, but statements like "'I' will experience being copy A (as opposed to B or C)" are not pointing to an actual fact about the world. Thus assigning a probability number to such a statement is a mental convenience that should not be taken seriously. The moment such numbers stop being convenient, like assigning different weights to copies you are actually indifferent between, they should be discarded. (and optionally you could make up new numbers that match what you actually care about instrumentally. Or just not think of it in those terms).

johnswentworth on You are not too "irrational" to know your preferences.

This post would be a LOT better with about half-a-dozen representative real examples you've run into.

dagon on You are not too "irrational" to know your preferences.

I'd enjoy some acknowledgement that there IS an interplay between cognitive beliefs (based on intelligent modeling of the universe and other people) and intuitive experienced emotions. "not a monocausal result of how smart or stupid they are" does not imply total lack of correlation or impact. Nor does it imply that cognitive ability to choose a framing or model is not effective in changing one's aliefs and preferences.

I'm fully onboard with countering the bullying and soldier-mindset debate techniques that smart people use against less-smart (or equally-smart but differently-educated) people. I don't buy that everyone is entitled to express and follow any preferences, including anti-social or harmful-to-others beliefs. Some things are just wrong in modern social contexts.

notfnofn on Two flavors of computational functionalism

I recently came across unsupervised machine translation here. It's not directly applicable, but it opens the possibility that, given enough information about "something", you can pin down what it's encoding in your own language.

So let's say now that we have a computer that simulates a human brain in a manner that we understand. Perhaps there really could be a sense in which it simulates a human brain that is independent of our interpretation of it. I'm having some trouble formulating this precisely.

chipmonk on Locally optimal psychology

Great question, thanks!

I think you're correct in pointing towards the existence of basically-all-downside genetic conditions, but I still think these are in the minority. Moreover, even most of those don't create a big issue on the object level— compared to how people might feel about the issue as a result.

This argument doesn't extend to conditions like Huntington's, but if a person is missing a pinky finger, most of the issues the person is going to face are related to social factors and their own emotions, not the physical aspect.

I also just say this from experience helping others.

ben-lang on Lao Mein's Shortform

That Iran thing is weird.

If I were guessing I might say that maybe this is happening:

Right now the more trade China has with Iran the more America might make a fuss. Either complaining politically, putting tariffs, or calling on general favours and good will for it to stop. But if America starts making a fuss anyway, or burns all its good will, then their is suddenly no downside to trading with Iran. Now substitute "China" for any and all countries (for example the UK, France and Germany, who all stayed in the Iran Nuclear Deal even after the USA pulled out).

nathan-helm-burger on Voyas32's Shortform

As a somewhat neurodivergent person who can't stand twitter-style interactions because of the difficulty of pulling gems from out of the noise.... I would love it if someone built a bot that harvested the "best of rationality and science adjacent twitter/bluesky/mastodon" and put it into a simple website. Maybe using LessWrong's codebase?

I feel like I'm missing out by being unable to comfortably read Twitter.

Some thoughts on this:

it could have a filter such that multiple semantically equivalent comments would always get merged into a single comment with an author-count
rude or non-sequitor comments would get automatically deleted
comments which have no info, just "applause" or "disapproval" would get turned into agreement/disagreement votes

All of this filtering seems like it could be adequately handled by a small-cheap LLM like Haiku or Gemini Flash in combination with some simple scripts.

nathan-helm-burger on yams's Shortform

I think I get what you're saying... That the argument you dislike is, "we should rush to AGI sooner, so that there's less compute overhang when we get there."

I agree that that argument is a pretty bad one. I personally think that we are already so far into a compute overhang regime that that ship has sailed. We are using very inefficient learning algorithms, and will be able to run millions of inference instances of any model we produce.

Does this correspond with what you are thinking?

lgs on "The Solomonoff Prior is Malign" is a special case of a simpler argument

The problem with this argument is that the oracle sucks.

The humans believe they have access to an oracle that correctly predicts what happens in the real world. However, they have access to a defective oracle which only performs well in simulated worlds, but performs terribly in the "real" universe (more generally, universes in which humans are real). This is a pretty big problem with the oracle!

Yes, I agree that an oracle which is incentivized to make correct predictions within its own vantage point (including possible simulated worlds, not restricted to the real world) is malign. I don't really agree the Solomonoff prior has this incentive. I also don't think this is too relevant to any superintelligence we might encounter in the real world, since it is unlikely that it will have this specific incentive (this is for a variety of reasons, including "the oracle will probably care about the real world more" and, more importantly, "the oracle has no incentive to say its true beliefs anyway").

zachary-1 on Ultralearning in 80 days

If you're trying to make the most of your abundant free time then learning Quenya is a mistake. Learning a language nobody speaks seriously is at best a way to signal status to a very specific group of people and at worst a party trick

Some ideas of ways to spend that time that would pay higher dividends over the course of the rest of your life:

learn a musical instrument
become skilled at a competitive game (mtg, poker, rocket league, chess, etc)
practice yoga
- personally I've found practicing yoga to be the best thing I can do for making me feel good in my body on a daily basis, which I see as a prerequisite for being effective at pursuing other goals
go running
invest in a relationship (call a friend/family member on the phone whom you otherwise wouldn't)

Read at least 20+ books on rationality, decision-making, heuristics and biases, etc.

Here are some of my favorites, in order of the impact they made on my life:

Language in Thought and Action
Yudkowsky's sequences
Class by Paul Fussel (taken with a large grain of salt. not really rationalist but caused me to update significantly)
Influence: science and practice