LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Losing Faith In Contrarianism
omnizoid · 2024-04-25T20:53:34.842Z · comments (44)

Natural abstractions are observer-dependent: a conversation with John Wentworth
Martín Soto (martinsq) · 2024-02-12T17:28:38.889Z · comments (13)

[link] [Linkpost] George Mack's Razors
trevor (TrevorWiesinger) · 2023-11-27T17:53:45.065Z · comments (8)

Australian AI Safety Forum 2024
Liam Carroll (liam-carroll) · 2024-09-27T00:40:11.451Z · comments (0)

[link] Generative ML in chemistry is bottlenecked by synthesis
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-16T16:31:34.801Z · comments (2)

AI #70: A Beautiful Sonnet
Zvi · 2024-06-27T14:40:08.087Z · comments (0)

[link] The consistent guessing problem is easier than the halting problem
jessicata (jessica.liu.taylor) · 2024-05-20T04:02:03.865Z · comments (5)

[link] AISafety.info: What is the "natural abstractions hypothesis"?
Algon · 2024-10-05T12:31:14.195Z · comments (2)

LLMs as a Planning Overhang
Larks · 2024-07-14T02:54:14.295Z · comments (8)

D&D.Sci: Whom Shall You Call?
abstractapplic · 2024-07-05T20:53:37.010Z · comments (6)

[question] What progress have we made on automated auditing?
LawrenceC (LawChan) · 2024-07-06T01:49:43.714Z · answers+comments (1)

Dialogue on What It Means For Something to Have A Function/Purpose
johnswentworth · 2024-07-15T16:28:56.609Z · comments (5)

Is This Lie Detector Really Just a Lie Detector? An Investigation of LLM Probe Specificity.
Josh Levy (josh-levy) · 2024-06-04T15:45:54.399Z · comments (0)

[link] An AI Manhattan Project is Not Inevitable
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-06T16:42:35.920Z · comments (25)

International Scientific Report on the Safety of Advanced AI: Key Information
Aryeh Englander (alenglander) · 2024-05-18T01:45:10.194Z · comments (0)

Free Will and Dodging Anvils: AIXI Off-Policy
Cole Wyeth (Amyr) · 2024-08-29T22:42:24.485Z · comments (12)

On DeepMind’s Frontier Safety Framework
Zvi · 2024-06-18T13:30:21.154Z · comments (4)

A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
Lao Mein (derpherpize) · 2024-09-20T13:13:26.181Z · comments (7)

Turning Your Back On Traffic
jefftk (jkaufman) · 2024-07-17T01:00:08.627Z · comments (7)

[link] A Percentage Model of a Person
Sable · 2024-10-12T17:55:07.560Z · comments (3)

AI #66: Oh to Be Less Online
Zvi · 2024-05-30T14:20:03.334Z · comments (6)

Games for AI Control
charlie_griffin (cjgriffin) · 2024-07-11T18:40:50.607Z · comments (0)

COT Scaling implies slower takeoff speeds
Logan Zoellner (logan-zoellner) · 2024-09-28T16:20:00.320Z · comments (56)

[link] Characterizing stable regions in the residual stream of LLMs
Jett Janiak (jett) · 2024-09-26T13:44:58.792Z · comments (4)

Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-01-02T18:15:54.168Z · comments (0)

UDT1.01: The Story So Far (1/10)
Diffractor · 2024-03-27T23:22:35.170Z · comments (6)

Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition
cmathw · 2024-04-08T11:14:43.268Z · comments (4)

[link] A High Decoupling Failure
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-14T19:46:09.552Z · comments (5)

Review Report of Davidson on Takeoff Speeds (2023)
Trent Kannegieter · 2023-12-22T18:48:55.983Z · comments (11)

AI #49: Bioweapon Testing Begins
Zvi · 2024-02-01T15:30:04.690Z · comments (11)

[question] Is a random box of gas predictable after 20 seconds?
Thomas Kwa (thomas-kwa) · 2024-01-24T23:00:53.184Z · answers+comments (35)

Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley (roger-d-1) · 2024-01-05T08:46:58.915Z · comments (4)

What is wisdom?
TsviBT · 2023-11-14T02:13:49.681Z · comments (3)

Possible OpenAI's Q* breakthrough and DeepMind's AlphaGo-type systems plus LLMs
Burny · 2023-11-23T03:16:09.358Z · comments (25)

Interview with Vanessa Kosoy on the Value of Theoretical Research for AI
WillPetillo · 2023-12-04T22:58:40.005Z · comments (0)

Deconfusing In-Context Learning
Arjun Panickssery (arjun-panickssery) · 2024-02-25T09:48:17.690Z · comments (1)

Principles For Product Liability (With Application To AI)
johnswentworth · 2023-12-10T21:27:41.403Z · comments (55)

I’m confused about innate smell neuroanatomy
Steven Byrnes (steve2152) · 2023-11-28T20:49:13.042Z · comments (1)

[link] The Hippie Rabbit Hole -Nuggets of Gold in Rivers of Bullshit
Jonathan Moregård (JonathanMoregard) · 2024-01-05T18:27:01.769Z · comments (20)

Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi (jarviniemi) · 2024-09-05T19:13:34.443Z · comments (0)

[link] Twitter thread on AI takeover scenarios
Richard_Ngo (ricraz) · 2024-07-31T00:24:33.866Z · comments (0)

[link] Dark Skies Book Review
PeterMcCluskey · 2023-12-29T18:28:59.352Z · comments (3)

Glitch Token Catalog - (Almost) a Full Clear
Lao Mein (derpherpize) · 2024-09-21T12:22:16.403Z · comments (3)

[link] WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals
trevor (TrevorWiesinger) · 2024-04-23T21:33:08.049Z · comments (5)

Your LLM Judge may be biased
Henry Papadatos (henry) · 2024-03-29T16:39:22.534Z · comments (9)

[link] [Fiction] A Confession
Arjun Panickssery (arjun-panickssery) · 2024-04-18T16:28:48.194Z · comments (2)

[question] Is there software to practice reading expressions?
lsusr · 2024-04-23T21:53:00.679Z · answers+comments (10)

Thousands of malicious actors on the future of AI misuse
Zershaaneh Qureshi (zershaaneh-qureshi) · 2024-04-01T10:08:42.357Z · comments (0)

Enhancing intelligence by banging your head on the wall
Bezzi · 2023-12-12T21:00:48.584Z · comments (26)

Medical Roundup #2
Zvi · 2024-04-09T13:40:05.908Z · comments (18)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

quila on Alexander Gietelink Oldenziel's Shortform

how?

viliam on Advice on Communicating Concisely

Just guessing here, because I have a similar problem. You need to know your audience, so that you can skip the parts they already know, and only communicate the new part.

Also, depends on whether it is a monologue or dialogue; in monologue you err on the side of saying more, in dialog you can expect some "if they don't understand, they will ask".

.

For example, I sometimes realize that I am needlessly defensive, that I am unconsciously expecting the most uncharitable misinterpretation of anything I say -- that's because I have spent a lot of time offline with people who were like that -- so I am trying to make my argument ironclad, include all kinds of disclaimers, etc., which results in many extra words.

On the other hand, it is easy (and frequent) to err on the side of saying too little, making your message ambiguous without noticing it [? · GW]. Sometimes people appreciate that I include some extra context; I have been explicitly praised at work for writing great documentation.

d0themath on Alexander Gietelink Oldenziel's Shortform

Oh I thought they meant like ski masks or something. For illness masks, the reason they’re not cool is very clearly that they imply you’re diseased.

(To a lesser extent too that your existing social status is so low you can’t expect to get away with accidentally infecting any friends or acquaintances, but my first point is more obvious & defensible)

quila on Alexander Gietelink Oldenziel's Shortform

oh i meant medical/covid ones. could also consider furry masks and the cat masks that femboys often wear (e.g. to obscure masculine facial structure), which feel cute rather than 'cool', though they are more like the natural human face in that they display an expression ("the face is a mask we wear over our skulls")

thomas-kwa on If far-UV is so great, why isn't it everywhere?

Quiet air filters is an already solved problem technically. You just need enough filter area that the pressure drop is low, so that you can use quiet low-pressure PC fans to move the air. CleanAirKits is already good, but if the market were big enough, rather than CleanAirKits charging >$200 for a box with holes in it and fans, you would get a purifier from IKEA for $120 which is sturdy and 3db quieter due to better sound design.

cipolla on Cipolla's Shortform

I noticed the most successful people, in the sense of advancing their career and publishing papers, I meet at work have a certain belief in themselves. What is striking, no matter their age/career stage, it is like they are already taking certain their success and where to go in the future.

I also noticed this is something that people from non-working class backgrounds manage to do.

Second point. They are good at finishing projects and delivering results in time.

I noticed that this was somehow independent from how smart is someone.

While I am very good at single tasks, I have always struggled with long term academic performance. I know it is true for some other people too.

What kind of knowledge/mentality am I missing? Because I feel stuck.

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

yes very lukewarm take

also nice product placement nina

tsvibt on The Hidden Complexity of Wishes

Here's an argument that alignment is difficult which uses complexity of value as a subpoint:

A1. If you try to manually specify what you want, you fail.
A2. Therefore, you want something algorithmically complex.
B1. When humanity makes an AGI, the AGI will have gotten values via some process; that process induces some probability distribution over what values the AGI ends up with.
B2. We want to affect the values-distribution, somehow, so that it ends up with our values.
B3. We don't understand how to affect the values-distribution toward something specific.
B4. If we don't affect the value-distribution toward something specific, then the values-distribution probably puts large penalties for absolute algorithmic complexity; any specific utility function with higher absolute algorithmic complexity will be less likely to be the one that the AGI ends up with.
C1. Because of A2 (our values are algorithmically complex) and B4 (a complex utility function is unlikely to show up in an AGI without us skillfully intervening), an AGI is unlikely to have our values without us skillfully intervening.
C2. Because of B3 (we don't know how to skillfully intervene on an AGI's values) and C1, an AGI is unlikely to have our values.

I think that you think that the argument under discussion is something like:

(same) A1. If you try to manually specify what you want, you fail.
(same) A2. Therefore, you want something algorithmically complex.
(same) B1. When humanity makes an AGI, the AGI will have gotten values via some process; that process induces some probability distribution over what values the AGI ends up with.
(same) B2. We want to affect the values-distribution, somehow, so that it ends up with our values.
B'3. The greater the complexity of our values, the harder it is to point at our values.
B'4. The harder it is to point at our values, the more work or difficulty is involved in B2.
C'1. By B'3 and B'4: the greater the complexity of our values, the more work or difficulty is involved in B2 (determining the AGI's values).
C'2. Because of A2 (our values are algorithmically complex) and C'1, it would take a lot of work to make an AGI pursue our values.

These are different arguments, which make use of the complexity of values in different ways. You dispute B'3 on the grounds that it can be easy to point at complex values. B'3 isn't used in the first argument though.

rokosbasilisk-1 on Exploring the Platonic Representation Hypothesis Beyond In-Distribution Data

Thanks for the feedback! working on refining the writeup.

zac-hatfield-dodds on A Narrow Path: a plan to deal with AI extinction risk

(2) ✅ ... The first is from Chief of Staff at Anthropic.

The byline of that piece is "Avital Balwit lives in San Francisco and works as Chief of Staff to the CEO at Anthropic. This piece was written entirely in her personal capacity and does not reflect the views of Anthropic."

I do not think this is an appropriate citation for the claim. In any case, They publicly state that it is not a matter of “if” such artificial superintelligence might exist, but “when” simply seems to be untrue; both cited sources are peppered with phrases like 'possibility', 'I expect', 'could arrive', and so on.