LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry · 2024-04-29T20:57:35.127Z · comments (7)

Towards a formalization of the agent structure problem
Alex_Altair · 2024-04-29T20:28:15.190Z · comments (2)

Ironing Out the Squiggles
Zack_M_Davis · 2024-04-29T16:13:00.371Z · comments (34)

Super additivity of consciousness
Arturo Macias (arturo-macias) · 2024-04-29T15:41:54.742Z · comments (12)

AISC9 has ended and there will be an AISC10
Linda Linsefors · 2024-04-29T10:53:18.812Z · comments (2)

Open-Source AI: A Regulatory Review
Elliot_Mckernon (elliot) · 2024-04-29T10:10:55.779Z · comments (0)

Big-endian is better than little-endian
Menotim · 2024-04-29T02:30:48.053Z · comments (14)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-04-29T01:57:29.464Z · comments (0)

The Prop-room and Stage Cognitive Architecture
Robert Kralisch (nonmali-1) · 2024-04-29T00:48:17.473Z · comments (4)

How are Simulators and Agents related?
Robert Kralisch (nonmali-1) · 2024-04-29T00:22:30.751Z · comments (0)

Extended Embodiment
Robert Kralisch (nonmali-1) · 2024-04-29T00:18:12.892Z · comments (1)

Referential Containment
Robert Kralisch (nonmali-1) · 2024-04-29T00:16:00.174Z · comments (4)

Disentangling Competence and Intelligence
Robert Kralisch (nonmali-1) · 2024-04-29T00:12:50.779Z · comments (7)

Unintentionally Creating Value
abstractapplic · 2024-04-28T20:05:08.479Z · comments (3)

An Unintentional Compliment
abstractapplic · 2024-04-28T20:04:56.522Z · comments (2)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

[link] Things I tell myself to be more agentic
DMMF · 2024-04-28T17:44:39.789Z · comments (0)

Estimating the Number of Players from Game Result Percentages
Daniel L (daniel-lyakovetsky) · 2024-04-28T17:42:03.247Z · comments (2)

[link] The Science Algorithm - AISC 2024 Final Presentation
Johannes C. Mayer (johannes-c-mayer) · 2024-04-28T14:55:50.504Z · comments (0)

[Aspiration-based designs] Outlook: dealing with complexity
Jobst Heitzig · 2024-04-28T13:06:35.841Z · comments (3)

[Aspiration-based designs] 3. Performance and safety criteria, and aspiration intervals
Jobst Heitzig · 2024-04-28T13:04:56.249Z · comments (0)

[Aspiration-based designs] 2. Formal framework, basic algorithm
Jobst Heitzig · 2024-04-28T13:02:17.253Z · comments (2)

[Aspiration-based designs] 1. Informal introduction
B Jacobs (Bob Jacobs) · 2024-04-28T13:00:43.268Z · comments (4)

Playing Northboro with Lily and Rick
jefftk (jkaufman) · 2024-04-28T02:40:03.436Z · comments (1)

[link] Release of UN's draft related to the governance of AI (a summary of the Simon Institute's response)
Sebastian Schmidt · 2024-04-27T18:34:39.836Z · comments (0)

Mercy to the Machine: Thoughts & Rights
False Name (False Name, Esq.) · 2024-04-27T16:36:06.006Z · comments (5)

Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon (joy_void_joy) · 2024-04-27T16:04:45.894Z · comments (12)

So What's Up With PUFAs Chemically?
J Bostock (Jemist) · 2024-04-27T13:32:52.159Z · comments (23)

[link] Link: Let's Think Dot by Dot: Hidden Computation in Transformer Language Models by Jacob Pfau, William Merrill & Samuel R. Bowman
Chris_Leong · 2024-04-27T13:22:53.287Z · comments (0)

[link] Two Vernor Vinge Book Reviews
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-27T12:14:53.917Z · comments (0)

Refusal in LLMs is mediated by a single direction
Andy Arditi (andy-arditi) · 2024-04-27T11:13:06.235Z · comments (76)

[question] Plausibility of Getting Early Warning Shots because AIs can't coordinate?
hmys (the-cactus) · 2024-04-27T08:02:10.792Z · answers+comments (0)

AI Safety Sphere
Myles H (zarsou9) · 2024-04-27T01:49:02.369Z · comments (2)

Exploring the Esoteric Pathways to AI Sentience (Part One)
jeffreycaruso · 2024-04-27T01:02:18.429Z · comments (7)

Superposition is not "just" neuron polysemanticity
LawrenceC (LawChan) · 2024-04-26T23:22:06.066Z · comments (4)

D&D.Sci Long War: Defender of Data-mocracy
aphyer · 2024-04-26T22:30:15.780Z · comments (20)

On Not Pulling The Ladder Up Behind You
Screwtape · 2024-04-26T21:58:29.455Z · comments (14)

We are headed into an extreme compute overhang
devrandom · 2024-04-26T21:38:21.694Z · comments (23)

[Concept Dependency] Edge Regular Lattice Graph
Johannes C. Mayer (johannes-c-mayer) · 2024-04-26T21:14:18.960Z · comments (1)

[Concept Dependency] Concept Dependency Posts
Johannes C. Mayer (johannes-c-mayer) · 2024-04-26T20:57:18.815Z · comments (3)

[question] Wouldn't weak AI agents provide warning?
Mandatory Topic · 2024-04-26T19:34:17.424Z · answers+comments (0)

Duct Tape security
Isaac King (KingSupernova) · 2024-04-26T18:57:05.659Z · comments (9)

Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?
Gordon Seidoh Worley (gworley) · 2024-04-26T18:10:26.517Z · comments (2)

Scaling of AI training runs will slow down after GPT-5
Maxime Riché (maxime-riche) · 2024-04-26T16:05:59.957Z · comments (5)

Spatial attention as a “tell” for empathetic simulation?
Steven Byrnes (steve2152) · 2024-04-26T15:10:58.040Z · comments (11)

Arch-anarchy
Peter lawless · 2024-04-26T15:05:14.984Z · comments (1)

Breadboarding a Whistle Synth
jefftk (jkaufman) · 2024-04-26T15:00:03.352Z · comments (2)

An Introduction to AI Sandbagging
Teun van der Weij (teun-van-der-weij) · 2024-04-26T13:40:00.126Z · comments (1)

[link] LLMs seem (relatively) safe
JustisMills · 2024-04-25T22:13:06.221Z · comments (24)

Losing Faith In Contrarianism
omnizoid · 2024-04-25T20:53:34.842Z · comments (42)

next page (older posts) →

Archive

Recent comments

nevin-wetherill on Open Thread Spring 2024

Hey, I'm new to LessWrong and working on a post - however at some point the guidelines which pop up at the top of a fresh account's "new post" screen went away, and I cannot find the same language in the New Users Guide or elsewhere on the site.

Does anyone have a link to this? I recall a list of suggestions like "make the post object-level," "treat it as a submission for a university," "do not write a poetic/literary post until you've already gotten a couple object-level posts on your record."

It seems like a minor oversight if it's impossible to find certain moderation guidelines/tips and tricks if you've already saved a draft/posted a comment.

I am not terribly worried about running headfirst into a moderation filter, as I can barely manage to write a comment which isn't as high effort of an explanation as I can come up with - but I do want that specific piece of text for reference, and now it appears to have evaporated into the shadow realm.

Am I just missing a link that would appear if I searched something else?

(Edit: also, sorry if this is the wrong place for this, I would've tried the "intercom" feature, but I am currently on the mobile version of the site, and that feature appears to be entirely missing there - and yes, I checked my settings to make sure it wasn't "hidden")

fowlertm on fowlertm's Shortform

We recently released an interview with independent scholar John Wentworth:

It mostly centers around two themes: "abstraction" (forming concepts) and "agency" (dealing with goal-directed systems).

Check it out!

habryka4 on Bogdan Ionut Cirstea's Shortform

At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article "Pausing AI Developments Isn't Enough. We Need to Shut it All Down"), so I am confused why you list him with anything related to "pause".

My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period.

raemon on Deep Honesty

So there's "being honest" and "trying to convince people of things you think are true", and I think those are at least somewhat different projects. I feel like the first is more obviously good than the second.

I would first ask "what's my goal" (and, doublecheck why it's your goal and if you're being honest with yourself). Like, "I want to be able to say my true thoughts out loud and have an honest open relationship with my relatives" is different from "i don't want my relatives to believe false things" (the win-condition for the former is about you, the latter is about them). The latter is subtly different from "I want to have presented my best case to them, that they'll actually listen to, but then let them make up their own mind."

I'd also note there are additional soft skills you can gain like:

feeling safe/nonjudgmental to talk to
making it feel safe for people to give up ideology (via living-through-example as someone who is happy without being religious)
helping people grieve/orient

rotatingpaguro on Dating Roundup #3: Third Time’s the Charm

Thinking about it, I suspect I was not getting what "authenticity and openness" means. Like, it's not "being yourself and letting go", and more "being honest", I guess? Could you give me >= 2 examples of a person being "authentic and open"?

raemon on some thoughts on LessOnline

Young people (metaphorically or literally) are welcome!

seth-herd on jacquesthibs's Shortform

I think future more powerful/useful AIs will understand our intentions better IF they are trained to predict language. Text corpuses contain rich semantics about human intentions.

I can imagine other AI systems that are trained differently, and I would be more worried about those.

That's what I meant by current AI understanding our intentions possibly better than future AI.

richard_kennaway on Introducing AI Lab Watch

"AI Watch."

raemon on Raemon's Shortform

Are the disagree reacts with ‘small icons are good for this reason (enough to override other concerns)’ or ‘I didn’t update previously?’

d0themath on We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"

I will also suggest the questions: 1) What are the things I’m really confident in? And 2) What are the things those I often read or talk to are really confident in? 3) And are there simple arguments which just involve bringing in little-thought-about domains of effect which throw that confidence into question?