LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

On ‘Responsible Scaling Policies’ (RSPs)
Zvi · 2023-12-05T16:10:06.310Z · comments (3)

[link] Every Mention of EA in "Going Infinite"
KirstenH · 2023-10-07T14:42:32.217Z · comments (0)

“Why can’t you just turn it off?”
Roko · 2023-11-19T14:46:18.427Z · comments (25)

[link] JumpReLU SAEs + Early Access to Gemma 2 SAEs
Senthooran Rajamanoharan (SenR) · 2024-07-19T16:10:54.664Z · comments (10)

The Handbook of Rationality (2021, MIT press) is now open access
romeostevensit · 2023-10-10T00:30:05.589Z · comments (4)

Translations Should Invert
abramdemski · 2023-10-05T17:44:23.262Z · comments (19)

Competitive, Cooperative, and Cohabitive
Screwtape · 2023-09-28T23:25:52.723Z · comments (12)

[link] Spaced repetition for teaching two-year olds how to read (Interview)
Chipmonk · 2023-11-26T16:52:58.412Z · comments (9)

AISC 2024 - Project Summaries
NickyP (Nicky) · 2023-11-27T22:32:23.555Z · comments (3)

D&D.Sci(-fi): Colonizing the SuperHyperSphere
abstractapplic · 2024-01-12T23:36:54.248Z · comments (23)

Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chipmonk · 2024-01-03T17:55:19.825Z · comments (3)

D&D.Sci: The Mad Tyrant's Pet Turtles [Evaluation and Ruleset]
abstractapplic · 2024-04-09T14:01:34.426Z · comments (6)

Interested in Cognitive Bootcamp?
Raemon · 2024-09-19T22:12:13.348Z · comments (0)

AI and the Technological Richter Scale
Zvi · 2024-09-04T14:00:08.625Z · comments (8)

On the lethality of biased human reward ratings
Eli Tyre (elityre) · 2023-11-17T18:59:02.303Z · comments (10)

Making Bad Decisions On Purpose
Screwtape · 2023-11-09T03:36:59.611Z · comments (8)

An alternative approach to superbabies
Towards_Keeperhood (Simon Skade) · 2024-11-05T22:56:15.740Z · comments (19)

The Mom Test: Summary and Thoughts
Adam Zerner (adamzerner) · 2024-04-18T03:34:21.020Z · comments (3)

[link] Active Recall and Spaced Repetition are Different Things
Saul Munn (saul-munn) · 2024-11-08T20:14:56.092Z · comments (2)

[link] Urging an International AI Treaty: An Open Letter
Olli Järviniemi (jarviniemi) · 2023-10-31T11:26:25.864Z · comments (2)

[link] Web-surfing tips for strange times
eukaryote · 2024-05-31T07:10:25.805Z · comments (19)

Mechanistic Interpretability Workshop Happening at ICML 2024!
Neel Nanda (neel-nanda-1) · 2024-05-03T01:18:26.936Z · comments (6)

[question] If I wanted to spend WAY more on AI, what would I spend it on?
Logan Zoellner (logan-zoellner) · 2024-09-15T21:24:46.742Z · answers+comments (16)

Philosophers wrestling with evil, as a social media feed
David Gross (David_Gross) · 2024-06-03T22:25:22.507Z · comments (2)

Evaluating the truth of statements in a world of ambiguous language.
Hastings (hastings-greer) · 2024-10-07T18:08:09.920Z · comments (19)

[link] Book review: Xenosystems
jessicata (jessica.liu.taylor) · 2024-09-16T20:17:56.670Z · comments (18)

Exercise: Planmaking, Surprise Anticipation, and "Baba is You"
Raemon · 2024-02-24T20:33:49.574Z · comments (19)

How to do conceptual research: Case study interview with Caspar Oesterheld
Chi Nguyen · 2024-05-14T15:09:30.390Z · comments (5)

Highlights from Lex Fridman’s interview of Yann LeCun
Joel Burget (joel-burget) · 2024-03-13T20:58:13.052Z · comments (15)

[link] Designing for a single purpose
Itay Dreyfus (itay-dreyfus) · 2024-05-07T14:11:22.242Z · comments (12)

How to safely use an optimizer
Simon Fischer (SimonF) · 2024-03-28T16:11:01.277Z · comments (21)

Extended Interview with Zhukeepa on Religion
Ben Pace (Benito) · 2024-08-18T03:19:05.625Z · comments (59)

Environmental allergies are curable? (Sublingual immunotherapy)
Chipmonk · 2023-12-26T19:05:08.880Z · comments (10)

[link] on neodymium magnets
bhauth · 2024-01-30T15:58:24.088Z · comments (6)

2023 Prediction Evaluations
Zvi · 2024-01-08T14:40:07.377Z · comments (0)

How do we know that "good research" is good? (aka "direct evaluation" vs "eigen-evaluation")
Ruby · 2024-07-19T00:31:38.332Z · comments (21)

AI Pause Will Likely Backfire (Guest Post)
jsteinhardt · 2023-10-24T04:30:02.113Z · comments (6)

What distinguishes "early", "mid" and "end" games?
Raemon · 2024-06-21T17:41:30.816Z · comments (22)

Caring about excellence
owencb · 2024-07-22T14:24:37.892Z · comments (4)

4. Existing Writing on Corrigibility
Max Harms (max-harms) · 2024-06-10T14:08:35.590Z · comments (13)

[link] A Good Explanation of Differential Gears
Johannes C. Mayer (johannes-c-mayer) · 2023-10-19T23:07:46.354Z · comments (4)

Arguments for moral indefinability
Richard_Ngo (ricraz) · 2023-09-30T22:40:04.325Z · comments (16)

shortest goddamn bayes guide ever
lukehmiles (lcmgcd) · 2024-05-10T07:06:23.734Z · comments (8)

Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor
RogerDearnaley (roger-d-1) · 2024-01-09T20:42:28.349Z · comments (8)

Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback
Marcus Williams · 2024-11-07T15:39:06.854Z · comments (6)

Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes
Anna Gajdova (anna-gajdova) · 2024-10-09T12:56:24.856Z · comments (14)

[link] Five projects from AI Safety Hub Labs 2023
charlie_griffin (cjgriffin) · 2023-11-08T19:19:37.759Z · comments (1)

Some Experiments I'd Like Someone To Try With An Amnestic
johnswentworth · 2024-05-04T22:04:19.692Z · comments (33)

Mission Impossible: Dead Reckoning Part 1 AI Takeaways
Zvi · 2023-11-01T12:52:29.341Z · comments (13)

[link] Constructive Cauchy sequences vs. Dedekind cuts
jessicata (jessica.liu.taylor) · 2024-03-14T23:04:07.300Z · comments (23)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

simon on OpenAI Email Archives (from Musk v. Altman)

Musk did also express concern about DeepMind making Hassabis the effective emperor of humanity, which seems much stranger - Hassabis' values appear to be quite standard humanist ones, so you'd think having him in charge of a project with the clear lead would be a best-case scenario for anything other than being in charge yourself.

It seems the concern was that DeepMind would create a singleton, whereas their vision was for many people (potentially with different values) to have access to it. I don't think that's strange at all - it's only strange if you assume that Musk and Altman would believe that a singleton is inevitable.

Musk:

If they win, it will be really bad news with their one mind to rule the world philosophy.

Altman:

The mission would be to create the first general AI and use it for individual empowerment—ie, the distributed version of the future that seems the safest.

daniel-murfet on Alexander Gietelink Oldenziel's Shortform

Typo, I think you meant singularity theory :p

daniel-murfet on Alexander Gietelink Oldenziel's Shortform

Modern mathematics is less about solving problems within established frameworks and more about designing entirely new games with their own rules. While school mathematics teaches us to be skilled players of pre-existing mathematical games, research mathematics requires us to be game designers, crafting rule systems that lead to interesting and profound consequences

I don't think so. This probably describes the kind of mathematics you aspire to do, but still the bulk of modern research in mathematics is in fact about solving problems within established frameworks and usually such research doesn't require us to "be game designers". Some of us are of course drawn to the kinds of frontiers where such work is necessary, and that's great, but I think this description undervalues the within-paradigm work that is the bulk of what is going on.

shankar-sivarajan on Shortform

They do the same thing with "child pornography": that's mostly teenagers sexting. And a girl was convicted for it too, charged with distributing child pornography of herself: link.

sharmake-farah on What (if anything) made your p(doom) go down in 2024?

I'd say the main things that made my own p(Doom) went down this year are the following:

I've come to believe that data was both a major factor in capabilities and alignment, and I also believe that careful interventions on that data could be really helpful for alignment.
I've come to think that instrumental convergence is closer to a scalar quantity than a boolean, and while I don't think 0 instrumental convergence is incentivized for capabilities and domain reasons, I do think that restraining instrumental convergence/putting useful constraints on instrumental convergence like world models is helpful for capabilities to the extent that I think that power-seeking will likely be a lot more local than what humans do.
I've overall shifted towards a worldview where the common thought experiment of the second-species argument, where humans have killed over 90%+ of chimpanzees and gorillas due to them running away with intelligence and being misaligned neglects very crucial differences between the human and the AI case that makes my p(Doom) lower.

(Maybe another way to say it is I think the outcome of humans just completely running roughshod on every other species due to instrumental convergence is not the median outcome of AI development, but a deep outiler that is very uninformative to how AI outcomes will look like.)

I've come to believe that human values, or at least the generator of values, are actually simpler than a lot of people think, and that a lot of the complexity that appears to be there is because we generally don't like admitting that very simple rules can generate very complex outcomes.

romeostevensit on Ayn Rand’s model of “living money”; and an upside of burnout

Thank you!

haiku-1 on OpenAI Email Archives (from Musk v. Altman)

That requires interpretation, which can introduce unintended editorializing. If you spotted the intent, the rest of the audience can as well. (And if the audience is confused about intent, the original recipients may have been as well.)

I personally would include these sorts of notes about typos if I was writing my own thoughts about the original content, or if I was sharing a piece of it for a specific purpose. I take the intent of this post to be more of a form of accessible archiving.

cubefox on Alexander Gietelink Oldenziel's Shortform

In the past we already had examples ("logical AI", "Bayesian AI") where galaxy-brained mathematical approaches lost out against less theory-based software engineering.

seth-herd on OpenAI Email Archives (from Musk v. Altman)

I totally agree. And I also think that all involved are quite serious when they say they care about the outcomes for all of humanity. So I think in this case history turned on a knife edge; Musk would've at least not done this much harm had he and Page had clearer thinking and clearer communication, possibly just by a little bit.

But I do agree that there's some motivated reasoning happening there, too. In support of your point that Musk might find an excuse to do what he emotionally wanted to anyway (become humanity's savior and perhaps emperor for eternity): Musk did also express concern about DeepMind making Hassabis the effective emperor of humanity, which seems much stranger - Hassabis' values appear to be quite standard humanist ones, so you'd think having him in charge of a project with the clear lead would be a best-case scenario for anything other than being in charge yourself. So yes, I do think Musk, Altman, and people like them also have some powerful emotional drives toward doing grand things themselves.

It's a mix of motivations, noble and selfish, conscious and unconscious. That's true of all of us all the time, but it becomes particularly salient and worth analyzing when the future hangs in the balance.

lukas_gloor on OpenAI Email Archives (from Musk v. Altman)

I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk's true and only reason for founding OpenAI, then I agree that this was a communication fuckup.

However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (and Altman too) tend to have a desire to do the most important thing and the belief that they can do this thing a lot better than anyone else. On that assumption, it's not too surprising that Musk found a reason for having to step in and build AGI himself. In fact, on this view, we should expect to see surprisingly little sincere exploration of "joining someone else's project to improve it" solutions.

I don't think this is necessarily a bad attitude. Sometimes people who think this way are right in the specific situation. It just means that we see the following patterns a lot:

Ambitious people start their own thing rather than join some existing thing.
Ambitious people have fallouts with each other after starting a project together where the question of "who eventually gets de facto ultimate control" wasn't totally specified from the start.

(Edited away a last paragraph that used to be here 50mins after posting. Wanted to express something like "Sometimes communication only prolongs the inevitable," but that sounds maybe a bit too negative because even if you're going to fall out eventually, probably good communication can help make it less bad.)