LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Meta: On viewing the latest LW posts
quiet_NaN · 2024-08-25T19:31:39.008Z · comments (2)

Metastrategy get-started guide
Tahp · 2024-06-25T15:04:11.542Z · comments (1)

Agency overhang as a proxy for Sharp left turn
Eris (anton-zheltoukhov) · 2024-11-07T12:14:24.333Z · comments (0)

[question] Artificial V/S Organoid Intelligence
10xyz (10xyz-coder) · 2024-10-23T14:31:46.385Z · answers+comments (0)

[link] Launching the Respiratory Outlook 2024/25 Forecasting Series
ChristianWilliams · 2024-07-17T19:51:05.380Z · comments (0)

Scattered thoughts on what it means for an LLM to believe
TheManxLoiner · 2024-11-06T22:10:29.429Z · comments (3)

New Capabilities, New Risks? - Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks
Tej Lander (tej-lander) · 2024-09-29T18:58:56.253Z · comments (0)

[link] An "Observatory" For a Shy Super AI?
Sherrinford · 2024-09-27T21:22:40.296Z · comments (0)

Introduction to Modern Dating: Strategic Dating Advice for beginners
Jesper Lindholm · 2024-07-20T15:45:25.705Z · comments (6)

A simple text status can change something
nextcaller · 2024-06-23T18:48:58.580Z · comments (0)

[link] Join the $10K AutoHack 2024 Tournament
Paul Bricman (paulbricman) · 2024-09-25T11:54:20.112Z · comments (0)

Freedom and Privacy of Thought Architectures
JohnBuridan · 2024-07-20T21:43:11.419Z · comments (2)

Apply to be a mentor in SPAR!
agucova · 2024-11-05T21:32:45.797Z · comments (0)

Using Narrative Prompting to Extract Policy Forecasts from LLMs
Max Ghenis (MaxGhenis) · 2024-11-05T04:37:52.004Z · comments (0)

Mentorship in AGI Safety: Applications for mentorship are open!
Valentin2026 (Just Learning) · 2024-06-28T14:49:48.501Z · comments (0)

[link] AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics
Corin Katzke (corin-katzke) · 2024-09-11T19:14:08.274Z · comments (1)

Educational CAI: Aligning a Language Model with Pedagogical Theories
Bharath Puranam (bharath-puranam) · 2024-11-01T18:55:26.993Z · comments (1)

[question] Can UBI overcome inflation and rent seeking?
Gordon Seidoh Worley (gworley) · 2024-08-01T00:13:51.693Z · answers+comments (34)

[link] Social interaction-inspired AI alignment
Chipmonk · 2024-06-24T08:10:08.719Z · comments (2)

[link] So You've Learned To Teleport by Tom Scott
landscape_kiwi · 2024-07-17T18:04:37.272Z · comments (0)

Interest poll: A time-waster blocker for desktop Linux programs
nahoj · 2024-08-22T20:44:04.479Z · comments (5)

[link] Predictions as Public Works Project — What Metaculus Is Building Next
ChristianWilliams · 2024-10-22T16:35:13.999Z · comments (0)

[question] What are the strategic implications if aliens and Earth civilizations produce similar utilities?
Maxime Riché (maxime-riche) · 2024-08-06T21:16:21.719Z · answers+comments (1)

[link] A Logical Proof for the Emergence and Substrate Independence of Sentience
rife (edgar-muniz) · 2024-10-24T21:08:09.398Z · comments (31)

Reasoning is not search - a chess example
p.b. · 2024-08-06T09:29:40.451Z · comments (3)

[question] How do you follow AI (safety) news?
PeterH · 2024-09-24T13:58:48.916Z · answers+comments (2)

Likelihood calculation with duobels
Martin Gerdes (martin-gerdes) · 2024-10-01T16:21:01.268Z · comments (0)

Effective Empathy
Thac0 · 2024-07-11T15:14:22.430Z · comments (1)

Can Current LLMs be Trusted To Produce Paperclips Safely?
Rohit Chatterjee (rohit-c) · 2024-08-19T17:17:07.530Z · comments (0)

Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)

Madrid - ACX Meetups Everywhere Fall 2024
Pablo Villalobos (pvs) · 2024-08-05T18:36:55.136Z · comments (0)

[question] When do alignment researchers retire?
Jordan Taylor (Nadroj) · 2024-06-25T23:30:25.520Z · answers+comments (2)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

[link] Game Theory and Society
Zero Contradictions · 2024-08-05T04:27:37.275Z · comments (0)

[question] Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
MrThink (ViktorThink) · 2024-07-02T20:13:24.054Z · answers+comments (23)

Contrapositive Natural Abstraction - Project Intro
Elliot Callender (javanotmocha) · 2024-06-24T18:37:21.761Z · comments (5)

The great Enigma in the sky: The universe as an encryption machine
Alex_Shleizer · 2024-08-14T13:21:58.713Z · comments (1)

What are Emotions?
Myles H (zarsou9) · 2024-11-15T04:20:27.388Z · comments (7)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation
Antonio Clarke (antonio-clarke) · 2024-09-29T18:48:23.308Z · comments (0)

It is time to start war gaming for AGI
yanni kyriacos (yanni) · 2024-10-17T05:14:17.932Z · comments (1)

[question] Isomorphisms don't preserve subjective experience... right?
notfnofn · 2024-07-03T14:22:59.679Z · answers+comments (26)

[link] Clopen sandwiches
dkl9 · 2024-07-14T13:07:58.345Z · comments (0)

Tokyo (日本語) Japan - ACX Meetups Everywhere Fall 2024
Emi (emi-2) · 2024-08-29T18:35:28.013Z · comments (0)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

On predictability, chaos and AIs that don't game our goals
Alejandro Tlaie (alejandro-tlaie-boria) · 2024-07-15T17:16:32.766Z · comments (8)

[question] Is there a known method to find others who came across the same potential infohazard without spoiling it to the public?
hive · 2024-10-17T10:47:05.099Z · answers+comments (6)

[question] is there a big dictionary somewhere with all your jargon and acronyms and whatnot?
KvmanThinking (avery-liu) · 2024-10-17T11:30:50.937Z · answers+comments (7)

[link] The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind
Roko · 2024-10-16T01:24:51.102Z · comments (18)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

o-o on O O's Shortform

O1 probably scales to superhuman reasoning:

O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:

-you can make the base model more efficient at thinking

-you can implement the base model more efficiently on hardware

-you can simply wait for hardware to get better

-you can create custom inference chips

Anything wrong with this view? I think agents are unlocked shortly along with or after this too.

ustice on Ayn Rand’s model of “living money”; and an upside of burnout

Thanks for clarifying! Willpower is a tricky concept.

I’ve suffered from depression at times, where getting out of bed felt like a huge exertion of emotional energy. Due to my tenuous control over my focus with ADHD, I often have to repeat in my head what I’m doing so I don’t forget in the middle of it. I’ve also put in 60-hour weeks writing code, both because I’ve had serious deadlines, but also because time disappeared as I got so wrapped up in it. I’ve stayed on healthy diets for years without problem, and had times where slipped back to high sugar foods.

All of these are examples of what people refer to as willpower (or lack there-of). Most of them are from times in my life where I haven’t felt really in control. This is especially true regarding memory. It’s not uncommon for me to realize as I am putting my groceries away that I didn’t get the one item I really needed (and have to go back).

That said, I’m pretty good at grit: I’m willing to put in the work, despite hardships and obstacles. I’m also good at leading by example. I’ll fight the good fight, when needed,

All of these different features of me and my brain, are wrapped up in the concept of willpower. Each of them are a mixture of conscious and unconscious patterns of behavior (including cognitive).

It’s this distinction that makes me look askance at the concept of willpower. It’s too wrapped up in moral judgement.

I wasn’t diagnosed with ADHD until after my son was. I lived with a lot guilt and shame because I interpreted the things I struggled with as a moral failings, because I just lacked the willpower.

Then I saw how many people struggled with the same sorts of things I did. It was really weird learning that so many things I previously would have described as negative personality traits of mine, turned out to be what happens when someone has this quirk in their brain that me and my son have.

Now, I don’t carry that guilt. Now, I know that despite my best efforts, tools, and practices, there are things I’m just going to always struggle with that neurotypical find easy, and that’s okay. Now, I don’t see myself as having low willpower because of them. Now, I better understand the quirks of my brain, and I am better equipped to mitigate my weaknesses, and play into my strengths.

Now, I’m a lot happier and confident. I wish it hadn’t taken 40 years for me to figure things out, but I’m glad my son is free of that shame and guilt.

I feel pretty lucky: when I was a kid, I had knack for patterns and abstraction, a fascination with computers, a family that could actually afford one, and people who could help me when I was stuck, I managed to make my hobby into my profession, and still enjoy it as a hobby.

I totally agree that joy and meaning are a balm to burnout. That and vacations; take more vacations.

I guess what I’m saying is be careful to not stretch your metaphors too far, as the details are messy; however, if it helps you to remember to take care of yourself, find joy, and seek meaning, I’m all for it.

rogerdearnaley on Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims

If these rumors are true, it sounds like we’re already starting to hit the issue I predicted in
LLMs May Find It Hard to FOOM [LW · GW]. The majority of content on the Internet isn’t written by geniuses with post-doctoral experience, so we’re starting to run out of the highest-quality training material for getting LLMs past doctoral student performance levels. However, as I describe there, this isn’t a wall, it’d just a slowdown: we need to start using AI to generate a lot more high-quality training data, As o1 shows, that’s entirely possible, using inference-time compute scaling and then training on the results.

However, this might be enough to render fast takeoff unlikely, which from an alignment point of view would be an excellent thing.

Now we just need to make sure all that synthetic training data we’re having the AI generate is well aligned.

sovran on OpenAI Email Archives (from Musk v. Altman)

Been thinking a lot about whether it's possible to stop humanity from developing AI.
I think the answer is almost definitely not.

Interesting that the very first thing he discusses is whether AI can be stopped

benito on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

Thanks for expressing this perspective.

I note Musk was the first one to start a competitor, which seems to me to be very costly.

I think that founding OpenAI could be right if the non-profit structure was likely to work out. I don't know if that made sense at the time. Altman has overpowered getting fired by the board, removed parts of the board, and rumor has it he is moving to a for-profit, which is strong evidence against the non-profit being able to withstand the pressures that were coming, but even without Altman I suspect it would still involve billions of $ of funding, partnerships like the one with Microsoft, and other for-profit pressures to be the sort of player it is today. So I don't know that Musk's plan was viable at all.

lorec on Project Adequate: Seeking Cofounders/Funders

This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.

meme-marine on Project Adequate: Seeking Cofounders/Funders

Kudos to you for actually trying to solve the problem, but I must remind you that the history of symbolic AI is pretty much nothing but failure after failure; what do you intend to do differently, and how do you intend to overcome the challenges that halted progress in this area for the past ~40 years?

mako-yass on Trying Bluesky

For a while I just stuck to that, but eventually it occurred to me that the rules of following mode favor whoever tweets the most, which is a similar social problem as when meetups end up favoring whoever talks the loudest and interrupts the most, and so I came to really prefer bsky's "Quiet Posters" mode.

seth-herd on OpenAI Email Archives (from Musk v. Altman)

That makes sense under certain assumptions - I find them so foreign I wasn't thinking in those terms. I find this move strange if you worry about either alignment or misuse. If you hand AGI to a bunch of people, one of them is prone to either screw up and release a misaligned AGI, or deliberately use their AGI to self-improve and either take over or cause mayhem.

To me these problems both seem highly likely. That's why the move of responding to concern over AGI by making more AGIs makes no sense to me. I think a singleton in responsible hands is our best chance at survival.

If you think alignment is so easy nobody will screw it up, or if you strongly believe that an offense-defense balance will strongly hold so that many good AGIs safely counter a few misaligned/misused ones, then sure. I just don't think either of those are very plausible views once you've thought back and forth through things.

Cruxes of disagreement on alignment difficulty [LW(p) · GW(p)] explains why I think anybody who thinks alignment is super easy is overestimating their confidence (as is anyone who's sure it's really really hard) - we just haven't done enough analysis or experimentation yet.

If we solve alignment, do we die anyway? [LW · GW] addresses why I think offense-defense balance is almost guaranteed to shift to offense with self-improving AGI, meaning a massively multipolar scenario means we're doomed to misuse.

My best guess is that people who think open-sourcing AGI is a good idea either are thinking only of weak "AGI" and not the next step to autonomously self-improving AGI, or they've taken an optimistic guess at the offense-defense balance with many human-controlled real AGIs.

cubefox on Trying Bluesky

The algorithm has been horrific for a while

After Musk took over, they implemented a mode which doesn't use an algorithm on the timeline at all. It's the "following" tab.