LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

Shard Theory - is it true for humans?
Rishika (rishika-bose) · 2024-06-14T19:21:47.997Z · comments (7)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

Duct Tape security
Isaac King (KingSupernova) · 2024-04-26T18:57:05.659Z · comments (11)

Indecision and internalized authority figures
Kaj_Sotala · 2024-07-06T10:10:02.528Z · comments (1)

Introducing AI-Powered Audiobooks of Rational Fiction Classics
Askwho · 2024-05-04T17:32:49.719Z · comments (14)

What and Why: Developmental Interpretability of Reinforcement Learning
Garrett Baker (D0TheMath) · 2024-07-09T14:09:40.649Z · comments (4)

Don't Share Information Exfohazardous on Others' AI-Risk Models
Thane Ruthenis · 2023-12-19T20:09:06.244Z · comments (11)

"Fractal Strategy" workshop report
Raemon · 2024-04-06T21:26:53.263Z · comments (22)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

SB 1047 Is Weakened
Zvi · 2024-06-06T13:40:41.547Z · comments (4)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

AI #39: The Week of OpenAI
Zvi · 2023-11-23T15:10:04.865Z · comments (8)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

[link] Why not electric trains and excavators?
bhauth · 2023-11-21T00:07:17.967Z · comments (39)

Timaeus is hiring!
Jesse Hoogland (jhoogland) · 2024-07-12T23:42:28.651Z · comments (6)

AE Studio @ SXSW: We need more AI consciousness research (and further resources)
AE Studio (AEStudio) · 2024-03-26T20:59:09.129Z · comments (8)

Thoughts On (Solving) Deep Deception
Jozdien · 2023-10-21T22:40:10.060Z · comments (2)

Ophiology (or, how the Mamba architecture works)
Danielle Ensign (phylliida-dev) · 2024-04-09T19:31:09.975Z · comments (8)

AI #42: The Wrong Answer
Zvi · 2023-12-14T14:50:05.086Z · comments (6)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

[link] Non-superintelligent paperclip maximizers are normal
jessicata (jessica.liu.taylor) · 2023-10-10T00:29:53.072Z · comments (4)

minutes from a human-alignment meeting
bhauth · 2024-05-24T05:01:53.904Z · comments (4)

[question] Will quantum randomness affect the 2028 election?
Thomas Kwa (thomas-kwa) · 2024-01-24T22:54:30.800Z · answers+comments (52)

[link] Funding case: AI Safety Camp
Remmelt (remmelt-ellen) · 2023-12-12T09:08:18.911Z · comments (5)

Out-of-distribution Bioattacks
jefftk (jkaufman) · 2023-12-02T12:20:05.626Z · comments (15)

Implementing activation steering
Annah (annah) · 2024-02-05T17:51:55.851Z · comments (7)

FAQ: What the heck is goal agnosticism?
porby · 2023-10-08T19:11:50.269Z · comments (36)

[link] Shane Legg interview on alignment
Seth Herd · 2023-10-28T19:28:52.223Z · comments (20)

Friendship is transactional, unconditional friendship is insurance
Ruby · 2024-07-17T22:52:41.967Z · comments (24)

OpenAI: Altman Returns
Zvi · 2023-11-30T14:10:05.469Z · comments (12)

AI #35: Responsible Scaling Policies
Zvi · 2023-10-26T13:30:02.439Z · comments (10)

OpenAI's Preparedness Framework: Praise & Recommendations
Akash (akash-wasil) · 2024-01-02T16:20:04.249Z · comments (1)

How to be an amateur polyglot
arisAlexis (arisalexis) · 2024-05-08T15:08:11.404Z · comments (16)

Reinforcement Via Giving People Cookies
Screwtape · 2023-11-15T04:34:21.119Z · comments (9)

[link] Most experts believe COVID-19 was probably not a lab leak
DanielFilan · 2024-02-02T19:28:00.319Z · comments (89)

[link] Towards Understanding Sycophancy in Language Models
Ethan Perez (ethan-perez) · 2023-10-24T00:30:48.923Z · comments (0)

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Joar Skalse (Logical_Lunatic) · 2024-05-17T19:13:31.380Z · comments (10)

An AI Race With China Can Be Better Than Not Racing
niplav · 2024-07-02T17:57:36.976Z · comments (32)

Preventing model exfiltration with upload limits
ryan_greenblatt · 2024-02-06T16:29:33.999Z · comments (21)

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours
Seth Herd · 2024-08-05T15:38:09.682Z · comments (22)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (17)

[link] Static Analysis As A Lifestyle
adamShimi · 2024-07-03T18:29:37.384Z · comments (11)

Schelling game evaluations for AI control
Olli Järviniemi (jarviniemi) · 2024-10-08T12:01:24.389Z · comments (5)

[link] So you want to save the world? An account in paladinhood
Tamsin Leake (carado-1) · 2023-11-22T17:40:33.048Z · comments (19)

[link] The Perceptron Controversy
Yuxi_Liu · 2024-01-10T23:07:23.341Z · comments (18)

How a chip is designed
YM (Yannick_Muehlhaeuser_duplicate0.05902100825326273) · 2024-06-28T08:04:27.392Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

noah-birnbaum on The Case For Giving To The Shrimp Welfare Project

I really don't like when people downvote so heavily without giving reasons - think this is nicely argued!

One issue I do have is that Bob Fischer, the conductor of the Rethink study, warned about exactly what you are sorta doing here in being like ah now we can use x amount of shrimp and saying we can trolly problem a human for that many. This is just one contention, but I think the point is important and people willing to take weird/ controversial ideas seriously (especially here!) should take it more seriously!

lukehmiles on lukehmiles's Shortform

Yeah I just wanted to check that nobody is giving away money before I go do the exact opposite thing I've been doing

gunnar_zarncke on Alexander Gietelink Oldenziel's Shortform

This sounds related to my complaint about the YUDKOWSKY + WOLFRAM ON AI RISK debate:

I wish there had been some effort to quantify @stephen_wolfram's "pockets or irreducibility" (section 1.2 & 4.2) because if we can prove that there aren't many or they are hard to find & exploit by ASI, then the risk might be lower.

I got this tweet wrong. I meant if pockets of irreducibility are common and non-pockets are rare and hard to find, then the risk from superhuman AI might be lower. I think Stephen Wolfram's intuition has merit but needs more analysis to be convicing.

ebenezer-dukakis on Alexander Gietelink Oldenziel's Shortform

Chinas has alienated virtually all its neighbours

That sounds like an exaggeration? My impression is that China has OK/good relations with countries such as Vietnam, Cambodia, Pakistan, Indonesia, North Korea, factions in Myanmar. And Russia, of course. If you're serious about this claim, I think you should look at a map, make a list of countries which qualify as "neighbors" based purely on geographic distance, then look up relations for each one.

o-o on O O's Shortform

O1 probably scales to superhuman reasoning:

O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:

-you can make the base model more efficient at thinking

-you can implement the base model more efficiently on hardware

-you can simply wait for hardware to get better

-you can create custom inference chips

Anything wrong with this view? I think agents are unlocked shortly along with or after this too.

ustice on Ayn Rand’s model of “living money”; and an upside of burnout

Thanks for clarifying! Willpower is a tricky concept.

I’ve suffered from depression at times, where getting out of bed felt like a huge exertion of emotional energy. Due to my tenuous control over my focus with ADHD, I often have to repeat in my head what I’m doing so I don’t forget in the middle of it. I’ve also put in 60-hour weeks writing code, both because I’ve had serious deadlines, but also because time disappeared as I got so wrapped up in it. I’ve stayed on healthy diets for years without problem, and had times where slipped back to high sugar foods.

All of these are examples of what people refer to as willpower (or lack there-of). Most of them are from times in my life where I haven’t felt really in control. This is especially true regarding memory. It’s not uncommon for me to realize as I am putting my groceries away that I didn’t get the one item I really needed (and have to go back).

That said, I’m pretty good at grit: I’m willing to put in the work, despite hardships and obstacles. I’m also good at leading by example. I’ll fight the good fight, when needed,

All of these different features of me and my brain, are wrapped up in the concept of willpower. Each of them are a mixture of conscious and unconscious patterns of behavior (including cognitive).

It’s this distinction that makes me look askance at the concept of willpower. It’s too wrapped up in moral judgement.

I wasn’t diagnosed with ADHD until after my son was. I lived with a lot guilt and shame because I interpreted the things I struggled with as a moral failings, because I just lacked the willpower.

Then I saw how many people struggled with the same sorts of things I did. It was really weird learning that so many things I previously would have described as negative personality traits of mine, turned out to be what happens when someone has this quirk in their brain that me and my son have.

Now, I don’t carry that guilt. Now, I know that despite my best efforts, tools, and practices, there are things I’m just going to always struggle with that neurotypical find easy, and that’s okay. Now, I don’t see myself as having low willpower because of them. Now, I better understand the quirks of my brain, and I am better equipped to mitigate my weaknesses, and play into my strengths.

Now, I’m a lot happier and confident. I wish it hadn’t taken 40 years for me to figure things out, but I’m glad my son is free of that shame and guilt.

I feel pretty lucky: when I was a kid, I had knack for patterns and abstraction, a fascination with computers, a family that could actually afford one, and people who could help me when I was stuck, I managed to make my hobby into my profession, and still enjoy it as a hobby.

I totally agree that joy and meaning are a balm to burnout. That and vacations; take more vacations.

I guess what I’m saying is be careful to not stretch your metaphors too far, as the details are messy; however, if it helps you to remember to take care of yourself, find joy, and seek meaning, I’m all for it.

rogerdearnaley on Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims

If these rumors are true, it sounds like we’re already starting to hit the issue I predicted in
LLMs May Find It Hard to FOOM [LW · GW]. The majority of content on the Internet isn’t written by geniuses with post-doctoral experience, so we’re starting to run out of the highest-quality training material for getting LLMs past doctoral student performance levels. However, as I describe there, this isn’t a wall, it’d just a slowdown: we need to start using AI to generate a lot more high-quality training data, As o1 shows, that’s entirely possible, using inference-time compute scaling and then training on the results.

However, this might be enough to render fast takeoff unlikely, which from an alignment point of view would be an excellent thing.

Now we just need to make sure all that synthetic training data we’re having the AI generate is well aligned.

sovran on OpenAI Email Archives (from Musk v. Altman)

Been thinking a lot about whether it's possible to stop humanity from developing AI.
I think the answer is almost definitely not.

Interesting that the very first thing he discusses is whether AI can be stopped

benito on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

Thanks for expressing this perspective.

I note Musk was the first one to start a competitor, which seems to me to be very costly.

I think that founding OpenAI could be right if the non-profit structure was likely to work out. I don't know if that made sense at the time. Altman has overpowered getting fired by the board, removed parts of the board, and rumor has it he is moving to a for-profit, which is strong evidence against the non-profit being able to withstand the pressures that were coming, but even without Altman I suspect it would still involve billions of $ of funding, partnerships like the one with Microsoft, and other for-profit pressures to be the sort of player it is today. So I don't know that Musk's plan was viable at all.

lorec on Project Adequate: Seeking Cofounders/Funders

This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.