LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Eye contact is effortless when you’re no longer emotionally blocked on it
Chipmonk · 2024-09-27T21:47:01.970Z · comments (24)

AI companies' commitments
Zach Stein-Perlman · 2024-05-29T11:00:31.339Z · comments (0)

[link] "Model UN Solutions"
Arjun Panickssery (arjun-panickssery) · 2023-12-08T23:06:33.490Z · comments (5)

[link] Scaling laws for dominant assurance contracts
jessicata (jessica.liu.taylor) · 2023-11-28T23:11:07.631Z · comments (5)

The Evolution of Humans Was Net-Negative for Human Values
Zack_M_Davis · 2024-04-01T16:01:10.037Z · comments (1)

[question] What are your cruxes for imprecise probabilities / decision rules?
Anthony DiGiovanni (antimonyanthony) · 2024-07-31T15:42:27.057Z · answers+comments (29)

[link] UC Berkeley course on LLMs and ML Safety
Dan H (dan-hendrycks) · 2024-07-09T15:40:00.920Z · comments (1)

List of strategies for mitigating deceptive alignment
joshc (joshua-clymer) · 2023-12-02T05:56:50.867Z · comments (2)

AI #47: Meet the New Year
Zvi · 2024-01-13T16:20:10.519Z · comments (7)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

[link] Shifting Headspaces - Transitional Beast-Mode
Jonathan Moregård (JonathanMoregard) · 2024-08-12T13:02:06.120Z · comments (9)

On Dwarkesh’s 3rd Podcast With Tyler Cowen
Zvi · 2024-02-02T19:30:05.974Z · comments (9)

[link] Searching for the Root of the Tree of Evil
Ivan Vendrov (ivan-vendrov) · 2024-06-08T17:05:53.950Z · comments (14)

Please Bet On My Quantified Self Decision Markets
niplav · 2023-12-01T20:07:38.284Z · comments (6)

AI Safety Camp final presentations
Linda Linsefors · 2024-03-29T14:27:43.503Z · comments (3)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (38)

(Appetitive, Consummatory) ≈ (RL, reflex)
Steven Byrnes (steve2152) · 2024-06-15T15:57:39.533Z · comments (1)

But Where do the Variables of my Causal Model come from?
Dalcy (Darcy) · 2024-08-09T22:07:57.395Z · comments (1)

Good job opportunities for helping with the most important century
HoldenKarnofsky · 2024-01-18T17:30:03.332Z · comments (0)

Live Machinery: An Interface Design Philosophy for Wholesome AI Futures (Workshop @ EA Hotel!)
Sahil · 2024-11-01T17:24:09.957Z · comments (2)

Finding the Wisdom to Build Safe AI
Gordon Seidoh Worley (gworley) · 2024-07-04T19:04:16.089Z · comments (10)

Deeply Cover Car Crashes?
jefftk (jkaufman) · 2023-12-10T22:20:01.133Z · comments (31)

Mental Masturbation and the Intellectual Comfort Zone
Declan Molony (declan-molony) · 2024-05-07T05:47:05.257Z · comments (2)

Drone Wars Endgame
RussellThor · 2024-02-01T02:30:46.161Z · comments (71)

[link] Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
Soroush Pour (soroush-pour) · 2023-11-07T17:59:36.857Z · comments (2)

Debate: Is it ethical to work at AI capabilities companies?
Ben Pace (Benito) · 2024-08-14T00:18:38.846Z · comments (21)

[link] Who is Sam Bankman-Fried (SBF) really, and how could he have done what he did? - three theories and a lot of evidence
spencerg · 2023-11-11T01:04:22.747Z · comments (28)

[question] Snapshot of narratives and frames against regulating AI
Jan_Kulveit · 2023-11-01T16:30:19.116Z · answers+comments (19)

Comparing representation vectors between llama 2 base and chat
Nina Panickssery (NinaR) · 2023-10-28T22:54:37.059Z · comments (5)

We are already in a persuasion-transformed world and must take precautions
trevor (TrevorWiesinger) · 2023-11-04T15:53:31.345Z · comments (14)

Childhood and Education Roundup #5
Zvi · 2024-04-17T13:00:03.015Z · comments (4)

We’re not as 3-Dimensional as We Think
silentbob · 2024-08-04T14:39:16.799Z · comments (16)

[link] Toki pona FAQ
dkl9 · 2024-03-17T21:44:21.782Z · comments (8)

AI #34: Chipping Away at Chip Exports
Zvi · 2023-10-19T15:00:03.055Z · comments (19)

The (partial) fallacy of dumb superintelligence
Seth Herd · 2023-10-18T21:25:16.893Z · comments (5)

[link] Learning coefficient estimation: the details
Zach Furman (zfurman) · 2023-11-16T03:19:09.013Z · comments (0)

Introduce a Speed Maximum
jefftk (jkaufman) · 2024-01-11T02:50:04.284Z · comments (28)

An anti-inductive sequence
Viliam · 2024-08-14T12:28:54.226Z · comments (10)

Dangers of Closed-Loop AI
Gordon Seidoh Worley (gworley) · 2024-03-22T23:52:22.010Z · comments (9)

How I select alignment research projects
Ethan Perez (ethan-perez) · 2024-04-10T04:33:08.092Z · comments (4)

Open Problems in AIXI Agent Foundations
Cole Wyeth (Amyr) · 2024-09-12T15:38:59.007Z · comments (2)

My Detailed Notes & Commentary from Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:51.894Z · comments (16)

[link] Hyperreals in a Nutshell
Yudhister Kumar (randomwalks) · 2023-10-15T14:23:58.027Z · comments (27)

[question] What is an "anti-Occamian prior"?
Zane · 2023-10-23T02:26:10.851Z · answers+comments (22)

[link] On Fables and Nuanced Charts
Niko_McCarty (niko-2) · 2024-09-08T17:09:07.503Z · comments (2)

Forecasting AI (Overview)
jsteinhardt · 2023-11-16T19:00:04.218Z · comments (0)

Book Review: On the Edge: The Gamblers
Zvi · 2024-09-24T11:50:06.065Z · comments (1)

Monthly Roundup #22: September 2024
Zvi · 2024-09-17T12:20:08.297Z · comments (10)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (5)

[link] AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks
aogara (Aidan O'Gara) · 2023-10-31T19:34:54.837Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

evhub on Simple probes can catch sleeper agents

Our work here is not arguing that probing is a perfect solution in general; it's just a single datapoint of how it fares on the models from our Sleeper Agents paper [LW · GW].

seth-herd on OpenAI Email Archives (from Musk v. Altman)

What I'm saying is that the people you mention should put a little more time into it. When I've been involved in philosophy discussions with academics, people tend to treat it like a fun game, with the goal being more to sore points and come up with clever new arguments than to converge on the truth.

I think most of the world doesn't take philosophy seriously, and they should.

I think the world thinks "there aren't real answers to philosophical questions, just personal preferences and a confusing mess of opinions". I think that's mostly wrong; LW does tend to cause convergence on a lot of issues for a lot of people. That might be groupthink, but I held almost identical philosophical views before engaging with LW - because I took the questions seriously and was truth-seeking.

I think Musk or Page are fully capable of LW-style philosophy if they put a little time into it - and took it seriously (were truth-seeking).

What would change people's attitudes? Well, I'm hoping that facing serious questions in how we create, use, and treat AI does cause at least some people to take the associated philosophical questions seriously.

anders-lindstroem on "It's a 10% chance which I did 10 times, so it should be 100%"

Yes. But I think you have mixed up expected value and expected utility. Please show your calculations.

satron on Why imperfect adversarial robustness doesn't doom AI control

Great post, very clearly written. Going to share it in my spaces.

anders-lindstroem on "It's a 10% chance which I did 10 times, so it should be 100%"

I do not understand your reasoning. Please show your calculations.

sherrinford on Monthly Roundup #24: November 2024

"Stephanie Murray reports that the village thing can still be done, and in particular has pulled off a ‘baby swapping’ system that periodically pools child care so parents can have time for themselves."

Maybe there is more detail in the linked blog but just from this post it sounds like a reinvention of Kindergarten.

egor-timatkov on "It's a 10% chance which I did 10 times, so it should be 100%"

Ah, shoot. You're right. Probably not good to use "odds" and "probability" interchangeably for percentages like I did. Should be fixed now.

zy on Rauno's Shortform

Yeah that makes sense; the knowledge should still be there, just need to re-shift the distribution "back"

raemon on OpenAI Email Archives (from Musk v. Altman)

Noting, this doesn't really engage with any of the particular other claims in the previous comment's link, just makes a general assertion.

thomas-kwa on Thomas Kwa's Shortform

The North Wind, the Sun, and Abadar

One day, the North Wind and the Sun argued about which of them was the strongest. Abadar, the god of commerce and civilization, stopped to observe their dispute. “Why don’t we settle this fairly?” he suggested. “Let us see who can compel that traveler on the road below to remove his cloak.”

The North Wind agreed, and with a mighty gust, he began his effort. The man, feeling the bitter chill, clutched his cloak tightly around him and even pulled it over his head to protect himself from the relentless wind. After a time, the North Wind gave up, frustrated.

Then the Sun tried his turn. Beaming warmly from the heavens, the Sun caused the air to grow pleasant and balmy. The man, feeling the growing heat, loosened his cloak and eventually took it off in the heat, resting under the shade of a tree. The Sun began to declare victory, but as soon as he turned away, the man put on the cloak again.

The god of commerce then approached the traveler, jingling a pouch of gold coins in his hand.

“Good sir,” Abadar called out, “that cloak of yours—how much would you sell it for?”

The man considered the offer, his eyes lighting up. “Five coins,” he replied hesitantly.

“Done,” said Abadar, handing over the coins and taking the cloak. The traveler tucked the money away and continued on his way, unbothered by either wind or heat. He soon bought a new cloak and invested the remainder in an index fund. The returns were steady, and in time he prospered far beyond the value of his simple cloak.

“See,” Abadar declared to the Wind and the Sun, “strength lies neither in force nor persuasion but in creating opportunities for mutual benefit. The cloak is mine permanently, and the man is better off as well. My solution also has minimal deadweight loss, assuming the elasticity—”

Before Abadar could say any more, the North Wind grumbled, the Sun conceded, and Abadar strode away, his wisdom proven. Thus, it was decided that commerce, when conducted wisely, can accomplish what neither force nor gentle persuasion alone can achieve.