LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Sparse MLP Distillation
slavachalnev · 2024-01-15T19:39:02.926Z · comments (3)

Inference-Only Debate Experiments Using Math Problems
Arjun Panickssery (arjun-panickssery) · 2024-08-06T17:44:27.293Z · comments (0)

[link] 2024 State of the AI Regulatory Landscape
Deric Cheng (deric-cheng) · 2024-05-28T11:59:06.582Z · comments (0)

The Intentional Stance, LLMs Edition
Eleni Angelou (ea-1) · 2024-04-30T17:12:29.005Z · comments (3)

[link] Baking vs Patissing vs Cooking, the HPS explanation
adamShimi · 2024-07-17T20:29:09.645Z · comments (16)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (10)

RA Bounty: Looking for feedback on screenplay about AI Risk
Writer · 2023-10-26T13:23:02.806Z · comments (6)

[link] The origins of the steam engine: An essay with interactive animated diagrams
jasoncrawford · 2023-11-29T18:30:36.315Z · comments (1)

[link] There is no IQ for AI
Gabriel Alfour (gabriel-alfour-1) · 2023-11-27T18:21:26.196Z · comments (10)

Information-Theoretic Boxing of Superintelligences
JustinShovelain · 2023-11-30T14:31:11.798Z · comments (0)

Protestants Trading Acausally
Martin Sustrik (sustrik) · 2024-04-01T14:46:26.374Z · comments (4)

AI #59: Model Updates
Zvi · 2024-04-11T14:20:06.339Z · comments (2)

AI #62: Too Soon to Tell
Zvi · 2024-05-02T15:40:04.364Z · comments (8)

The Math of Suspicious Coincidences
Roko · 2024-02-07T13:32:35.513Z · comments (3)

Against "argument from overhang risk"
RobertM (T3t) · 2024-05-16T04:44:00.318Z · comments (11)

[link] Evaluating Stability of Unreflective Alignment
james.lucassen · 2024-02-01T22:15:40.902Z · comments (3)

Interpreting the Learning of Deceit
RogerDearnaley (roger-d-1) · 2023-12-18T08:12:39.682Z · comments (14)

[link] When scientists consider whether their research will end the world
Harlan · 2023-12-19T03:47:06.645Z · comments (4)

A Case for Superhuman Governance, using AI
ozziegooen · 2024-06-07T00:10:10.902Z · comments (0)

"Full Automation" is a Slippery Metric
ozziegooen · 2024-06-11T19:56:49.855Z · comments (1)

Differential Optimization Reframes and Generalizes Utility-Maximization
J Bostock (Jemist) · 2023-12-27T01:54:22.731Z · comments (2)

[link] Managing AI Risks in an Era of Rapid Progress
Algon · 2023-10-28T15:48:25.029Z · comments (3)

[link] Our Digital and Biological Children
Eneasz · 2024-10-24T18:36:38.719Z · comments (0)

AI #85: AI Wins the Nobel Prize
Zvi · 2024-10-10T13:40:07.286Z · comments (6)

AIS terminology proposal: standardize terms for probability ranges
eggsyntax · 2024-08-30T15:43:39.857Z · comments (12)

Fun With CellxGene
sarahconstantin · 2024-09-06T22:00:03.461Z · comments (2)

[link] Safety tax functions
owencb · 2024-10-20T14:08:38.099Z · comments (0)

[link] [Paper] Hidden in Plain Text: Emergence and Mitigation of Steganographic Collusion in LLMs
Yohan Mathew (ymath) · 2024-09-25T14:52:48.263Z · comments (1)

[link] AI forecasting bots incoming
Dan H (dan-hendrycks) · 2024-09-09T19:14:31.050Z · comments (44)

[link] My Methodological Turn
adamShimi · 2024-09-29T15:01:45.986Z · comments (0)

[question] Where to find reliable reviews of AI products?
Elizabeth (pktechgirl) · 2024-09-17T23:48:25.899Z · answers+comments (6)

Examples of How I Use LLMs
jefftk (jkaufman) · 2024-10-14T17:10:04.597Z · comments (2)

Offering Completion
jefftk (jkaufman) · 2024-06-07T01:40:02.137Z · comments (6)

Reviewing the Structure of Current AI Regulations
Deric Cheng (deric-cheng) · 2024-05-07T12:34:17.820Z · comments (0)

Weekly newsletter for AI safety events and training programs
Bryce Robertson (bryceerobertson) · 2024-05-03T00:33:29.418Z · comments (0)

AI #61: Meta Trouble
Zvi · 2024-05-02T18:40:03.242Z · comments (0)

Non-myopia stories
lberglund (brglnd) · 2023-11-13T17:52:31.933Z · comments (10)

Big-endian is better than little-endian
Menotim · 2024-04-29T02:30:48.053Z · comments (17)

A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans
Thane Ruthenis · 2023-12-17T20:28:57.854Z · comments (7)

Quick Thoughts on Our First Sampling Run
jefftk (jkaufman) · 2024-05-23T00:20:02.050Z · comments (3)

Paper Summary: Princes and Merchants: European City Growth Before the Industrial Revolution
Jeffrey Heninger (jeffrey-heninger) · 2024-07-15T21:30:04.043Z · comments (1)

Experience Report - ML4Good AI Safety Bootcamp
Kieron Kretschmar · 2024-04-11T18:03:41.040Z · comments (0)

End-to-end hacking with language models
tchauvin (timot.cool) · 2024-04-05T15:06:53.689Z · comments (0)

Please Understand
samhealy · 2024-04-01T12:33:20.459Z · comments (11)

[question] How does it feel to switch from earn-to-give?
Neil (neil-warren) · 2024-03-31T16:27:22.860Z · answers+comments (4)

[link] Anthropic: Reflections on our Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-05-20T04:14:44.435Z · comments (21)

Deception Chess: Game #2
Zane · 2023-11-29T02:43:22.375Z · comments (17)

Glomarization FAQ
Zane · 2023-11-15T20:20:49.488Z · comments (5)

“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)
Joe Carlsmith (joekc) · 2023-11-29T16:32:30.068Z · comments (1)

DPO/PPO-RLHF on LLMs incentivizes sycophancy, exaggeration and deceptive hallucination, but not misaligned powerseeking
tailcalled · 2024-06-10T21:20:11.938Z · comments (13)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

anders-lindstroem on Anders Lindström's Shortform

I am so thrilled! Daylight saving time got me to experience (kind of) the sleeping beauty problem first hand.

Last night we in Sweden changed our clocks back one hour at 03.00 to 02.00 and went from “summertime” to the dreaded “wintertime”. It’s dreaded because we know what follows with it, ice storms and polar bears in the streets...

Anyways, I woke up in the middle of the night and I reached for my phone to check what time it was. It was 02.50. Then it struck me. Am I experiencing the first 02.50 or the second 02.50 this night, i.e. have I first slept to 03, then the clock have changed back to 02 (which it automatically does on the phone) and then slept until 02.50 the new time or am I on the first 02.50 and in 10 minutes at 03 the clock will switchback to 02?

It was a very dizzying thought. I could not for my life say either or. There was nothing in the dark that could give me any indication weather I was experiencing the first or the second 02.50. Then with my thoughts spinning I slowly waited for the clock on my phone to turn 03. When it did, it did not go back to 02, I had experienced the second 02.50 that night.

ninety-three on The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!

They made it so the sociopath at the top of the pyramid was the kind that’s clever and myopic and numerate and invested in the status quo

The word "myopic" seems out of place in this list of positive descriptors, especially contrasted with crazed gloryhounds. Was this supposed to be "farsighted"?

viliam on cryonics is a pascal's mugging?

Well, there are different opinions on the possibility of reconstructing a person. Some people here would agree with you. I am afraid that there will not be enough evidence left to reconstruct the person, even if we had all their writings, and we usually don't have even that.

christiankl on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

I did talk with Geoff Anders about this. He told me that there's no legal agreement between CEA and Leverage. However, there are Leverage employees that are ex-CEA and thus bound by legal agreement. Geoff himself said, that he would consider it positive for the information to be public but he would not want to pick another fight with CEA by publically talking about what happened.

gwern on avturchin's Shortform

I had no idea ABBYY was so big. I thought it was just some minor OCR or PDF software developer. Interesting to hear about their historical arc. (I am also amused to see my Sutton meme used.)

bhauth on Electrostatic Airships?

Yes, helium costs would be a problem for large-scale use of airships. Yes, it's possible to use hydrogen in airships safely. This has been noted by many people.

Hydrogen has some properties that make it relatively safe:

it's light so it rises instead of accumulating on the ground or around a leak
it has a relatively high ignition temperature

and some properties that make it less safe:

it has a wide range of concentrations where it will burn in air
fast diffusion, that is, it mixes with air quickly
it leaks through many materials
it embrittles steel
it causes some global warming if released

Regardless, the FAA does not allow using hydrogen in airships, and I don't expect that to change soon. Especially since accidents still happen despite the small number of airships.

In any case, the only uses of airships that are plausibly economical today are: advertising and luxury yachts for the wealthy. Are those things that you care about working towards?

davekasten on davekasten's Shortform

I was being intentionally broad, here. I am probably less interested for purposes of this particular post only in the question of "who controls the future" swerves and more about "what else would interested, agentic actors do" questions.

It is not at all clear to me that OpenPhil is the only org who feels this way -- I can think of several non-EA-ish charities that if they genuinely 100% believed "none of the people you care for will die of the evils you fight if you can just keep them alive for the next 90 days" would plausibly do some interestingly agentic stuff.

benito on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

You could put it in a collapsible section, so that it's easy to get to the comment section by-default.

nerd_gatherer on AGI Fermi Paradox

First of all, I do agree with you that why haven't other civilizations created AGIs that have then spread far enough to reach Earth is a really interesting question as well, and I would be happy to see a discussion on that question.

For that question, I think you are missing a fourth possibility, AGI is almost always deadly, so on quantum branches where it develops anywhere in the light cone, no one observes it (at least not for long). So we don't see other civilization's AGI because we just are not alive on those quantum branches.

seth-herd on davekasten's Shortform

Endgame strategies from who?

A lot of powerful people would focus on being the ones to control it when it happens, so they'd control the future - and not be subject to some else's control of the future. OpenPhil is about the only org that would think first of the public benefit and not the dangers of other humans controlling it. And not a terribly powerful org, particularly relative to governments.