LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Impact stories for model internals: an exercise for interpretability researchers
jenny · 2023-09-25T23:15:29.189Z · comments (3)

[question] Potential alignment targets for a sovereign superintelligent AI
Paul Colognese (paul-colognese) · 2023-10-03T15:09:59.529Z · answers+comments (4)

Please Understand
samhealy · 2024-04-01T12:33:20.459Z · comments (11)

Experience Report - ML4Good AI Safety Bootcamp
Kieron Kretschmar · 2024-04-11T18:03:41.040Z · comments (0)

[question] How does it feel to switch from earn-to-give?
Neil (neil-warren) · 2024-03-31T16:27:22.860Z · answers+comments (4)

End-to-end hacking with language models
tchauvin (timot.cool) · 2024-04-05T15:06:53.689Z · comments (0)

AI #61: Meta Trouble
Zvi · 2024-05-02T18:40:03.242Z · comments (0)

[question] [link] Is Bjorn Lomborg roughly right about climate change policy?
yhoiseth · 2023-09-27T20:06:30.722Z · answers+comments (14)

Deception Chess: Game #2
Zane · 2023-11-29T02:43:22.375Z · comments (17)

Is the Wave non-disparagement thingy okay?
Ruby · 2023-10-14T05:31:21.640Z · comments (13)

[link] The Poker Theory of Poker Night
omark · 2024-04-07T09:47:01.658Z · comments (13)

Dishonorable Gossip and Going Crazy
Ben Pace (Benito) · 2023-10-14T04:00:35.591Z · comments (31)

On the 2nd CWT with Jonathan Haidt
Zvi · 2024-04-05T17:30:05.223Z · comments (3)

Let's talk about Impostor syndrome in AI safety
Igor Ivanov (igor-ivanov) · 2023-09-22T13:51:18.482Z · comments (4)

Non-myopia stories
lberglund (brglnd) · 2023-11-13T17:52:31.933Z · comments (10)

[link] GDP per capita in 2050
Hauke Hillebrandt (hauke-hillebrandt) · 2024-05-06T15:14:30.934Z · comments (8)

The (partial) fallacy of dumb superintelligence
Seth Herd · 2023-10-18T21:25:16.893Z · comments (5)

Big-endian is better than little-endian
Menotim · 2024-04-29T02:30:48.053Z · comments (17)

Results from the Turing Seminar hackathon
Charbel-Raphaël (charbel-raphael-segerie) · 2023-12-07T14:50:38.377Z · comments (1)

[link] Debate helps supervise human experts [Paper]
habryka (habryka4) · 2023-11-17T05:25:17.030Z · comments (6)

[link] One: a story
Richard_Ngo (ricraz) · 2023-10-10T00:18:31.604Z · comments (0)

Offering Completion
jefftk (jkaufman) · 2024-06-07T01:40:02.137Z · comments (6)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset]
abstractapplic · 2024-05-20T09:38:55.228Z · comments (2)

[link] Anthropic: Reflections on our Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-05-20T04:14:44.435Z · comments (21)

Adam Smith Meets AI Doomers
James_Miller · 2024-01-31T15:53:03.070Z · comments (10)

[link] What fuels your ambition?
Cissy · 2024-01-31T18:30:53.274Z · comments (1)

Investigating Bias Representations in LLMs via Activation Steering
DawnLu · 2024-01-15T19:39:14.077Z · comments (4)

[question] Weighing reputational and moral consequences of leaving Russia or staying
spza · 2024-02-18T19:36:40.676Z · answers+comments (24)

Reviewing the Structure of Current AI Regulations
Deric Cheng (deric-cheng) · 2024-05-07T12:34:17.820Z · comments (0)

Wholesome Culture
owencb · 2024-03-01T12:08:17.877Z · comments (3)

A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans
Thane Ruthenis · 2023-12-17T20:28:57.854Z · comments (7)

Throughput vs. Latency
alkjash · 2024-01-12T21:37:07.632Z · comments (2)

Representation Tuning
Christopher Ackerman (christopher-ackerman) · 2024-06-27T17:44:33.338Z · comments (4)

[link] My MATS Summer 2023 experience
James Chua (james-chua) · 2024-03-20T11:26:14.944Z · comments (0)

Two Tales of AI Takeover: My Doubts
Violet Hour · 2024-03-05T15:51:05.558Z · comments (8)

Quick Thoughts on Our First Sampling Run
jefftk (jkaufman) · 2024-05-23T00:20:02.050Z · comments (3)

Aggregative Principles of Social Justice
Cleo Nardo (strawberry calm) · 2024-06-05T13:44:47.499Z · comments (10)

DPO/PPO-RLHF on LLMs incentivizes sycophancy, exaggeration and deceptive hallucination, but not misaligned powerseeking
tailcalled · 2024-06-10T21:20:11.938Z · comments (13)

Paper Summary: Princes and Merchants: European City Growth Before the Industrial Revolution
Jeffrey Heninger (jeffrey-heninger) · 2024-07-15T21:30:04.043Z · comments (1)

Scorable Functions: A Format for Algorithmic Forecasting
ozziegooen · 2024-05-21T04:14:11.749Z · comments (0)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

[question] Where to find reliable reviews of AI products?
Elizabeth (pktechgirl) · 2024-09-17T23:48:25.899Z · answers+comments (4)

Reading More Each Day: A Simple $35 Tool
aysajan · 2024-07-24T13:54:04.290Z · comments (2)

AI #65: I Spy With My AI
Zvi · 2024-05-23T12:40:02.793Z · comments (7)

Evaporation of improvements
Viliam · 2024-06-20T18:34:40.969Z · comments (27)

I played the AI box game as the Gatekeeper — and lost
datawitch · 2024-02-12T18:39:35.777Z · comments (52)

[link] AI Impacts 2023 Expert Survey on Progress in AI
habryka (habryka4) · 2024-01-05T19:42:17.226Z · comments (1)

AI #64: Feel the Mundane Utility
Zvi · 2024-05-16T15:20:02.956Z · comments (11)

Childhood and Education Roundup #6: College Edition
Zvi · 2024-06-26T11:40:03.990Z · comments (8)

Employee Incentives Make AGI Lab Pauses More Costly
nikola (nikolaisalreadytaken) · 2023-12-22T05:04:15.598Z · comments (12)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

gwern on Applications of Chaos: Saying No (with Hastings Greer)

See my comment. The problem with the post is revealed in the fourth sentence:

To demonstrate how chaos theory imposes some limits on the skill of an arbitrary intelligence, I will also look at a game: pinball.

Note that predicting a ball is not at all the same thing as skill in manipulating a ball. It's just a giant non sequitur being slipped in before he begins the math. Which is why he is 100% wrong when he concludes

This is not a problem that is solvable by applying more cognitive effort.

It totally is solvable. The 'cognitive effort' here is 'git gud at pinball, scrub, and stop making excuses for losing', and as he admits in the footnote he didn't include in the LW version, in real life, when adequately incentivized to win rather than find excuses involving 'well, chaos theory shows you can't predict ball bounces more than n bounces out', pinball pros learn how to win and rack up high scores despite 'muh chaos'.

And that is why I don't believe your anecdotal survey responses imply anything good. I think that several or all of those cases, if we were able to investigate them adequately, would turn out to be similar to this pinball essay: a lot of browbeating intimidation-by-math, possibly completely valid insofar as it went, but ultimately, proving an irrelevant claim and the problem in fact soluble.

quila on What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

I am not sure if this is the answer you're looking for, but it is the one I've come to prefer, and others have given the standard answer already.

You can write out a Bayesian algorithm to determine what happens. The program specifics determine the result. Two that come to mind of many possible results:

After making an observation which had prior probability 0%, the program returns an 'cannot divide by 0' error.
The program updates on the observation, but this rules out the entirety of its probability-space, and it soon fails to find expected items in an empty set.

Bayes' theorem is an algorithm which is used (and selected for) because it helps predict the world, rather than something with metaphysical status.

This usefulness also makes it selected for (in complex worlds where agents have incomplete information, like this one). We can also imagine different mathematical worlds where Bayes' theorem is not useful.

wyatt-s on Hammertime Day 5: Comfort Zone Expansion

I didn't so much find any specific thing in particular, as I found a sensation of excitement that had been missing from my life. I thought about people I could talk to, games I could play, shows I could watch. I am excited to do this again next Hammertime repeat.

yanni-kyriacos on yanni's Shortform

I am 90% sure that most AI Safety talent aren't thinking hard enough about what Neglectedness. The industry is so nascent that you could look at 10 analogous industries, see what processes or institutions are valuable and missing and build an organisation around the highest impact one.

The highest impact job ≠ the highest impact opportunity for you!

benito on Did Christopher Hitchens change his mind about waterboarding?

Curated. I appreciated reading this attempt to actually get to the bottom of a simple, widely popularized narrative. It's a helpful datapoint about how reliable narratives are that are spread around our civilization, and how much work is actually involved in checking what actually happened.

robo on Making Eggs Without Ovaries

I suspect experiments with almost-genetically identical twin tests might advance our understanding about almost all genes except sex chromosomes.

Sex chromosomes are independent coin flips with huge effect sizes. That's amazing! Natural provided us with experiments everywhere! Most alleles are confounded (e.g.. correlated with socioeconomic status for no causal reason) and have very small effect sizes.

Example: Imagine an allele which is common in east asians, uncommon in europeans, and makes people 1.1 mm taller. Even though allele causally makes people taller, the average height of the people with the allele (mostly asian) would be less than the average height of the people without the allele (mostly European). The +1.1 mm in causal height gain would be drowned out by the ≈-50 mm in Simpson's paradox. Your almost-twin experiment gives signal where observational regression gives error.

That's not needed for sex differences. Poor people tend to have poor children. Caucasian people tend to have Caucasian children. Male people do not tend to have male children. It's pretty easy to extract signal about sex differences.

(far from my area of expertise)

patrickdfarley on Laziness death spirals

I agree, but I'd lump all of that into "Analyze the circumstances that caused it". Maybe I should've included more external examples like these

jimrandomh on Ozyrus's Shortform

Ah, sorry that one went unfixed for as long as it did; a fix is now written and should be deployed pretty soon.

patrickdfarley on Laziness death spirals

This method is interesting to me and I'd like to get into it someday. Personally I keep finding that whenever I decline to write something down, that one thing will come back to bite me a few days later (because I'd forgotten it). Do you find that you're able to mentally keep track of things better than before, even if they're just vaguely in the back of your mind?

elizabeth-1 on Applications of Chaos: Saying No (with Hastings Greer)

Remember the proof that humans can't get high scores playing pinball because 'chaos theory' [LW · GW]?

Can you point to where the post says this? Because I read it as saying "It is impossible to predict a game of pinball for more than 12 bounces in the future" and "Professional pinball players try to avoid the parts of the board where the motion is chaotic."