LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

[link] [Paper] Language Models Don't Learn the Physical Manifestation of Language
Bruce W. Lee (bruce-lee) · 2024-02-22T18:52:32.237Z · comments (23)

Instrumental deception and manipulation in LLMs - a case study
Olli Järviniemi (jarviniemi) · 2024-02-24T02:07:01.769Z · comments (13)

[link] Understanding Gödel’s completeness theorem
jessicata (jessica.liu.taylor) · 2024-05-27T18:55:02.079Z · comments (0)

Stitching SAEs of different sizes
Bart Bussmann (Stuckwork) · 2024-07-13T17:19:20.506Z · comments (12)

[Interim research report] Evaluating the Goal-Directedness of Language Models
Rauno Arike (rauno-arike) · 2024-07-18T18:19:04.260Z · comments (4)

Forget Everything (Statistical Mechanics Part 1)
J Bostock (Jemist) · 2024-04-22T13:33:35.446Z · comments (6)

Losing Faith In Contrarianism
omnizoid · 2024-04-25T20:53:34.842Z · comments (44)

LLMs as a Planning Overhang
Larks · 2024-07-14T02:54:14.295Z · comments (8)

Monthly Roundup #14: January 2024
Zvi · 2024-01-24T12:50:09.231Z · comments (22)

China-AI forecasts
[deleted] · 2024-02-25T16:49:33.652Z · comments (29)

[question] How would you navigate a severe financial emergency with no help or resources?
Tigerlily · 2024-05-02T18:27:51.329Z · answers+comments (22)

[link] Elon files grave charges against OpenAI
mako yass (MakoYass) · 2024-03-01T17:42:13.963Z · comments (10)

AI #48: The Talk of Davos
Zvi · 2024-01-25T16:20:26.625Z · comments (9)

[link] Things You're Allowed to Do: At the Dentist
rbinnn · 2024-01-28T18:39:33.584Z · comments (16)

Are we so good to simulate?
KatjaGrace · 2024-03-04T05:20:03.535Z · comments (24)

Inducing Unprompted Misalignment in LLMs
Sam Svenningsen (sven) · 2024-04-19T20:00:58.067Z · comments (6)

You're a Space Wizard, Luke
lsusr · 2024-08-18T05:35:39.238Z · comments (6)

[link] Increasing IQ is trivial
George3d6 · 2024-03-01T22:43:32.037Z · comments (60)

Evaluating Sparse Autoencoders with Board Game Models
Adam Karvonen (karvonenadam) · 2024-08-02T19:50:21.525Z · comments (1)

[link] Simple Kelly betting in prediction markets
jessicata (jessica.liu.taylor) · 2024-03-06T18:59:18.243Z · comments (3)

Dialogue on What It Means For Something to Have A Function/Purpose
johnswentworth · 2024-07-15T16:28:56.609Z · comments (5)

0.202 Bits of Evidence In Favor of Futarchy
niplav · 2024-09-29T21:57:59.896Z · comments (0)

Compelling Villains and Coherent Values
Cole Wyeth (Amyr) · 2024-10-06T19:53:47.891Z · comments (4)

Toy Models of Feature Absorption in SAEs
chanind · 2024-10-07T09:56:53.609Z · comments (7)

Tort Law Can Play an Important Role in Mitigating AI Risk
Gabriel Weil (gabriel-weil) · 2024-02-12T17:17:59.135Z · comments (9)

[link] On what research policymakers actually need
MondSemmel · 2024-04-23T19:50:12.833Z · comments (0)

Is This Lie Detector Really Just a Lie Detector? An Investigation of LLM Probe Specificity.
Josh Levy (josh-levy) · 2024-06-04T15:45:54.399Z · comments (0)

Requirements for a Basin of Attraction to Alignment
RogerDearnaley (roger-d-1) · 2024-02-14T07:10:20.389Z · comments (9)

Mud and Despair (Part 4 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-07T00:14:23.975Z · comments (0)

[link] [Linkpost] George Mack's Razors
trevor (TrevorWiesinger) · 2023-11-27T17:53:45.065Z · comments (8)

D&D.Sci: Whom Shall You Call?
abstractapplic · 2024-07-05T20:53:37.010Z · comments (6)

[question] What progress have we made on automated auditing?
LawrenceC (LawChan) · 2024-07-06T01:49:43.714Z · answers+comments (1)

[link] Win Friends and Influence People Ch. 2: The Bombshell
gull · 2024-01-28T21:40:47.986Z · comments (13)

From Finite Factors to Bayes Nets
J Bostock (Jemist) · 2024-01-23T20:03:51.845Z · comments (7)

[link] Tinker
Richard_Ngo (ricraz) · 2024-04-16T18:26:38.679Z · comments (0)

Making a Secular Solstice Songbook
jefftk (jkaufman) · 2024-01-23T19:40:05.055Z · comments (6)

Text Posts from the Kids Group: 2021
jefftk (jkaufman) · 2023-11-09T17:50:25.782Z · comments (1)

D&D.Sci(-fi): Colonizing the SuperHyperSphere [Evaluation and Ruleset]
abstractapplic · 2024-01-22T19:20:05.001Z · comments (7)

AI #70: A Beautiful Sonnet
Zvi · 2024-06-27T14:40:08.087Z · comments (0)

The Fundamental Theorem for measurable factor spaces
Matthias G. Mayer (matthias-georg-mayer) · 2023-11-12T19:25:25.583Z · comments (2)

[link] The consistent guessing problem is easier than the halting problem
jessicata (jessica.liu.taylor) · 2024-05-20T04:02:03.865Z · comments (5)

[link] An AI Manhattan Project is Not Inevitable
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-06T16:42:35.920Z · comments (25)

Natural abstractions are observer-dependent: a conversation with John Wentworth
Martín Soto (martinsq) · 2024-02-12T17:28:38.889Z · comments (13)

Stop talking about p(doom)
Isaac King (KingSupernova) · 2024-01-01T10:57:28.636Z · comments (22)

International Scientific Report on the Safety of Advanced AI: Key Information
Aryeh Englander (alenglander) · 2024-05-18T01:45:10.194Z · comments (0)

[link] Characterizing stable regions in the residual stream of LLMs
Jett Janiak (jett) · 2024-09-26T13:44:58.792Z · comments (4)

[link] Generative ML in chemistry is bottlenecked by synthesis
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-16T16:31:34.801Z · comments (2)

[link] [Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang (leon-lang) · 2024-10-22T13:57:41.125Z · comments (0)

Australian AI Safety Forum 2024
Liam Carroll (liam-carroll) · 2024-09-27T00:40:11.451Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

alexej-gerstmaier-1 on The Case For Bullying

Hi Justin, I already read both the posts you linked there.

My desire for Truth is overwhelmingly strong, I would change my stance if anyone would present some actual counter-arguments that go beyond the surface level.

Will give longer rebuttal later, am currently on vacation in Spain 🤝

alexej-gerstmaier-1 on The Case For Bullying

Thanks for linking, I love Worm

tailcalled on Alexander Gietelink Oldenziel's Shortform

For everyday life, flat earth is more convenient than round earth geocentrism, which in turn is more convenient than heliocentrism. Like we don't constantly change our city maps based on the time of year, for instance, which we would have to do if we used a truly heliocentric coordinate system as the positions of city buildings are not even approximately constant within such a coordinate system.

This is mainly because the sun and the earth are powerful enough to handle heliocentrism for you, e.g. the earth pulls you and the cities towards the earth so you don't have to put effort into staying on it.

The sun and the planetary motion does remain the most important governing factor for predicting activities on earth, though, even given this coordinate change. We just mix them together into ~epicyclic variables like "day"/"night" and "summer"/"autumn"/"winter"/"spring" rather than talking explicitly about the sun, the earth, and their relative positions.

tailcalled on Three Notions of "Power"

Can you explain what this coordination would look like?

khafra on Three Notions of "Power"

Your definition seems like it fits the Emperor of China example--by reputation, they had few competitors for being the most willing and able to pessimize another agent's utility function; e.g. 9 Familial Exterminations.
And that seems to be a key to understanding this type of power, because if they were able to pessimize all other agents' utility functions, that would just be an evil mirror of bargaining power. Being able to choose a sharply limited number of unfortunate agents, and punish them severely pour encourager les autres, seems like it might just stop working when the average agent is smart enough to implicitly coordinate around a shared understanding of payoff matrices.
So I think I might have arrived back to the "all dominance hierarchies will be populated solely by scheming viziers" conclusion.

fread2281 on Alexander Gietelink Oldenziel's Shortform

I guess this is sorta about your 3, which I disbelieve (though algorithms for tasks other than learning are also important). Currently, Bayesian inference vs SGD is a question of how much data you have (where SGD wins except for very little data). For small to medium amounts of data, even without AGI, I expect SGD to lose eventually due to better inference algorithms. For many problems I have the intuition that it's ~always possible to improve performance with more complicated algorithms (eg sat solvers). All that together makes me expect there to be inference algorithms that scale to very large amounts of data (that aren't going to be doing full Bayesian inference but rather some complicated approximation).

bolverk on I got dysentery so you don’t have to

Sequence 1 length:3 

Sequence 2 length:6 

Alignment length: 6 

Identity: 3/6 (50.00%) 

Similarity: 3/6 (50.00%) 

Gaps: 3/6 (50.00%)

---AGC
   |||
AGCAGC

Like this. Difference between lengths is considered non-matching.

https://en.vectorbuilder.com/tool/sequence-alignment.html

inquilinekea on What TMS is like

https://pmc.ncbi.nlm.nih.gov/articles/PMC8122027/

raemon on JargonBot Beta Test

I've reverted the part that automatically generates jargon for drafts until we've figured out a better overall solution.

yair-halberstadt on Trading Candy

Counterpoint: when I was about 12, I was too old to collect candy at my Synagogue on Simchat Torah, so I would beg a single candy from someone, then trade it up (Dutch book style) with naive younger kids until I had a decent stash. I was particularly pleased whenever my traded up stash included the original sweet.