LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Does reducing the amount of RL for a given capability level make AI safer?
Chris_Leong · 2024-05-05T17:04:01.799Z · answers+comments (22)

[link] How bad is chlorinated water?
bhauth · 2023-12-13T18:00:12.640Z · comments (18)

Job Listing: Managing Editor / Writer
Gretta Duleba (gretta-duleba) · 2024-02-21T23:41:26.818Z · comments (2)

2023 LessWrong Community Census, Request for Comments
Screwtape · 2023-11-01T16:32:19.102Z · comments (37)

Incidental polysemanticity
Victor Lecomte (victor-lecomte) · 2023-11-15T04:00:00.000Z · comments (7)

[question] Where is the Town Square?
Gretta Duleba (gretta-duleba) · 2024-02-13T03:53:18.205Z · answers+comments (8)

New intro textbook on AIXI
Alex_Altair · 2024-05-11T18:18:50.945Z · comments (8)

Childhood and Education Roundup #4
Zvi · 2024-01-30T13:50:06.033Z · comments (10)

Why does generalization work?
Martín Soto (martinsq) · 2024-02-20T17:51:10.424Z · comments (16)

The Next ChatGPT Moment: AI Avatars
kolmplex (luke-man) · 2024-01-05T20:14:10.074Z · comments (10)

[link] Why Georgism Lost Its Popularity
Zero Contradictions · 2024-07-20T15:08:41.469Z · comments (50)

Ambiguity in Prediction Market Resolution is Still Harmful
aphyer · 2024-07-31T20:32:40.217Z · comments (17)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Understanding Positional Features in Layer 0 SAEs
bilalchughtai (beelal) · 2024-07-29T09:36:40.701Z · comments (0)

The need for multi-agent experiments
Martín Soto (martinsq) · 2024-08-01T17:14:16.590Z · comments (3)

My intellectual journey to (dis)solve the hard problem of consciousness
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-06T09:32:41.612Z · comments (41)

Sci-Fi books micro-reviews
Yair Halberstadt (yair-halberstadt) · 2024-06-24T09:49:28.523Z · comments (27)

Concrete empirical research projects in mechanistic anomaly detection
Erik Jenner (ejenner) · 2024-04-03T23:07:21.502Z · comments (3)

The Case for Predictive Models
Rubi J. Hudson (Rubi) · 2024-04-03T18:22:20.243Z · comments (7)

New Executive Team & Board — PIBBSS
Nora_Ammann · 2024-07-01T19:30:45.261Z · comments (1)

[link] Non-alignment project ideas for making transformative AI go well
Lukas Finnveden (Lanrian) · 2024-01-04T07:23:13.658Z · comments (1)

[link] rapid growth
Chipmonk · 2024-06-05T00:43:51.501Z · comments (0)

Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders
Evan Anders (evan-anders) · 2024-02-27T02:43:22.446Z · comments (16)

Housing Roundup #7
Zvi · 2024-03-04T15:00:08.192Z · comments (1)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

Protocol evaluations: good analogies vs control
Fabien Roger (Fabien) · 2024-02-19T18:00:09.794Z · comments (10)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

Take SCIFs, it’s dangerous to go alone
latterframe · 2024-05-01T08:02:38.067Z · comments (1)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Wholesomeness and Effective Altruism
owencb · 2024-02-28T20:28:22.175Z · comments (3)

[question] What rationality failure modes are there?
Ulisse Mini (ulisse-mini) · 2024-01-19T09:12:57.924Z · answers+comments (11)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

Case studies on social-welfare-based standards in various industries
HoldenKarnofsky · 2024-06-20T13:33:44.780Z · comments (0)

Estimating efficiency improvements in LLM pre-training
Daan · 2024-01-19T19:32:45.124Z · comments (3)

Koan: divining alien datastructures from RAM activations
TsviBT · 2024-04-05T18:04:57.280Z · comments (10)

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

[link] Surgery Works Well Without The FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2024-01-26T13:31:29.968Z · comments (28)

Evidential Cooperation in Large Worlds: Potential Objections & FAQ
Chi Nguyen · 2024-02-28T18:58:25.688Z · comments (5)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts
Mikhail Samin (mikhail-samin) · 2023-12-27T18:44:33.976Z · comments (17)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Navigating emotions in an uncertain & confusing world
Akash (akash-wasil) · 2023-11-20T18:16:09.492Z · comments (1)

[link] Project ideas: Epistemics
Lukas Finnveden (Lanrian) · 2024-01-05T23:41:23.721Z · comments (4)

MonoPoly Restricted Trust
ymeskhout · 2024-01-02T23:02:55.066Z · comments (37)

Taking responsibility and partial derivatives
Ruby · 2023-12-31T04:33:51.419Z · comments (1)

[link] AI Girlfriends Won't Matter Much
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-23T15:58:30.308Z · comments (22)

Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)

[Valence series] 5. “Valence Disorders” in Mental Health & Personality
Steven Byrnes (steve2152) · 2023-12-18T15:26:29.970Z · comments (12)

Upgrading the AI Safety Community
trevor (TrevorWiesinger) · 2023-12-16T15:34:26.600Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

denkenberger on Is the Power Grid Sustainable?

It would be helpful to see a calculation with your rates, the installed cost of batteries, cost of the space taken up, losses in the batteries, any cost of maintenance, lifetime of batteries, and cost (or benefit) of disposal.

nathan-helm-burger on johnswentworth's Shortform

I like it. I do work that it, and The Narrow Path, are both missing how hard it will be to govern and restrict AI.

aprilsr on Alexander Gietelink Oldenziel's Shortform

I think it's pretty good to keep it in mind that heliocentrism is literally speaking just a change in what coordinate system you use, but it is legitimately a much more convenient coordinate system.

raemon on JargonBot Beta Test

Mmm, that does seem reasonable.

johnswentworth on Trading Candy

To be clear, I don't really think of myself as libertarian these days, though I guess it'd probably look that way if you just gave me a political alignment quiz.

To answer your question: I'm two years older than my brother, who is two years older than my sister.

zy on Trading Candy

I think that is probably not a good reason to be libertarian in my opinion? Could you also share maybe how much older were your than you siblings? If you are not that far apart, you and your siblings came from the same starting line, distributing is not going to happen in real life economically nor socially even if not libertarian (in real life, where we need equity is when the starting line is not the same and is not able to be changed by choice. A more similar analogy might be some kids are born with large ears, and large ears are favored by the society, and the large eared kids always get more candy). If you are ages apart with you being a lot older, it may make some limited sense to for your parents to re-distribute.

quila on JargonBot Beta Test

Let us know what you think!

the grey text feels disruptive to normal reading flow but idk why green link text wouldn't also be, maybe i'm just not used to it. e.g., in this post's "Curating technical posts" where 'Curating' is grey, my mind sees "<Curating | distinct term> technical posts" instead of [normal meaning inference not overfocused on individual words]

Is this useful, as a reader?

if the authors make sure they agree with all the definitions they allow into the glossary, yes. author-written definitions would be even more useful because how things are worded can implicitly convey things like, the underlying intuition, ontology, or related views they may be using wording to rule in or out.

Whenever an author with 100+ karma saves a draft of a post, our database queries a language model to:

i would prefer this be optional too, for drafts which are meant to be private (e.g. shared with a few other users, e.g. may contain possible capability-infohazards), where the author doesn't trust LM companies

johnswentworth on Ryan Kidd's Shortform

Y'know @Ryan [LW · GW], MATS should try to hire the PIBBSS folks to help with recruiting. IMO they tend to have the strongest participants of the programs on this chart which I'm familiar with (though high variance).

johnswentworth on Ryan Kidd's Shortform

... WOW that is not an efficient market.

johnswentworth on Trading Candy

My two siblings and I always used to trade candy after Halloween and Easter. We'd each lay out our candy on table, like a little booth, and then haggle a lot.

My memories are fuzzy, but apparently the way this most often went was that I tended to prioritize quantity moreso than my siblings, wanting to make sure that I had a stock of good candy which would last a while. So naturally, a few days later, my siblings had consumed their tastiest treats and complained that I had all the best candy. My mother then stepped in and redistributed the candy.

And that was how I became a libertarian at a very young age.

Only years later did we find out that my mother would also steal the best-looking candies after we went to bed and try to get us to blame each other, which... is altogether too on-the-nose for this particular analogy.