LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)
Kaj_Sotala · 2024-01-23T14:05:40.986Z · comments (2)

AI #54: Clauding Along
Zvi · 2024-03-07T16:00:05.066Z · comments (11)

[link] AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance · 2024-01-17T17:17:30.913Z · comments (9)

Some open-source dictionaries and dictionary learning infrastructure
Sam Marks (samuel-marks) · 2023-12-05T06:05:21.903Z · comments (7)

A starting point for making sense of task structure (in machine learning)
Kaarel (kh) · 2024-02-24T01:51:49.227Z · comments (2)

[link] How people stopped dying from diarrhea so much (& other life-saving decisions)
Writer · 2024-03-16T16:00:47.830Z · comments (0)

Quick thoughts on the implications of multi-agent views of mind on AI takeover
Kaj_Sotala · 2023-12-11T06:34:06.395Z · comments (14)

[link] Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Owain_Evans · 2023-12-19T19:14:26.423Z · comments (4)

[link] Towards Evaluating AI Systems for Moral Status Using Self-Reports
Ethan Perez (ethan-perez) · 2023-11-16T20:18:51.730Z · comments (3)

We ran an AI safety conference in Tokyo. It went really well. Come next year!
Blaine (blaine-rogers) · 2024-07-17T06:55:39.620Z · comments (1)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (5)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

[link] Rational Animations' intro to mechanistic interpretability
Writer · 2024-06-14T16:10:57.015Z · comments (1)

AI #72: Denying the Future
Zvi · 2024-07-11T15:00:05.865Z · comments (8)

[link] Slightly More Than You Wanted To Know: Pregnancy Length Effects
JustisMills · 2024-10-21T01:26:02.030Z · comments (4)

[link] AI Rights for Human Safety
Simon Goldstein (simon-goldstein) · 2024-08-01T23:01:07.252Z · comments (6)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

Startup Roundup #2
Zvi · 2024-08-06T13:30:06.554Z · comments (0)

Things Solenoid Narrates
Solenoid_Entity · 2024-04-12T23:57:16.169Z · comments (2)

[link] Book review: Deep Utopia
PeterMcCluskey · 2024-04-23T19:55:50.417Z · comments (14)

Dating Roundup #3: Third Time’s the Charm
Zvi · 2024-05-08T13:30:03.232Z · comments (26)

New intro textbook on AIXI
Alex_Altair · 2024-05-11T18:18:50.945Z · comments (5)

Monthly Roundup #18: May 2024
Zvi · 2024-05-13T12:30:04.863Z · comments (10)

[link] Book review: Everything Is Predictable
PeterMcCluskey · 2024-05-27T03:33:53.857Z · comments (0)

Announcing Atlas Computing
miyazono · 2024-04-11T15:56:31.241Z · comments (4)

ProLU: A Nonlinearity for Sparse Autoencoders
Glen Taggart · 2024-04-23T14:09:21.592Z · comments (4)

D&D.Sci Long War: Defender of Data-mocracy Evaluation & Ruleset
aphyer · 2024-05-14T03:35:10.586Z · comments (3)

Apply to LASR Labs: a London-based technical AI safety research programme
Erin Robertson · 2024-04-09T17:34:06.847Z · comments (1)

[link] Against Student Debt Cancellation From All Sides of the Political Compass
Maxwell Tabarrok (maxwell-tabarrok) · 2024-05-13T14:55:57.525Z · comments (16)

Back to Basics: Truth is Unitary
lsusr · 2024-03-29T21:10:33.399Z · comments (13)

[link] LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery (arjun-panickssery) · 2024-04-17T21:09:12.007Z · comments (1)

An Introduction to AI Sandbagging
Teun van der Weij (teun-van-der-weij) · 2024-04-26T13:40:00.126Z · comments (8)

[link] Level up your spreadsheeting
angelinahli · 2024-05-25T14:57:19.730Z · comments (11)

Higher-Order Forecasts
ozziegooen · 2024-05-22T21:49:42.802Z · comments (1)

AI #60: Oh the Humanity
Zvi · 2024-04-18T14:10:02.281Z · comments (7)

In defense of technological unemployment as the main AI concern
tailcalled · 2024-08-27T17:58:01.992Z · comments (36)

D&D.Sci Long War: Defender of Data-mocracy
aphyer · 2024-04-26T22:30:15.780Z · comments (20)

Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders
Gytis Daujotas (gytis-daujotas) · 2024-08-01T21:08:38.800Z · comments (6)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

Start an Upper-Room UV Installation Company?
jefftk (jkaufman) · 2024-10-19T02:00:10.691Z · comments (9)

Simplifying Corrigibility – Subagent Corrigibility Is Not Anti-Natural
Rubi J. Hudson (Rubi) · 2024-07-16T22:44:17.128Z · comments (27)

Economics Roundup #3
Zvi · 2024-09-10T13:50:06.955Z · comments (9)

[link] Open Sourcing Metaculus
ChristianWilliams · 2024-07-02T22:30:01.339Z · comments (0)

Truthseeking, EA, Simulacra levels, and other stuff
Elizabeth (pktechgirl) · 2023-10-27T23:56:49.198Z · comments (12)

Userscript to always show LW comments in context vs at the top
Vlad Sitalo (harcisis) · 2023-11-21T17:53:30.418Z · comments (8)

On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche
Zack_M_Davis · 2024-01-09T23:12:20.349Z · comments (31)

What does davidad want from «boundaries»?
Chipmonk · 2024-02-06T17:45:42.348Z · comments (1)

AI #38: Let’s Make a Deal
Zvi · 2023-11-16T19:50:05.442Z · comments (2)

On Trust
johnswentworth · 2023-12-06T19:19:07.680Z · comments (26)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sherrinford on Was the K-T event a Great Filter?

Does this question require that there is only one big filter per species?

1stuserhere on A Rocket–Interpretability Analogy

on the one hand, mechanistic understanding has historically underperformed as a research strategy,

Are you talking about ML or in general? What are you deriving this from?

gwern on davekasten's Shortform

Well, what's the alternative? I think it might be helpful to try to make the case for doing these things via some of the alternatives:

a peer-reviewed Nature paper which would be published 3 years from now, maybe, behind a paywall
a Tiktok video
a 5-minute excerpted interview on CNN
a WSJ or NYT op-ed
an Arxiv paper in the standard LaTeX template
a Twitter thread of 500 tweets (which can only be read by logged-in users)
a Medium post (which can't be read because it is written in a light gray font illegible to anyone over the age of 20. Also, it's paywalled 90% of the time.)
interpretive dance in front of the Lincoln Memorial
...

robert-cousineau on Lonely Dissent

http://www.overcoming-bias.com/2007/06/against_free_th.html.

This link should be: https://www.overcomingbias.com/p/against_free_thhtml (removing the hyphen will allow a successful redirect).

christiankl on Why I’m not a Bayesian

Most of the time, the data you gather about the world is that you have a bunch of facts about the world and probabilities about the individual data points and you would want as an outcome also probabilities over individual datapoints.

As far as my own background goes, I have not studied logic or the math behind the AI algorithm that David Chapman wrote. I did study bioinformatics in that that study we did talk about probabilities calculations that are done in bioinformatics, so I have some intuitions from that domain, so I take a bioinformatics example even if I don't know exactly how to productively apply predicate calculus to the example.

If you for example get input data from gene sequencing and billions of probabilities (a_1, a_2, ..., a_n) and want output data about whether or not individual genetic mutations exist (b_1, b_2, ..., b_m) and not just P(B) = P(b_1) * P(b_2) * ... * P(b_m).

If you have m = 100,000 in the case of possible genetic mutations, P(B) is a very small number with little robustness to error. A single bad b_x will propagate to make your total P(B) unreliable. You might have an application where getting a b_234, b_9538 and b _33889 wrong is an acceptable error because most of the values where good.

davekasten on davekasten's Shortform

It seems like the current meta is to write a big essay outlining your opinions about AI (see, e.g., Gladstone Report, Situational Awareness, various essays recently by Sam Altman and Dario Amodei, even the A Narrow Path report I co-authored).

Why do we think this is the case?
I can imagine at least 3 hypotheses:
1. Just path-dependence; someone did it, it went well, others imitated

2. Essays are High Status Serious Writing, and people want to obtain that trophy for their ideas

3. This is a return to the true original meaning of an essay, under Montaigne, that it's an attempt to write thinking down when it's still inchoate, in an effort to make it more comprehensible not only to others but also to oneself. And AGI/ASI is deeply uncertain, so the essay format is particularly suited for this.

What do you think?

habryka4 on A Defense of Peer Review

An exciting recent development is community peer review, also called “open peer review.” Under this system, preprints are uploaded to a server, wherein a pool of reviewers can look them over and decide which, if any, they would like to review. Articles that have made it out of this pool are then selected for publication. This differs from the “upload PDFs to the internet” ideas because it is more structured, results in a definitive outcome, and allows gatekeeping in terms of the composition of the pool of reviewers.

That... seems like a weird framing of what is going on? Community peer-review was the standard before anonymous and random peer review ended up being forced on the scientific institution, the way this article describes. Post-publication community peer review was the standard in most fields until the mid of the 20th century, and describing it as an exciting recent development feels like it's conceding the whole debate.

Yes, just do post-publication peer review. Let journals and authors curate which papers they think are good at the same time as everyone else gets to read them. That's what science did before various large government funding bodies demanded more objectivity in the process (with, as this article and other articles it links to, great harm to the process of science).

viliam on TurnTrout's shortform feed

Morality means sometimes not following the incentives.

I am not saying that people who always follow the incentives are immoral. Maybe they are just lucky and their incentives happen to be aligned with doing the right thing. Too much luck is suspicious though.

eggsyntax on Exploring SAE features in LLMs with definition trees and token lists

My main insight from all this is that we should be thinking in terms of taxonomisation of features. Some are very token-specific, others are more nuanced and context-specific (in a variety of ways). The challenge of finding maximally activating text samples might be very different from one category of features to another.

Joseph and Johnny did some interesting work on this in 'Understanding SAE Features with the Logit Lens' [LW · GW], taxonomizing features as partition features vs suppression features vs prediction features, and using summary statistics to distinguish them.

jonas-hallgren on Liquid vs Illiquid Careers

No sorry, I meant from the perspective of the person with less legible skills.