LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The case for stopping AI safety research
catubc (cat-1) · 2024-05-23T15:55:18.713Z · comments (38)

The Geometry of Feelings and Nonsense in Large Language Models
7vik (satvik-golechha) · 2024-09-27T17:49:27.420Z · comments (10)

Pseudonymity and Accusations
jefftk (jkaufman) · 2023-12-21T19:20:19.944Z · comments (20)

AI #43: Functional Discoveries
Zvi · 2023-12-21T15:50:04.442Z · comments (26)

Can we build a better Public Doublecrux?
Raemon · 2024-05-11T19:21:53.326Z · comments (6)

BatchTopK: A Simple Improvement for TopK-SAEs
Bart Bussmann (Stuckwork) · 2024-07-20T02:20:51.848Z · comments (0)

[link] OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns
Seth Herd · 2023-11-20T14:20:33.539Z · comments (28)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

Gradient Descent on the Human Brain
Jozdien · 2024-04-01T22:39:24.862Z · comments (5)

Anthropical Paradoxes are Paradoxes of Probability Theory
Ape in the coat · 2023-12-06T08:16:26.846Z · comments (18)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-11-07T16:12:20.031Z · comments (20)

On Lex Fridman’s Second Podcast with Altman
Zvi · 2024-03-25T12:20:08.780Z · comments (10)

[link] Slightly More Than You Wanted To Know: Pregnancy Length Effects
JustisMills · 2024-10-21T01:26:02.030Z · comments (4)

Model evals for dangerous capabilities
Zach Stein-Perlman · 2024-09-23T11:00:00.866Z · comments (9)

[link] Prices are Bounties
Maxwell Tabarrok (maxwell-tabarrok) · 2024-10-12T14:51:40.689Z · comments (12)

Rewilding the Gut VS the Autoimmune Epidemic
GGD · 2024-08-16T18:00:46.239Z · comments (0)

Does literacy remove your ability to be a bard as good as Homer?
Adrià Garriga-alonso (rhaps0dy) · 2024-01-18T03:43:14.994Z · comments (19)

Two LessWrong speed friending experiments
mikko (morrel) · 2024-06-15T10:52:26.081Z · comments (3)

Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.
Andrew_Critch · 2024-09-11T04:41:24.872Z · comments (7)

Provably Safe AI: Worldview and Projects
bgold · 2024-08-09T23:21:02.763Z · comments (43)

The Shutdown Problem: Incomplete Preferences as a Solution
EJT (ElliottThornley) · 2024-02-23T16:01:16.378Z · comments (22)

Llama Llama-3-405B?
Zvi · 2024-07-24T19:40:07.565Z · comments (9)

Will 2024 be very hot? Should we be worried?
A.H. (AlfredHarwood) · 2023-12-29T11:22:50.200Z · comments (12)

The Assumed Intent Bias
silentbob · 2023-11-05T16:28:03.282Z · comments (13)

On OpenAI’s Preparedness Framework
Zvi · 2023-12-21T14:00:05.144Z · comments (4)

[link] Bed Time Quests & Dinner Games for 3-5 year olds
Gunnar_Zarncke · 2024-06-22T07:53:38.989Z · comments (0)

[link] Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Gunnar_Zarncke · 2024-05-16T13:09:39.265Z · comments (20)

Polysemantic Attention Head in a 4-Layer Transformer
Jett Janiak (jett) · 2023-11-09T16:16:35.132Z · comments (0)

[link] Electrostatic Airships?
DaemonicSigil · 2024-10-27T04:32:34.852Z · comments (11)

Applying refusal-vector ablation to a Llama 3 70B agent
Simon Lermen (dalasnoin) · 2024-05-11T00:08:08.117Z · comments (14)

D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset
aphyer · 2024-06-17T21:29:08.778Z · comments (11)

[link] how birds sense magnetic fields
bhauth · 2024-06-27T18:59:35.075Z · comments (4)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (25)

[link] Anthropic's updated Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-10-15T16:46:48.727Z · comments (3)

[link] The Good Balsamic Vinegar
jenn (pixx) · 2024-01-26T19:30:57.435Z · comments (4)

Cooperating with aliens and AGIs: An ECL explainer
Chi Nguyen · 2024-02-24T22:58:47.345Z · comments (8)

[link] A starter guide for evals
Marius Hobbhahn (marius-hobbhahn) · 2024-01-08T18:24:23.913Z · comments (2)

[link] Can AI Outpredict Humans? Results From Metaculus's Q3 AI Forecasting Benchmark
ChristianWilliams · 2024-10-10T18:58:46.041Z · comments (2)

On Overhangs and Technological Change
Roko · 2023-11-05T22:58:51.306Z · comments (19)

Apply to the Conceptual Boundaries Workshop for AI Safety
Chipmonk · 2023-11-27T21:04:59.037Z · comments (0)

Consent across power differentials
Ramana Kumar (ramana-kumar) · 2024-07-09T11:42:03.177Z · comments (12)

Unlearning via RMU is mostly shallow
Andy Arditi (andy-arditi) · 2024-07-23T16:07:52.223Z · comments (3)

Applications of Chaos: Saying No (with Hastings Greer)
Elizabeth (pktechgirl) · 2024-09-21T16:30:07.415Z · comments (16)

Scenario Forecasting Workshop: Materials and Learnings
elifland · 2024-03-08T02:30:46.517Z · comments (3)

Paper in Science: Managing extreme AI risks amid rapid progress
JanB (JanBrauner) · 2024-05-23T08:40:40.678Z · comments (2)

GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt · 2023-11-10T07:30:06.480Z · comments (5)

[link] on the dollar-yen exchange rate
bhauth · 2024-04-07T04:49:53.920Z · comments (21)

Goal-Completeness is like Turing-Completeness for AGI
Liron · 2023-12-19T18:12:29.947Z · comments (26)

The Shortest Path Between Scylla and Charybdis
Thane Ruthenis · 2023-12-18T20:08:34.995Z · comments (8)

AI #82: The Governor Ponders
Zvi · 2024-09-19T13:30:04.863Z · comments (8)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

raemon on On Shifgrethor

“Advice can be violating” is the concept-handle I think I will take away.

cubefox on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

Some issues that seem to be controversial are really taboo, or arise due to an underlying taboo. For this case I have two general recommendations here [LW(p) · GW(p)].

Related to this: Some opinions may be often expressed because of virtue signalling; e.g. because the opposite is taboo, or for other reasons. Hearing such opinions doesn't provide significant testimonial evidence for their truth, since people don't hold them because of evidence they encountered, but because they feel virtuous. Though it is not easy to recognize why particular opinions are being expressed, whether they are motivated by signalling or not.

sarahconstantin on sarahconstantin's Shortform

links 10/28/2024: https://roamresearch.com/#/app/srcpublic/page/10-28-2024

Vincent deVita, chemotherapy pioneer, reflecting on how cancer research has changed (and become more bureaucratic) since the 1960s:
Michael Levin has his own team (of ~20) at Tufts working on morphogenetics: https://allencenter.tufts.edu/
- with a $10M founding grant from the Allen Foundation, which I expect will not be enough to complete this research program. https://alleninstitute.org/news/the-paul-g-allen-frontiers-group-announces-allen-discovery-center-at-tufts-university/

abramdemski on Why I’m not a Bayesian

One thing I don't understand / don't agree with here is the move from propositions to models. It seems to me that models can be (and usually are) understood in terms of propositions.

For example, Solomonoff understands models as computer programs which generate predictions. However, computer programs are constructed out of bits, which can be understood as propositions. The bits are not very meaningful in isolation; the claim "program-bit number 37 is a 1" has almost no meaning in the absence of further information about the other program bits. However, this isn't much of an issue for the formalism.

Similarly, I expect that any attempt to formally model "models" can be broken down into propositions. EG, if someone claimed that humans understand the world in terms of systems of differential equations, this would still be well-facilitated by a concept of propositions (ie, the equations).

It seems to me like a convincing abandonment of propositions would have to be quite radical, abandoning the idea of formalism entirely. This is because you'd have to explain why your way of thinking about models is not amenable to a mathematical treatment (since math is commonly understood in terms of propositions).

So (a) I'm not convinced that thinking in terms of propositions makes it difficult to think in terms of models; (b) it seems to me that refusing to think in terms of propositions would make it difficult to think in terms of models.

donald-hobson on On Shifgrethor

You have given various examples of advice being unwanted/unhelpful. But there are also plenty of examples of it being wanted/helpful. Including lots of cases where the person doesn't know they need it.

Why do you think advice is rarer than it should be?

donald-hobson on Your memory eventually drives confidence in each hypothesis to 1 or 0

But if I only remember the most significant bit, I am going to treat it more like 25%/75% as opposed to 0/1

sharmake-farah on Labs should be explicit about why they are building AGI

I broadly suspect that this is the actual answer:

Maybe controlling a real human body is an incredibly compute-intensive task

More specifically, the reason here is latency requirements are on the order of milliseconds, which is also a hard constraint, which means you need more compute specifically for motor processing.

abramdemski on Why I’m not a Bayesian

"X is false" has to be modeled as something that is value 1 if and only if X is value 0, but continuously decreases in value as X continuously increases in value. The simplest formula is value(X is false) = 1-value(X). However, we can made "sharper" formulas which diminish in value more rapidly as X increases in value. Hartry Field constructs a hierarchy of such predicates which he calls "definitely false", "definitely definitely false", etc.

Proof systems for the logic should have the property that sentences are derivable only when they have value 1; so "X is false" or "X is definitely false" etc all share the property that they're only derivable when X has value zero.

minusgix on johnswentworth's Shortform

Finally, the speed at which you communicate vibing means you're communicating almost purely from System 1, expressing your actual felt beliefs. It makes deception both of yourself and others much harder. Its much more likely to reveal your true colors. This allows it to act as a values screening mechanism as well.

I'm personally skeptical of this. I've found I'm far more likely to lie than I'd endorse when vibing. Saying "sure I'd be happy to join you on X event" when it is clear with some thought that I'd end up disliking it. Or exaggerating stories because it fits with the vibe.
I view System-1 as less concerned with truth here, it is the one that is more likely to produce a fake-argument in response to a suggested problem. More likely to play social games regardless of if they make sense.

nathan-helm-burger on Electrostatic Airships?

What about a hot air blimp with the membrane being quilted, and filled with aerogel. The super light, super insulating aerogel combined with the large volume to surface ratio would make it pretty efficient to keep hot.

https://en.m.wikipedia.org/wiki/Thermal_airship

With aerogel insulation, the hot air plus steam idea seems quite plausible. Claude s3.6 says:

With modern aerogel insulation (U-value ~0.015 W/m²K):

For 10m radius: Heat loss = 0.015 × 1,257 × (100-15) = 1,602 W ≈ 5,500 BTU/hr

For 20m radius: Heat loss = 0.015 × 5,027 × (100-15) = 6,409 W ≈ 22,000 BTU/hr

Converting to fuel consumption (using propane as example):

Propane contains ~91,500 BTU/gallon
Assuming 80% heating efficiency:

10m radius: ~0.08 gallons/hour 20m radius: ~0.30 gallons/hour

The efficiency improves dramatically with size due to the cubic/square relationship. Each doubling of radius:

Increases volume (and lift) by 8×
Increases surface area (and heat loss) by 4×
Improves fuel efficiency per kg of lift by ~2×