LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Linear encoding of character-level information in GPT-J token embeddings
mwatkins · 2023-11-10T22:19:14.654Z · comments (4)

[link] Inferring the model dimension of API-protected LLMs
Ege Erdil (ege-erdil) · 2024-03-18T06:19:25.974Z · comments (3)

(Maybe) A Bag of Heuristics is All There Is & A Bag of Heuristics is All You Need
Sodium · 2024-10-03T19:11:58.032Z · comments (17)

Augmenting Statistical Models with Natural Language Parameters
jsteinhardt · 2024-09-20T18:30:10.816Z · comments (0)

ARENA4.0 Capstone: Hyperparameter tuning for MELBO + replication on Llama-3.2-1b-Instruct
25Hour (aaron-kaufman) · 2024-10-05T11:30:11.953Z · comments (2)

[link] Book review: On the Edge
PeterMcCluskey · 2024-08-30T22:18:39.581Z · comments (0)

[question] If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?
KvmanThinking (avery-liu) · 2024-10-03T11:31:19.974Z · answers+comments (34)

The Cognitive Bootcamp Agreement
Raemon · 2024-10-16T23:24:05.509Z · comments (0)

What AI companies should do: Some rough ideas
Zach Stein-Perlman · 2024-10-21T14:00:10.412Z · comments (10)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

AI Safety Camp 10
Robert Kralisch (nonmali-1) · 2024-10-26T11:08:09.887Z · comments (4)

My disagreements with "AGI ruin: A List of Lethalities"
Noosphere89 (sharmake-farah) · 2024-09-15T17:22:18.367Z · comments (44)

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

More on the Apple Vision Pro
Zvi · 2024-02-13T17:40:05.388Z · comments (5)

Takeaways from a Mechanistic Interpretability project on “Forbidden Facts”
Tony Wang (tw) · 2023-12-15T11:05:23.256Z · comments (8)

Disentangling four motivations for acting in accordance with UDT
Julian Stastny · 2023-11-05T21:26:22.514Z · comments (3)

Rational Animations offers animation production and writing services!
Writer · 2024-03-15T17:26:07.976Z · comments (0)

[question] Do websites and apps actually generally get worse after updates, or is it just an effect of the fear of change?
lillybaeum · 2023-12-10T17:26:34.206Z · answers+comments (34)

We have promising alignment plans with low taxes
Seth Herd · 2023-11-10T18:51:38.604Z · comments (9)

Update #2 to "Dominant Assurance Contract Platform": EnsureDone
moyamo · 2023-11-28T18:02:50.367Z · comments (2)

Regrant up to $600,000 to AI safety projects with GiveWiki
Dawn Drescher (Telofy) · 2023-10-28T19:56:06.676Z · comments (1)

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5
VipulNaik · 2023-11-29T18:11:53.252Z · comments (16)

An illustrative model of backfire risks from pausing AI research
Maxime Riché (maxime-riche) · 2023-11-06T14:30:58.615Z · comments (3)

Effectively Handling Disagreements - Introducing a New Workshop
Camille Berger (Camille Berger) · 2024-04-15T16:33:50.339Z · comments (2)

LLMs can strategically deceive while doing gain-of-function research
Igor Ivanov (igor-ivanov) · 2024-01-24T15:45:08.795Z · comments (4)

Helpful examples to get a sense of modern automated manipulation
trevor (TrevorWiesinger) · 2023-11-12T20:49:57.422Z · comments (3)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

[question] Is AlphaGo actually a consequentialist utility maximizer?
faul_sname · 2023-12-07T12:41:05.132Z · answers+comments (8)

Love, Reverence, and Life
Elizabeth (pktechgirl) · 2023-12-12T21:49:04.061Z · comments (7)

[link] On Lies and Liars
Gabriel Alfour (gabriel-alfour-1) · 2023-11-17T17:13:03.726Z · comments (4)

The Consciousness Box
GradualImprovement · 2023-12-11T16:45:08.172Z · comments (22)

UDT1.01: Logical Inductors and Implicit Beliefs (5/10)
Diffractor · 2024-04-18T08:39:13.368Z · comments (2)

"Which chains-of-thought was that faster than?"
Emrik (Emrik North) · 2024-05-22T08:21:00.269Z · comments (4)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery (arjun-panickssery) · 2024-01-15T21:21:03.962Z · comments (0)

Monthly Roundup #20: July 2024
Zvi · 2024-07-23T12:50:07.991Z · comments (9)

Important open problems in voting
Closed Limelike Curves · 2024-07-01T02:53:44.690Z · comments (1)

How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-07-01T09:04:03.687Z · comments (4)

5. Moral Value for Sentient Animals? Alas, Not Yet
RogerDearnaley (roger-d-1) · 2023-12-27T06:42:09.130Z · comments (41)

Templates I made to run feedback rounds for Ethan Perez’s research fellows.
Henry Sleight (ResentHighly) · 2024-03-28T19:41:15.506Z · comments (0)

[link] Vacuum: Theory and Technologies
ethanmorse · 2024-01-21T17:23:49.257Z · comments (0)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (24)

AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them
Roman Leventov · 2023-12-27T14:51:37.713Z · comments (9)

[link] FTX expects to return all customer money; clawbacks may go away
Mikhail Samin (mikhail-samin) · 2024-02-14T03:43:13.218Z · comments (1)

[link] patent process problems
bhauth · 2024-07-14T21:12:04.953Z · comments (13)

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

One way violinists fail
Solenoid_Entity · 2024-05-29T04:08:17.675Z · comments (5)

AI Safety Strategies Landscape
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-09T17:33:45.853Z · comments (1)

2024 ACX Predictions: Blind/Buy/Sell/Hold
Zvi · 2024-01-09T19:30:06.388Z · comments (2)

Boston Solstice 2023 Retrospective
jefftk (jkaufman) · 2024-01-02T03:10:05.694Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

raemon on On Shifgrethor

“Advice can be violating” is the concept-handle I think I will take away.

cubefox on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

Some issues that seem to be controversial are really taboo, or arise due to an underlying taboo. For this case I have two general recommendations here [LW(p) · GW(p)].

Related to this: Some opinions may be often expressed because of virtue signalling; e.g. because the opposite is taboo, or for other reasons. Hearing such opinions doesn't provide significant testimonial evidence for their truth, since people don't hold them because of evidence they encountered, but because they feel virtuous. Though it is not easy to recognize why particular opinions are being expressed, whether they are motivated by signalling or not.

sarahconstantin on sarahconstantin's Shortform

links 10/28/2024: https://roamresearch.com/#/app/srcpublic/page/10-28-2024

Vincent deVita, chemotherapy pioneer, reflecting on how cancer research has changed (and become more bureaucratic) since the 1960s:
Michael Levin has his own team (of ~20) at Tufts working on morphogenetics: https://allencenter.tufts.edu/
- with a $10M founding grant from the Allen Foundation, which I expect will not be enough to complete this research program. https://alleninstitute.org/news/the-paul-g-allen-frontiers-group-announces-allen-discovery-center-at-tufts-university/

abramdemski on Why I’m not a Bayesian

One thing I don't understand / don't agree with here is the move from propositions to models. It seems to me that models can be (and usually are) understood in terms of propositions.

For example, Solomonoff understands models as computer programs which generate predictions. However, computer programs are constructed out of bits, which can be understood as propositions. The bits are not very meaningful in isolation; the claim "program-bit number 37 is a 1" has almost no meaning in the absence of further information about the other program bits. However, this isn't much of an issue for the formalism.

Similarly, I expect that any attempt to formally model "models" can be broken down into propositions. EG, if someone claimed that humans understand the world in terms of systems of differential equations, this would still be well-facilitated by a concept of propositions (ie, the equations).

It seems to me like a convincing abandonment of propositions would have to be quite radical, abandoning the idea of formalism entirely. This is because you'd have to explain why your way of thinking about models is not amenable to a mathematical treatment (since math is commonly understood in terms of propositions).

So (a) I'm not convinced that thinking in terms of propositions makes it difficult to think in terms of models; (b) it seems to me that refusing to think in terms of propositions would make it difficult to think in terms of models.

donald-hobson on On Shifgrethor

You have given various examples of advice being unwanted/unhelpful. But there are also plenty of examples of it being wanted/helpful. Including lots of cases where the person doesn't know they need it.

Why do you think advice is rarer than it should be?

donald-hobson on Your memory eventually drives confidence in each hypothesis to 1 or 0

But if I only remember the most significant bit, I am going to treat it more like 25%/75% as opposed to 0/1

sharmake-farah on Labs should be explicit about why they are building AGI

I broadly suspect that this is the actual answer:

Maybe controlling a real human body is an incredibly compute-intensive task

More specifically, the reason here is latency requirements are on the order of milliseconds, which is also a hard constraint, which means you need more compute specifically for motor processing.

abramdemski on Why I’m not a Bayesian

"X is false" has to be modeled as something that is value 1 if and only if X is value 0, but continuously decreases in value as X continuously increases in value. The simplest formula is value(X is false) = 1-value(X). However, we can made "sharper" formulas which diminish in value more rapidly as X increases in value. Hartry Field constructs a hierarchy of such predicates which he calls "definitely false", "definitely definitely false", etc.

Proof systems for the logic should have the property that sentences are derivable only when they have value 1; so "X is false" or "X is definitely false" etc all share the property that they're only derivable when X has value zero.

minusgix on johnswentworth's Shortform

Finally, the speed at which you communicate vibing means you're communicating almost purely from System 1, expressing your actual felt beliefs. It makes deception both of yourself and others much harder. Its much more likely to reveal your true colors. This allows it to act as a values screening mechanism as well.

I'm personally skeptical of this. I've found I'm far more likely to lie than I'd endorse when vibing. Saying "sure I'd be happy to join you on X event" when it is clear with some thought that I'd end up disliking it. Or exaggerating stories because it fits with the vibe.
I view System-1 as less concerned with truth here, it is the one that is more likely to produce a fake-argument in response to a suggested problem. More likely to play social games regardless of if they make sense.

nathan-helm-burger on Electrostatic Airships?

What about a hot air blimp with the membrane being quilted, and filled with aerogel. The super light, super insulating aerogel combined with the large volume to surface ratio would make it pretty efficient to keep hot.

https://en.m.wikipedia.org/wiki/Thermal_airship

With aerogel insulation, the hot air plus steam idea seems quite plausible. Claude s3.6 says:

With modern aerogel insulation (U-value ~0.015 W/m²K):

For 10m radius: Heat loss = 0.015 × 1,257 × (100-15) = 1,602 W ≈ 5,500 BTU/hr

For 20m radius: Heat loss = 0.015 × 5,027 × (100-15) = 6,409 W ≈ 22,000 BTU/hr

Converting to fuel consumption (using propane as example):

Propane contains ~91,500 BTU/gallon
Assuming 80% heating efficiency:

10m radius: ~0.08 gallons/hour 20m radius: ~0.30 gallons/hour

The efficiency improves dramatically with size due to the cubic/square relationship. Each doubling of radius:

Increases volume (and lift) by 8×
Increases surface area (and heat loss) by 4×
Improves fuel efficiency per kg of lift by ~2×