LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] MIRI's September 2024 newsletter
Harlan · 2024-09-16T18:15:40.785Z · comments (0)

We ran an AI safety conference in Tokyo. It went really well. Come next year!
Blaine (blaine-rogers) · 2024-07-17T06:55:39.620Z · comments (1)

Startup Roundup #2
Zvi · 2024-08-06T13:30:06.554Z · comments (0)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

[link] Open Sourcing Metaculus
ChristianWilliams · 2024-07-02T22:30:01.339Z · comments (0)

In defense of technological unemployment as the main AI concern
tailcalled · 2024-08-27T17:58:01.992Z · comments (36)

Simplifying Corrigibility – Subagent Corrigibility Is Not Anti-Natural
Rubi J. Hudson (Rubi) · 2024-07-16T22:44:17.128Z · comments (27)

Humanity isn't remotely longtermist, so arguments for AGI x-risk should focus on the near term
Seth Herd · 2024-08-12T18:10:56.543Z · comments (10)

[link] Why Georgism Lost Its Popularity
Zero Contradictions · 2024-07-20T15:08:41.469Z · comments (50)

New Executive Team & Board — PIBBSS
Nora_Ammann · 2024-07-01T19:30:45.261Z · comments (1)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

The need for multi-agent experiments
Martín Soto (martinsq) · 2024-08-01T17:14:16.590Z · comments (3)

Economics Roundup #3
Zvi · 2024-09-10T13:50:06.955Z · comments (5)

Understanding Positional Features in Layer 0 SAEs
bilalchughtai (beelal) · 2024-07-29T09:36:40.701Z · comments (0)

Sci-Fi books micro-reviews
Yair Halberstadt (yair-halberstadt) · 2024-06-24T09:49:28.523Z · comments (27)

AI #82: The Governor Ponders
Zvi · 2024-09-19T13:30:04.863Z · comments (8)

Conflating value alignment and intent alignment is causing confusion
Seth Herd · 2024-09-05T16:39:51.967Z · comments (17)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (11)

How ARENA course material gets made
CallumMcDougall (TheMcDouglas) · 2024-07-02T18:04:00.209Z · comments (2)

Superintelligent AI is possible in the 2020s
HunterJay · 2024-08-13T06:03:26.990Z · comments (3)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

(Approximately) Deterministic Natural Latents
johnswentworth · 2024-07-19T23:02:12.306Z · comments (0)

[link] [Paper] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

[link] Progress Conference 2024: Toward Abundant Futures
jasoncrawford · 2024-06-26T15:39:45.267Z · comments (2)

Sherlockian Abduction Master List
Cole Wyeth (Amyr) · 2024-07-11T20:27:00.000Z · comments (59)

[link] you should probably eat oatmeal sometimes
bhauth · 2024-08-25T14:50:37.570Z · comments (29)

[link] What's important in "AI for epistemics"?
Lukas Finnveden (Lanrian) · 2024-08-24T01:27:06.771Z · comments (0)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

Applying Force to the Wrong End of a Causal Chain
silentbob · 2024-06-22T18:06:32.364Z · comments (0)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders
Gytis Daujotas (gytis-daujotas) · 2024-08-01T21:08:38.800Z · comments (6)

Californians, tell your reps to vote yes on SB 1047!
Holly_Elmore · 2024-08-12T19:50:09.817Z · comments (24)

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

Medical Roundup #3
Zvi · 2024-07-09T13:10:06.862Z · comments (4)

[link] Adverse Selection by Life-Saving Charities
vaishnav92 · 2024-08-14T20:46:23.662Z · comments (14)

You're a Space Wizard, Luke
lsusr · 2024-08-18T05:35:39.238Z · comments (6)

Stitching SAEs of different sizes
Bart Bussmann (Stuckwork) · 2024-07-13T17:19:20.506Z · comments (12)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (9)

Whiteboard Pen Magazines are Useful
Johannes C. Mayer (johannes-c-mayer) · 2024-07-12T17:15:33.200Z · comments (6)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (19)

[question] What progress have we made on automated auditing?
LawrenceC (LawChan) · 2024-07-06T01:49:43.714Z · answers+comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sdm on AI #82: The Governor Ponders

For months, those who want no regulations of any kind placed upon themselves have hallucinated and fabricated information about the bill’s contents and intentionally created an internet echo chamber, in a deliberate campaign to create the impression of widespread opposition to SB 1047, and that SB 1047 would harm California’s AI industry.

There is another significant angle to add here. Namely: Many of the people in this internet echo chamber or behind this campaign are part of the network of neoreactionaries, MAGA supporters, and tech elites who want to be unaccountable that you've positioned yourself as a substantial counterpoint to.

Obviously it's invoking a culture war fight which has its downsides, but it's not just rhetoric: the charge that many bill opponents are basing their decisions on an ideology that Newsom opposes and sees as dangerous for the country, is true.

A16z and many other of the most dishonest opponents of the bill, are part of a Trump-supporting network with lots of close ties to neoreactionary thought, which opposes SB 1047 for precisely the same reason that they want Trump and republicans to win: to remove restraints on their own power in the short-to-medium term, and more broadly because they see it as a step towards making our society into one where wealthy oligarchs are given favorable treatment and can get away with anything.

It also serves as a counterpoint against the defense and competition angle, at least if its presented by a16z (this argument doesn't work for e.g. OpenAI, but there are many other good counterarguments). The claims they make about the bill harming competitiveness e.g. for defense and security against China and other adversaries ring hollow when most of them are anti-Ukraine support or anti-NATO, making it clear they don't generally care about the US maintaining its global leadership.

I think this would maybe compel Newsom who's positioned himself as an anti-MAGA figure.

eggsyntax on eggsyntax's Shortform

"simulations or training situations" doesn't necessarily sound like fun.

Seems like some would be and some wouldn't. Although those are the 'medium significance' ones; the largest category is the 188 that used 'low significance' tasks. Still doesn't map exactly to 'fun', but I expect those ones are at least very low stress.

Generally, comparing kids vs adults could be interesting, although it is difficult to say what would be an equivalent mental effort. Specifically I am curious about the impact of school. Oh, we should also compare homeschooled kids vs kids in school, to separate the effects of school and age.

That would definitely be interesting; it wouldn't surprise me if at least a couple of the studies in the meta-analysis did that.

tailcalled on tailcalled's Shortform

Thesis: in addition to probabilities, forecasts should include entropies (how many different conditions are included in the forecast) and temperatures (how intense is the outcome addressed by the marginal constraint in this forecast, i.e. the big-if-true factor).

I say "in addition to" rather than "instead of" because you can't compute probabilities just from these two numbers. If we assume a Gibbs distribution, there's the free parameter of energy: ln(P) = S - U/T. But I'm not sure whether this energy parameter has any sensible meaning with more general events that aren't some thermal chemical equillibrium type thing.

Follow-up thesis: a major problem with rationalist forecasting wisdom is that it focuses on gaining accuracy by increasing S (e.g. addressing conjunction fallacy/base-rates/antipredictions). Meanwhile, the signed interestingness of a forecast is something like P ln(T/T_baseline) or P U. I guess implicitly the assumption is the event is already preselected for high temperature, but then surprising predictions get selected for high entropy, and this leads to resolution difficulty as to what "counts".

gb on What you know when you know nothing

We can only make that inference about conjunctions if we know that the statements are independent. Since (by assumption) we don’t know anything about said world, we don’t know that either, so the conclusion does not follow.

johannes-c-mayer on We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap

reward is the evidence from which we learn about our values

A sadist might feel good each time they hurt somebody. I am pretty sure it is possible for a sadist to exist who does not endorse hurting people, meaning they feel good if they hurt people, but they avoid it nonetheless.

So to what extent is hurting people a value? It's like the sadist's brain tries to tell them that they ought to want to hurt people, but they don't want to. Intuitively the "they don't want to" seems to be the value.

mateusz-baginski on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

It's better but still not quite. When you play on two levels, sometimes the best strategy involves a pair of (level 1 and 2) substrategies that are seemingly opposites of each other. I don't think there's anything hypocritical about that.

Similarly, hedging is not hypocrisy.

cousin_it on Slave Morality: A place for every man and every man in his place

I think Nietzsche would agree that "slave morality" originated with Jesus. The main new idea that Jesus brought as a moral philosopher was compassion, feeling for the other person. It's pretty find to hard in earlier sources, for example the heroes of the Iliad hurt weaker people without a second thought.

To me it feels obvious that the idea of compassion needs to exist, and needs to have force. Because otherwise we'd have a human society operating by the laws of the natural world, and if you look at what animals do to each other, there's no limit to how bad things can get.

Can compassion also become a tool of power and abuse? Sure. But let's not go back to a world without compassion, please.

xpym on A Nonconstructive Existence Proof of Aligned Superintelligence

This isn’t really a problem with alignment

I'd rather put it that resolving that problem is a prerequisite for the notion of "alignment problem" to be meaningful in the first place. It's not technically a contradiction to have an "aligned" superintelligence that does nothing, but clearly nobody would in practice be satisfied with that.

quetzal_rainbow on What's the Deal with Logical Uncertainty?

The reason why logical uncertainty was brought up in the first place is decision theory, to make crisp formal expression for intuitive "I cooperate with you conditional on you cooperating with me", where "you cooperating with me" is result of analysis of probability distribution over possible algorithms which control actions of your opponent and you can't actually run these algorithms due to computational constraints, and you want to do all this reasoning in non-arbitrary ways.

devrandom on Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety

There seem to be substantial problems with low probability events, coherent predictions over time, short term events, probabilities adding up to more than 100%, etc

An probabilistic oracle being inconsistent is completely besides the point. If I have a probabilistic oracle that has high accuracy but is sometimes inconsistent, I can just post-process the predictions to force them into a consistent format. For example, I can normalize the probabilities to 100%.

The economic value is in the overall accuracy. Being consistent is a cosmetic consideration.