LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

[link] The rise of AI in cybercrime
BobyResearcher · 2023-07-30T20:19:34.867Z · comments (1)

SSA vs. SIA: how future population may provide evidence for or against the foundations of political liberalism
[deleted] · 2023-07-30T20:18:59.444Z · comments (10)

[link] Rationalization Maximizes Expected Value
Kevin Dorst · 2023-07-30T20:11:26.377Z · comments (10)

Apollo Neuro Results
Elizabeth (pktechgirl) · 2023-07-30T18:40:05.213Z · comments (16)

Hilbert's Triumph, Church and Turing's failure, and what it means (Post #2)
Noosphere89 (sharmake-farah) · 2023-07-30T14:33:25.180Z · comments (16)

[question] Specific Arguments against open source LLMs?
Iknownothing · 2023-07-30T14:27:13.116Z · answers+comments (2)

Socialism in large organizations
Adam Zerner (adamzerner) · 2023-07-30T07:25:57.736Z · comments (16)

How to make real-money prediction markets on arbitrary topics (Outdated)
yutaka · 2023-07-30T02:11:47.050Z · comments (13)

[question] Does decidability of a theory imply completeness of the theory?
Noosphere89 (sharmake-farah) · 2023-07-29T23:53:08.166Z · answers+comments (12)

[question] If I showed the EQ-SQ theory's findings to be due to measurement bias, would anyone change their minds about it?
tailcalled · 2023-07-29T19:38:13.285Z · answers+comments (13)

Self-driving car bets
paulfchristiano · 2023-07-29T18:10:01.112Z · comments (41)

[link] The Parable of the Dagger - The Animation
Writer · 2023-07-29T14:03:12.023Z · comments (6)

Are Guitars Obsolete?
jefftk (jkaufman) · 2023-07-29T13:20:01.482Z · comments (8)

NAMSI: A promising approach to alignment
[deleted] · 2023-07-29T07:03:51.930Z · comments (6)

Understanding and Aligning a Human-like Inductive Bias with Cognitive Science: a Review of Related Literature
Claire Short (claire-short) · 2023-07-29T06:10:38.353Z · comments (0)

[link] Universal and Transferable Adversarial Attacks on Aligned Language Models [paper link]
Sodium · 2023-07-29T03:21:15.477Z · comments (0)

[link] Why You Should Never Update Your Beliefs
Arjun Panickssery (arjun-panickssery) · 2023-07-29T00:27:01.899Z · comments (17)

Thoughts about the Mechanistic Interpretability Challenge #2 (EIS VII #2)
RGRGRG · 2023-07-28T20:44:36.868Z · comments (5)

Because of LayerNorm, Directions in GPT-2 MLP Layers are Monosemantic
ojorgensen · 2023-07-28T19:43:12.235Z · comments (3)

When can we trust model evaluations?
evhub · 2023-07-28T19:42:21.799Z · comments (9)

Yes, It's Subjective, But Why All The Crabs?
johnswentworth · 2023-07-28T19:35:36.741Z · comments (15)

Semaglutide and Muscle
5hout · 2023-07-28T18:36:22.036Z · comments (14)

Double Crux in a Box
Screwtape · 2023-07-28T17:55:08.794Z · comments (3)

[link] AI Safety 101 : Introduction to Vision Interpretability
jeanne_ (jeanne_s) · 2023-07-28T17:32:11.545Z · comments (0)

Visible loss landscape basins don't correspond to distinct algorithms
Mikhail Samin (mikhail-samin) · 2023-07-28T16:19:05.279Z · comments (13)

[link] Progress links digest, 2023-07-28: The decadent opulence of modern capitalism
jasoncrawford · 2023-07-28T14:36:26.382Z · comments (3)

AI Awareness through Interaction with Blatantly Alien Models
VojtaKovarik · 2023-07-28T08:41:07.776Z · comments (5)

You don't get to have cool flaws
Neil (neil-warren) · 2023-07-28T05:37:31.414Z · comments (16)

Reducing sycophancy and improving honesty via activation steering
Nina Rimsky (NinaR) · 2023-07-28T02:46:23.122Z · comments (14)

Mech Interp Puzzle 2: Word2Vec Style Embeddings
Neel Nanda (neel-nanda-1) · 2023-07-28T00:50:00.297Z · comments (4)

[link] ETFE windows
bhauth · 2023-07-28T00:46:55.556Z · comments (4)

A Short Memo on AI Interpretability Rainbows
scasper · 2023-07-27T23:05:50.196Z · comments (0)

Pulling the Rope Sideways: Empirical Test Results
Daniel Kokotajlo (daniel-kokotajlo) · 2023-07-27T22:18:01.072Z · comments (18)

[link] A $10k retroactive grant for VaccinateCA
Austin Chen (austin-chen) · 2023-07-27T18:14:44.305Z · comments (0)

Preference Aggregation as Bayesian Inference
beren · 2023-07-27T17:59:36.270Z · comments (1)

AI #22: Into the Weeds
Zvi · 2023-07-27T17:40:02.184Z · comments (8)

[link] SSA rejects anthropic shadow, too
jessicata (jessica.liu.taylor) · 2023-07-27T17:25:17.728Z · comments (38)

[question] What are examples of someone doing a lot of work to find the best of something?
chanamessinger (cmessinger) · 2023-07-27T15:58:02.114Z · answers+comments (15)

[link] AI-Plans.com 10-day Critique-a-Thon
Iknownothing · 2023-07-27T11:44:01.660Z · comments (2)

Privacy in a Digital World
Faustify (nikolay-blagoev) · 2023-07-27T10:46:38.887Z · comments (0)

[link] Cultivating a state of mind where new ideas are born
Henrik Karlsson (henrik-karlsson) · 2023-07-27T09:16:42.566Z · comments (18)

[link] Partial Transcript of Recent Senate Hearing Discussing AI X-Risk
Daniel_Eth · 2023-07-27T09:16:01.168Z · comments (0)

AXRP Episode 24 - Superalignment with Jan Leike
DanielFilan · 2023-07-27T04:00:02.106Z · comments (3)

[question] Have you ever considered taking the 'Turing Test' yourself?
Super AGI (super-agi) · 2023-07-27T03:48:30.407Z · answers+comments (6)

AXRP Episode 23 - Mechanistic Anomaly Detection with Mark Xu
DanielFilan · 2023-07-27T01:50:02.808Z · comments (0)

GPT-4 can catch subtle cross-language translation mistakes
Michael Tontchev (michael-tontchev-1) · 2023-07-27T01:39:23.492Z · comments (1)

Social Balance through Embracing Social Credit
dhruvv · 2023-07-26T20:07:02.953Z · comments (9)

[link] Why no Roman Industrial Revolution?
jasoncrawford · 2023-07-26T19:34:41.682Z · comments (30)

Why you can't treat decidability and complexity as a constant (Post #1)
Noosphere89 (sharmake-farah) · 2023-07-26T17:54:33.294Z · comments (13)

A response to the Richards et al.'s "The Illusion of AI's Existential Risk"
Harrison Fell (harrison-fell) · 2023-07-26T17:34:20.409Z · comments (0)

next page (older posts) →

Archive

Recent comments

keltan on Fermi Estimates

To help remember this post and it's methods I broke it down into song lyrics and used Udio to make the song.

saidachmiz on Losing Faith In Contrarianism

Similarly, the lab leak theory—one of the more widely accepted and plausible contrarian views—also doesn’t survive careful scrutiny. It’s easy to think it’s probably right when your perception is that the disagreement is between people like Saar Wilf and government bureaucrats like Fauci. But when you realize that some of the anti-lab leak people are obsessive autists who have studied the topic a truly mind-boggling amount, and don’t have any social or financial stake in the outcome, it’s hard to be confident that they’re wrong.

This is a very poor conclusion to draw from the Rootclaim debate. If you have not yet read Gwern’s commentary on the debate, I suggest that you do so. In short, the correct conclusion here is that the debate was a very poor format for evaluating questions like this, and that the “obsessive autists” in question cannot be relied on. (This is especially so because in this case, there absolutely was a financial stake—$100,000 of financial stake, to be precise!)

adam_scholl on Paul Christiano named as US AI Safety Institute Head of AI Safety

This thread seems unproductive to me, so I'm going to bow out after this. But in case you're actually curious: at least in the case of Open Philanthropy, it's easy to check what their primary concerns are because they write them up. And accidental release from dual use research is one of them.

kromem on Examples of Highly Counterfactual Discoveries?

Though the Greeks actually credited the idea to an even earlier Phonecian, Mochus of Sidon.

Through when it comes to antiquity credit isn't really "first to publish" as much as "first of the last to pass the survivorship filter."

logan-zoellner on The first future and the best future

What plateau? Why pause now (vs say 10 years ago)? Why not wait until after the singularity and impose a "long reflection" when we will be in an exponentially better place to consider such questions.
Singularity 5-10 years from now vs 15-20 years from now determines whether or not some people I personally know and care about will be alive.
Every second we delay the singularity leads to a "cosmic waste" as millions more galaxies move permanently behind the event horizon defined by the expanding universe
Slower is not prima facia safer. To the contrary, the primary mechanism for slowing down AGI is "concentrate power in the hands of a small number of decision makers," which in my current best guess increases risk.
There is no bright line for how much slower we should go. If we accept without evidence that we should slow down AGI by 10 years, why not 50? why not 5000?

fer32dwt34r3dfsz on My experience using financial commitments to overcome akrasia

it very much depends on where the user came from

Can you provide any further detail here, i.e. be more specific on origin-stratified-retention rates? (I would appreciate this, even if this might require some additional effort searching)

leogao on Improving Dictionary Learning with Gated Sparse Autoencoders

Great paper! The gating approach is an interesting way to learn the JumpReLU threshold and it's exciting that it works well. We've been working on some related directions at OpenAI based on similar intuitions about feature shrinking.

Some questions:

Is b_mag still necessary in the gated autoencoder?
Did you sweep learning rates for the baseline and your approach?
How large is the dictionary of the autoencoder?

shankar-sivarajan on Losing Faith In Contrarianism

I doubt you could have picked a worse example to make your point that contrarian takes are usually wrong than racial differences in IQ/intelligence.

logan-zoellner on Losing Faith In Contrarianism

Sam Atis—a super forecaster—had a piece arguing against The Case Against Education

If it's this piece, I would be interested to know why you found it convincing. He doesn't address (or seem to have even read) any of Brian's arguments. His argument basically boils down to "but so many people who work for universities think it's good".

michaeldickens on MichaelDickens's Shortform

Have there been any great discoveries made by someone who wasn't particularly smart?

This seems worth knowing if you're considering pursuing a career with a low chance of high impact. Is there any hope for relatively ordinary people (like the average LW reader) to make great discoveries?