LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Was Partisanship Good for the Environmental Movement?
Jeffrey Heninger (jeffrey-heninger) · 2024-05-15T17:30:54.796Z · comments (0)

[link] Truth is Universal: Robust Detection of Lies in LLMs
Lennart Buerger · 2024-07-19T14:07:25.162Z · comments (3)

[question] Could there be "natural impact regularization" or "impact regularization by default"?
tailcalled · 2023-12-01T22:01:46.062Z · answers+comments (6)

Foresight Institute: 2023 Progress & 2024 Plans for funding beneficial technology development
Allison Duettmann (allison-duettmann) · 2023-11-22T22:09:16.956Z · comments (1)

Defense Against The Dark Arts: An Introduction
Lyrongolem (david-xiao) · 2023-12-25T06:36:06.278Z · comments (36)

Paper Summary: The Koha Code - A Biological Theory of Memory
jakej (jake-jenks) · 2023-12-30T22:37:13.865Z · comments (2)

[question] Would you have a baby in 2024?
martinkunev · 2023-12-25T01:52:04.358Z · answers+comments (76)

[link] Secret US natsec project with intel revealed
Nathan Helm-Burger (nathan-helm-burger) · 2024-05-25T04:22:11.624Z · comments (0)

[link] Robert Caro And Mechanistic Models In Biography
adamShimi · 2024-07-14T10:56:42.763Z · comments (5)

AI Safety University Organizing: Early Takeaways from Thirteen Groups
agucova · 2024-10-02T15:14:00.137Z · comments (0)

[link] Tokyo AI Safety 2025: Call For Papers
Blaine (blaine-rogers) · 2024-10-21T08:43:38.467Z · comments (0)

Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev (sunishchal-dev) · 2024-10-15T18:25:08.179Z · comments (0)

Winning isn't enough
JesseClifton · 2024-11-05T11:37:39.486Z · comments (14)

[link] overengineered air filter shelving
bhauth · 2024-11-08T22:04:39.987Z · comments (2)

[question] How should vegans think about Methionine needs?
ChristianKl · 2024-11-10T09:28:47.655Z · answers+comments (1)

[link] "25 Lessons from 25 Years of Marriage" by honorary rationalist Ferrett Steinmetz
CronoDAS · 2024-10-02T22:42:30.509Z · comments (2)

A Basic Economics-Style Model of AI Existential Risk
Rubi J. Hudson (Rubi) · 2024-06-24T20:26:09.744Z · comments (3)

An evaluation of Helen Toner’s interview on the TED AI Show
PeterH · 2024-06-06T17:39:40.800Z · comments (2)

Even if we lose, we win
Morphism (pi-rogers) · 2024-01-15T02:15:43.447Z · comments (17)

UDT1.01: Local Affineness and Influence Measures (2/10)
Diffractor · 2024-03-31T07:35:52.831Z · comments (0)

D&D.Sci Hypersphere Analysis Part 2: Nonlinear Effects & Interactions
aphyer · 2024-01-14T19:59:37.911Z · comments (0)

Scientific Method
Andrij “Androniq” Ghorbunov (andrij-androniq-ghorbunov) · 2024-02-18T21:06:45.228Z · comments (4)

[link] Scenario planning for AI x-risk
Corin Katzke (corin-katzke) · 2024-02-10T00:14:11.934Z · comments (12)

I didn't think I'd take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!
mako yass (MakoYass) · 2024-08-02T22:35:21.136Z · comments (2)

[link] Altruism and Vitalism Aren't Fellow Travelers
Arjun Panickssery (arjun-panickssery) · 2024-08-09T02:01:11.361Z · comments (2)

[question] What percent of the sun would a Dyson Sphere cover?
Raemon · 2024-07-03T17:27:50.826Z · answers+comments (26)

Language and Capabilities: Testing LLM Mathematical Abilities Across Languages
Ethan Edwards · 2024-04-04T13:18:54.909Z · comments (2)

How Congressional Offices Process Constituent Communication
Tristan Williams (tristan-williams) · 2024-07-02T12:38:41.472Z · comments (0)

A brief review of China's AI industry and regulations
Elliot Mckernon (elliot) · 2024-03-14T12:19:00.775Z · comments (0)

Building Trust in Strategic Settings
StrivingForLegibility · 2023-12-28T22:12:24.024Z · comments (0)

Technology path dependence and evaluating expertise
bhauth · 2024-01-05T19:21:23.302Z · comments (2)

[link] Clickbait Soapboxing
DaystarEld · 2024-03-13T14:09:29.890Z · comments (15)

[link] The absence of self-rejection is self-acceptance
Chipmonk · 2023-12-21T21:54:52.116Z · comments (1)

Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities
c.trout (ctrout) · 2024-09-11T15:09:48.019Z · comments (2)

My Alignment "Plan": Avoid Strong Optimisation and Align Economy
VojtaKovarik · 2024-01-31T17:03:34.778Z · comments (9)

Weeping Agents
pleiotroth · 2024-06-06T12:18:54.978Z · comments (2)

[link] Extinction Risks from AI: Invisible to Science?
VojtaKovarik · 2024-02-21T18:07:33.986Z · comments (7)

[link] Review of Alignment Plan Critiques- December AI-Plans Critique-a-Thon Results
Iknownothing · 2024-01-15T19:37:07.984Z · comments (0)

[link] my theory of the industrial revolution
bhauth · 2024-02-28T21:07:55.274Z · comments (7)

Would you benefit from, or object to, a page with LW users' reacts?
Raemon · 2024-08-20T16:35:47.568Z · comments (6)

[link] [Talk transcript] What “structure” is and why it matters
Alex_Altair · 2024-07-25T15:49:00.844Z · comments (0)

I Want XMP But I Know Why I Can't Have It
jefftk (jkaufman) · 2024-01-19T15:30:07.492Z · comments (0)

[question] Why do so many think deception in AI is important?
Prometheus · 2024-01-13T08:14:58.671Z · answers+comments (12)

Eliminating Cookie Banners is Hard
jefftk (jkaufman) · 2024-01-13T03:00:04.843Z · comments (15)

AI #77: A Few Upgrades
Zvi · 2024-08-20T00:20:09.717Z · comments (3)

[link] 11 diceware words is enough
DanielFilan · 2024-02-15T00:13:43.420Z · comments (6)

New social credit formalizations
KatjaGrace · 2024-03-11T19:00:06.201Z · comments (3)

Analogy Bank for AI Safety
utilistrutil · 2024-01-29T02:35:13.746Z · comments (0)

GPT-3.5 judges can supervise GPT-4o debaters in capability asymmetric debates
Charlie George (charlie-george) · 2024-08-27T20:44:08.683Z · comments (7)

[LDSL#2] Latent variable models, network models, and linear diffusion of sparse lognormals
tailcalled · 2024-08-09T19:57:56.122Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

yams on yams's Shortform

I (and maybe you) have historically underrated the density of people with religious backgrounds in secular hubs. Most of these people don't 'think differently', in a structural sense, from their forebears; they just don't believe in that God anymore.

The hallmark here is a kind of naive enlightenment approach that ignores ~200 years of intellectual history (and a great many thinkers from before that period, including canonical philosophers they might claim to love/respect/understand). This type of thing.

They're no less tribal or dogmatic, or more critical, than the place they came from. They just vote the other way and can maybe talk about one or two levels of abstraction beyond the stereotype they identify against (although they can't really think about those levels).

You should still be nice to them, and honest with them, but you should understand what you're getting into.

The mere biographical detail of having a religious background or being religious isn't a strong mark against someone's thinking on other topics, but it is a sign you may be talking to a member of a certain meta-intellectual culture, and need to modulate your style. I have definitely had valuable conversations with people that firmly belong in this category, and would not categorically discourage engagement. Just don't be so surprised when the usual jutsu falls flat!

mako-yass on nikola's Shortform

Timelines are a result of a person's intuitions about a technical milestone being reached in the future, it is super obviously impossible for us to have a consensus about that kind of thing.

Talking only synchronises beliefs if you have enough time to share all of the relevant information, with technical matters, you usually don't.

alok-singh on Derivative AT a discontinuity

added some open circles

alok-singh on Derivative AT a discontinuity

I adjusted H to use heaviside's 1/2 convention, good catch.

yams on nikola's Shortform

I agree with this in the world where people are being epistemically rigorous/honest with themselves about their timelines and where there's a real consensus view on them. I've observed that it's pretty rare for people to make decisions truly grounded in their timelines, or to do so only nominally, and I think there's a lot of social signaling going on when (especially younger) people state their timelines.

I appreciate that more experienced people are willing to give advice within a particular frame ("if timelines were x", "if China did y", "if Anthropic did z", "If I went back to school", etc etc), even if they don't agree with the frame itself. I rely on more experienced people in my life to offer advice of this form ("I'm not sure I agree with your destination, but admit there's uncertainty, and love and respect you enough to advise you on your path").

Of course they should voice their disagreement with the frame (and I agree this should happen more for timelines in particular), but to gate direct counsel on urgent, object-level decisions behind the resolution of background disagreements is broadly unhelpful.

When someone says "My timelines are x, what should I do?", I actually hear like three claims:

Timelines are x
I believe timelines are x
I am interested in behaving as though timelines are x

Evaluation of the first claim is complicated and other people do a better job of it than I do so let's focus on the others.

"I believe timelines are x" is a pretty easy roll to disbelieve. Under relatively rigorous questioning, nearly everyone (particularly everyone 'career-advice-seeking age') will either say they are deferring (meaning they could just as easily defer to someone else tomorrow), or admit that it's a gut feel, especially for their ~90 percent year, and especially for more and more capable systems (this is more true of ASI than weak AGI, for instance, although those terms are underspecified). Still others will furnish 0 reasoning transparency and thus reveal their motivations to be principally social (possibly a problem unique to the bay, although online e/acc culture has a similar Thing).

"I am interested in behaving as though timelines are x" is an even easier roll to disbelieve. Very few people act on their convictions in sweeping, life-changing ways without concomitant benefits (money, status, power, community), including people within AIS (sorry friends).

With these uncertainties, piled on top of the usual uncertainties surrounding timelines, I'm not sure I'd want anyone to act so nobly as to refuse advice to someone with different timelines.

If Alice is a senior AIS professional who gives advice to undergrads at parties in Berkeley (bless her!), how would her behavior change under your recommendation? It sounds like maybe she would stop fostering a diverse garden of AIS saplings and instead become the awful meme of someone who just wants to fight about a highly speculative topic. Seems like a significant value loss.

Their timelines will change some other day; everyone's will. In the meantime, being equipped to talk to people with a wide range of safety-concerned views (especially for more senior, or just Older people), seems useful.

harder to converge

Converge for what purpose? It feels like the marketplace of ideas is doing an ok job of fostering a broad portfolio of perspectives. If anything, we are too convergent and, as a consequence, somewhat myopic internally. Leopold mind-wormed a bunch of people until Tegmark spoke up (and that only somewhat helped). Few thought governance was a good idea until pretty recently (~3 years ago), and it would be going better if those interested in the angle weren't shouted down so emphatically to begin with.

If individual actors need to cross some confidence threshold in order to act, but the reasonable confidence interval is in fact very wide, I'd rather have a bunch of actors with different timelines, which roughly sum to the shape of the reasonable thing*, then have everyone working on the same overconfident assumption that later comes back to bite us (when we've made mistakes in the past, this is often why).

*Which is, by the way, closer to flat than most people's individual timelines

dagon on Why Bayesians should two-box in a one-shot

note: this was 7 years ago and I've refined my understanding of CDT and the Newcomb problem since.

My current understanding of CDT is that it's does effectively assign a confidence of 1 to the decision not being causally upstream of Omega's action, and that is the whole of the problem. It's "solved" by just moving Omega's action downstream (by cheating and doing a rapid switch). It's ... illustrated? ... by the transparent version, where a CDT agent just sees the second box as empty before it even realizes it's decided. It's also "solved" by acausal decision theories, because they move the decision earlier in time to get the jump on Omega.

For non-rigorous DTs (like human intuition, and what I personally would want to do), there's a lot of evidence in the setup that Omega is going to turn out to be correct, and one-boxing is an easy call. If the setup is somewhat difference (say, neither Omega nor anyone else makes any claims about predictions, just says "sometimes both boxes have money, sometimes only one"), then it's a pretty straightforward EV calculation based on kind of informal probability assignments.

But it does require not using strict CDT, which rejects the idea that the choice has backward-causality.

unexpectedvalues on Seven lessons I didn't learn from election day

It's a little hard to know what you mean by that. Do you mean something like: given the information known at the time, but allowing myself the hindsight of noticing facts about that information that I may have missed, what should I have thought the probability was?

If so, I think my answer isn't too different from what I believed before the election (essentially 50/50). Though I welcome takes to the contrary.

joe-rogero on Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

I'm assuming the Cosmic Flipper is offering, not a doubling of the universe's current value, but a doubling of its current expected value (including whatever you think the future is worth) plus a little more. If it's just doubling current niceness or something, then yeah, that's not nearly enough.

joe-rogero on Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

That is an interesting reframing of this wager!

joe-rogero on Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Alas, I am not familiar with Lara Buchak's arguments, and the high-level summary I can get from Googling them isn't sufficient to tell me how it's supposed to capture something utility maximizing can't. Was there a specific argument you had in mind?