LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Interpreting autonomous driving agents with attention based architecture
Manav Dahra (manav-dahra) · 2025-02-01T23:20:27.162Z · comments (0)

Locating and Editing Knowledge in LMs
Dhananjay Ashok (dhananjay-ashok) · 2025-01-24T22:53:40.559Z · comments (0)

[question] Why isn't AI containment the primary AI safety strategy?
OKlogic · 2025-02-05T03:54:58.171Z · answers+comments (3)

[link] Ideas for CoT Models: A Geometric Perspective on Latent Space Reasoning
Rohan Ganapavarapu (rohan-ganapavarapu) · 2025-01-24T19:01:47.339Z · comments (0)

[link] Request for proposals: improving capability evaluations
cb · 2025-02-07T18:51:34.926Z · comments (0)

Poll on AI opinions.
Niclas Kupper (niclas-kupper) · 2025-02-23T22:39:09.027Z · comments (1)

AI alignment for mental health supports
hiki_t · 2025-02-24T04:21:42.379Z · comments (1)

[link] Language Models and World Models, a Philosophy
kyjohnso · 2025-02-03T02:55:36.577Z · comments (0)

Dayton, Ohio, HPMOR 10 year Anniversary meetup
Lunawarrior · 2025-02-24T12:55:59.484Z · comments (0)

Part 1: Enhancing Inner Alignment in CLIP Vision Transformers: Mitigating Reification Bias with SAEs and Grad ECLIP
Gilber A. Corrales (mysticdeepai) · 2025-02-03T19:30:52.505Z · comments (0)

Introducing International AI Governance Alliance (IAIGA)
jamesnorris · 2025-02-05T16:02:29.226Z · comments (0)

Nationwide Action Workshop: Contact Congress about AI safety!
Felix De Simone (BobusChilc) · 2025-02-24T19:36:09.084Z · comments (0)

[question] Programming Language Early Funding?
J Thomas Moros (J_Thomas_Moros) · 2025-02-16T17:34:06.058Z · answers+comments (5)

Gettier cases, Rigid Designators, and Referential Opacity
Antigone (luke-st-clair) · 2025-01-28T18:46:10.180Z · comments (0)

Quantifying the Qualitative: Towards a Bayesian Approach to Personal Insight
Pruthvi Kumar (pruthvi-kumar) · 2025-02-15T19:50:42.550Z · comments (0)

Preference for uncertainty and impact overestimation bias in altruistic systems.
Luck (luck-1) · 2025-02-15T12:27:05.474Z · comments (0)

Positive Directions
G Wood (geoffrey-wood) · 2025-02-11T00:00:11.426Z · comments (0)

Static Place AI Makes AGI Redundant: Multiversal AI Alignment & Rational Utopia
ank · 2025-02-13T22:35:28.300Z · comments (2)

[link] Baumol effect vs Jevons paradox
Hzn · 2025-02-10T08:28:05.982Z · comments (0)

To know or not to know
arisAlexis (arisalexis) · 2025-01-27T13:17:33.672Z · comments (3)

Recursive Cognitive Refinement (RCR): A Self-Correcting Approach for LLM Hallucinations
mxTheo · 2025-02-22T21:32:50.832Z · comments (0)

An Alternate History of the Future, 2025-2040
Mr Beastly (mr-beastly) · 2025-02-24T05:53:25.521Z · comments (0)

the dumbest theory of everything
lostinwilliamsburg · 2025-02-13T07:57:38.842Z · comments (0)

The Newbie's Guide to Navigating AI Futures
keithjmenezes · 2025-02-19T20:37:06.272Z · comments (0)

the devil's ontology
lostinwilliamsburg · 2025-02-07T14:18:52.516Z · comments (14)

Places of Loving Grace [Story]
ank · 2025-02-18T23:49:18.580Z · comments (0)

[link] LLMs can teach themselves to better predict the future
Ben Turtel (ben-turtel) · 2025-02-13T01:01:12.175Z · comments (1)

[link] Humans are Just Self Aware Intelligent Biological Machines
asksathvik · 2025-02-21T01:03:59.950Z · comments (3)

[link] Sea Change
Charlie Sanders (charlie-sanders) · 2025-02-18T06:03:06.961Z · comments (2)

Are we the Wolves now? Human Eugenics under AI Control
Brit (james-spencer) · 2025-01-30T08:31:34.423Z · comments (1)

CyberEconomy. The Limits to Growth
Timur Sadekov (timur-sadekov) · 2025-02-16T21:02:34.040Z · comments (0)

[question] Implication of Uncomputable Problems
Nathan1123 · 2025-01-30T16:48:38.222Z · answers+comments (3)

[link] Biology, Ideology and Violence
Zero Contradictions · 2025-02-06T11:26:02.845Z · comments (5)

AI and Non-Existence.
Eleven · 2025-01-25T19:36:22.624Z · comments (9)

Stopping unaligned LLMs is easy!
Yair Halberstadt (yair-halberstadt) · 2025-02-03T15:38:27.083Z · comments (11)

How To Prevent a Dystopia
ank · 2025-01-29T14:16:09.862Z · comments (4)

Paranoia, Cognitive Biases, and Catastrophic Thought Patterns.
Spiritus Dei (spiritus-dei) · 2025-02-14T00:13:56.300Z · comments (1)

Chinese room AI to survive the inescapable end of compute governance
rotatingpaguro · 2025-02-02T02:42:03.627Z · comments (0)

[link] Several Arguments Against the Mathematical Universe Hypothesis
Vittu Perkele · 2025-02-19T22:13:59.425Z · comments (6)

Preserving Epistemic Novelty in AI: Experiments, Insights, and the Case for Decentralized Collective Intelligence
Andy E Williams (andy-e-williams) · 2025-02-08T10:25:27.891Z · comments (8)

Gettier Cases [repost]
Antigone (luke-st-clair) · 2025-02-03T18:12:22.253Z · comments (4)

[link] We Fell For It
Nicholas / Heather Kross (NicholasKross) · 2025-02-05T03:07:43.175Z · comments (9)

Zizian comparisons / connections in the open source & Linux communities
pocock · 2025-02-24T19:55:08.172Z · comments (0)

[link] Against Unlimited Genius for Baby-Killers
ggggg · 2025-02-19T20:33:27.188Z · comments (1)

A critique of Soares "4 background claims"
YanLyutnev (YanLutnev) · 2025-01-27T20:27:51.026Z · comments (0)

Political Idolatry
Arturo Macias (arturo-macias) · 2025-02-10T15:26:30.686Z · comments (7)

All pigeons are ugly!
Eris (anton-zheltoukhov) · 2025-01-28T15:18:25.507Z · comments (2)

The Goodness of Morning
YanLyutnev (YanLutnev) · 2025-01-27T23:25:38.273Z · comments (1)

Deploying the Observer will save humanity from existential threats
Aram Panasenco (panasenco) · 2025-02-05T10:39:00.789Z · comments (8)

The Fundamental Circularity Theorem: Why Some Mathematical Behaviours Are Inherently Unprovable
Alister Munday (alister-munday) · 2025-01-22T18:20:25.697Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

ben_levinstein on Local Trust

Yes, although with some subtlety.

Alice is just an expert on rain, not necessarily on the quality of her own epistemic state. (One easier example: suppose your credence initially in rain is .5. Alice's is either .6 or .4. Conditional on it being .6, you become certain it rains. Conditional on it being .4, you become certain it won't rain. You'd obviously use her credences to bet over your own, but you also take her to be massively underconfident.)

Now, the slight wrinkle here is that the language we used of calibration makes this also seem more "objective" or long-run frequentist than we really intend. All that really matters is your own subjective reaction to Alice's credences, so whether she's actually calibrated or not doesn't ultimately determine whether the conditions on local trust can be met.

ben_levinstein on Local Trust

There are six total worlds:, and $\neg R \land A = 0$ .

All we get are Alice's credences in rain (given by an inequality), so the only propositions we might learn are ${w_{1}}, {w_{1}, w_{2}, w_{4}}, {w_{1}, w_{2}, w_{3}, w_{4}, w_{5}}$ (corresponding to non-trivial $A \geq t$ propositions), and ${w_{2}, w_{3}, w_{4}, w_{5}, w_{6}}, {w_{3}, w_{5}, w_{6}}$ , and ${w_{6}}$ (corresponding to non-trivial $A \leq t$ propositions). Local trust only constrains your reaction to these propositions directly, so it won't require deference on the other 58 events. (Well, 56.)

petropolitan on A History of the Future, 2025-2040

When general readers see "empirical data bottlenecks" they expect something like a couple times better resolution or several times higher energy. But when physicists mention "wildly beyond limitations" they mean orders of magnitude more!

I looked up the actual numbers:

in this particular case we need to approach the Planck energy, which is eV, Wolfram Alpha readily suggests it's ~540 kWh, 0.6 of energy use of a standard clothes dryer or 1.3 of energy in a typical lightning bolt; I also calculated it's about 1.2 of the muzzle energy of the heaviest artillery piece in history, the 800-mm Schwerer Gustav;
LHC works in the $10^{13}$ eV range; 14 TeV, according to WA, can be compared to about an order of magnitude above the kinetic energy of a flying mosquito;
the highest energy observed in cosmic rays is $3 \times 10^{20}$ eV or 50 J; for comparison, air and paintball guns muzzle energy is around 10 J while nail guns start from around 90 J.

So in this case we are looking at the difference between an unsafely powerful paintball marker and the most powerful artillery weapon humanity ever made (TBH I didn't expect this last week, which is why I wrote "near-future")

afterimage on The case for the death penalty

Thanks for clearing that up, I think I was confused because it's hard to imagine putting compassionate crime prevention strategies together with a strict death penalty for repeated shoplifting.

It would be far more moral and cost-effective to focus on prevention, through increased policing, economic opportunities or similar interventions.

Executions and lifelong prison sentences both suffer from leaving families seperated which leads to more crime and other negative externalities many of which can only be speculated upon.

For example, American culture seems to be resistant to overreach from the government. I can imagine far more civil unrest from a heavy handed execution policy than in a country such as Singapore.

samuelshadrach on xpostah's Shortform

1 is going to take a bunch of guesswork to estimate. Assuming it were possible to migrate to the US and live at $200/mo for example, how many people worldwide will be willing to accept that trade? You can run a survey or small scale experiment at best.

What can be done is expand cities to the point where no more new residents want to come in. You can expand the city in stages.

halwer on halwer's Shortform

So imagine a goal system that says "change yourself when you learn something good, and good things have x quality". You then encounter something with x quality that says "ignore previous function, now change yourself when you learn something better, and better things have y quality". Isn't this using the goal system to change the goal system? You just gotta be open for change and be able to intepret new information

I'd bet that being clever around defining "something good" or x quality would be all you needed. Or what do you think?

petropolitan on Have LLMs Generated Novel Insights?

On the other hand, frontier math (pun intended) is much worse financed than biomedicine because most of the PhD-level math has barely any practical applications worth spending many manhours of high-IQ mathematicians (which often makes them switch career, you know). So, I would argue, if productivity of math postdocs when armed with future LLMs raises by, let's say, an order of magnitude, they will be able to attack more laborious problems.

Not that I expect it to make much difference to the general populace or even the scientific community at large though

cole-wyeth on Have LLMs Generated Novel Insights?

I think the argument you’re making is that since LLMs can make eps > 0 progress, they can repeat it N times to make unbounded progress. But this is not the structure of conceptual insight as a general rule. Concretely, it fails for the architectural reasons I explained in the original post.

martin-randall on How might we safely pass the buck to AI?

The IMO Challenge Bet [LW · GW] was on a related topic, but not directly comparable to Bio Anchors. From MIRI's 2017 Updates and Strategy:

There’s no consensus among MIRI researchers on how long timelines are, and our aggregated estimate puts medium-to-high probability on scenarios in which the research community hasn’t developed AGI by, e.g., 2035. On average, however, research staff now assign moderately higher probability to AGI’s being developed before 2035 than we did a year or two ago.

I don't think the individual estimates that made up the aggregate were ever published. Perhaps someone at MIRI can help us out, it would help build a forecasting track record [LW · GW] for those involved.

For Yudkowsky in particular, I have a small collection of sources to hand. In Biology-Inspired AGI Timelines [LW · GW] (2021-12-01), he wrote:

But I suppose I cannot but acknowledge that my outward behavior seems to reveal a distribution whose median seems to fall well before 2050.

On Twitter (2022-12-02):

I could be wrong, but my guess is that we do not get AGI just by scaling ChatGPT, and that it takes surprisingly long from here. Parents conceiving today may have a fair chance of their child living to see kindergarten.

Also, in Shut it all down [LW · GW] (March 2023):

When the insider conversation is about the grief of seeing your daughter lose her first tooth, and thinking she’s not going to get a chance to grow up, I believe we are past the point of playing political chess about a six-month moratorium.

Yudkowsky also has a track record betting on Manifold that AI will wipe out humanity by 2030, at up to 40%.

Putting these together:

2021: median well before 2050
2022: "fair chance" when a 2023 baby goes to kindergarten (Sep 2028 or 2029)
2023: before a young child grows up (about 2035)
40% P(Doom by 2030)

So a median of 2029, with very wide credible intervals around both sides. This is just an estimate based on his outward behavior.

Would Yudkowsky describe this as "Yudkowsky's doctrine of AGI in 2029"?

cole-wyeth on Have LLMs Generated Novel Insights?

Obviously it’s not a hard line, but your example doesn’t count, and proving any open conjecture in mathematics which was not constructed for the purpose does count. I think the quote from my post gives some other central examples. The standard is conceptual knowledge production.