LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)

The Third Fundamental Question
Screwtape · 2024-11-15T04:01:33.770Z · comments (7)

AI Craftsmanship
abramdemski · 2024-11-11T22:17:01.112Z · comments (7)

[Intuitive self-models] 8. Rooting Out Free Will Intuitions
Steven Byrnes (steve2152) · 2024-11-04T18:16:26.736Z · comments (16)

[link] Electrostatic Airships?
DaemonicSigil · 2024-10-27T04:32:34.852Z · comments (13)

SAEs are highly dataset dependent: a case study on the refusal direction
Connor Kissane (ckkissane) · 2024-11-07T05:22:18.807Z · comments (4)

[link] electric turbofans
bhauth · 2024-11-02T22:50:59.807Z · comments (2)

Why imperfect adversarial robustness doesn't doom AI control
Buck · 2024-11-18T16:05:06.763Z · comments (27)

Toward Safety Cases For AI Scheming
Mikita Balesni (mykyta-baliesnyi) · 2024-10-31T17:20:06.019Z · comments (1)

Why our politicians aren't Median
Yair Halberstadt (yair-halberstadt) · 2024-11-03T14:03:33.779Z · comments (15)

Perils of Generalizing from One's Social Group
localdeity · 2024-11-24T15:31:18.332Z · comments (1)

Training AI agents to solve hard problems could lead to Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-11-19T00:10:55.522Z · comments (12)

[link] The Alignment Trap: AI Safety as Path to Power
crispweed · 2024-10-29T15:21:26.545Z · comments (17)

AI #87: Staying in Character
Zvi · 2024-10-29T07:10:08.212Z · comments (3)

Seeking Collaborators
abramdemski · 2024-11-01T17:13:36.162Z · comments (14)

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative
Phib · 2024-11-19T18:42:43.296Z · comments (7)

[link] The Evals Gap
Marius Hobbhahn (marius-hobbhahn) · 2024-11-11T16:42:46.287Z · comments (7)

Toward Safety Case Inspired Basic Research
Lucas Teixeira · 2024-10-31T23:06:32.854Z · comments (2)

[question] Could orcas be (trained to be) smarter than humans? 
Towards_Keeperhood (Simon Skade) · 2024-11-04T23:29:26.677Z · answers+comments (11)

Neuroscience of human social instincts: a sketch
Steven Byrnes (steve2152) · 2024-11-22T16:16:52.552Z · comments (0)

Win/continue/lose scenarios and execute/replace/audit protocols
Buck · 2024-11-15T15:47:24.868Z · comments (2)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

A Qualitative Case for LTFF: Filling Critical Ecosystem Gaps
Linch · 2024-11-18T00:44:57.133Z · comments (2)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations
Steven Byrnes (steve2152) · 2024-10-29T13:36:16.325Z · comments (2)

Metastatic Cancer Treatment Since 2010: The Success Stories
sarahconstantin · 2024-11-04T22:50:09.386Z · comments (2)

A Conflicted Linkspost
Screwtape · 2024-11-21T00:37:54.035Z · comments (0)

[link] Active Recall and Spaced Repetition are Different Things
Saul Munn (saul-munn) · 2024-11-08T20:14:56.092Z · comments (2)

An alternative approach to superbabies
Towards_Keeperhood (Simon Skade) · 2024-11-05T22:56:15.740Z · comments (19)

On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
Marcus Williams · 2024-11-07T15:39:06.854Z · comments (6)

D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset
aphyer · 2024-10-29T01:21:03.075Z · comments (12)

Which evals resources would be good?
Marius Hobbhahn (marius-hobbhahn) · 2024-11-16T14:24:48.012Z · comments (4)

Secular Solstice Round Up 2024
dspeyer · 2024-11-21T10:49:36.682Z · comments (10)

AI #88: Thanks for the Memos
Zvi · 2024-10-31T15:00:07.412Z · comments (5)

AI as a powerful meme, via CGP Grey
TheManxLoiner · 2024-10-30T18:31:58.544Z · comments (8)

Looking back on the Future of Humanity Institute - Asterisk
jakeeaton · 2024-11-19T00:44:40.928Z · comments (0)

[link] What Ketamine Therapy Is Like
Sable · 2024-11-11T11:09:08.602Z · comments (8)

The Shallow Bench
Karl Faulks (karl-faulks) · 2024-11-05T05:07:27.357Z · comments (5)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

[link] Analyzing how SAE features evolve across a forward pass
bensenberner · 2024-11-07T22:07:02.827Z · comments (0)

AI #91: Deep Thinking
Zvi · 2024-11-21T14:30:06.930Z · comments (9)

[link] Epistemic status: poetry (and other poems)
Richard_Ngo (ricraz) · 2024-11-21T18:13:17.194Z · comments (5)

[link] The Choice Transition
owencb · 2024-11-18T12:30:56.198Z · comments (4)

[link] Literacy Rates Haven't Fallen By 20% Since the Department of Education Was Created
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-22T20:53:59.007Z · comments (0)

[link] Dangerous capability tests should be harder
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:20:50.610Z · comments (3)

Monthly Roundup #24: November 2024
Zvi · 2024-11-18T13:20:06.086Z · comments (14)

Motivation control
Joe Carlsmith (joekc) · 2024-10-30T17:15:50.881Z · comments (7)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (0)

[link] a space habitat design
bhauth · 2024-11-25T17:28:48.481Z · comments (5)

AI #89: Trump Card
Zvi · 2024-11-07T16:30:05.684Z · comments (12)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lao-mein on Lao Mein's Shortform

US markets are not taking the Trump tariff proposals very seriously - stock prices increased after the election and 10-year Treasury yields have returned to pre-election levels, although they did spike ~0.1% after the election. Maybe the Treasury pick reassured investors?

https://www.cnbc.com/quotes/US10Y

https://www.cnbc.com/quotes/US10Y

If you believe otherwise, I encourage you to bet on it! I expected both yields and stocks to go up and am quite surprised.

I'm not sure what the markets expect to happen - Trump uses the threat of tariffs to bully Europeans for diplomatic concessions, who then back down? Or maybe Trump backs down? There's also talk about Trump's policies increasing the strength of the dollar, which makes sense. But again, net zero inflation from the tariffs is pretty wild.

The Iranian stock market also spiked after the US elections, which... what?

https://tradingeconomics.com/iran/stock-market

The Iranian government has tried to kill Trump multiple times since he authorized the assassination of Solemani. Trump tightened sanctions against Iran in his first term. He pledges even tougher sanctions against Iran in his second. There is no possible way he can be good for the Iranian economy. Maybe this is just a hedge against inflation?

nadroj on [bounty $100] Why are there no interesting (1D, 2-state) quantum cellular automata?

There are many articles on quantum cellular automata. See for example "A review of Quantum Cellular Automata", or "Quantum Cellular Automata, Tensor Networks, and Area Laws".
I think compared to the literature you're using an overly restrictive and nonstandard definition of quantum cellular automata. Specifically, it only makes sense to me to write as a product of operators like you have if all of the terms are on spatially disjoint regions.

Consider defining quantum cellular automata instead as local quantum circuits composed of identical two-site unitary operators everywhere:

If you define them like this, then basically any kind of energy and momentum conserving local quantum dynamics can be discretized into a quantum cellular automata, because any two-site time and space independent quantum Hamiltonian can be decomposed into steps with identical unitaries like this using the Suzuki-Trotter decomposition.

mako-yass on a space habitat design

I guess since it sounds like they're going to be about a km long and 20 stories deep there'll be enough room for a nice running track with minimal upspin/downspin sections.

seth-herd on How can we prevent AGI value drift?

I think that's a pretty reasonable worry. And a lot of people share it. Here's my brief take.

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours [LW · GW]

I'm less worried about that because it seems like one questionable group with tons of power is way better than a bunch of questionable groups with tons of power - if the offense-defense balance tilts toward offense, which I think it does. The more groups, the more chance that someone uses it for ill.

Here's one update on my thinking: mutually assured destruction will still work for most of the world. ICBMs with nuclear payloads will be obsoleted at some point, but AGIs will also likely be told to find even better/worse ways to destroy stuff. So possibly everyone with an AGI will go ahead and hold the whole earth hostage, just so whoever starts a war doesn't get to keep any of their stuff they were keeping on the planet. That makes the incentive to get off planet and possibly keep going.

It's really hard to see how this stuff plays out, but I suspect it will be obvious what the constraints and incentives and distribution of psychologies was in retrospect. So I appreciate your help in thinking through it. We don't have answers yet, but they may be out there.

I don't think it would be much harder for a group to give it up if they were the only ones who had it. And maybe there's not much difference between a full renunciation of control and just saying "oh fine, I'm tired of running the world, do whatever it seems like everybody wants but check major changes with me in case I decide to throw my weight around instead of hanging out in the land of infinite fun".

steve2152 on Counting AGIs

I understand that you’re basically assuming that the “initial AGI population” is running on only the same amount of compute that was used to train that very AGI. It’s fine to make that assumption but I think you should emphasize it more. There are a lot of situations where that’s not an appropriate assumption, but rather the relevant question is “what’s the AGI population if most of the world’s compute is running AGIs”.

For example, if the means to run AGIs (code, weights, whatever) gets onto the internet, then everybody all over the world would be doing that immediately. Or if a power-seeking AGI escapes human control, then a possible thing it might do is work to systematically get copies of itself running on most of the world’s compute. Or another possible thing it might do is wipe out humanity and then get copies of itself running on most of the world’s compute, and then we’ll want to know if that’s enough AGIs for a self-sufficient stable supply chain (see “Argument 2” here [LW · GW]). Or if we’re thinking more than a few months after AGI becomes possible at all, in a world like today’s where the leader is only slightly ahead of a gaggle of competitors and open-source projects, then AGI would again presumably be on most of the world’s compute. Or if we note that a company with AGI can make unlimited money by renting more and more compute to run more AGIs to do arbitrary remote-work jobs, then we might guess that they would decide to do so, which would lead to scaling up to as much compute around the world as money can buy.

OK, here’s the part of the post where you justified your decision to base your analysis on one training run worth of compute rather than one planet worth of compute, I think:

One reason the training run imputation approach is likely still solid is that competition between firms or countries will crowd out compute or compute will be excluded on national security grounds. Consider the two main actors that could build AGI. If a company builds AGI, they are unlikely to have easy access to commodified compute that they have not themselves built, since they will be in fierce competition with other firms buying chips and obtaining compute. If a government builds AGI, it seems plausible they would impose strict security measures on their compute, reducing the likelihood that anything not immediately in the project would be employable at inference.

The first part doesn’t make sense to me:

Let’s say Company A can make AGIs that are drop-in replacements for highly-skilled humans at any existing remote job (including e.g. “company founder”), and no other company can. And Company C is a cloud provider. Then Company A will be able to outbid every other company for Company C’s cloud compute, since Company A is able to turn cloud compute directly into massive revenue. It can just buy more and more cloud compute from C and every other company, funding itself with rapid exponential growth, until the whole world is saturated.

If Company A and Company B can BOTH make AGIs that are drop-in replacements for highly-skilled humans, and Company C doesn’t do AI research but is just a giant cloud provider, then Company A and Company B will bid against each other to rent Company C’s compute, and no other bidders will be anywhere close to those two. It doesn’t matter whether Company A or Company B wins the auction—Company C’s compute is going to be running AGIs either way. Right?

Next, the second part.

Yes it’s possible that a government would be sufficiently paranoid about IP theft (or loss of control or other things) that it doesn’t want to run its AGI code on random servers that it doesn’t own itself. (We should be so lucky!) It’s also possible that a company would make the same decision for the same reason. Yeah OK, that’s indeed a scenario where one might be interested in the question of what AGI population you get for its training compute. But that’s really only relevant if the government or company rapidly does a pivotal act, I think. Otherwise that’s just an interesting few-month period of containment before AGIs are on most of the world’s compute as above.

we found three existing attempts to estimate the initial AGI population

FWIW Holden Karnofsky wrote a 2022 blog post “AI Could Defeat All Of Us Combined” that mentions the following: “once the first human-level AI system is created, whoever created it could use the same computing power it took to create it in order to run several hundred million copies for about a year each.” Brief justification in his footnote 5. Not sure that adds much to the post, it just popped into my head as a fourth example.

~ ~ ~

For what it’s worth, my own opinion [LW · GW] is that 1e14 FLOP/s is a better guess than 1e15 FLOP/s for human brain compute, and also that we should divide all the compute in the world including consumer PCs by 1e14 FLOP/s to guess (what I would call) “initial AGI population”, for all planning purposes apart from pivotal acts. But you’re obviously assuming that AGI will be an LLM, and I’m assuming that it won’t, so you should probably ignore my opinion. We’re talking about different things. Just thought I’d share anyway ¯\_(ツ)_/¯

daemonicsigil on a space habitat design

Running parallel to the spin axis would be fine, though.

sharmake-farah on Dalcy's Shortform

or really, where we're just not working in a context where it's natural to think in terms of algorithms with run-times.

What contexts is it not natural to think in terms of algorithms with specific run-times?

shankar-sivarajan on Ultralearning in 80 days

Quenya resources you might find useful (though you've probably seen most of these):

The Vinyë Lambengolmor Discord - Extremely helpful community.
Eldamo - The main dictionary (includes lots of useful neologisms), but it also has an intro to Quenya.
Quettali - Constructs the declension and conjugation tables for the words in Eldamo's dictionary.

If you also want to write in the Tengwar, see Tecendil and BSSScribe.

Thinking in Quenya might not be a reasonable goal.

cleo-nardo on Counting AGIs

Thanks for putting this together — very useful!

cesiumquail on Ultralearning in 80 days

Cool! I’m going to add my thoughts here, but I’m no authority so feel free to ignore and do whatever feels best.

Waking up early is fine as long as you’re also going to bed early. Chronic sleep deprivation is bad.

If you’re studying CS, give special attention to machine learning and the current AI landscape. It’s hard to predict what AI will look like in five years, but it’s the most important thing to be tracking.

If learning Quenya is fun and intrinsically rewarding, then that’s great, but if you’re doing it for practical reasons there are probably more efficient options. I actually have a system for writing things I don’t want anyone to read. I write in English, but I replace key words with other words based on associations that only I would find meaningful. This requires no preparatory memorization and is basically impossible to decrypt without my brain, as long as I don’t give away the meaning with context clues.

For writing, the two essential things are to have good ideas and to communicate them clearly. In my opinion Scott Alexander is the best example of this, so here’s his guide to nonfiction writing. I endorse just copying his style unless you find something you like better.

I would add a few things about writing:

Make everything predictable and standard except the most important parts that you want to emphasize.
Be honest and use the tone that feels most natural.
Spend most of your effort searching for the best ideas. Then just write them down clearly.

For general rationality, books aren’t all that helpful in my opinion. There’s a sensitivity to the specifics of each situation that’s hard to transmit except by direct example. I think you would get more out of following people who seem smart. I endorse Eliezer Yudkowsky, Scott Alexander, Wei Dai, Gwern, Connor Leahy, Dwarkesh Patel, and Stefan Schubert.