LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Comparing Quantized Performance in Llama Models
NickyP (Nicky) · 2024-07-15T16:01:24.960Z · comments (2)

[question] If I wanted to spend WAY more on AI, what would I spend it on?
Logan Zoellner (logan-zoellner) · 2024-09-15T21:24:46.742Z · answers+comments (11)

Book Review: What Even Is Gender?
Joey Marcellino · 2024-09-01T16:09:27.773Z · comments (14)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (0)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (23)

Games for AI Control
charlie_griffin (cjgriffin) · 2024-07-11T18:40:50.607Z · comments (0)

Glitch Token Catalog - (Almost) a Full Clear
Lao Mein (derpherpize) · 2024-09-21T12:22:16.403Z · comments (3)

I was raised by devout Mormons, AMA [&|] Soliciting Advice
ErioirE (erioire) · 2024-03-13T16:52:19.130Z · comments (41)

[link] New report: A review of the empirical evidence for existential risk from AI via misaligned power-seeking
Harlan · 2024-04-04T23:41:26.439Z · comments (5)

[link] Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund
Zach Stein-Perlman · 2023-10-25T15:20:52.765Z · comments (8)

Game Theory without Argmax [Part 2]
Cleo Nardo (strawberry calm) · 2023-11-11T16:02:41.836Z · comments (14)

Mapping the semantic void II: Above, below and between token embeddings
mwatkins · 2024-02-15T23:00:09.010Z · comments (4)

Quick evidence review of bulking & cutting
jp · 2024-04-04T21:43:48.534Z · comments (5)

Video and transcript of presentation on Scheming AIs
Joe Carlsmith (joekc) · 2024-03-22T15:52:03.311Z · comments (1)

[question] When did Eliezer Yudkowsky change his mind about neural networks?
[deactivated] (Yarrow Bouchard) · 2023-11-14T21:24:00.000Z · answers+comments (15)

Features and Adversaries in MemoryDT
Joseph Bloom (Jbloom) · 2023-10-20T07:32:21.091Z · comments (6)

AI's impact on biology research: Part I, today
octopocta · 2023-12-23T16:29:18.056Z · comments (6)

Different views of alignment have different consequences for imperfect methods
Stuart_Armstrong · 2023-09-28T16:31:20.239Z · comments (0)

The Byronic Hero Always Loses
Cole Wyeth (Amyr) · 2024-02-22T01:31:59.652Z · comments (4)

[link] A Narrative History of Environmentalism's Partisanship
Jeffrey Heninger (jeffrey-heninger) · 2024-05-14T16:51:01.029Z · comments (3)

Some Quick Follow-Up Experiments to “Taken out of context: On measuring situational awareness in LLMs”
Miles Turpin (miles) · 2023-10-03T02:22:00.199Z · comments (0)

Superforecasting the premises in “Is power-seeking AI an existential risk?”
Joe Carlsmith (joekc) · 2023-10-18T20:23:51.723Z · comments (3)

[link] Thoughts on Zero Points
depressurize (anchpop) · 2024-04-23T02:22:27.448Z · comments (1)

Late-talking kid part 3: gestalt language learning
Steven Byrnes (steve2152) · 2023-10-17T02:00:05.182Z · comments (5)

[link] Abs-E (or, speak only in the positive)
dkl9 · 2024-02-19T21:14:32.095Z · comments (20)

Falling fertility explanations and Israel
Yair Halberstadt (yair-halberstadt) · 2024-04-03T03:27:38.564Z · comments (4)

[link] Self-Resolving Prediction Markets
PeterMcCluskey · 2024-03-03T02:39:42.212Z · comments (0)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (24)

Mentorship in AGI Safety (MAGIS) call for mentors
Valentin2026 (Just Learning) · 2024-05-23T18:28:03.173Z · comments (3)

Attention Output SAEs Improve Circuit Analysis
Connor Kissane (ckkissane) · 2024-06-21T12:56:07.969Z · comments (0)

AI Safety Strategies Landscape
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-09T17:33:45.853Z · comments (1)

[link] introduction to thermal conductivity and noise management
bhauth · 2024-03-06T23:14:02.288Z · comments (1)

[link] Aaron Silverbook on anti-cavity bacteria
DanielFilan · 2023-11-20T03:06:19.524Z · comments (3)

On Not Requiring Vaccination
jefftk (jkaufman) · 2024-02-01T19:20:12.657Z · comments (21)

UDT1.01: Plannable and Unplanned Observations (3/10)
Diffractor · 2024-04-12T05:24:34.435Z · comments (0)

[link] [Linkpost] Statement from Scarlett Johansson on OpenAI's use of the "Sky" voice, that was shockingly similar to her own voice.
Linch · 2024-05-20T23:50:28.138Z · comments (8)

On "Geeks, MOPs, and Sociopaths"
alkjash · 2024-01-19T21:04:48.525Z · comments (35)

Retrospective: PIBBSS Fellowship 2023
DusanDNesic · 2024-02-16T17:48:32.151Z · comments (1)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures
abstractapplic · 2024-05-17T00:25:42.950Z · comments (12)

Extracting SAE task features for in-context learning
Dmitrii Kharlapenko (dmitrii-kharlapenko) · 2024-08-12T20:34:13.747Z · comments (1)

How Would an Utopia-Maximizer Look Like?
Thane Ruthenis · 2023-12-20T20:01:18.079Z · comments (23)

Why wasn't preservation with the goal of potential future revival started earlier in history?
Andy_McKenzie · 2024-01-16T16:15:08.550Z · comments (1)

Good Bings copy, great Bings steal
dr_s · 2024-04-21T09:52:46.658Z · comments (6)

Music in the AI World
Martin Sustrik (sustrik) · 2024-08-16T04:20:01.706Z · comments (8)

[link] self-fulfilling prophecies when applying for funding
Chipmonk · 2024-03-01T19:01:40.991Z · comments (0)

[LDSL#1] Performance optimization as a metaphor for life
tailcalled · 2024-08-08T16:16:27.349Z · comments (4)

Some Things That Increase Blood Flow to the Brain
romeostevensit · 2024-03-27T21:48:46.244Z · comments (14)

"Full Automation" is a Slippery Metric
ozziegooen · 2024-06-11T19:56:49.855Z · comments (1)

A more systematic case for inner misalignment
Richard_Ngo (ricraz) · 2024-07-20T05:03:03.500Z · comments (4)

Adversarial Robustness Could Help Prevent Catastrophic Misuse
aogara (Aidan O'Gara) · 2023-12-11T19:12:26.956Z · comments (18)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

antanaclasis on Economics Roundup #3

Scenario: you have equity worth (say) $100 million in expectation, but of no realized value at the moment.

You are forced to pay unrealized gains tax on that amount, and so are now $25 million in the hole. Even if you avoid this crashing you immediately (such as by getting a loan), if your equity goes to $0 you’re still out for the $25 million you paid, with no assets to back it.

The fact that this could be counted as a prepayment for a hypothetical later unrealized gain doesn’t help you, you can’t actually get your money back.

anthonyc on Lost in Innovation: The Case of Phlogiston

I guess I'm confused by the assertion that phlogiston explains things about metal properties, that isn't equally explained by "metals are calxes with the oxygen removed." Both explanations are descriptive, not predictive, and yes that remains true until we figured out quantum mechanics. Neither will tell you how a metal will behave when burned, what color flame it'll produce, why you can reduce iron ore with charcoal but not aluminum, what alloys you can make under what conditions and what their behavior will be, and so on.

I don't disagree with "you can't explain the properties of metals based on Lavoisier's chemistry paradigm without quantum mechanics." That's just straightforwardly true. I remember very well one quantum mechanics lecture where my professor said, after about a week of derivations, "and that's why metals are shiny." What I disagree with is the assertion that phlogiston does explain this, in any sense other than just postulating the existence of a substance that tautologically, exactly matches whatever is observed in all its complexity. Understanding oxygen's role better serves to highlight where the gaps in useful understanding already were, whether or not anyone had the tools yet to fill them.

Even if we do agree to identify phlogiston with electrons, then the phlogiston theorists were still mistaken to think of it as a substance separate from the other reactants. Electrons, and free energy too, are part of the reactant and product substances in question. "Atoms" aren't actually atomic, or unbreakable. Neither side of this disagreement had that truth in its toolbox, and that truth is the central one that allows quantum mechanics to improve on what came before.

review-bot on Laziness death spirals

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

amalthea on Applications of Chaos: Saying No (with Hastings Greer)

I don't have any experience with actual situations where this could be relevant, but it does feel like you're overly focusing on the failure case where everyone is borderline incompetent and doing arbitrary things (which of course happens on less wrong sometimes, since the variation here is quite large!). There's clearly a huge upside to being able to spot when you're trying to do something that's impossible for theoretical reasons, and being extra sceptical in these situations. (E.g. someone trying to construct a perpetual motion machine). I'm open to the argument that there's a lot to be wished for in the way people in practice apply these things.

ruby on Ruby's Quick Takes

Yeah, I think a question is whether I want to say "that kind of wireheading isn't mypoic" vs "that isn't wireheading". Probably fine eitherway if you're consistent / taboo adequately.

adamshimi on Lost in Innovation: The Case of Phlogiston

Is there any empirical question the phlogiston theorists got right that compositional chemistry did not? AFAIK, no, but it's a real question and I'd like to know if I'm wrong here.

Although I haven't digged into the historical literature that much, I think there are two main candidates here: explaining the behavior of metals, and potential chemical energy.

On explaining the behavior of metal, this is Chang (Is Water H2O? p.43)

Phlogistonists explained the common properties of metals by saying that all metals were rich in phlogiston; this explanation was lost through the Chemical Revolution, as it does not work if we make the familiar substitution of phlogiston with the absence of oxygen (or, as Lavoisier had it, a strong affinity for oxygen). As Paul Hoyningen-Huene puts it (2008, 110): “Only after more than a 100 years could the explanatory potential of the phlogiston theory be regained in modern chemistry. One had to wait until the advent of the electron theory of metals”.

(Is Water H2O? p.21)

One salient case was the explanation of why metals (which were compounds for phlogistonists) had a set of common properties (Kuhn 1970 , 148). Actually by the onset of the Chemical Revolution this was no longer a research problem in the phlogiston paradigm, as it was accepted almost as common sense that metals had their common metallic properties (including shininess, malleability, ductility, electrical conductivity) because of the phlogiston they contained. The oxygenist side seems to have rejected not so much this answer as the question itself; chemistry reclaimed this stretch of territory only in the twentieth century.

And on potential chemical energy, here are the quotes from Chang again

(Is Water H2O? p.46)

William Odling made the same point in a most interesting paper from 1871. Although not a household name today, Odling was one of the leading theoretical chemists of Victorian Britain, and at that time the Fullerian Professor of Chemistry at the Royal Institution. According to Odling (1871, 319), the major insight from the phlogistonists was that “combustible bodies possess in common a power or energy capable of being elicited and used”, and that “the energy pertaining to combustible bodies is the same in all of them, and capable of being transferred from the combustible body which has it to an incombustible body which has it not”. Lavoisier had got this wrong by locating the energy in the oxygen gas in the form of caloric, without a convincing account of why caloric contained in other gases would not have the ability to cause combustion.

(Is Water H2O? p.47)

Although phlogiston was clearly not exactly chemical potential energy as understood in 1871, Odling (p. 325) argued that “the phlogistians had, in their time, possession of a real truth in nature which, altogether lost sight of in the intermediate period, has since crystallized out in a definite form.” He ended his discourse by quoting Becher: “I trust that I have got hold of my pitcher by the right handle.” And that pitcher (or Becher, cup?), the doctrine of energy, was of course “the grandest generalization in science that has ever yet been established.”

As a summary, let's quote Chang one last time. (Is Water H2O? p.47-48)

All in all, I think it is quite clear that killing phlogiston off had two adverse effects: one was to discard certain valuable scientific problems and solutions; the other was to close off certain theoretical and experimental avenues for future scientific work. Perhaps it’s all fine from where we sit, since I think the frustrated potential of the phlogistonist system was quite fully realized eventually, by some very circuitous routes. But it seems to me quite clear that the premature death of phlogiston retarded scientific progress in quite tangible ways. If it had been left to develop, I think the concept of phlogiston would have split into two. On the one hand, by the early nineteenth century someone might well have hit upon energy conservation, puzzling over this imponderable entity which seemed to have an elusive sort of reality which could be passed from one ponderable substance to another.
In that parallel universe, we would be talking about the conservation of phlogiston, and how phlogiston turned out to have all sorts of different forms, but all interconvertible with each other. This would be no more awkward than what we have in our actual universe, in which we still talk about the role of “oxygen” (acid-generator, Sauerstoff ) in supporting combustion, and the “oxidation” number of ions. On the other hand, the phlogiston concept could have led to a study of electrons without passing through such a categorical and over-simplified atomic theory as Dalton’s. Chemists might have skipped right over from phlogiston to elementary particles, or at least found an alternative path of development that did not pass through the false simplicity of the atom–molecule–bulk matter hierarchy. Keeping the phlogiston theory would have led chemists to pay more attention to the “fourth state of matter”, starting with flames, and served as a reminder that the durability of compositionist chemical building-blocks may only be an appearance. Keeping phlogiston alive could have challenged the easy Daltonian assumption that chemical atoms were physically unbreakable units. The survival of phlogiston into the nineteenth century would have sustained a vigorous alternative tradition in chemistry and physics, which would have allowed scientists to recognize with more ease the wonderful fluidity of matter, and to come to grips sooner with the nature of ions, solutions, metals, plasmas, cathode rays, and perhaps even radioactivity.

raemon on Skills from a year of Purposeful Rationality Practice

Yeah I do think it is basically a reformulation of that idea, but tailored for a different cluster of problems. (I also think Leave a Line of Retreat [LW · GW] and some other sequences posts cover similar ground).

jacques-thibodeau on Bogdan Ionut Cirstea's Shortform

I completely agree, and we should just obviously build an organization around this. Automating alignment research while also getting a better grasp on maximum current capabilities (and a better picture of how we expect it to grow).

(This is my intention, and I have had conversations with Bogdan about this, but I figured I'd make it more public in case anyone has funding or ideas they would like to share.)

sharmake-farah on Another argument against utility-centric alignment paradigms

My short answer is that the argument would consist of human values are quite simple and are most likely a reasonably natural abstraction, and the felt complexity is due to adding both the complexity of the generators and the data, which people wouldn't do for AI capabilities, meaning the bitter lesson holds for human values and morals as well.

Also, the way AI is aligned depends far more on the data that is given and our control over synthetic data means we can get AIs that follow human values before it gets too capable to take over everything, and evolutionary psychology mispredicted this and the above point pretty hard, making it lose many Bayes points compared to the Universal Learning Machine/Blank Slate hypotheses.

Alignment generalizes further than capabilities for pretty deep reasons, contra Nate Soares but basically it's way easier to have an AI care about human values than it is to get it to be capable in real-world domains, combined with verification being easier than generation.

Finally, there is evidence that AIs are far more robust to errors than people thought 15-20 years ago.

In essence, it's a negation of the following:

Fragility and Complexity of Value
Pretty much all of evolutionary psychology literature.
Capabilities generalizing further than alignment.
The Sharp Left Turn.

mako-yass on Release: Optimal Weave (P1): A Prototype Cohabitive Game

It might be good to have a suggestion that people can't talk if it's not their turn

I notice I haven't really offered a way of governing table noise, the contract system is too formal, so there's an incentive to shout over others to get the most negotiation bandwidth. I don't think this will result in people shouting a lot, but it may result in them failing to apprehend the incentives (ie, the game).

Maybe the rule should be, the person whose turn it is decides who can talk.

It might be good to explain why the turn timer.

Wasn't it already explained?

manual.md:

Turns should be limited to 1 minute, as everything that's tricky about real world negotiation is about the way it strains under time constraints. After 1 minute, you must carry out your choice. You don't need to be strict about it, but it is very important. Perfection isn't always attainable. In life, the *efficiency* of your negotiation process matters a whole lot, you don't just want to be able to negotiate, you want to be able to negotiate fast.
Having an appropriate tolerance for error and capacity for forgiveness also matters.
Without the 1 minute rule, most of the thinking of the game will be crammed in before the first turn, which wont leave much for the rest of the game. [edited just now:] It's easier to digest if it's spread out. If you try to do all of your decisionmaking at once, well, that's a lot of decisions!

Describing how to normalize points maybe good

This seems incompatible with relaxed power balancing requirements. Tightening power balance increases the design load... although an argument needs to be explored as to whether power balance is just too similar to balanced access to strategic depth for game design to separate them.