LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Linkpost: Look at the Water
J Bostock (Jemist) · 2024-12-30T19:49:04.107Z · comments (0)

Printable book of some rationalist creative writing (from Scott A. & Eliezer)
CounterBlunder · 2024-12-23T15:44:31.437Z · comments (0)

No, the Polymarket price does not mean we can immediately conclude what the probability of a bird flu pandemic is. We also need to know the interest rate!
Christopher King (christopher-king) · 2024-12-28T16:05:47.037Z · comments (7)

Vision of a positive Singularity
RussellThor · 2024-12-23T02:19:35.050Z · comments (0)

Super human AI is a very low hanging fruit!
Hzn · 2024-12-26T19:00:22.822Z · comments (0)

Good Fortune and Many Worlds
Jonah Wilberg (jrwilb@googlemail.com) · 2024-12-27T13:21:43.142Z · comments (0)

Towards mutually assured cooperation
mikko (morrel) · 2024-12-22T20:46:21.965Z · comments (0)

[question] Has Anthropic checked if Claude fakes alignment for intended values too?
Maloew (maloew-valenar) · 2024-12-23T00:43:07.490Z · answers+comments (1)

Dishbrain and implications.
RussellThor · 2024-12-29T10:42:43.912Z · comments (0)

Broken Latents: Studying SAEs and Feature Co-occurrence in Toy Models
chanind · 2024-12-30T22:50:54.964Z · comments (0)

[question] Are Sparse Autoencoders a good idea for AI control?
Gerard Boxo (gerard-boxo) · 2024-12-26T17:34:55.617Z · answers+comments (2)

[question] Could my work, "Beyond HaHa" benefit the LessWrong community?
P. João (gabriel-brito) · 2024-12-29T16:14:13.497Z · answers+comments (0)

Teaching Claude to Meditate
Gordon Seidoh Worley (gworley) · 2024-12-29T22:27:44.657Z · comments (3)

[link] When do experts think human-level AI will be created?
Vishakha (vishakha-agrawal) · 2024-12-30T06:20:33.158Z · comments (0)

Algorithmic Asubjective Anthropics, Cartesian Subjective Anthropics
Lorec · 2024-12-27T01:58:39.880Z · comments (0)

Duplicate token neurons in the first layer of gpt2-small
Alex Gibson · 2024-12-27T04:21:55.896Z · comments (0)

[link] The Economics & Practicality of Starting Mars Colonization
Zero Contradictions · 2024-12-26T10:56:26.019Z · comments (1)

[link] World models I'm currently building
xpostah · 2024-12-30T08:26:16.972Z · comments (0)

[Rationality Malaysia] 2024 year-end meetup!
Doris Liew (doris-liew) · 2024-12-23T16:02:03.566Z · comments (0)

Towards a Unified Interpretability of Artificial and Biological Neural Networks
jan_bauer · 2024-12-21T23:10:45.842Z · comments (0)

[question] What are the main arguments against AGI?
Edy Nastase (edy-nastase) · 2024-12-24T15:49:03.196Z · answers+comments (6)

Game Theory and Behavioral Economics in The Stock Market
Jaiveer Singh (jaiveer-singh) · 2024-12-24T18:15:55.468Z · comments (0)

ARC-AGI is a genuine AGI test but o3 cheated :(
Knight Lee (Max Lee) · 2024-12-22T00:58:05.447Z · comments (2)

Making LLMs safer is more intuitive than you think: How Common Sense and Diversity Improve AI Alignment
Jeba Sania (jeba-sania) · 2024-12-29T19:27:35.685Z · comments (0)

Emergence and Amplification of Survival
jgraves01 · 2024-12-28T23:52:47.893Z · comments (0)

The Great OpenAI Debate: Should It Stay ‘Open’ or Go Private?
Satya (satya-2) · 2024-12-30T01:14:28.329Z · comments (0)

Morality Is Still Demanding
utilistrutil · 2024-12-29T00:33:40.471Z · comments (2)

The Opening Salvo: 1. An Ontological Consciousness Metric: Resistance to Behavioral Modification as a Measure of Recursive Awareness
Peterpiper · 2024-12-25T02:29:52.025Z · comments (0)

Action: how do you REALLY go about doing?
DDthinker · 2024-12-29T22:00:24.915Z · comments (0)

[link] Human, All Too Human - Superintelligence requires learning things we can’t teach
Ben Turtel (ben-turtel) · 2024-12-26T16:26:27.328Z · comments (4)

Aristotle, Aquinas, and the Evolution of Teleology: From Purpose to Meaning.
Spiritus Dei (spiritus-dei) · 2024-12-23T19:37:58.788Z · comments (0)

Woloch & Wosatan
JackOfAllTrades (JackOfAllSpades) · 2024-12-22T15:46:27.235Z · comments (0)

Terminal goal vs Intelligence
Donatas Lučiūnas (donatas-luciunas) · 2024-12-26T08:10:42.144Z · comments (24)

Propaganda Is Everywhere—LLM Models Are No Exception
Yanling Guo (yanling-guo) · 2024-12-23T01:39:03.777Z · comments (0)

The Engineering Argument Fallacy: Why Technological Success Doesn't Validate Physics
Wenitte Apiou (wenitte-apiou) · 2024-12-28T00:49:53.300Z · comments (5)

Rejecting Anthropomorphic Bias: Addressing Fears of AGI and Transformation
Gedankensprünge (gedankenspruenge) · 2024-12-29T01:48:47.583Z · comments (1)

AI Alignment, and where we stand.
afeller08 · 2024-12-29T14:08:47.276Z · comments (0)

The Misconception of AGI as an Existential Threat: A Reassessment
Gedankensprünge (gedankenspruenge) · 2024-12-29T01:39:57.780Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

eukaryote on Learn to write well BEFORE you have something worth saying

😅 You know, I was thinking of calling it "Learn to write good BEFORE you have something worth saying", but figured I'd get some people rolling their eyes at the grammar of "write good" in a post purporting to offer writing advice. This would however have disambiguated the point you mentioned, which I hadn't thought about. Really goes to show you something or other.

l-rudolf-l on By default, capital will matter more than ever after AGI

For example:

Currently big companies struggle to hire and correctly promote talent for the reasons discussed in my post, whereas AI talent will be easier to find/hire/replicate given only capital & legible info
To the extent that AI ability scales with resources (potentially boosted by inference-time compute, and if SOTA models are no longer available to the public), then better-resourced actors have better galaxy brains
Superhuman intelligence and organisational ability in AIs will mean less bureaucratic rot and communication bandwidth problems in large orgs, compared to orgs made out of human brain -sized chunks, reducing the costs of scale

Imagine for example the world where software engineering is incredibly cheap. You can start a software company very easily, yes, but Google can monitor the web for any company that makes revenue off of software, instantly clone the functionality (because software engineering is just a turn-the-crank-on-the-LLM thing now) and combine it with their platform advantage and existing products and distribution channels. Whereas right now, it would cost Google a lot of precious human time and focus to try to even monitor all the developing startups, let alone launch a competing product for each one. Of course, it might be that Google itself is too bureaucratic and slow to ever do this, but someone else will then take this strategy.

C.f. the oft-quoted thing about how the startup challenge is getting to distribution before the incumbents get to distribution. But if the innovation is engineering, and the engineering is trivial, how do you get time to get distribution right?

(Interestingly, as I'm describing it above the most key thing is not so much capital intensivity, and more just that innovation/engineering is no longer a source of differential advantage because everyone can do it with their AIs really well)

There's definitely a chance that there's some "crack" in this, either from the economics or the nature of AI performance or some interaction. In particular, as I mentioned at the end, I don't think modelling the AI as an approaching blank wall of complete perfect intelligence all-obsoleting intelligence is the right model for short to medium -term dynamics. Would be very curious if you have thoughts.

willpetillo on The Robot, the Puppet-master, and the Psychohistorian

Verifying my understanding of your position: you are fine with the puppet-master and psychohistorian categories and agree with their implications, but you put the categories on a spectrum (systems are not either chaotic or robustly modellable, chaos is bounded and thus exists in degrees) and contend that ASI will be much closer to the puppet-master category. This is a valid crux.

To dig a little deeper, how does your objection sustain in light of my previous post, Lenses of Control [LW · GW]? The basic argument there is that future ASI control systems will have to deal with questions like: "If I deploy novel technology X, what is the resulting equilibrium of the world, including how feedback might impact my learning and values?" Does the level chaos in such contexts remain narrowly bounded?

sharmake-farah on By default, capital will matter more than ever after AGI

This might be okay if they respected the autonomy of unaugmented people, but all of the arguments about AGI being hard to control, and destroying its creators by default, apply equally well to hyperaugmented humans. If you try to coexist with entities who are vastly more powerful than you, you will eventually be crushed or deprived of key resources. In fact, this applies even moreso with humans than AIs, since humans were not explicitly designed to be helpful or benevolent.

I would go further and say that augmented humans are probably more risky than AIs, because you can't do a lot of the experimentation on a human that is legal to do to AI, and importantly it's way riskier from a legal perspective and a difficulty perspective to align a human to you, because it is essentially brainwashing, and it's easier to control an AI's data source than a human's data source.

This is a big reason why I never really liked the augmentation of humans path to solve AI alignment that people like Tsvi Benson-Tilsen want, because you now possibly have 2 alignment problems, not just 1 (link is below):

https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods [LW · GW]

l-rudolf-l on By default, capital will matter more than ever after AGI

Note, firstly, that money will continue being a thing, at least unless we have one single AI system doing all economic planning. Prices are largely about communicating information. If there are many actors and they trade with each other, the strong assumption should be that there are prices (even if humans do not see them or interact with them). Remember too that however sharp the singularity, abundance will still be finite, and must therefore be allocated.

Though yes, I agree that a superintelligent singleton controlling a command economy means this breaks down.

However it seems far from clear we will end up exactly there. The finiteness of the future lightcone and the resulting necessity of allocating "scarce" resources, the usefulness of a single medium of exchange (which you can see as motivated by coherence theorems if you want), and trade between different entities all seem like very general concepts. So even in futures that are otherwise very alien, but just not in the exact "singleton-run command economy" direction, I expect a high chance that those concepts matter.

ryan_greenblatt on By default, capital will matter more than ever after AGI

The point of a scenario forecast (IMO) is less that you expect clear, predictable paths and more that:

Humans often do better understanding and thinking about something if there is a specific story to discuss and thus tradeoffs can be worth it.
Sometimes, scenario forecasting, indicates a case where your previous views were missing a clearly very important consideration or were assuming something implausible.

(See also Daniel's sibling comment.)

My biggest disagreements with you are probably a mix of:

We have disagreements about how society will react to AI (and how AI will react to society) given a realistic development arc (especially in short timelines) that imply that your vision of the future seems implausible to me. And perhaps the easiest way to get through all of these disagreements is for you to concretely describe what you expect might happen. As an example, I have a view like "it will be hard for power to very quickly transition from humans to AIs without some sort of hard takeover especially given dynamics about alignment and training AIs on imitation (and sandbagging)", but I think this is tied up "when I think about the story for how a non-hard-takeover quick transition would go, it doesn't seem to make sense to me", and thus if you told the story from your perspective it would be easier to point at the disagreement in your ontology/world view.
(Less importantly?) We have various technical disagreements about how AI takeoff and misalignment will practically work that I don't think will be addressed by scenario forecasting. (E.g., I think software only singularity is more likely than you do, and think that worst cast scheming is more likely.)

johnswentworth on The Field of AI Alignment: A Postmortem, and What To Do About It

I’m not convinced that the “hard parts” of alignment are difficult in the standardly difficult, g-requiring way that e.g., a physics post-doc might possess.

To be clear, I wasn't talking about physics postdocs mainly because of raw g. Raw g is a necessary element, and physics postdocs are pretty heavily loaded on it, but I was talking about physics postdocs mostly because of the large volume of applied math tools they have.

The usual way that someone sees footholds on the hard parts of alignment is to have a broad enough technical background that they can see some analogy to something they know about, and try borrowing tools that work on that other thing. Thus the importance of a large volume of technical knowledge.

alex_altair on Shallow review of live agendas in alignment & safety

FWIW I can't really tell what this website is supposed to be/do by looking at the landing page and menu

alex_altair on Learn to write well BEFORE you have something worth saying

The title reads ambiguous to me; I can't tell if you mean "learn to [write well] before" or "learn to write [well before]".

winstonbosan on The low Information Density of Eliezer Yudkowsky & LessWrong

While I agree that we don't live at the Pareto frontier of conciseness, explain-ability and etc, those are some odd examples to use to support your thesis. And the comparison to the hackernews post is likely using the wrong reference class.

Two of the three examples are heavily downvoted. Whether that's because of untruthful content or stylistic (length, tone, etc) or memetic reason (Eliezer ~ prophet), those posts are hardly the poster child of what Lesswrong can do or even is.

As for Vanessa Kosoy's piece, the last third was filled with quotations and they had given the "stop here if you don't want my comments" warning. And it is also otherwise filled with references to many historically important posts and concepts that requires at least a quick refresher to catch the reader up to speed. I suppose Vanessa could have assumed that her readers would have been familiar with all those arguments and the nuances in different positions, but that was not her goal.

The specific example used from hacker news is likely a HackerNews Ask - a format more comparable to the shortforms and quicktake format in Lesswrong. Full fledged posts here vs full fledged posts on Hackernews is actually very comparable. (Will add supporting data when offwork)