LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] October 2024 Progress in Guaranteed Safe AI
Quinn (quinn-dougherty) · 2024-10-28T23:34:51.689Z · comments (0)

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations
ozziegooen · 2024-10-28T21:44:42.352Z · comments (0)

Quantitative Trading Bootcamp [Nov 6-10]
Ricki Heicklen (bayesshammai) · 2024-10-28T18:39:58.480Z · comments (0)

[question] somebody explain the word "epistemic" to me
KvmanThinking (avery-liu) · 2024-10-28T16:40:24.275Z · answers+comments (8)

Avoiding jailbreaks by discouraging their representation in activation space
Guido Bergman · 2024-09-27T17:49:20.785Z · comments (2)

'Chat with impactful research & evaluations' (Unjournal NotebookLMs)
david reinstein (david-reinstein) · 2024-09-28T00:32:16.845Z · comments (0)

Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?
Lorec · 2024-09-28T14:07:42.412Z · comments (6)

A gentle introduction to sparse autoencoders
Nick Jiang (nick-jiang) · 2024-09-02T18:11:47.086Z · comments (0)

Exploring Shard-like Behavior: Empirical Insights into Contextual Decision-Making in RL Agents
Alejandro Aristizabal (alejandro-aristizabal) · 2024-09-29T00:32:42.161Z · comments (0)

[question] How to cite LessWrong as an academic source?
PhilosophicalSoul (LiamLaw) · 2024-11-06T08:28:26.309Z · answers+comments (6)

Food, Prison & Exotic Animals: Sparse Autoencoders Detect 6.5x Performing Youtube Thumbnails
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-09-17T03:52:43.269Z · comments (2)

[link] LLMs Do Not Think Step-by-step In Implicit Reasoning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-11-28T09:16:57.463Z · comments (0)

Budapest Hungary - ACX Meetups Everywhere Fall 2024
Timothy Underwood (timothy-underwood-1) · 2024-08-29T18:37:41.313Z · comments (0)

[question] how to truly feel my beliefs?
KvmanThinking (avery-liu) · 2024-11-11T00:04:30.994Z · answers+comments (6)

Against Job Boards: Human Capital and the Legibility Trap
vaishnav92 · 2024-10-24T20:50:50.266Z · comments (1)

Forever Leaders
Justice Howard (justice-howard) · 2024-09-14T20:55:39.095Z · comments (9)

[link] SCP Foundation - Anti memetic Division Hub
landscape_kiwi · 2024-09-15T13:40:52.691Z · comments (1)

[question] Is School of Thought related to the Rationality Community?
Shoshannah Tekofsky (DarkSym) · 2024-10-15T12:41:33.224Z · answers+comments (6)

2025 Q1 Pivotal Research Fellowship (Technical & Policy)
Tobias H (clearthis) · 2024-11-12T10:56:24.858Z · comments (0)

GPT4o is still sensitive to user-induced bias when writing code
Reed (ThomasReed) · 2024-09-22T21:04:54.717Z · comments (0)

Retrieval Augmented Genesis
João Ribeiro Medeiros (joao-ribeiro-medeiros) · 2024-10-01T20:18:01.836Z · comments (0)

Thirty random thoughts about AI alignment
Lysandre Terrisse · 2024-09-15T16:24:10.572Z · comments (1)

[question] Why would ASI share any resources with us?
Satron · 2024-11-13T23:38:36.535Z · answers+comments (8)

[question] Can subjunctive dependence emerge from a simplicity prior?
Daniel C (harper-owen) · 2024-09-16T12:39:35.543Z · answers+comments (0)

[link] Internal music player: phenomenology of earworms
dkl9 · 2024-11-14T23:29:48.383Z · comments (4)

Introducing Kairos: a new AI safety fieldbuilding organization (the new home for SPAR and FSP)
agucova · 2024-10-25T21:59:08.782Z · comments (0)

Halifax Canada - ACX Meetups Everywhere Fall 2024
interstice · 2024-08-29T18:39:12.490Z · comments (0)

[link] AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels
Corin Katzke (corin-katzke) · 2024-10-28T16:03:39.258Z · comments (0)

Americans are fat and sick—and it’s their fault…right?
Declan Molony (declan-molony) · 2024-11-19T06:41:36.648Z · comments (3)

Another UFO Bet
codyz · 2024-11-01T01:55:27.301Z · comments (11)

[link] [Linkpost] Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)
Fernando Avalos (fernando-avalos) · 2024-09-09T03:33:53.548Z · comments (1)

Inquisitive vs. adversarial rationality
gb (ghb) · 2024-09-18T13:50:09.198Z · comments (9)

[link] Optimising under arbitrarily many constraint equations
dkl9 · 2024-09-12T14:59:28.475Z · comments (0)

[link] Contra Yudkowsky on 2-4-6 Game Difficulty Explanations
Josh Hickman (josh-hickman) · 2024-09-08T16:13:33.187Z · comments (1)

Increasing the Span of the Set of Ideas
Jeffrey Heninger (jeffrey-heninger) · 2024-09-13T15:52:39.132Z · comments (1)

[question] why won't this alignment plan work?
KvmanThinking (avery-liu) · 2024-10-10T15:44:59.450Z · answers+comments (7)

The Existential Dread of Being a Powerful AI System
testingthewaters · 2024-09-26T10:56:32.904Z · comments (1)

Differential knowledge interconnection
Roman Leventov · 2024-10-12T12:52:36.267Z · comments (0)

AI Compute governance: Verifying AI chip location
Farhan · 2024-10-12T17:36:45.942Z · comments (0)

[link] How long should political (and other) terms be?
ohmurphy · 2024-10-14T21:38:43.050Z · comments (0)

Biasing VLM Response with Visual Stimuli
Jaehyuk Lim (jason-l) · 2024-10-03T18:04:31.474Z · comments (0)

[link] Should we abstain from voting? (In nondeterministic elections)
B Jacobs (Bob Jacobs) · 2024-10-02T10:07:43.167Z · comments (6)

[question] AMA: International School Student in China
Novice · 2024-10-01T06:00:16.282Z · answers+comments (0)

How to solve the misuse problem assuming that in 10 years the default scenario is that AGI agents are capable of synthetizing pathogens
jeremtti · 2024-11-27T21:17:56.687Z · comments (0)

New Capabilities, New Risks? - Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks
Tej Lander (tej-lander) · 2024-09-29T18:58:56.253Z · comments (0)

[link] Linkpost: Hypocrisy standoff
Chris_Leong · 2024-09-29T14:27:19.175Z · comments (1)

[link] An "Observatory" For a Shy Super AI?
Sherrinford · 2024-09-27T21:22:40.296Z · comments (0)

[link] Join the $10K AutoHack 2024 Tournament
Paul Bricman (paulbricman) · 2024-09-25T11:54:20.112Z · comments (0)

[question] Artificial V/S Organoid Intelligence
10xyz (10xyz-coder) · 2024-10-23T14:31:46.385Z · answers+comments (0)

Using LLM's for AI Foundation research and the Simple Solution assumption
Donald Hobson (donald-hobson) · 2024-09-24T11:00:53.658Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

chris_leong on New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters

If it works, maybe it isn't slop?

alexander-gietelink-oldenziel on John Fisher's Shortform

I was about to delete my message because I was afraid it was a bit much but then the likes started streaming in and god knows how much of a sloot i am for internet validation.

mondsemmel on Repeal the Jones Act of 1920

Maybe our disagreement is that I'm more skeptical about the legislature proactively suggesting any good legislation? My default assumption is that without leadership, hardly anything of value gets done. Like, it's an obviously good idea to repeal the Jones Act, and yet it's persisted for a hundred years.

christiankl on Repeal the Jones Act of 1920

If you don't think it relates to the question at hand, why did you brought up the point in the first place?

I think you are too much focused on Trump (likely because the media likes to focus on Trump) and not on how a successful campaign to repeal the act would look like. It's unlikely that Trump makes it his agenda, but that's not required given that the legislature is independent from the executive.

q-home on Making a conservative case for alignment

I'll describe my general thoughts, like you did.

I think about transness in a similar way to how I think about homo/bisexuality.

If homo/bisexuality is outlawed, people are gonna suffer. Bad.
If I could erase homo/bisexuality from existence without creating suffering, I wouldn't anyway. Would be a big violation of people's freedom to choose their identity and actions (even if in practice most people don't actually "choose" to be homo/bisexual).
Different people have homo/bisexuality of different "strength" and form. One man might fall in love with another man, but dislike sex or even kissing. Maybe he isn't a real homosexual, if he doesn't need to prove it physically? Another man might identify as a bisexual, but be in a relationship with a woman... he doesn't get to prove his bisexuality (sexually or romantically). Maybe we shouldn't trust him unless he walks the talk? As a result of all such situations, we might have certain "inconsistencies": some people identifying as straight have done more "gay" things than people identifying as gay. My opinion on this? I think all of this is OK. Pushing for an "objective gay test" would be dystopian and suffering-inducing. I don't think it's an empirical matter (unless we choose it to be, which is a value-laden choice). Even if it was, we might be very far away from resolving it. So just respecting people's self-identification in the meantime is best, I believe. Moreover, a lot of this is very private information anyway. Less reason to try measuring it "objectively".

My thoughts about transness specifically:

We strive for gender equality (I hope). Which makes the concept of gender less important for society as a whole.
The concept of gender is additionally damaged by all the things a person can decide to do in their social/sexual life. For example, take an "assigned male at birth" (AMAB) person. AMAB can appear and behave very feminine without taking hormones. Or vice-versa (take hormones, get a pair of boobs, but present masculine). Additionally there are different degrees of medical transition and different types of sexual preferences.
A lot of things which make someone more or less similar to a man/woman (behavior with friends, behavior with romantic partners, behavior with sexual partners, thoughts) are private. Less reason to try measuring those "objectively".
I have a choice to respect people's self-identified genders or not. I decide to respect them. Not just because I care about people's feelings, but also because of points 1 & 2 & 3 and because of my general values (I show similar respect to homo/bisexuals). So I respect pronouns, but on top of that I also respect if someone identifies as a man/woman/nonbinary. I believe respect is optimal in terms of reducing suffering and adhering to human values.

When I compare your opinion to mine, most of my confusion is about two things: what exactly do you see as an empirical question? how does the answer (or its absence) affect our actions?

Zack insists that Blanchard is right, and that I fail at rationality if I disagree with him. People on Twitter and Reddit insist that Blanchard is wrong, and that I fail at being a decent human if I disagree with them. My opinion is that I have no comparative advantage at figuring out who is right and who is wrong on this topic, or maybe everyone is wrong, anyway it is an empirical question and I don't have the data. I hope that people who have more data and better education will one day sort it out, but until that happens, my position firmly remains "I don't know (and most likely neither do you), stop bothering me".

I think we need to be careful to not make a false equivalence here:

Trans people want us to respect their pronouns and genders.
I'm not very familiar with Blanchard, so far it seems to me like Blanchard's work is (a) just a typology for predicting certain correlations and (b) this work is sometimes used to argue that trans people are mistaken about their identities/motivations.

2A is kinda tangential to 1. So is this really a case of competing theories? I think uncertainty should make one skeptical of Blanchard work's implications rather than make one skeptical about respecting trans people.

(Note that this is about the representatives, not the people being represented. Two trans people can have different opinions, but you are required to believe the woke one and oppose the non-woke one.) Otherwise, you are transphobic. I completely reject that.

Two homo/bisexuals can have different opinions on what's "true homo/bisexuality" is too. Some opinions can be pretty negative. Yes, that's inconvenient, but that's just an expected course of events.

Shortly: disagreement is not hate. But it often gets conflated, especially in environments that overwhelmingly contain people of one political tribe.

I feel it's just the nature of some political questions. Not in all questions, not in all spaces you can treat disagreement as something benign.

But if there is a person who actually feels dysphoria from not being addressed as "ve" (someone who would be triggered by calling them any of: "he", "she", or "they"), then I believe that this is between them and their psychiatrist, and I want to be left out of this game.

Agree. Also agree that lynching for accidental misgendering is bad.

(That's when you get the "attack helicopters" as an attempt to point out the absurdity of the system.)

I'm pretty sure the helicopter argument began as an argument against trans people, not as an argument against weird-ass novel pronouns.

lao-mein on Lao Mein's Shortform

I found a good summary of OpenAI's nonprofit restructuring.

will-taylor on Counting AGIs

Plateau: There may be unexpected development plateaus that come into effect at around human-level intelligence. These plateaus could be architecture-specific (scaling laws break down; getting past AGI requires something outside the deep learning paradigm) or fundamental to the nature of machine intelligence.

That doesn’t prevent any of those four things I mentioned: it doesn’t prevent (1) the AGIs escaping control and self-reproducing, nor (2) the code / weights leaking or getting stolen, nor (3) other companies reinventing the same thing, nor (4) the AGI company (or companies) having an ability to transform compute into profits at a wildly higher exchange rate than any other compute customer, and thus making unprecedented amounts of money off their existing models, and thus buying more and more compute to run more and more copies of their AGI

It doesn't prevent (1) but it does make it less likely. A 'barely general' AGI is less likely to be able to escape control than an ASI. It doesn't prevent (2). We acknowledge (3) in section IV: "We can also incorporate multiple firms or governments building AGI, by multiplying the initial AGI population by the number of such additional AGI projects. For example, 2x if we believe China and the US will be the only two projects, or 3x if we believe OpenAI, Anthropic, and DeepMind each achieve AGI." We think there are likely to be a small number of companies near the frontier, so this is likely to be a modest multiplier. Re. (4), I think ryan_b made relevant points. I would expect some portion of compute to be tied up in long-term contracts. I agree that I would expect the developer of AGI to be able to increase their access to compute over time, but it's not obvious to me how fast that would be.

Pause: Government intervention could pause frontier AI development. Such a pause could be international. It is plausible that achieving or nearly achieving an AGI system would constitute exactly the sort of catalyzing event that would inspire governments to sharply and suddenly restrict frontier AI development.

That definitely doesn’t prevent (1) or (2), and it probably doesn’t prevent (3) or (4) either depending on implementation details.

I mostly agree on this one, though again think it makes (1) less likely for the same reason. As you say, the implementation details matter for (3) and (4), and it's not clear to me that it 'probably' wouldn't prevent them. It might be that a pause would target all companies near the frontier, in which case we could see a freeze at AGI for its developer, and near AGI for competitors.

Abstention: Many frontier AI firms appear to take the risks of advanced AI seriously, and have risk management frameworks in place (see those of Google DeepMind, OpenAI, and Anthropic). Some contain what Holden Karnofsky calls if-then commitments: “If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time.” Commitments to pause further development may kick at human-level capabilities. AGI firms might avoid recursive self-improvement to avoid existential or catastrophic risks.

That could be relevant to (1,2,4) with luck. As for (3), it might buy a few months, before Meta and the various other firms and projects that are extremely dismissive of the risks of advanced AI catch up to the front-runners.

Again, mostly agreed. I think it's possible that the development of AGI would precipitate a wider change in attitude towards it, including at other developers. Maybe it would be exactly what is needed to make other firms take the risks seriously. Perhaps it's more likely it would just provide a clear demonstration of a profitable path and spur further acceleration though. Again, we see (3) as a modest multiplier.

Windup: There are hard-to-reduce windup times in the production process of frontier AI models. For example, a training run for future systems may run into the hundreds of billions of dollars, consuming vast amounts of compute and taking months of processing. Other bottlenecks, like the time it takes to run ML experiments, might extend this windup period.

That doesn’t prevent any of (1,2,3,4). Again, we’re assuming the AGI already exists, and discussing how many servers will be running copies of it, and how soon. The question of training next-generation even-more-powerful AGIs is irrelevant to that question. Right?

The question of training next-generation even-more-powerful AGIs is relevant to containment, and is therefore relevant to how long a relatively stable period running a 'first generation AGI' might last. It doesn't prevent (2) ad (3). It doesn't prevent (4) either, though presumably a next-gen AGI would further increase a company's ability in this regard.

jeremy-gillen on John Fisher's Shortform

These seem right, but more importantly I think it would eliminate investing in new scalable companies. Or dramatically reduce it in the 50% case. So there would be very few new companies created.

(As a side note: Maybe our response to this proposal was a bit cruel. It might have been better to just point toward some econ reading material).

sil-ver on Is the mind a program?

No software/hardware separation in the brain: empirical evidence

I feel like the evidence in this section isn't strong enough to support the conclusion. Neuroscience is like nutrition -- no one agrees on anything, and you can find real people with real degrees and reputations supporting just about any view. Especially if it's something as non-committal as "this mechanism could maybe matter". Does that really invalidate the neuron doctrine? Maybe if you don't simulate ATP, the only thing that changes is that you have gotten rid of an error source. Maybe it changes some isolated neuron firings, but the brain has enough redundancy that it basically computes the same functions.

Or even if it does have a desirable computational function, maybe it's easy to substitute with some additional code.

I feel like the required standard of evidence is to demonstrate that there's a mechanism-not-captured-by-the-neuron-doctrine that plays a major computational role, not just any computational role. (Aren't most people talking about neuroscience still basically assuming that this is not the case?)

We can expect natural selection to result in a web of contingencies between different levels of abstraction.[6]

Mhh yeah I think the plausibility argument has some merit.

lucas-teixeira on Bogdan Ionut Cirstea's Shortform

I'm curious how these claims relate to what's proposed by this paper. (note, I haven't read either in depth)