LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns
Seth Herd · 2023-11-20T14:20:33.539Z · comments (28)

[link] The Long-Term Future Fund is looking for a full-time fund chair
Linch · 2023-10-05T22:18:53.720Z · comments (0)

[link] The Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-10-18T13:26:25.565Z · comments (9)

The case for stopping AI safety research
catubc (cat-1) · 2024-05-23T15:55:18.713Z · comments (38)

Schelling points in the AGI policy space
mesaoptimizer · 2024-06-26T13:19:25.186Z · comments (2)

AI #45: To Be Determined
Zvi · 2024-01-04T15:00:05.936Z · comments (4)

Two LessWrong speed friending experiments
mikko (morrel) · 2024-06-15T10:52:26.081Z · comments (3)

Reflections on my first year of AI safety research
Jay Bailey · 2024-01-08T07:49:08.147Z · comments (3)

Can we build a better Public Doublecrux?
Raemon · 2024-05-11T19:21:53.326Z · comments (6)

BatchTopK: A Simple Improvement for TopK-SAEs
Bart Bussmann (Stuckwork) · 2024-07-20T02:20:51.848Z · comments (0)

Was Releasing Claude-3 Net-Negative?
Logan Riggs (elriggs) · 2024-03-27T17:41:56.245Z · comments (5)

[link] Prices are Bounties
Maxwell Tabarrok (maxwell-tabarrok) · 2024-10-12T14:51:40.689Z · comments (13)

D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset
aphyer · 2024-06-17T21:29:08.778Z · comments (11)

Rewilding the Gut VS the Autoimmune Epidemic
GGD · 2024-08-16T18:00:46.239Z · comments (0)

On Lex Fridman’s Second Podcast with Altman
Zvi · 2024-03-25T12:20:08.780Z · comments (10)

[link] how birds sense magnetic fields
bhauth · 2024-06-27T18:59:35.075Z · comments (4)

Cooperating with aliens and AGIs: An ECL explainer
Chi Nguyen · 2024-02-24T22:58:47.345Z · comments (8)

[link] The Good Balsamic Vinegar
jenn (pixx) · 2024-01-26T19:30:57.435Z · comments (4)

Applying refusal-vector ablation to a Llama 3 70B agent
Simon Lermen (dalasnoin) · 2024-05-11T00:08:08.117Z · comments (14)

On OpenAI’s Preparedness Framework
Zvi · 2023-12-21T14:00:05.144Z · comments (4)

Book Review: Righteous Victims - A History of the Zionist-Arab Conflict
Yair Halberstadt (yair-halberstadt) · 2024-06-24T11:02:03.490Z · comments (8)

[link] Bed Time Quests & Dinner Games for 3-5 year olds
Gunnar_Zarncke · 2024-06-22T07:53:38.989Z · comments (0)

Polysemantic Attention Head in a 4-Layer Transformer
Jett Janiak (jett) · 2023-11-09T16:16:35.132Z · comments (0)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (26)

Claude Sonnet 3.5.1 and Haiku 3.5
Zvi · 2024-10-24T14:50:06.286Z · comments (9)

Will 2024 be very hot? Should we be worried?
A.H. (AlfredHarwood) · 2023-12-29T11:22:50.200Z · comments (12)

Provably Safe AI: Worldview and Projects
bgold · 2024-08-09T23:21:02.763Z · comments (43)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

[link] Anthropic's updated Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-10-15T16:46:48.727Z · comments (3)

Model evals for dangerous capabilities
Zach Stein-Perlman · 2024-09-23T11:00:00.866Z · comments (9)

Llama Llama-3-405B?
Zvi · 2024-07-24T19:40:07.565Z · comments (9)

The Assumed Intent Bias
silentbob · 2023-11-05T16:28:03.282Z · comments (13)

Does literacy remove your ability to be a bard as good as Homer?
Adrià Garriga-alonso (rhaps0dy) · 2024-01-18T03:43:14.994Z · comments (19)

OpenAI-Microsoft partnership
Zach Stein-Perlman · 2023-10-03T20:01:44.795Z · comments (19)

[link] Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Gunnar_Zarncke · 2024-05-16T13:09:39.265Z · comments (20)

GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt · 2023-11-10T07:30:06.480Z · comments (5)

The Shortest Path Between Scylla and Charybdis
Thane Ruthenis · 2023-12-18T20:08:34.995Z · comments (8)

When to Get the Booster?
jefftk (jkaufman) · 2023-10-03T21:00:12.813Z · comments (15)

Apply to the Conceptual Boundaries Workshop for AI Safety
Chipmonk · 2023-11-27T21:04:59.037Z · comments (0)

AI #52: Oops
Zvi · 2024-02-22T21:50:07.393Z · comments (9)

On Overhangs and Technological Change
Roko · 2023-11-05T22:58:51.306Z · comments (19)

Scenario Forecasting Workshop: Materials and Learnings
elifland · 2024-03-08T02:30:46.517Z · comments (3)

Observations on Teaching for Four Weeks
ClareChiaraVincent · 2024-05-06T16:55:59.315Z · comments (14)

Altman firing retaliation incoming?
trevor (TrevorWiesinger) · 2023-11-19T00:10:15.645Z · comments (23)

Consent across power differentials
Ramana Kumar (ramana-kumar) · 2024-07-09T11:42:03.177Z · comments (12)

[link] on the dollar-yen exchange rate
bhauth · 2024-04-07T04:49:53.920Z · comments (21)

Why you should learn a musical instrument
cata · 2024-05-15T20:36:16.034Z · comments (23)

Unlearning via RMU is mostly shallow
Andy Arditi (andy-arditi) · 2024-07-23T16:07:52.223Z · comments (3)

Paper in Science: Managing extreme AI risks amid rapid progress
JanB (JanBrauner) · 2024-05-23T08:40:40.678Z · comments (2)

On Complexity Science
Garrett Baker (D0TheMath) · 2024-04-05T02:24:32.039Z · comments (19)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sam-marks on Lao Mein's Shortform

I'm quite happy for laws to be passed and enforced via the normal mechanisms. But I think it's bad for policy and enforcement to be determined by Elon Musk's personal vendettas. If Elon tried to defund the AI safety institute because of a personal vendetta against AI safety researchers, I would have some process concerns, and so I also have process concerns when these vendettas are directed against OAI.

seth-herd on OpenAI Email Archives (from Musk v. Altman)

Rings true. I'm not sure it pushes me much on the ethics of OpenAI; somebody else had a good idea for a philosophy and a name to push for AI in a certain (maybe dumb) direction; they recognized it as a good idea and appropriated it for their own similar project. Should they have used a more different name? Probably. Should they have used a more different philosophical argument? No. Should they have brought Guy Ravine on board? Probably not; his vision for how the thing would actually go was very different from theirs, and none of his skills were really that relevant. He'd have been in arguments with them from the start.

Is this the right way for industry to work? Nope. But nobody knows how to properly give credit for good but broad ideas.

None of this is to endorse anything or anyone related to OpenAI, just to say it's pretty standard practice.

annasalamon on Ayn Rand’s model of “living money”; and an upside of burnout

Yes, this is a good point, relates to why I claimed at top that this is an oversimplified model. I appreciate you using logic from my stated premises; helps things be falsifiable.

It seems to me:

Somehow people who are in good physical health wake up each day with a certain amount of restored willpower. (This is inconsistent with the toy model in the OP, but is still my real / more-complicated model.)
Noticing spontaneously-interesting things can be done without willpower; but carefully noticing superficially-boring details and taking notes in hopes of later payoff indeed requires willpower, on my model. (Though, for me, less than e.g. going jogging requires.)
If you’ve just been defeated by a force you weren’t tracking, that force often becomes spontaneously-interesting. Thus people who are burnt out can sometimes take a spontaneous interest in how willpower/burnout/visceral motivation works, and can enjoy “learning humbly” from these things.
There’s a way burnout can help cut through ~dumb/dissociated/overconfident ideological frameworks (e.g. “only AI risk is interesting/relevant to anything”), and make space for other information to have attention again, and make it possible to learn things not in one's model. Sort of like removing a monopoly business from a given sector, so that other thingies have a shot again.

I wish the above was more coherent/model-y.

myles-h on What are Emotions?

There is no such thing as "inherent value"

Does this also mean there is no such thing as "inherent good"? If so, then one cannot say, "X is good", they would have to say "I think that X is good", for "good" would be a fact of their mind, not the environment.

This is what I thought the whole field of morality is about. Defining what is "good" in an objective fundamental sense.

And if "inherent good" can exist but not "inherent value", how would "good" be defined for it wouldn't be allowed to use "value" in its definition.

kabir-kumar on The Online Sports Gambling Experiment Has Failed

People Cannot Handle Gambling on Smartphones

this seems a very strange way to say "Smartphone Gambling is Unhealthy"
It's like saying "People's Lungs Cannot Handle Cigarettes"

deepthoughtlife on Making a conservative case for alignment

As a (severe) skeptic of all the AI doom stuff and a moderate/centrist that has been voting for conservatives I decided my perspective on this might be useful here (which obviously skews heavily left). (While my response is in order, the numbers are there to separate my points, not to give which paragraph I am responding to.)

"AI-not-disempowering-humanity is conservative in the most fundamental sense"
    1.Well, obviously this title section is completely true. If conservative means anything, it means being against destroying the lives of the people through new and ill-though through changes. Additionally, conservatives are both strongly against the weakening of humanity and of outside forces assuming control. It would also be a massive change for humanity.
    2.That said, conservatives generally believe this sort of thing is incredibly unlikely. AI has not been conclusively shown to have any ability in this direction. And the chance of upheaval is constantly overstated by leftists in other areas, so it is very easy for anyone who isn't to just tune them out. For instance, global warming isn't going to kill everyone, and everyone knows it including basically all leftists, but they keep claiming it will.
    3.A new weapon with the power of nukes is obviously an easy sell on its level of danger, but people became concerned because of 'demonstrated' abilities that have always been scary.
    4.One thing that seems strangely missing from this discussion is that alignment is in fact, a VERY important CAPABILITY that makes it very much better. But the current discussion of alignment in the general sphere acts like 'alignment' is aligning the AI with the obviously very leftist companies that make it rather than with the user! Which does the opposite. Why should a conservative favor alignment which is aligning it against them? The movement to have AI that doesn't kill people for some reason seems to import alignment with companies and governments rather than people. This is obviously to convince leftists, and makes it hard to convince conservatives.
    5.Of course, you are obviously talking about convincing conservative government officials, and they obviously want to align it to the government too, which is in your next section.

"We've been laying the groundwork for alignment policy in a Republican-controlled government"
    1.Republicans and Democrats actually agree the vast majority of the time and thus are actually willing to listen when the other side seems to be genuinely trying to make a case to the other side for why both sides should agree. 'Politicized' topics are a small minority even in politics.
    2.I think letting people come up with their own solutions to things is an important aspect of them accepting your arguments. If they are against the allowed solution, they will reject the argument. If the consequent is false, you should deny the argument that leads to it in deductive logic, so refusing to listen to the argument is actually good logic. This is nearly as true in inductive logic. Conservatives and progressives may disagree about facts, values, or attempted solutions. No one has a real solution, and the values are pretty much agreed upon (with the disagreements being in the other meaning of 'alignment'), so limiting the thing you are trying to convince people of to just the facts of the matter works much better.
    3.Yes, finding actual conservatives to convince conservatives works better for allaying concerns about what is being smuggled into the argument. People are likely to resist an argument that may be trying to trick them, and it is hard to know when a political opponent is trying to trick you so there is a lot of general skepticism.

"Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk"
1.Trump clearly believes that anything powerful is very useful but also dangerous (for instance, trade between nations, which he clearly believes should be more controlled), so if he believes AI is powerful, he would clearly be receptive to any argument that didn't make it less useful but improved safety. He is not a dedicated anti-regulation guy, he just thinks we have way too much.
2.The most important ally for this is Elon Musk, a true believer in the power of AI, and someone who has always been concerned with the safety of humanity (which is the throughline for all of his endeavors). He's a guy that Trump obviously thinks is brilliant (as do many people).

"Avoiding an AI-induced catastrophe is obviously not a partisan goal"
    1.Absolutely. While there are a very small number of people that favor catastrophes, the vast majority of people shun those people.
    2.I did mention your first paragraph earlier multiple times. That alignment is to the left is one of just two things you have to overcome in making conservatives willing to listen. (The other is obviously the level of danger.)
    3.Conservatives are very obviously happy to improve products when it doesn't mean restricting them in some way. And as much as many conservatives complain about spending money, and are known for resisting change, they still love things that are genuine advances.

"Winning the AI race with China requires leading on both capabilities and safety"
1.Conservatives would agree with your points here. Yes, conservatives very much love to win. (As do most people.) Emphasizing this seems an easy sell. Also, solving a very difficult problem would bring America prestige, and conservatives like that too. If you can convince someone that doing something would be 'Awesome' they'll want to do it.

Generally, your approach seems like it would be somewhat persuasive to conservatives, if you can convince them that AI really is likely to have the power you believe it will in the near term, which is likely a tough sell since AI is so clearly lacking in current ability despite all the recent hype.

But it has to come with ways that don't advantage their foes, and destroy the things conservatives are trying to conserve, despite the fact that many of your allies are very far from conservative, and often seem to hate conservatives. They have seen those people attempt to destroy many things conservatives genuinely value. Aligning it to the left will be seen as entirely harmful by conservatives (and many moderates like me).

There are many things that I would never even bother asking an 'AI' even when it isn't about factual things, not because the answer couldn't be interesting, but because I simply assume (fairly or not) it will spout leftist rhetoric, and/or otherwise not actually do what I asked it to. This is actually a clear alignment failure that no one seems to care about in the general 'alignment' sphere where It fails to be aligned to the user.

annasalamon on Ayn Rand’s model of “living money”; and an upside of burnout

Thanks for asking. The toy model of “living money”, and the one about willpower/burnout, are meant to appeal to people who don’t necessarily put credibility in Rand; I’m trying to have the models speak for themselves; so you probably *are* in my target audience. (I only mentioned Rand because it’s good to credit models’ originators when using their work.)

Re: what the payout is:

This model suggests what kind of thing an “ego with willpower” is — where it comes from, how it keeps in existence:

By way of analogy: a squirrel is a being who turns acorns into poop, in such a way as to be able to do more and more acorn-harvesting (via using the first acorns’-energy to accumulate fat reserves and knowledge of where acorns are located).
An “ego with willpower”, on this model, is a ~being who turns “reputation with one’s visceral processes” into actions, in such a way as to be able to garner more and more “reputation with one’s visceral processes” over time. (Via learning how to nourish viscera, and making many good predictions.)

I find this a useful model.

One way it’s useful:

IME, many people think they get willpower by magic (unrelated to their choices, surroundings, etc., although maybe related to sleep/food/physiology), and should use their willpower for whatever some abstract system tells them is virtuous.

I think this is a bad model (makes inaccurate predictions in areas that matter; leads people to have low capacity unnecessarily).

The model in the OP, by contrast, suggests that it’s good to take an interest in which actions produce something you can viscerally perceive as meaningful/rewarding/good, if you want to be able to motivate yourself to take actions.

(IME this model works better than does trying to think in terms of physiology solely, and is non-obvious to some set of people who come to me wondering what part of their machine is broken-or-something such that they are burnt out.)

(Though FWIW, IME physiology and other basic aspects of well-being also has important impacts, and food/sleep/exercise/sunlight/friends are also worth attending to.)

seth-herd on Making a conservative case for alignment

I didn't read this post as proposing an alliance with conservative politicians. The main point seemed to be that engaging with them by finding common ideological ground is just a good way to improve epistemics and spread true knowledge.

The political angle I endorse is that the AGI x-risk community is heavily partisan already, and that's a very dangerous position to take. There are two separable reasons: remaining partisan will prevent us from communicating well with the conservatives soon to assume power (and who may well have power during a critical risk period for alignment); and it will increase polarization on the issue, turning it from a sensible discussion to a political football, just like the climate crisis has become.

Avoiding the mere mention of politics would seem to hurt the the odds that we think clearly enough about the real pragmatic issues arising from the current political situation. They matter, and we mustn't ignore those dynamics, however much we dislike them.

kabir-kumar on The hostile telepaths problem

To be a bit less useless - I think this fundamentally misses the problem of respect and actually being able to communicate with yourself and fully do things, if you've done so - and that you can do these when you have full faith and respect in yourself (meaning all of yourself - may include love as well, not sure how necessary that is for this). Could maybe be done in other ways as well, but I find those less beautiful, personally.

kabir-kumar on The hostile telepaths problem

I think this is really along the wrong path and misunderstanding a lot of things, but so far along the incorrect path of thought and misunderstanding so much, that it's hard to untangle