LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Species as Canonical Referents of Super-Organisms
Yudhister Kumar (randomwalks) · 2024-10-18T07:49:52.944Z · comments (8)

NYU Debate Training Update: Methods, Baselines, Preliminary Results
samarnesen · 2024-07-06T18:28:54.053Z · comments (0)

AI Labs Wouldn't be Convicted of Treason or Sedition
Matthew Khoriaty (matthew-khoriaty) · 2024-06-23T21:34:50.135Z · comments (2)

[link] Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent
Karolis Jucys (karolis-ramanauskas) · 2024-07-18T17:02:06.179Z · comments (0)

Fix simple mistakes in ARC-AGI, etc.
Oleg Trott (oleg-trott) · 2024-07-09T17:46:50.364Z · comments (9)

[question] How bad would AI progress need to be for us to think general technological progress is also bad?
Jim Buhler (jim-buhler) · 2024-07-09T10:43:45.506Z · answers+comments (5)

Prediction markets and Taxes
Edmund Nelson (edmund-nelson) · 2024-11-01T17:39:35.191Z · comments (7)

[question] A Different Perspective on Rationality - Would This Be Valuable?
Gabriel Brito (gabriel-brito) · 2024-10-26T18:47:46.416Z · answers+comments (4)

Fundamental Uncertainty: Epilogue
Gordon Seidoh Worley (gworley) · 2024-11-16T00:57:48.823Z · comments (0)

Keeping it (less than) real: Against ℶ₂ possible people or worlds
quiet_NaN · 2024-09-13T17:29:44.915Z · comments (0)

how to rapidly assimilate new information
dhruvmethi · 2024-10-24T02:18:00.648Z · comments (3)

Thinking About a Pedalboard
jefftk (jkaufman) · 2024-10-08T11:50:02.054Z · comments (2)

Rationalist Gnosticism
tailcalled · 2024-10-10T09:06:34.149Z · comments (10)

[link] Testing Genetic Engineering Detection with Spike-Ins
jefftk (jkaufman) · 2024-10-22T17:20:54.947Z · comments (0)

[link] In Praise of the Beatitudes
robotelvis · 2024-09-24T05:08:21.133Z · comments (7)

[link] Physics of Language models (part 2.1)
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-19T16:48:32.301Z · comments (2)

[link] Markets Are Information - Beating the Sportsbooks at Their Own Game
JJXW · 2024-11-07T20:58:43.389Z · comments (1)

[link] Anthropic teams up with Palantir and AWS to sell AI to defense customers
Matrice Jacobine · 2024-11-09T11:50:34.050Z · comments (0)

Force Sequential Output with SCP?
jefftk (jkaufman) · 2024-11-09T12:40:06.098Z · comments (4)

[question] Change My Mind: Thirders in "Sleeping Beauty" are Just Doing Epistemology Wrong
DragonGod · 2024-10-16T10:20:22.133Z · answers+comments (67)

[link] It's important to know when to stop: Mechanistic Exploration of Gemma 2 List Generation
Gerard Boxo (gerard-boxo) · 2024-10-14T17:04:57.010Z · comments (0)

An open response to Wittkotter and Yampolskiy
Donald Hobson (donald-hobson) · 2024-09-24T22:27:21.987Z · comments (0)

[link] What is autonomy? Why boundaries are necessary.
Chipmonk · 2024-10-21T17:56:33.722Z · comments (1)

New UChicago Rationality Group
Noah Birnbaum (daniel-birnbaum) · 2024-11-08T21:20:34.485Z · comments (0)

[link] Jailbreaking language models with user roleplay
loops (smitop) · 2024-09-28T23:43:10.870Z · comments (0)

Two new datasets for evaluating political sycophancy in LLMs
alma.liezenga · 2024-09-28T18:29:49.088Z · comments (0)

[link] Triangulating My Interpretation of Methods: Black Boxes by Marco J. Nathan
adamShimi · 2024-10-09T19:13:26.631Z · comments (0)

[link] Nerdtrition: simple diets via spreadsheet abuse
dkl9 · 2024-10-27T21:45:15.117Z · comments (0)

Steering LLMs' Behavior with Concept Activation Vectors
Ruixuan Huang (sprout_ust) · 2024-09-28T09:53:19.658Z · comments (0)

MIT FutureTech are hiring for a Head of Operations role
peterslattery · 2024-10-02T17:11:42.960Z · comments (0)

[link] Michael Streamlines on Buddhism
Chris_Leong · 2024-08-09T04:44:52.126Z · comments (0)

Dario Amodei's "Machines of Loving Grace" sound incredibly dangerous, for Humans
Super AGI (super-agi) · 2024-10-27T05:05:13.763Z · comments (1)

On Intentionality, or: Towards a More Inclusive Concept of Lying
Cornelius Dybdahl (Kalciphoz) · 2024-10-18T10:37:32.201Z · comments (0)

[link] Can AI agents learn to be good?
Ram Rachum (ram@rachum.com) · 2024-08-29T14:20:04.336Z · comments (0)

Thoughts On the Nature of Capability Elicitation via Fine-tuning
Theodore Chapman · 2024-10-15T08:39:19.909Z · comments (0)

[link] Contagious Beliefs—Simulating Political Alignment
James Stephen Brown (james-brown) · 2024-10-13T00:27:08.084Z · comments (0)

HDBSCAN is Surprisingly Effective at Finding Interpretable Clusters of the SAE Decoder Matrix
Jaehyuk Lim (jason-l) · 2024-10-11T23:06:14.340Z · comments (2)

[link] Cooperation and Alignment in Delegation Games: You Need Both!
Oliver Sourbut · 2024-08-03T10:16:51.716Z · comments (0)

Bad lessons learned from the debate
bayesyatina · 2024-06-26T11:54:44.953Z · comments (5)

[link] New fast transformer inference ASIC — Sohu by Etched
lukehmiles (lcmgcd) · 2024-06-26T09:56:08.649Z · comments (9)

[link] AI Safety at the Frontier: Paper Highlights, July '24
gasteigerjo · 2024-08-05T13:00:46.028Z · comments (0)

Three main arguments that AI will save humans and one meta-argument
avturchin · 2024-10-02T11:39:08.910Z · comments (8)

[question] Is it Legal to Maintain Turing Tests using Data Poisoning, and would it work?
Double · 2024-09-05T00:35:39.504Z · answers+comments (9)

Foresight Vision Weekend 2024
Allison Duettmann (allison-duettmann) · 2024-10-01T21:59:55.107Z · comments (0)

[link] AI Safety Newsletter #42: Newsom Vetoes SB 1047 Plus, OpenAI’s o1, and AI Governance Summary
Corin Katzke (corin-katzke) · 2024-10-01T20:35:32.399Z · comments (0)

[link] In-Context Learning: An Alignment Survey
alamerton · 2024-09-30T18:44:28.589Z · comments (0)

[link] Catastrophic Cyber Capabilities Benchmark (3CB): Robustly Evaluating LLM Agent Cyber Offense Capabilities
Jonathan N (derpyplops) · 2024-11-05T01:01:08.083Z · comments (0)

LLMs are likely not conscious
research_prime_space · 2024-09-29T20:57:26.111Z · comments (8)

My covid-related beliefs and questions
Severin T. Seehrich (sts) · 2024-07-23T03:27:09.348Z · comments (0)

[link] Models of life
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-29T19:24:40.060Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

seth-herd on OpenAI Email Archives (from Musk v. Altman)

I sometimes feel we spend too much time on philosophy and communication in the x-risk community. But thinking through the OpenAI drama suggests that it's crucial.

Now the world is in more and more immediate danger because a couple of smart guys couldn't get their philosophy or their communication right-enough, and didn't spend the time necessary to clarify. Instead Musk followed his combative and entrepreneurial instincts. The result was dramatically heating up the race for AGI, which previously had no real competition to DeepMind.

OpenAI wouldn't have launched without Musk's support, and he gave it because he was afraid of Larry Page being in charge of a successful Google AGI effort.

From Musk's interview with Tucker Carlson (automated transcript, sorry!):

I mean the the reason open AI exists at all is that um Larry Paige and I used to be close friends and I would stay at his house in pal Alto and I would talk to him late into the night about uh AI safety and at least my (01:12) perception was that Larry was not taking uh AI safety seriously enough um and um what did he say about it he really seemed to be um one want want sort of digital super intelligence basically digital God if you will and at one point uh I said well what about you know we're going to make sure humanity is okay here um and and and um uh and then he called me a speciest

Musk was afraid of what Page would do with AGI because Page called Musk a speciesist (specist?) when they were talking about AGI safety. What did Page mean by this? He probably hadn't worked it all the way through.

These guys stopped being friends, and Musk put a bunch of money and effort into developing an org that could rival DeepMind's progress toward AGI.

That org was captured by Altman. But it was always based on a stupid idea: make AGI open source. That's the dumbest thing you could do with something really dangerous - unless you believed that it would otherwise wind up in hands that just don't care about humanity.

That's probably not what Page meant. He probably mean that AI that included what we value about humanity would be a worthy successor. He probably wasn't even clear on his own philosophy at the time.

A little more careful conversation would've prevented this whole thing, and we'd be in a much better strategic position.

cata on OpenAI Email Archives (from Musk v. Altman)

It seems like Musk in 2018 dramatically underestimated the ability of OpenAI to compete with Google in the medium term.

cata on OpenAI Email Archives (from Musk v. Altman)

Thanks for not only doing this but noting the accuracy of the unchecked transcript, it's always hard work to build a mental model of how good LLM tools are at what stuff.

sam-marks on Lao Mein's Shortform

I'm quite happy for laws to be passed and enforced via the normal mechanisms. But I think it's bad for policy and enforcement to be determined by Elon Musk's personal vendettas. If Elon tried to defund the AI safety institute because of a personal vendetta against AI safety researchers, I would have some process concerns, and so I also have process concerns when these vendettas are directed against OAI.

seth-herd on OpenAI Email Archives (from Musk v. Altman)

Rings true. I'm not sure it pushes me much on the ethics of OpenAI; somebody else had a good idea for a philosophy and a name to push for AI in a certain (maybe dumb) direction; they recognized it as a good idea and appropriated it for their own similar project. Should they have used a more different name? Probably. Should they have used a more different philosophical argument? No. Should they have brought Guy Ravine on board? Probably not; his vision for how the thing would actually go was very different from theirs, and none of his skills were really that relevant. He'd have been in arguments with them from the start.

Is this the right way for industry to work? Nope. But nobody knows how to properly give credit for good but broad ideas.

None of this is to endorse anything or anyone related to OpenAI, just to say it's pretty standard practice.

annasalamon on Ayn Rand’s model of “living money”; and an upside of burnout

Yes, this is a good point, relates to why I claimed at top that this is an oversimplified model. I appreciate you using logic from my stated premises; helps things be falsifiable.

It seems to me:

Somehow people who are in good physical health wake up each day with a certain amount of restored willpower. (This is inconsistent with the toy model in the OP, but is still my real / more-complicated model.)
Noticing spontaneously-interesting things can be done without willpower; but carefully noticing superficially-boring details and taking notes in hopes of later payoff indeed requires willpower, on my model. (Though, for me, less than e.g. going jogging requires.)
If you’ve just been defeated by a force you weren’t tracking, that force often becomes spontaneously-interesting. Thus people who are burnt out can sometimes take a spontaneous interest in how willpower/burnout/visceral motivation works, and can enjoy “learning humbly” from these things.
There’s a way burnout can help cut through ~dumb/dissociated/overconfident ideological frameworks (e.g. “only AI risk is interesting/relevant to anything”), and make space for other information to have attention again, and make it possible to learn things not in one's model. Sort of like removing a monopoly business from a given sector, so that other thingies have a shot again.

I wish the above was more coherent/model-y.

myles-h on What are Emotions?

There is no such thing as "inherent value"

Does this also mean there is no such thing as "inherent good"? If so, then one cannot say, "X is good", they would have to say "I think that X is good", for "good" would be a fact of their mind, not the environment.

This is what I thought the whole field of morality is about. Defining what is "good" in an objective fundamental sense.

And if "inherent good" can exist but not "inherent value", how would "good" be defined for it wouldn't be allowed to use "value" in its definition.

kabir-kumar on The Online Sports Gambling Experiment Has Failed

People Cannot Handle Gambling on Smartphones

this seems a very strange way to say "Smartphone Gambling is Unhealthy"
It's like saying "People's Lungs Cannot Handle Cigarettes"

deepthoughtlife on Making a conservative case for alignment

As a (severe) skeptic of all the AI doom stuff and a moderate/centrist that has been voting for conservatives I decided my perspective on this might be useful here (which obviously skews heavily left). (While my response is in order, the numbers are there to separate my points, not to give which paragraph I am responding to.)

"AI-not-disempowering-humanity is conservative in the most fundamental sense"
    1.Well, obviously this title section is completely true. If conservative means anything, it means being against destroying the lives of the people through new and ill-though through changes. Additionally, conservatives are both strongly against the weakening of humanity and of outside forces assuming control. It would also be a massive change for humanity.
    2.That said, conservatives generally believe this sort of thing is incredibly unlikely. AI has not been conclusively shown to have any ability in this direction. And the chance of upheaval is constantly overstated by leftists in other areas, so it is very easy for anyone who isn't to just tune them out. For instance, global warming isn't going to kill everyone, and everyone knows it including basically all leftists, but they keep claiming it will.
    3.A new weapon with the power of nukes is obviously an easy sell on its level of danger, but people became concerned because of 'demonstrated' abilities that have always been scary.
    4.One thing that seems strangely missing from this discussion is that alignment is in fact, a VERY important CAPABILITY that makes it very much better. But the current discussion of alignment in the general sphere acts like 'alignment' is aligning the AI with the obviously very leftist companies that make it rather than with the user! Which does the opposite. Why should a conservative favor alignment which is aligning it against them? The movement to have AI that doesn't kill people for some reason seems to import alignment with companies and governments rather than people. This is obviously to convince leftists, and makes it hard to convince conservatives.
    5.Of course, you are obviously talking about convincing conservative government officials, and they obviously want to align it to the government too, which is in your next section.

"We've been laying the groundwork for alignment policy in a Republican-controlled government"
    1.Republicans and Democrats actually agree the vast majority of the time and thus are actually willing to listen when the other side seems to be genuinely trying to make a case to the other side for why both sides should agree. 'Politicized' topics are a small minority even in politics.
    2.I think letting people come up with their own solutions to things is an important aspect of them accepting your arguments. If they are against the allowed solution, they will reject the argument. If the consequent is false, you should deny the argument that leads to it in deductive logic, so refusing to listen to the argument is actually good logic. This is nearly as true in inductive logic. Conservatives and progressives may disagree about facts, values, or attempted solutions. No one has a real solution, and the values are pretty much agreed upon (with the disagreements being in the other meaning of 'alignment'), so limiting the thing you are trying to convince people of to just the facts of the matter works much better.
    3.Yes, finding actual conservatives to convince conservatives works better for allaying concerns about what is being smuggled into the argument. People are likely to resist an argument that may be trying to trick them, and it is hard to know when a political opponent is trying to trick you so there is a lot of general skepticism.

"Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk"
1.Trump clearly believes that anything powerful is very useful but also dangerous (for instance, trade between nations, which he clearly believes should be more controlled), so if he believes AI is powerful, he would clearly be receptive to any argument that didn't make it less useful but improved safety. He is not a dedicated anti-regulation guy, he just thinks we have way too much.
2.The most important ally for this is Elon Musk, a true believer in the power of AI, and someone who has always been concerned with the safety of humanity (which is the throughline for all of his endeavors). He's a guy that Trump obviously thinks is brilliant (as do many people).

"Avoiding an AI-induced catastrophe is obviously not a partisan goal"
    1.Absolutely. While there are a very small number of people that favor catastrophes, the vast majority of people shun those people.
    2.I did mention your first paragraph earlier multiple times. That alignment is to the left is one of just two things you have to overcome in making conservatives willing to listen. (The other is obviously the level of danger.)
    3.Conservatives are very obviously happy to improve products when it doesn't mean restricting them in some way. And as much as many conservatives complain about spending money, and are known for resisting change, they still love things that are genuine advances.

"Winning the AI race with China requires leading on both capabilities and safety"
1.Conservatives would agree with your points here. Yes, conservatives very much love to win. (As do most people.) Emphasizing this seems an easy sell. Also, solving a very difficult problem would bring America prestige, and conservatives like that too. If you can convince someone that doing something would be 'Awesome' they'll want to do it.

Generally, your approach seems like it would be somewhat persuasive to conservatives, if you can convince them that AI really is likely to have the power you believe it will in the near term, which is likely a tough sell since AI is so clearly lacking in current ability despite all the recent hype.

But it has to come with ways that don't advantage their foes, and destroy the things conservatives are trying to conserve, despite the fact that many of your allies are very far from conservative, and often seem to hate conservatives. They have seen those people attempt to destroy many things conservatives genuinely value. Aligning it to the left will be seen as entirely harmful by conservatives (and many moderates like me).

There are many things that I would never even bother asking an 'AI' even when it isn't about factual things, not because the answer couldn't be interesting, but because I simply assume (fairly or not) it will spout leftist rhetoric, and/or otherwise not actually do what I asked it to. This is actually a clear alignment failure that no one seems to care about in the general 'alignment' sphere where It fails to be aligned to the user.

annasalamon on Ayn Rand’s model of “living money”; and an upside of burnout

Thanks for asking. The toy model of “living money”, and the one about willpower/burnout, are meant to appeal to people who don’t necessarily put credibility in Rand; I’m trying to have the models speak for themselves; so you probably *are* in my target audience. (I only mentioned Rand because it’s good to credit models’ originators when using their work.)

Re: what the payout is:

This model suggests what kind of thing an “ego with willpower” is — where it comes from, how it keeps in existence:

By way of analogy: a squirrel is a being who turns acorns into poop, in such a way as to be able to do more and more acorn-harvesting (via using the first acorns’-energy to accumulate fat reserves and knowledge of where acorns are located).
An “ego with willpower”, on this model, is a ~being who turns “reputation with one’s visceral processes” into actions, in such a way as to be able to garner more and more “reputation with one’s visceral processes” over time. (Via learning how to nourish viscera, and making many good predictions.)

I find this a useful model.

One way it’s useful:

IME, many people think they get willpower by magic (unrelated to their choices, surroundings, etc., although maybe related to sleep/food/physiology), and should use their willpower for whatever some abstract system tells them is virtuous.

I think this is a bad model (makes inaccurate predictions in areas that matter; leads people to have low capacity unnecessarily).

The model in the OP, by contrast, suggests that it’s good to take an interest in which actions produce something you can viscerally perceive as meaningful/rewarding/good, if you want to be able to motivate yourself to take actions.

(IME this model works better than does trying to think in terms of physiology solely, and is non-obvious to some set of people who come to me wondering what part of their machine is broken-or-something such that they are burnt out.)

(Though FWIW, IME physiology and other basic aspects of well-being also has important impacts, and food/sleep/exercise/sunlight/friends are also worth attending to.)