LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter
Nate Thomas (nate-thomas) · 2023-10-26T03:07:34.118Z · comments (10)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (11)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

Concrete empirical research projects in mechanistic anomaly detection
Erik Jenner (ejenner) · 2024-04-03T23:07:21.502Z · comments (0)

Koan: divining alien datastructures from RAM activations
TsviBT · 2024-04-05T18:04:57.280Z · comments (10)

Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems
Sonia Joseph (redhat) · 2024-03-13T17:09:17.027Z · comments (13)

Housing Roundup #7
Zvi · 2024-03-04T15:00:08.192Z · comments (1)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

[link] Post series on "Liability Law for reducing Existential Risk from AI"
Nora_Ammann · 2024-02-29T04:39:50.557Z · comments (1)

Wholesomeness and Effective Altruism
owencb · 2024-02-28T20:28:22.175Z · comments (3)

Evidential Cooperation in Large Worlds: Potential Objections & FAQ
Chi Nguyen · 2024-02-28T18:58:25.688Z · comments (5)

Navigating emotions in an uncertain & confusing world
Akash (akash-wasil) · 2023-11-20T18:16:09.492Z · comments (1)

How I internalized my achievements to better deal with negative feelings
Raymond Koopmanschap · 2024-02-27T15:10:24.149Z · comments (7)

[link] Project ideas: Epistemics
Lukas Finnveden (Lanrian) · 2024-01-05T23:41:23.721Z · comments (4)

[link] We Need Major, But Not Radical, FDA Reform
Maxwell Tabarrok (maxwell-tabarrok) · 2024-02-24T16:54:33.061Z · comments (12)

Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)

US Presidential Election: Tractability, Importance, and Urgency
kuhanj · 2024-05-29T23:52:22.420Z · comments (2)

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

Was Releasing Claude-3 Net-Negative?
Logan Riggs (elriggs) · 2024-03-27T17:41:56.245Z · comments (5)

Case studies on social-welfare-based standards in various industries
HoldenKarnofsky · 2024-06-20T13:33:44.780Z · comments (0)

Estimating efficiency improvements in LLM pre-training
Daan · 2024-01-19T19:32:45.124Z · comments (3)

[question] What rationality failure modes are there?
Ulisse Mini (ulisse-mini) · 2024-01-19T09:12:57.924Z · answers+comments (11)

What makes teaching math special
Viliam · 2023-12-17T14:15:01.136Z · comments (27)

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts
Mikhail Samin (mikhail-samin) · 2023-12-27T18:44:33.976Z · comments (17)

AI Risk and the US Presidential Candidates
Zane · 2024-01-06T20:18:04.945Z · comments (22)

[link] Podcast with Yoshua Bengio on Why AI Labs are “Playing Dice with Humanity’s Future”
garrison · 2024-05-10T17:23:20.436Z · comments (0)

(Approximately) Deterministic Natural Latents
johnswentworth · 2024-07-19T23:02:12.306Z · comments (0)

On plans for a functional society
kave · 2023-12-12T00:07:46.629Z · comments (8)

GPT-4o My and Google I/O Day
Zvi · 2024-05-16T17:50:03.040Z · comments (2)

[question] What did you change your mind about in the last year?
mike_hawke · 2023-11-23T20:53:45.664Z · answers+comments (16)

[link] Beyond the Board: Exploring AI Robustness Through Go
AdamGleave · 2024-06-19T16:40:06.594Z · comments (2)

How ARENA course material gets made
CallumMcDougall (TheMcDouglas) · 2024-07-02T18:04:00.209Z · comments (2)

Surviving Seveneves
Yair Halberstadt (yair-halberstadt) · 2024-06-19T13:11:55.414Z · comments (4)

[Aspiration-based designs] 1. Informal introduction
B Jacobs (Bob Jacobs) · 2024-04-28T13:00:43.268Z · comments (4)

[link] What's new at FAR AI
AdamGleave · 2023-12-04T21:18:03.951Z · comments (0)

Matrix completion prize results
paulfchristiano · 2023-12-20T15:40:04.281Z · comments (0)

How Emergency Medicine Solves the Alignment Problem
StrivingForLegibility · 2023-12-26T05:24:35.579Z · comments (4)

Notes on control evaluations for safety cases
ryan_greenblatt · 2024-02-28T16:15:17.799Z · comments (0)

D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues
aphyer · 2024-06-07T19:02:06.859Z · comments (14)

Superintelligent AI is possible in the 2020s
HunterJay · 2024-08-13T06:03:26.990Z · comments (3)

[link] [Paper] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

nathan-helm-burger on Does life actually locally *increase* entropy?

I kinda think of 'free energy' and 'entropy' as being things that living creatures in some sense 'consume'. We use the 'order' present in the universe to advance our goals (e.g. homeostasis) and leave behind a trail of higher entropy. We harness a gradient of incoming energy and order.

The sunlight which a plant absorbs might counterfactually have been turned into heat after being absorbed by the ground, and ended up in the same entropy state (from the perspective of the universe). Or it might have reflected, traveled light-years through space, and warmed some other thing. The leaf managed to insert itself in this process, intercepting the free energy, and more rapidly-than-counterfactually-expected increased the entropy of the universe.

And living multi-cellular beings are basically made up of tiny entities, cells, which are generally doing the metabolism process internally. And then mitochondria and chloroplasts within cells. But it would be a mistake to say that the living thing is causing itself to be disordered because it's increasing entropy in parts of itself. It's spending free energy (and 'excreting' entropy) in order to accomplish things. For instance, using a muscle (converting some of its stored energy to motion and waste heat) in order to bring food to the creature's mouth. The creature is creating an anti-entropic state, pursuing its specific goals, by increasing the probability of the universe corresponding to its goal state, by causing other things to be extra entropic (always with some extra loss along the way, like from friction). You are missing the order that the agent is creating in the world though if you aren't analyzing the world with the frame of how likely the agent's goals were to be achieved by random chance (e.g. Brownian motion) versus by active optimization efforts by the agent. Anytime a living creature agentically does anything, they are consuming free energy and excreting entropy.

That's my understanding anyway, but I may be using the physics terms wrong since I'm not a physicist.

drocta on A Nonconstructive Existence Proof of Aligned Superintelligence

Not if the point of the argument is to establish that a superintelligence is compatible with achieving the best possible outcome.

Here is a parody of the issue, which is somewhat unfair and leaves out almost all of your argument, but which I hope makes clear the issue I have in mind:

"Proof that a superintelligence can lead to the best possible outcome: Suppose by some method we achieved the best possible outcome. Then, there's no properties we would want a superintelligence to have beyond that, so let's call however we achieved the best possible outcome, 'a superintelligence'. Then, it is possible to have a superintelligence produce the best possible outcome, QED."

In order for an argument to be compelling for the conclusion "It is possible for a superintelligence to lead to good outcomes." you need to use a meaning of "a superintelligence" in the argument, such that the statement "It is possible for a superintelligence to lead to good outcomes", when interpreted with that meaning of "a superintelligence", produces the meaning you want that sentence to have? If I argue "it is possible for a superintelligence, by which I mean computer with a clock speed faster than N, to lead to good outcomes", then, even if I convincingly argue that a computer with a clock speed faster than N can lead to good outcomes, that shouldn't convince people that it is possible for a superintelligence, in the sense that they have in mind (presumably not defined as "a computer with a clock speed faster than N"), is compatible with good outcomes.

Now, in your argument you say that a superintelligence would presumably be some computational process. True enough! If you then showed that some predicate is true of every computational process, you would then be justified in concluding that that predicate is (presumably) true of every possible superintelligence. But instead, you seem to have argued that a predicate is true of some computational process, and then concluded that it is therefore true of some possible superintelligence. This does not follow.

daniel-murfet on yanni's Shortform

It might be worth knowing that some countries are participating in the "network" without having formal AI safety institutes

gordon-seidoh-worley on The Other Existential Crisis

What will I do when I grow up, if AI can do everything?

One interesting this about this question is that it comes from an implicit frame in which humans must do something to support their survival.

This is deeply ingrained in our biology and culture. As animals, we carry in us the well-worn drives to survive and reproduce, for which if we did not possess we not not exist because our ancestors would never have created the unbroken chain of billions of years that led to us. And with those drives comes the need to do something useful to those ends.

As humans, we are enmeshed in a culture that exists at the frontier of a long process of becoming ever better at working together to get better at surviving, because those cultures that did it better outcompeted those that were worse at it. And so we approach our entire lives with this question in our minds: what actions will I take that contribute to my survival and the survival of my society?

Transformative AI stands to break the survival frame, where the problem of our survival is put into the hands of beings more powerful than ourselves. And so then the question becomes, what do we do if we don't have to do anything to survive?

I imagine quite a lot of things! Consider what it is like to be a pet kept by humans. They have all their survival needs met for them. Some of them are so inexperienced at surviving that they'd probably die if their human caretakers disappeared, and others would make it but without the experience of years of caring for their own survival to make them experts at it. What do they do given they don't have to fight to survive? They live in luxury and happiness, if their caretakers love them and are skillful, or suffering and sorrow, if their caretakers don't or aren't.

So perhaps like a dog who lives to chase a ball or a cat who lives for napping in the sun, we will one day live to tell stories, to play games, or to simply enjoy the pleasures of being alive. Let us hope that's the world we manage to create!

drocta on A Nonconstructive Existence Proof of Aligned Superintelligence

Yes, I knew the cardinalities in question were finite. The point applies regardless though. For any set X, there is no injection from 2^X to X. In the finite case, this is 2^n > n for all natural numbers n.

If there are N possible states, then the number of functions from possible states to {0,1} is 2^N , which is more than N, so there is some function from the set of possible states to {0,1} which is not implemented by any state.

nathan-helm-burger on If I wanted to spend WAY more on AI, what would I spend it on?

Personally, I have some long lists of ideas for things I haven't got time for including: research projects in AI, research projects in other subjects which could be advanced entirely by work on a computer (e.g. collecting and summarizing relevant facts from papers, running physical simulations of potential designs, etc), games, books, productivity tools, etc.

I've tried some of the current AI agent stuff, and nothing I've tried is quite good enough with the current set of models to automated enough of actualizing my ideas to make it worth my time. I'm prioritizing saving the lives of everyone on Earth, including everyone I love, by attempting to reduce the risk of AI catastrophe. Maybe next year, the critical point will be reached where spending a lot on inference to make many tries at each necessary step will become effective. If I could just dump in a couple thousand dollars a month into AI agent inference working on my ideas, and get a handful of mostly complete projects out, then I'd be making tons of money even if my success rate for the ideas taking off were 1 in 1000.

If you aren't the sort of person who does have lists of potentially valuable projects sitting around waiting for intelligent workers to breathe life into them... I dunno. Maybe the next generation of models will be good enough to also help you with the ideation phase?

kenoubi on "Wanting" and "liking"

Thank you for writing this. It has a lot of stuff I haven't seen before (I'm only really interested in neurology insofar as it's the substrate for literally everything I care about, but that's still plenty for "I'd rather have a clue than treat the whole area as spooky stuff that goes bump in the night").

As I understand it, you and many scientists are treating energy consumption by anatomical part of the brain (as proxied by blood flow) as the main way to see "what the brain is doing". It seems possible to me that there are other ways that specific thoughts could be kept compartmentalized, e.g. which neurotransmitters are active (although I guess this correlates pretty strongly to brain region anyway) or microtemporal properties of neural pulses; but the fact that we've found any kind of reasonably consistent relationship between [brain region consuming energy] and [mental state as reported or as predicted by the situation] means that brain region is a factor used for separating / modularizing cognition, if not that it's the only such part. So, I'll take brain region = mental module for granted for now and get to my actual question:

Do you know whether anyone has compiled data, across a wide variety of experiments or other data-gathering opportunities, of which brain regions have which kinds of correlations with one another? E.g. "these two tend to be active simultaneously", "this one tends to become active just after this one", etc.

I'm particularly interested in this for the brain regions you mention in this article, those related in various senses to good and/or to bad. If one puts both menthol and capsaicin in one's mouth at the same time, the menthol will stimulate cold receptors and the capsaicin will stimulate heat receptors, and one will have an experience out of range of what the sensors usually encounter: hot and cold, simultaneously in the same location. What I actually want to know is: are good and bad (or some forms of them, anyway) also represented in a way where one isn't actually the opposite of the other, neurologically speaking? If so, are there actual cases that are clearly best described as "good and bad", where to pick a single number instead would inevitably miss the intensity of the experience?

lao-mein on Glitch Token Catalog - (Almost) a Full Clear

It does!

'What is \'████████\'?\n\nThis term comes from the Latin for "to know". It'
'What is \'████████\'?\n\n"████████" is a Latin for "I am not",'

Putting it in the middle of code causes it to sometimes spontaneously switch to an SCP story

' for i in █████.\n\n"I\'m not a scientist!"\n\n- Dr'

' for i in █████,\n\n[REDACTED]\n\n[REDACTED]\n\n[REDACTED] [REDACTED]\n\n[REDACTED]'

yanni-kyriacos on yanni's Shortform

Big AIS news imo: “The initial members of the International Network of AI Safety Institutes are Australia, Canada, the European Union, France, Japan, Kenya, the Republic of Korea, Singapore, the United Kingdom, and the United States.”

https://www.commerce.gov/news/press-releases/2024/09/us-secretary-commerce-raimondo-and-us-secretary-state-blinken-announce

H/T @shakeel

dagon on How Often Does Taking Away Options Help?

This seems really dependent on the distribution of games and choices faced by participants. Also the specifics of why external limits are possible but normal commitments aren’t.