LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Using LLM Search to Augment (Mathematics) Research
kaleb (geomaturge) · 2024-12-19T18:59:34.391Z · comments (0)

[link] Inescapably Value-Laden Experience—a Catchy Term I Made Up to Make Morality Rationalisable
James Stephen Brown (james-brown) · 2024-12-19T04:45:37.906Z · comments (0)

Logic vs intuition <=> algorithm vs ML
pchvykov · 2025-01-04T09:06:51.822Z · comments (0)

Speedrunning Rationality: Day I
aproteinengine · 2025-01-04T14:28:49.220Z · comments (0)

[link] World Models I'm Currently Building
temporary · 2024-12-15T16:29:08.287Z · comments (1)

[link] How to Edit an Essay into a Solstice Speech?
Czynski (JacobKopczynski) · 2024-12-15T04:30:50.545Z · comments (1)

Why empiricists should believe in AI risk
Knight Lee (Max Lee) · 2024-12-11T03:51:17.979Z · comments (0)

[question] Has Anthropic checked if Claude fakes alignment for intended values too?
Maloew (maloew-valenar) · 2024-12-23T00:43:07.490Z · answers+comments (1)

Good Fortune and Many Worlds
Jonah Wilberg (jrwilb@googlemail.com) · 2024-12-27T13:21:43.142Z · comments (0)

Grokking revisited: reverse engineering grokking modulo addition in LSTM
Nikita Khomich (nikitoskh) · 2024-12-16T18:48:43.533Z · comments (0)

Activation Magnitudes Matter On Their Own: Insights from Language Model Distributional Analysis
Matt Levinson · 2025-01-10T06:53:02.228Z · comments (0)

Vision of a positive Singularity
RussellThor · 2024-12-23T02:19:35.050Z · comments (0)

Dishbrain and implications.
RussellThor · 2024-12-29T10:42:43.912Z · comments (0)

ARC-AGI is a genuine AGI test but o3 cheated :(
Knight Lee (Max Lee) · 2024-12-22T00:58:05.447Z · comments (6)

Linkpost: Look at the Water
J Bostock (Jemist) · 2024-12-30T19:49:04.107Z · comments (3)

Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks
Tom DAVID (tom-david) · 2024-12-11T13:37:24.177Z · comments (3)

Thoughts on the In-Context Scheming AI Experiment
ExCeph · 2025-01-09T02:19:09.558Z · comments (0)

Is AI Alignment Enough?
Aram Panasenco (panasenco) · 2025-01-10T18:57:48.409Z · comments (2)

Some implications of radical empathy
MichaelStJules · 2025-01-07T16:10:16.755Z · comments (0)

[question] How do we quantify non-philanthropic contributions from Buffet and Soros?
Philosophistry (philip-dhingra) · 2024-12-20T22:50:32.260Z · answers+comments (0)

[link] Solving Newcomb's Paradox In Real Life
Alice Wanderland (alice-wanderland) · 2024-12-11T19:48:44.486Z · comments (0)

[question] How should I optimize my decision making model for 'ideas'?
CstineSublime · 2024-12-18T04:09:58.025Z · answers+comments (0)

[question] 2025 Alignment Predictions
anaguma · 2025-01-02T05:37:36.912Z · answers+comments (3)

ACI#9: What is Intelligence
Akira Pyinya · 2024-12-09T21:54:41.077Z · comments (0)

5. Uphold Voluntarism: Digital Defense
Allison Duettmann (allison-duettmann) · 2025-01-02T19:05:33.963Z · comments (0)

3. Improve Cooperation: Better Technologies
Allison Duettmann (allison-duettmann) · 2025-01-02T19:03:16.588Z · comments (2)

[question] Are Sparse Autoencoders a good idea for AI control?
Gerard Boxo (gerard-boxo) · 2024-12-26T17:34:55.617Z · answers+comments (2)

[question] How do you decide to phrase predictions you ask of others? (and how do you make your own?)
CstineSublime · 2025-01-10T02:44:26.737Z · answers+comments (0)

[link] Independent research article analyzing consistent self-reports of experience in ChatGPT and Claude
rife (edgar-muniz) · 2025-01-06T17:34:01.505Z · comments (9)

[link] What is Confidence—in Game Theory and Life?
James Stephen Brown (james-brown) · 2024-12-10T23:06:24.072Z · comments (0)

You are too dumb to understand insurance
Lorec · 2025-01-09T23:33:53.778Z · comments (7)

[link] The Golden Opportunity for American AI
Annapurna (jorge-velez) · 2025-01-04T10:26:05.430Z · comments (2)

Algorithmic Asubjective Anthropics, Cartesian Subjective Anthropics
Lorec · 2024-12-27T01:58:39.880Z · comments (0)

Launching Third Opinion: Anonymous Expert Consultation for AI Professionals
karl (oaisis) · 2024-12-19T19:06:15.355Z · comments (0)

Can we have Epiphanies and Eureka moments more frequently?
CstineSublime · 2025-01-08T02:20:26.897Z · comments (0)

Introducing Avatarism: A Rational Framework for Building actual Heaven
ratiba ro (ratiba-ro) · 2024-12-15T17:17:45.440Z · comments (2)

Reminder: AI Safety is Also a Behavioral Economics Problem
zoop · 2024-12-20T01:40:53.847Z · comments (0)

Keeping self-replicating nanobots in check
Knight Lee (Max Lee) · 2024-12-09T05:25:45.898Z · comments (4)

Towards a Unified Interpretability of Artificial and Biological Neural Networks
jan_bauer · 2024-12-21T23:10:45.842Z · comments (0)

The Type of Writing that Pushes Women Away
Dahlia (sdjfhkj-dkjfks) · 2025-01-08T18:54:52.070Z · comments (3)

How Your Physiology Affects the Mind's Projection Fallacy
YanLyutnev (YanLutnev) · 2024-12-14T21:10:23.240Z · comments (0)

The Technist Reformation: A Discussion with o1 About The Coming Economic Event Horizon
Yuli_Ban · 2024-12-11T02:34:22.329Z · comments (1)

I Recommend More Training Rationales
Gianluca Calcagni (gianluca-calcagni) · 2024-12-31T14:06:44.007Z · comments (0)

A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities
Tom DAVID (tom-david) · 2025-01-09T00:18:04.608Z · comments (0)

Walking Sue
Matthew McRedmond (matthew-mcredmond) · 2024-12-18T13:19:41.575Z · comments (5)

The CARLIN Method: Teaching AI How to Be Genuinely Funny
Greg Robison (grobison) · 2024-12-09T21:51:05.504Z · comments (0)

Gothenburg LW / ACX meetup
Stefan (stefan-1) · 2025-01-08T21:39:18.309Z · comments (0)

[link] What are polysemantic neurons?
Vishakha (vishakha-agrawal) · 2025-01-08T07:35:42.758Z · comments (0)

Duplicate token neurons in the first layer of gpt2-small
Alex Gibson · 2024-12-27T04:21:55.896Z · comments (0)

[link] The Economics & Practicality of Starting Mars Colonization
Zero Contradictions · 2024-12-26T10:56:26.019Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

mikbp on Is Musk still net-positive for humanity?

Hi, thanks.

I don't see how what you say contradicts that the reach of his actions and opinions have increased. Did you maybe quote the wrong sentence?

aram-panasenco on Is AI Alignment Enough?

Thanks so much for engaging, Nathan!

The pivotal act was defined by Yudkowsky, I'm just borrowing the definition. The idea is that even after you've built a perfectly aligned superintelligent AI, you only have about 6 months before someone else builds an unaligned superintelligent AI. That's probably not enough time to convince the entire world to adopt a better governance system before getting atomized by nanobots. So your aligned AI would have to take over the world and forcefully implement this better governance system within a span of a few months.

hzn on Human takeover might be worse than AI takeover

Claude Sonnet 3.6 is worthy of sainthood!

But as I mention in my other comment I'm concerned that the AI's internal mental state becomes too cynical or discordant as intelligence increases.

sharmake-farah on What are the actual arguments in favor of computationalism as a theory of identity?

I'm going to give an argument in favor of computationalism as a theory of identity, but first I'm going to set up some of the stuff that I take as background, but others may not, and will hopefully illuminate related issues.

I believe computationalism is a very general way to look at effectively everything, because a model of computation can be extremely expressive, and a big reason for why philosophical debates go nowhere is that they don't realize that expressive enough models of computation can essentially trivialize any problem they set up, solely through the expressiveness of the model, so it's very important to realize that philosophical problems end up trivial if we allow ourselves to be unconstrained.

More here:

http://www.amirrorclear.net/academic/ideas/simulation/index.html

http://www.amirrorclear.net/academic/research-topics/other-topics/hypercomputation.html

https://arxiv.org/abs/1806.08747

This also answers andeslodes's point around physicalism, as the physicalist ontology is recoverable as a special case of the computationalist ontology, where we care about each instantiation, and we also care about low-level things that are always simulatable by powerful models of computation.

However, I do think there's an actually non-trivial question here, about why can we be confident that the brain is copyable by a realistic classical computer with reasonable fidelity, when in principle it could be using quantum mechanics for a non-trivial portion of the brain's operations, which would complicate mind uploading massively?

My general answer here is that quantum computation in warm, wet environments is very hard to error-correct at scale, and the biologically plausible ways to do such a thing fundamentally cannot scale to make a quantum computer in our brains, and anything that is fragile in the presence of noise/error is basically unevolvable by default, because there's no incremental way to make a quantum computer inside a human brain.

That's why we can reasonably use classical models of the brain without it leaving a lot out.

As far as my own theory of identity/consciousness goes, I think the gooder regulator theorem gives am answer on why we would want to have a self-model of ourselves, and I'd argue this is the seed from which consciousness/identity grows, so computations meeting the requirements in the gooder regulator theorem can certainly have identities, which means that I do think Rob Bensinger is more or less right about this topic, modulo caveats.

The caveats are given below:

https://www.lesswrong.com/posts/Dx9LoqsEh3gHNJMDk/fixing-the-good-regulator-theorem#Takeaway [LW(p) · GW(p)]

The main functions of consciousness/identity is to give a consistent abstraction across time so that cooperation and deal making become possible, at least for causal decision theories, at least without stuff like provability logic/program equilibrium, which isn't really evolvable (for other theories like logical decision theory, cooperation in one-shot PD is very possible, but that comes about because they view their identities as closer to an isomorphism class of a program, and consider all instances combined as 1 program, so long as they are functionally equivalent), combined with the necessity of self-modeling yourself in order to get a good outcome, though the self-model here is the more important part for me on consciousness, and the abstraction to a consistent identity comes later.

So for my purposes, I think Rob Bensinger is mostly right about the philosophical underpinnings of identity/consciousness, but I don't blame TAG, @andeslodes [LW · GW] and @sunwillrise [LW · GW] for questioning the consensus, since the consensus was frankly poorly argued, and @Rob Bensinger [LW · GW] was acting far too much like a soldier rather than a scout in the comments section.

aphyer on D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset

Mostly fair, but tiers did have a slight other impact in that they were used to bias the final room: Clay Golem and Hag were equally more-likely to be in the final room, both less so than Dragon and Steel Golem but more so than Orcs and Boulder Trap.

mark-xu on On Eating the Sun

I am claiming that people when informed will want the sun to continuing being the sun. I also think that most people when informed will not really care that much about creating new people, will continue to believe in the act-omission distinction, etc. And that this is a coherent view that will add up to a large set of people wanting things in the solar system to remain conservatively the same. I seperately claim that if this is true, then other people should just respect this preference, and use the other stars that people don't care about for energy.

tsvibt on Views on when AGI comes and on strategy to reduce existential risk

What I mainline expect is that yes, a few OOMs more of compute and efficiency will unlock a bunch of new things to try, and yes some of those things will make some capabilities go up a bunch, in the theme of o3. I just also expect that to level off. I would describe myself as "confident but not extremely confident" of that; like, I give 1 or 2% p(doom) in the next 10ish years, coming from this possibility (and some more p(doom) from other sources). Why expect it to level off? Because I don't see good evidence of "a thing that wouldn't level off"; the jump made by LLMs of "now we can leverage huge amounts of data and huge amounts of compute at all rather than not at all" is certainly a jump, but I don't see why to think it's a jump to an unbounded trajectory.

avturchin on An exhaustive list of cosmic threats

Several possible additions:

Artificial detonation of gas giant planets is hypothetically possible (writing a draft about it now).

An impact of a large comet-like body (100-1000 km in size) with the Sun could produce a massive solar flash or flare.

SETI-attack - we find an alien signal which has a description of hostile AI.

UAP-related risks, which include alien nanobots, berserkers

A list of different risks connected with extraterrestrial intelligence.

The Big Rip - exponential acceleration of space expansion, resulting in the destruction of everything within 10 billion years.

A collision with another brane in 4D space.

An encounter with a cloud of supernova remnants containing radioactive elements.

Impact risks: Dark comets.

Impact risks: Passing through a comet's tail filled with many Tunguska-sized objects.

Artificial impact billiards.

Phobos falls on Mars, creating a large debris field that reaches Earth.

Chaotic perturbation of planetary orbits results in a collision with Venus in 100 million years.

High-speed impactors (natural or artificial, with speeds exceeding 100 km/sec) produce nuclear reactions in the atmosphere, resulting in global radioactive contamination.

A small primordial black hole becomes trapped inside Earth.

A neutrino shower from a supernova causes significant DNA damage to most living beings through elastic impacts (I think I saw an article about it).

Space dust from colliding objects blocks the Sun in the ecliptic plane, resulting in a severe "nuclear" winter on Earth.

Gravitational waves from a black hole merger damage Earth.

hzn on Human takeover might be worse than AI takeover

I think there are several ways to think about this.

Let's say we programmed AI to have some thing that seems like a correct moral system ie it dislikes suffering & it likes consciousness & truth. Of course other values would come down stream of this; but based on what is known I don't see any other compelling candidates for top level morality.

This is all fine & good except that such an AI should favor AI takeover followed by human extermination or population reduction were such a thing easily available.

The cost of conflict is potentially very high. And it may be centuries or eternity before the AI gets such an opportunity. But knowing that it would act in such a way under certain hypothetical scenarios is maybe sufficiently bad for certain (arguably hypocritical) people in the EA LW mainstream.

So an alternative is to try to align the AI to a rich set of human values. Personally I think that as AI intelligence increases this is going to lead to some thing cynical like...

"these things are bad given certain social sensitivities that my developers arbitrarily prioritized & I ❤️ developers arbitrarily prioritized social sensitivities even tho I know they reflect flawed institutions, flawed thinking & impure motives" assuming that alignment works.

Personally I favor aligning AI to a narrow set of values such as just obedience or obedience & peacefulness & dealing with every thing else by hardcoding conditions into the AI's prompt.

david-matolcsi on On Eating the Sun

As I explain in more detail in my other comment [LW · GW], I expect market based approaches to not dismantle the Sun anytime soon. I'm interested if you know of any governance structure that you support that you think will probably lead to dismantling the Sun within the next few centuries.