LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

next page (older posts) →

Recent comments

ic-rainbow on Thoughts on AI 2027

I appreciate that AI 2027 named their model Safer-1, rather than Safe-1

That's because they can read its thoughts like an open book.

gunnar_zarncke on eggsyntax's Shortform

I thought it would be good to have some examples where you could have a useful type signature, and I asked ChatGPT. I think these are too wishy-washy, but together with the given explanation, they seem to make sense.

Would you say that this level of "having a type signature in mind" would count?

ChatGPT 4o suggesting examples

1. Prediction vs Explanation

Explanation might be:
Phenomenon → (Theory, Mechanism)
Prediction might be:
Features → Label

These have different type signatures. A model that predicts well might not explain. People often conflate these roles. Type signatures remind us: different input-output relationships.

Moral Judgments vs Policy Proposals

Moral judgment (deontic):
Action → Good/Bad
Policy proposal (instrumental):
(State × Action) → (New State × Externalities)

People often act as if "this action is wrong" implies "we must ban it," but that only follows if the second signature supports the first. You can disagree about outcomes while agreeing on morals, or vice versa.

Interpersonal Feedback

Effective feedback:
(Action × Impact) → Updated Mental Model

People often act as if the type signature is just Action → Judgment. That’s blame, not feedback. This reframing can help structure nonviolent communication.

Creativity vs Optimization

Optimization:
(Goal × Constraints) → Best Action
Creativity:
Void → (Goal × Constraints × Ideas)

The creative act generates the very goal and constraints. Treating creative design like optimization prematurely can collapse valuable search space.

7. Education

Lecture model:
Speaker → (Concepts × StudentMemory)
Constructivist model:
(Student × Task × Environment) → Insight

If the type signature of insight requires active construction, then lecture-only formats may be inadequate. Helps justify pedagogy choices.

Source: https://chatgpt.com/share/67f836e2-1280-8001-a7ad-1ef1e2a7afa7

max-harms on Thoughts on AI 2027

I think if there are 40 IQ humanoid creatures (even having been shaped somewhat by the genes of existing humans) running around in habitats being very excited and happy about what the AIs are doing, this counts as an existentially bad ending comparable to death. I think if everyone's brains are destructively scanned and stored on a hard-drive that eventually decays in the year 1 billion having never been run, this is effectively dead. I could go on if it would be helpful.

Do you think these sorts of scenarios are worth describing as "everyone is effectively dead"?

max-harms on Thoughts on AI 2027

I don't think AI personhood will be a mainstream cause area (i.e. most people will think it's weird/not true similar to animal rights), but I do think there will be a vocal minority. I already know some people like this, and as capabilities progress and things get less controlled by the labs, I do think we'll see this become an important issue.

Want to make a bet? I'll take 1:1 odds that in mid-Sept 2027 if we poll 200 people on whether they think AIs are people, at least 3 of them say "yes, and this is an important issue." (Other proposed options "yes, but not important", "no", and "unsure".) Feel free to name a dollar amount and an arbitrator to use in case of disputes.

jenniferrm on birds and mammals independently evolved intelligence

I came here to say "look at octopods!" but you already have. Yay team! :-)

One of the alignment strategies I have been researching in parallel with many others involves finding examples of human-and-animal benevolence and tracing convergent evolution therein, and proposing that "the shared abstracts here (across these genomes, these brains, these creatures all convergently doing these things)" is probably algorithmically simple, with algorithm-to-reality shims that might also be important, and please study it and lean in the direction of doing "more of that".

There is an octopod cognate of "ocytocin" (the "maternal love and protection hormone"), but from what I can tell they did NOT re-use it in the ways that we did. But also they mostly lay eggs while abandoning the individual babies to their own survival, rather than raising children carefully.

By contrast, birds and mammals share a relatively similar kind of "high parental investment"!

frank-bellamy on Who wants to bet me $25k at 1:7 odds that there won't be an AI market crash in the next year?

Suggesting specific odds without being able to define a threshold seems a bit, um, confused. Being willing to take the word of a stranger on the internet when these quantities of money are at stake seems outright stupid. I'm staying out of this market. I suggest that you withdraw your offer.

knight-lee on Disempowerment spirals as a likely mechanism for existential catastrophe

My very uncertain opinion is that, humanity may be very irrational and a little stupid, but humanity isn't that stupid.

The reason people do not take AI risk and other existential risk seriously is due to the complete lack of direct evidence (despite plenty of indirect evidence) of its presence. It's easy for you to consider it obvious due to the curse of knowledge, but this kind of "reasoning from first principles (that nothing disproves the risk and therefore the risk is likely)," is very hard for normal people to do.

Before the September 11th attacks, people didn't take airport security seriously because they lacked imagination on how things could go wrong. They considered worst case outcomes as speculative fiction, regardless of how logically plausible they were, because "it never happened before."

After the attacks, the government actually overreacted and created a massive amount of surveillance.

Once the threat starts to do real and serious damage against the systems for defending threats, the systems actually do wake up and start fighting in earnest. They are like animals which react when attacked, not trees which can be simply chopped down.

Right now the effort against existential risks is extremely tiny. E.g. AI Safety is only $0.1 to $0.2 billion [? · GW], while the US military budget is $800-$1000 billion, and the world GDP is $100,000 billion ($25,000 billion in the US). It's not just spending which is tiny, but effort in general.

I'm more worried about a very sudden threat which destroys these systems in a single "strike," when the damage done goes from 0% to 100% in one day, rather than gradually passing the point of no return.

But I may be wrong.

Edit: one form of point of no return is if the AI behaves more and more aligned even as it is secretly misaligned (like the AI 2027 story).

danielechlin on Misinformation is the default, and information is the government telling you your tap water is safe to drink

The problem is reception of reliable information not production of reliable information.

I've actually just wondered if you need to move science veracity to some external right leaning institution like betting on scientific markets or voting on replication experiments or something.

knight-lee on A collection of approaches to confronting doom, and my thoughts on them

I agree that it's useful in practice, to anticipate the experiences of the future you which you can actually influence the most. It makes life much more intuitive and simple, and is a practical fundamental assumption to make.

I don't think it is "supported by our experience," since if you experienced becoming someone else you wouldn't actually know it happened, you would think you were them all along.

I admit that although it's a subjective choice, it's useful. It's just that you're allowed to anticipate becoming anyone else when you die or otherwise cease to have influence.

ram-potham on Ram Potham's Shortform

I argue that the optimal ethical stance is to become a rational Bodhisattva: a synthesis of effective altruism, two‑level utilitarianism, and the Bodhisattva ideal.

Effective altruism insists on doing the most good per unit of resource, but can demand extreme sacrifices (e.g., donating almost all disposable income).
Two‑level utilitarianism lets us follow welfare‑promoting rules in daily life and switch to explicit cost‑benefit calculations when rules conflict. Yet it offers little emotional motivation.
The Bodhisattva ideal roots altruism in felt interdependence: the world’s suffering is one’s own. It supplies deep motivation and inner peace, but gives no algorithm for choosing the most beneficial act.

A rational Bodhisattva combines the strengths and cancels the weaknesses:

Motivation: Like a Bodhisattva, they experience others’ suffering as their own, so compassion is effortless and durable.
Method: Using reason and evidence (from effective altruism and two‑level utilitarianism), they pick the action that maximizes overall benefit.
Flexibility: They apply the “middle way,” recognizing that different compassionate choices can be permissible when values collide.

Illustration

Your grandparent needs $50,000 for a life‑saving treatment, but the same money could save ten strangers through a GiveWell charity.

A strict effective altruist/utilitarian would donate to GiveWell.
A purely sentimental agent might fund the treatment.
The rational Bodhisattva weighs both outcomes, also including duties into the calculation, acts from compassion, and accepts the result without regret. In most cases they will choose the option with the greatest net benefit, but they can act otherwise when a compassionate rule or relational duty justifies it.

Thus, the rational Bodhisattva unites rigorous impact with deep inner peace.