LessWrong 2.0 Reader
View: New · Old · TopRestrict date range: Today · This week · This month · Last three months · This year · All time
next page (older posts) →
next page (older posts) →
I appreciate that AI 2027 named their model Safer-1, rather than Safe-1
That's because they can read its thoughts like an open book.
gunnar_zarncke on eggsyntax's ShortformI thought it would be good to have some examples where you could have a useful type signature, and I asked ChatGPT. I think these are too wishy-washy, but together with the given explanation, they seem to make sense.
Would you say that this level of "having a type signature in mind" would count?
ChatGPT 4o suggesting examples
Phenomenon → (Theory, Mechanism)
Features → Label
These have different type signatures. A model that predicts well might not explain. People often conflate these roles. Type signatures remind us: different input-output relationships.
Action → Good/Bad
(State × Action) → (New State × Externalities)
People often act as if "this action is wrong" implies "we must ban it," but that only follows if the second signature supports the first. You can disagree about outcomes while agreeing on morals, or vice versa.
(Action × Impact) → Updated Mental Model
People often act as if the type signature is just Action → Judgment
. That’s blame, not feedback. This reframing can help structure nonviolent communication.
(Goal × Constraints) → Best Action
Void → (Goal × Constraints × Ideas)
The creative act generates the very goal and constraints. Treating creative design like optimization prematurely can collapse valuable search space.
Speaker → (Concepts × StudentMemory)
(Student × Task × Environment) → Insight
If the type signature of insight requires active construction, then lecture-only formats may be inadequate. Helps justify pedagogy choices.
Source: https://chatgpt.com/share/67f836e2-1280-8001-a7ad-1ef1e2a7afa7
max-harms on Thoughts on AI 2027I think if there are 40 IQ humanoid creatures (even having been shaped somewhat by the genes of existing humans) running around in habitats being very excited and happy about what the AIs are doing, this counts as an existentially bad ending comparable to death. I think if everyone's brains are destructively scanned and stored on a hard-drive that eventually decays in the year 1 billion having never been run, this is effectively dead. I could go on if it would be helpful.
Do you think these sorts of scenarios are worth describing as "everyone is effectively dead"?
I don't think AI personhood will be a mainstream cause area (i.e. most people will think it's weird/not true similar to animal rights), but I do think there will be a vocal minority. I already know some people like this, and as capabilities progress and things get less controlled by the labs, I do think we'll see this become an important issue.
Want to make a bet? I'll take 1:1 odds that in mid-Sept 2027 if we poll 200 people on whether they think AIs are people, at least 3 of them say "yes, and this is an important issue." (Other proposed options "yes, but not important", "no", and "unsure".) Feel free to name a dollar amount and an arbitrator to use in case of disputes.
I came here to say "look at octopods!" but you already have. Yay team! :-)
One of the alignment strategies I have been researching in parallel with many others involves finding examples of human-and-animal benevolence and tracing convergent evolution therein, and proposing that "the shared abstracts here (across these genomes, these brains, these creatures all convergently doing these things)" is probably algorithmically simple, with algorithm-to-reality shims that might also be important, and please study it and lean in the direction of doing "more of that".
There is an octopod cognate of "ocytocin" (the "maternal love and protection hormone"), but from what I can tell they did NOT re-use it in the ways that we did. But also they mostly lay eggs while abandoning the individual babies to their own survival, rather than raising children carefully.
By contrast, birds and mammals share a relatively similar kind of "high parental investment"!
frank-bellamy on Who wants to bet me $25k at 1:7 odds that there won't be an AI market crash in the next year?Suggesting specific odds without being able to define a threshold seems a bit, um, confused. Being willing to take the word of a stranger on the internet when these quantities of money are at stake seems outright stupid. I'm staying out of this market. I suggest that you withdraw your offer.
knight-lee on Disempowerment spirals as a likely mechanism for existential catastropheMy very uncertain opinion is that, humanity may be very irrational and a little stupid, but humanity isn't that stupid.
The reason people do not take AI risk and other existential risk seriously is due to the complete lack of direct evidence (despite plenty of indirect evidence) of its presence. It's easy for you to consider it obvious due to the curse of knowledge, but this kind of "reasoning from first principles (that nothing disproves the risk and therefore the risk is likely)," is very hard for normal people to do.
Before the September 11th attacks, people didn't take airport security seriously because they lacked imagination on how things could go wrong. They considered worst case outcomes as speculative fiction, regardless of how logically plausible they were, because "it never happened before."
After the attacks, the government actually overreacted and created a massive amount of surveillance.
Once the threat starts to do real and serious damage against the systems for defending threats, the systems actually do wake up and start fighting in earnest. They are like animals which react when attacked, not trees which can be simply chopped down.
Right now the effort against existential risks is extremely tiny. E.g. AI Safety is only $0.1 to $0.2 billion [? · GW], while the US military budget is $800-$1000 billion, and the world GDP is $100,000 billion ($25,000 billion in the US). It's not just spending which is tiny, but effort in general.
I'm more worried about a very sudden threat which destroys these systems in a single "strike," when the damage done goes from 0% to 100% in one day, rather than gradually passing the point of no return.
But I may be wrong.
Edit: one form of point of no return is if the AI behaves more and more aligned even as it is secretly misaligned (like the AI 2027 story).
danielechlin on Misinformation is the default, and information is the government telling you your tap water is safe to drinkThe problem is reception of reliable information not production of reliable information.
I've actually just wondered if you need to move science veracity to some external right leaning institution like betting on scientific markets or voting on replication experiments or something.
knight-lee on A collection of approaches to confronting doom, and my thoughts on themI agree that it's useful in practice, to anticipate the experiences of the future you which you can actually influence the most. It makes life much more intuitive and simple, and is a practical fundamental assumption to make.
I don't think it is "supported by our experience," since if you experienced becoming someone else you wouldn't actually know it happened, you would think you were them all along.
I admit that although it's a subjective choice, it's useful. It's just that you're allowed to anticipate becoming anyone else when you die or otherwise cease to have influence.
ram-potham on Ram Potham's ShortformI argue that the optimal ethical stance is to become a rational Bodhisattva: a synthesis of effective altruism, two‑level utilitarianism, and the Bodhisattva ideal.
A rational Bodhisattva combines the strengths and cancels the weaknesses:
Illustration
Your grandparent needs $50,000 for a life‑saving treatment, but the same money could save ten strangers through a GiveWell charity.
Thus, the rational Bodhisattva unites rigorous impact with deep inner peace.