Posts
Comments
I also think it is unlikely that AGIs will compete in human status games. Status games are not just about being the best: Deep Blue is not high status, sportsmen that take drugs to improve their performance are not high status.
Status games have rules and you only win if you do something impressive while competing within the rules, being an AGI is likely to be seen as an unfair advantage, and thus AIs will be banned from human status games, in the same way that current sports competitions are split by gender and weight.
Even if they are not banned given their abilities it will be expected that they do much better than humans, it will just be a normal thing, not a high status, impressive thing.
Let me show you the ropes
There is a rope.
You hold one end.
I hold the other.
The rope is tight.
I pull on it.
How long until your end of the rope moves?
What matters is not how long until your end of the rope moves.
It's having fun sciencing it!
For those interested in writing better trip reports there is a "Guide to Writing Rigorous Reports of Exotic States of Consciousness" at https://qri.org/blog/rigorous-reports
A trip report is an especially hard case of something one can write about:
- english does not have a well-developed vocabulary for exotic states of consciousness
- even if we made up new words, they might not make much sense to people that have not experienced what they point at, just like it's hard to describe color to blind people or to project a high-dimensional thing to a lower dimensional space.
I have a similar intuition that if mirror-life is dangerous to Earth-life, then the mirror version of mirror-life (that is, Earth-life) should be about equally as dangerous to mirror-life as mirror-life is to Earth-life. Having only read this post and in the absence of any evidence either way this default intuition seems reasonable.
I find the post alarming and I really wish it had some numbers instead of words like "might" to back up the claims of threat. At the moment my uneducated mental model is that for mirror-life to be a danger it has to:
- find enough food that fit its chirality to survive
- not get killed by other life-forms
- be able to survive Earth temperature, atmosphere etc etc
- enter our body
- bypass our immune system
- be a danger to us
Hmm, 6 ifs seems like a lot, so is it unlikely? in the absence of any odds it is hard to say.
The post would be more convincing and useful if it included a more detailed threat model, or some probabilities, or a simulation, or anything quantified.
A last question: how many mirror molecules does an organism need to be mirror-life? is one enough? does it make any difference to its threat-level?
2+2=5 is Fine Maths: all you need is Coherence
[ epistemological status: a thought I had while reading about Russell's paradox, rewritten and expanded on by Claude ; my math level: undergraduate-ish ]
Introduction
Mathematics has faced several apparent "crises" throughout history that seemed to threaten its very foundations. However, these crises largely dissolve when we recognize a simple truth: mathematics consists of coherent systems designed for specific purposes, rather than a single universal "true" mathematics. This perspective shift—from seeing mathematics as the discovery of absolute truth to viewing it as the creation of coherent and sometimes useful logical systems—resolves many historical paradoxes and controversies.
The Key Insight
The only fundamental requirement for a mathematical system is internal coherence—it must operate according to consistent rules without contradicting itself. A system need not:
- Apply to every conceivable case
- Match physical reality
- Be the "one true" way to approach a problem
Just as a carpenter might choose different tools for different jobs, mathematicians can work with different systems depending on their needs. This insight resolves numerous historical "crises" in mathematics.
Historical Examples
The Non-Euclidean Revelation
For two millennia, mathematicians struggled to prove Euclid's parallel postulate from his other axioms. The discovery that you could create perfectly consistent geometries where parallel lines behave differently initially seemed to threaten the foundations of geometry itself. How could there be multiple "true" geometries? The resolution? Different geometric systems serve different purposes:
- Euclidean geometry works perfectly for everyday human-scale calculations
- Spherical geometry proves invaluable for navigation on planetary surfaces
- Hyperbolic geometry finds applications in relativity theory
None of these systems is "more true" than the others—they're different tools for different jobs.
Russell's Paradox and Set Theory
Consider the set of all sets that don't contain themselves. Does this set contain itself? If it does, it shouldn't; if it doesn't, it should. This paradox seemed to threaten the foundations of set theory and logic itself.
The solution was elegantly simple: we don't need a set theory that can handle every conceivable set definition. Modern set theories (like ZFC) simply exclude problematic cases while remaining perfectly useful for mathematics. This isn't a weakness—it's a feature. A hammer doesn't need to be able to tighten screws to be an excellent hammer.
The Calculus Controversy
Early calculus used "infinitesimals"—infinitely small quantities—in ways that seemed logically questionable. Rather than this destroying calculus, mathematics evolved multiple rigorous frameworks:
- Standard analysis using limits
- Non-standard analysis with hyperreal numbers
- Smooth infinitesimal analysis
Each approach has its advantages for different applications, and all are internally coherent.
Implications for Modern Mathematics
This perspective—that mathematics consists of various coherent systems with different domains of applicability—aligns perfectly with modern mathematical practice. Mathematicians routinely work with different systems depending on their needs:
- A number theorist might work with different number systems
- A geometer might switch between different geometric frameworks
- A logician might use different logical systems
None of these choices imply that other options are "wrong"—just that they're less useful for the particular problem at hand.
The Parallel with Physics
This view of mathematics parallels modern physics, where seemingly incompatible theories (quantum mechanics and general relativity) can coexist because each is useful in its domain. We don't need a "theory of everything" to do useful physics, and we don't need a universal mathematics to do useful mathematics.
Conclusion
The recurring "crises" in mathematical foundations largely stem from an overly rigid view of what mathematics should be. By recognizing mathematics as a collection of coherent tools rather than a search for absolute truth, these crises dissolve into mere stepping stones in our understanding of mathematical systems.
Mathematics isn't about discovering the one true system—it's about creating useful systems that help us understand and manipulate abstract patterns. The only real requirement is internal coherence, and the main criterion for choosing between systems is their utility for the task at hand.
This perspective not only resolves historical controversies but also liberates us to create and explore new mathematical systems without worrying about whether they're "really true." The question isn't truth—it's coherence.
I really like the idea of milestones, I think seeing the result of each milestones will help create trust in the group, confidence that the end action will succeed and a realization of the real impact the group has. Each CA should probably start with small milestones (posting something on social medias) and ramp things up until the end goal is reached. Seeing actual impact early will definitely keep people engaged and might make the group more cohesive and ambitious.
Ditch old software tools or programming languages for better, new ones.
My take on the tool VS agent distinction:
-
A tool runs a predefined algorithm whose outputs are in a narrow, well-understood and obviously safe space.
-
An agent runs an algorithm that allows it to compose and execute its own algorithm (choose actions) to maximize its utility function (get closer to its goal). If the agent can compose enough actions from a large enough set, the output of the new algorithm is wildly unpredictable and potentially catastrophic.
This hints that we can build safe agents by carefully curating the set of actions it chooses from so that any algorithm composed from the set produces an output that is in a safe space.
I think being as honest as reasonably sensible is good for oneself. Being honest applies pressure on oneself and one’s environment until the both closely match. I expect the process to have its ups and downs but to lead to a smoother life on the long run.
An example that comes to mind is the necessity to open up to have meaningful relationships (versus the alternative of concealing one’s interests which tends to make conversations boring).
Also honesty seems like a requirement to have an accurate map of reality: having snappy and accurate feedback is essential to good learning, but if one lies and distorts reality to accomplish one’s goals, reality will send back distorted feedback causing incorrect updates of one’s beliefs.
On another note: this post immediately reminded me of the buddhist concept of Right Speech, which might be worth investigating for further advice on how to practice this. A few quotes:
"Right speech, explained in negative terms, means avoiding four types of harmful speech: lies (words spoken with the intent of misrepresenting the truth); divisive speech (spoken with the intent of creating rifts between people); harsh speech (spoken with the intent of hurting another person's feelings); and idle chatter (spoken with no purposeful intent at all)."
"In positive terms, right speech means speaking in ways that are trustworthy, harmonious, comforting, and worth taking to heart. When you make a practice of these positive forms of right speech, your words become a gift to others. In response, other people will start listening more to what you say, and will be more likely to respond in kind. This gives you a sense of the power of your actions: the way you act in the present moment does shape the world of your experience."
Thanissaro Bhikkhu (source: https://www.accesstoinsight.org/lib/authors/thanissaro/speech.html)
I also thought about something along those lines: explaining the domestication of wolves to dogs, or maybe prehistoric wheat to modern wheat, then extrapolating to chimps. Then I had a dangerous thought, what would happen if we tried to select chimps for humaneness?
goals appear only when you make rough generalizations from its behavior in limited cases.
I am surprised no one brought up the usual map / territory distinction. In this case the territory is the set of observed behaviors. Humans look at the territory and with their limited processing power they produce a compressed and lossy map, here called the goal.
The goal is a useful model to talk simply about the set of behaviors, but has no existence outside the head of people discussing it.
This is a great use case for AI: expert knowledge tailored precisely to one’s needs
Is the "cure cancer goal ends up as a nuke humanity action" hypothesis valid and backed by evidence?
My understanding is that the meaning of the "cure cancer" sentence can be represented as a point in a high-dimensional meaning space, which I expect to be pretty far from the "nuke humanity" point.
For example "cure cancer" would be highly associated with saving lots of lives and positive sentiments, while "nuke humanity" would have the exact opposite associations, positioning it far away from "cure cancer".
A good design might specify that if the two goals are sufficiently far away they are not interchangeable. This could be modeled in the AI as an exponential decrease of the reward based on the distance between the meaning of the goal and the meaning of the action.
Does this make any sense? (I have a feeling I might be mixing concepts coming from different types of AI)
If you know your belief isn't correlated to reality, how can you still believe it?
Interestingly, physics models (map) are wrong (inaccurate) and people know that but still use them all the time because they are good enough with respect to some goal.
Less accurate models can even be favored over more accurate ones to save on computing power or reduce complexity.
As long as the benefits outweigh the drawbacks, the correlation to reality is irrelevant.
Not sure how cleanly this maps to beliefs since one would have to be able to go from one belief to another, however it might be possible by successively activating different parts of the brain that hold different beliefs, in a way similar to someone very angry that completely switches gears to answer an important phone call.
@Eliezer, some interesting points in the article, I will criticize what frustrated me:
> If you see a beaver chewing a log, then you know what this thing-that-chews-through-logs looks like,
> and you will be able to recognize it on future occasions whether it is called a “beaver” or not.
> But if you acquire your beliefs about beavers by someone else telling you facts about “beavers,”
> you may not be able to recognize a beaver when you see one.
Things do not have intrinsic meaning, rather meaning is an emergent property of
things in relation to each other: for a brain, an image of a beaver and the sound
"beaver" are just meaningless patterns of electrical signals.
Through experiencing reality the brain learns to associate patterns based on similarity, co-occurence and so on, and labels these clusters with handles in order to communicate. ’Meaning’ is the entire cluster itself, which itself bears meaning in relation to other clusters.
If you try to single out a node off the cluster, you soon find that it loses all meaning and
reverts back to meaningless noise.
> G1071(G1072, G1073)
Maybe the above does not seem dumb now? experiencing reality is basically entering and updating relationships that eventually make sense as a whole in a system.
I feel there is a huge difference in our models of reality:
In my model everything is self-referential, just one big graph where nodes barely exist (only aliases for the whole graph itself). There is no ground to knowledge, nothing ultimate. The only thing we have
is this self-referential map, from which we infer a non-phenomenological territory.
You seem to think the territory contains beavers, I claim beavers exist only in the map, as a block arbitrarily carved out of our phenomenological experience by our brain, as if it were the only way to carve a concept out of experience and not one of infinitely many valid ways (e.g. considering the beaver and the air around and not have a concept for just a beaver with no air), and as if only part experience could be considered without being impacted by the whole of experience (i.e. there is no living beaver without air).
This view is very influenced by emptiness by the way.
The examples seem to assume that "and" and "or" as used in natural language work the same way as their logical counterpart. I think this is not the case and that it could bias the experiment’s results.
As a trivial example the question "Do you want to go to the beach or to the city?" is not just a yes or no question, as boolean logic would have it.
Not everyone learns about boolean logic, and those who do likely learn it long after learning how to talk, so it’s likely that natural language propositions that look somewhat logical are not interpreted as just logic problems.
I think that this is at play in the example about Russia. Say you are on holidays and presented with one these 2 statements:
1. "Going to the beach then to the city"
2. "Going to the city"
The second statement obviously means you are going only to the city, and not to the beach nor anywhere else before.
Now back to Russia:
1. "Russia invades Poland, followed by suspension of diplomatic relations between the USA and the USSR”
2. “Suspension of diplomatic relations between the USA and the USSR”
Taken together, the 2nd proposition strongly implies that Russia did not invade Poland: after all if Russia did invade Poland no one would have written the 2nd proposition because it would be the same as the 1st one.
And it also implies that there is no reason at all for suspending relations: the statements look like they were made by an objective know-it-all, a reason is given in the 1st statement, so in that context it is reasonable to assume that if there was a reason for the 2nd statement it would also be given, and the absence of further info means there is no reason.
Even if seeing only the 2nd proposition and not the 1st, it seems to me that humans have a need to attribute specific causes to effects (which might be a cognitive bias), and seeing no explanation for the event, it is natural to think "surely, there must be SOME reason, how likely is it that Russia suspends diplomatic relations for no reason?", but confronted to the fact that no reason is given, the probability of the event is lowered.
It seems that the proposition is not evaluated as pure boolean logic, but perhaps parsed taking into account the broader social context, historical context and so on, which arguably makes more sense in real life.