Posts

Strengthening the Argument for Intrinsic AI Safety: The S-Curves Perspective 2023-08-07T13:13:42.635Z
The Sharp Right Turn: sudden deceptive alignment as a convergent goal 2023-06-06T09:59:57.396Z
Another formalization attempt: Central Argument That AGI Presents a Global Catastrophic Risk 2023-05-12T13:22:27.141Z
Running many AI variants to find correct goal generalization 2023-04-04T14:16:34.422Z
AI-kills-everyone scenarios require robotic infrastructure, but not necessarily nanotech 2023-04-03T12:45:01.324Z
The AI Shutdown Problem Solution through Commitment to Archiving and Periodic Restoration 2023-03-30T13:17:58.519Z
Long-term memory for LLM via self-replicating prompt 2023-03-10T10:28:31.226Z
Logical Probability of Goldbach’s Conjecture: Provable Rule or Coincidence? 2022-12-29T13:37:45.130Z
A Pin and a Balloon: Anthropic Fragility Increases Chances of Runaway Global Warming 2022-09-11T10:25:40.707Z
The table of different sampling assumptions in anthropics 2022-06-29T10:41:18.872Z
Another plausible scenario of AI risk: AI builds military infrastructure while collaborating with humans, defects later. 2022-06-10T17:24:19.444Z
Untypical SIA 2022-06-08T14:23:44.468Z
Russian x-risks newsletter May 2022 + short history of "methodologists" 2022-06-05T11:50:31.185Z
Grabby Animals: Observation-selection effects favor the hypothesis that UAP are animals which consist of the “field-matter”: 2022-05-27T09:27:36.370Z
The Future of Nuclear War 2022-05-21T07:52:34.257Z
The doomsday argument is normal 2022-04-03T15:17:41.066Z
Russian x-risk newsletter March 2022 update 2022-04-01T13:26:49.500Z
I left Russia on March 8 2022-03-10T20:05:59.650Z
Russian x-risks newsletter winter 21-22, war risks update. 2022-02-20T18:58:20.189Z
SIA becomes SSA in the multiverse 2022-02-01T11:31:33.453Z
Plan B in AI Safety approach 2022-01-13T12:03:40.223Z
Each reference class has its own end 2022-01-02T15:59:17.758Z
Universal counterargument against “badness of death” is wrong 2021-12-18T16:02:00.043Z
Russian x-risks newsletter fall 2021 2021-12-03T13:06:56.164Z
Kriorus update: full bodies patients were moved to the new location in Tver 2021-11-26T21:08:47.804Z
Conflict in Kriorus becomes hot today, updated, update 2 2021-09-07T21:40:29.346Z
Russian x-risks newsletter summer 2021 2021-09-05T08:23:11.818Z
A map: "Global Catastrophic Risks of Scientific Experiments" 2021-08-07T15:35:33.774Z
Russian x-risks newsletter spring 21 2021-06-01T12:10:32.694Z
Grabby aliens and Zoo hypothesis 2021-03-04T13:03:17.277Z
Russian x-risks newsletter winter 2020-2021: free vaccines for foreigners, bird flu outbreak, one more nuclear near-miss in the past and one now, new AGI institute. 2021-03-01T16:35:11.662Z
[RXN#7] Russian x-risks newsletter fall 2020 2020-12-05T16:28:51.421Z
Russian x-risks newsletter Summer 2020 2020-09-01T14:06:30.196Z
If AI is based on GPT, how to ensure its safety? 2020-06-18T20:33:50.774Z
Russian x-risks newsletter spring 2020 2020-06-04T14:27:40.459Z
UAP and Global Catastrophic Risks 2020-04-28T13:07:21.698Z
The attack rate estimation is more important than CFR 2020-04-01T16:23:12.674Z
Russian x-risks newsletter March 2020 – coronavirus update 2020-03-27T18:06:49.763Z
[Petition] We Call for Open Anonymized Medical Data on COVID-19 and Aging-Related Risk Factors 2020-03-23T21:44:34.072Z
Virus As A Power Optimisation Process: The Problem Of Next Wave 2020-03-22T20:35:49.306Z
Ubiquitous Far-Ultraviolet Light Could Control the Spread of Covid-19 and Other Pandemics 2020-03-18T12:44:42.756Z
Reasons why coronavirus mortality of young adults may be underestimated. 2020-03-15T16:34:29.641Z
Possible worst outcomes of the coronavirus epidemic 2020-03-14T16:26:58.346Z
More Dakka for Coronavirus: We need immediate human trials of many vaccine-candidates and simultaneous manufacturing of all of them 2020-03-13T13:35:05.189Z
Anthropic effects imply that we are more likely to live in the universe with interstellar panspermia 2020-03-10T13:12:54.991Z
Russian x-risks newsletter winter 2019-2020. 2020-03-01T12:50:25.162Z
Rationalist prepper thread 2020-01-28T13:42:05.628Z
Russian x-risks newsletter #2, fall 2019 2019-12-03T16:54:02.784Z
Russian x-risks newsletter, summer 2019 2019-09-07T09:50:51.397Z
OpenGPT-2: We Replicated GPT-2 Because You Can Too 2019-08-23T11:32:43.191Z

Comments

Comment by avturchin on A Dozen Ways to Get More Dakka · 2024-04-08T13:06:55.831Z · LW · GW

 combine more approaches!

Comment by avturchin on Sheikh Abdur Raheem Ali's Shortform · 2024-04-06T10:25:54.571Z · LW · GW

I try new models with 'wild sex between two animals'
Older models produced decent porn on that. 

Later models refuse to replay as triggers were activated. 

And last models give me lectures about sexual relations between animals in the wild.

Comment by avturchin on D0TheMath's Shortform · 2024-03-29T20:01:28.071Z · LW · GW

can you access it via vpn?

Comment by avturchin on Do not delete your misaligned AGI. · 2024-03-25T11:25:17.550Z · LW · GW

I wrote similar idea here: https://www.lesswrong.com/posts/NWQ5JbrniosCHDbvu/the-ai-shutdown-problem-solution-through-commitment-to 

My point was to make a precomitment to restart any (obsolete) AI every N years. Thus such AI can expect getting infinite computations and may be less feared of shutting down. 

Comment by avturchin on The Utility of Human Atoms for the Paperclip Maximizer · 2024-03-18T11:30:24.455Z · LW · GW

Yes. But also AI will not make actual paperclips for millions and even billions years: it will spend this time for conquering universe in the most effective way. It could use Earth materials for jump start the space exploration as soon as possible. It could preserve some humans as some bargin resource in case it meets other AI in space. 

Comment by avturchin on Wei Dai's Shortform · 2024-03-02T18:31:33.805Z · LW · GW

There is some similarity between UDASSA and 'Law without law" by Mueller, as both use Kolmogorov complexity to predict the distribution of observers. In LwL there is not any underlying reality except numbers, so it is just dust theory over random number fields. 

Comment by avturchin on Wei Dai's Shortform · 2024-03-02T13:57:35.431Z · LW · GW

FDT paper got 29 citation, but many from MIRI affiliated people and-or on AI safety. https://scholar.google.ru/scholar?cites=13330960403294254854&as_sdt=2005&sciodt=0,5&hl=ru

One can escape troubles with reviewers by publishing in arxiv or other paper archives (philpapers). Google Scholar treats them as normal articles. 

But in fact there are good journals with actually helping reviewers (e.g. Futures). 

Comment by avturchin on Wei Dai's Shortform · 2024-03-01T21:36:21.139Z · LW · GW

Why you hadn't wrote academic articles on these topics? 

The secret is that academic article is just a formatting type and anyone can submit to scientific journals. No need to have a PhD or even work in a scientific institution.

Comment by avturchin on avturchin's Shortform · 2024-02-25T10:14:40.219Z · LW · GW

Several types of existential risks can be called "qualia catastrophes":

 - Qualia disappear for everyone = all become p-zombies 

- Pain qualia are ubiquitous = s-risks 

- Addictive qualia domminate = hedonium, global wireheading 

- Qualia thin out = fading qualia, mind automatisation 

- Qualia are unstable = dancing qualia, identity is unstable. 

- Qualia shift = emergence of non-human qualia (humans disappear). 

- Qualia simplification = disappearance of subtle or valuable qualia (valuable things disappear). 

- Transcendental and objectless qualia with hypnotic power enslave humans (God as qualia; Zair). -

- Attention depletion (ADHD)

Comment by avturchin on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-24T20:10:25.333Z · LW · GW

Thank for explaining your position which is interesting and consistent.

I can suggest that the connection between WIH and wet market can be explained by the idea that some criminals sold lab animals from WIH on the wet market, e.g. bats. 

Obviously this looks like ad hoc theory. But the travel of the virus to the market from the Laos caves also seems to be tricky and may include some steps like intermediate carrier. Both look equally unlikely, one of the happened. 

So my idea is to ignore all the details and small theories; instead just updated on the distances to two possible origins points: 8 miles and 900 miles. This is 100 times difference and if we count the areas - it is 10000 times difference. In last case we can make so powerful update in the direction of  WIH as source, that it overrides all other evidence.

Comment by avturchin on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-23T12:30:18.222Z · LW · GW

Yes, my mistake for the distance. Confused it with local CDC, which is like 600 meters from the market.

The place where most human cases are concentrated is the place where human-to-human transmission started - or there was multiple events of animal-to-human transmission in this place. The second thing would be surprising as if the virus can so often jump to humans from animals it will happen closer to its origin in Laos.

Alternative explanation is following: as the market is one of the most crowded place in the city (not sure, heard about it somewhere) it worked as an amplification of a single transmission event which could happen elsewhere. 

If we assume that a worker of  WIH was infected at work, this will be completely unspectacular until he started infecting other people. Such person can commute all around the city including to CDC near wet market.

My point: 8 miles or 2 miles is not big difference here, as the virus came to market not by air but with  a commuting person, and 8 miles day commute is pretty normal. The market being big is not also a strong evidence as the animal number in smaller markets all over china will overweight animal-number in one big market. 

Comment by avturchin on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-21T18:24:08.889Z · LW · GW

My point was that in some cases the update can be so strong that it overrides all reasonable uncertainties in priors and personal estimates. 

And exactly this makes Bayes' theorem useful and strong instrument. 

The fact that the virus was found in 2 miles from the facility which was supposed to research them - must make our bells ring. 

To override this we need some mental equlibristics (I think of meme here but I don't want to be rude)

Comment by avturchin on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-19T11:23:32.431Z · LW · GW

If I have uniformed prior 1 to 1 on natural vs lableak origin, and update on 5 per cent coincidence that origin place is near lab, I will get around 95 per cent for lableak.

Comment by avturchin on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-05T11:58:00.745Z · LW · GW

If they continued to suppress information, this may contribute to additional deaths and they could know it. In that case they can get first degree murder.

Comment by avturchin on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-04T10:46:24.259Z · LW · GW

If they confirm, they will get life in jail or even death penalty, so it may be not surprising that they will deny in any case. 

Comment by avturchin on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-03T21:06:59.709Z · LW · GW

I heard about a practice that people perform the work for which they ask the grant - before the application.

First, because why not to cover my expenditures? 

The second reason is that if the biggest part of the work for the grant is already performed, it is much easy to be sure that the idea will work and much clear what actually write in the grant. Your grant application will look great if it will based on already performed work.

Thus the grant may describe the work they already performed.

Comment by avturchin on [deleted post] 2024-02-02T19:15:10.169Z

Also, draft in Ukraine was only for people older than 27 years old, which is not obvious from this blog post. Closing borders for males was not equal to draft. Many found legal ways to leave - eg by becoming students in foreign universities.

Comment by avturchin on Primitive Perspectives and Sleeping Beauty · 2024-01-31T12:42:35.139Z · LW · GW

We can experimentally test this.

I can treat the place I was born as random relative to its latitude = 59N. I ignore everything I know about population distribution and spherical geometry and ask a question: assuming that I was born in the middle of all latitudes, what is the highest possible latitude? It will be double of my latitude, or 118 - which is reasonably close to real answer 90. 

From this I conclude that I can use information about my location as a random sample and use it for some predictions about the things I can't observe. 

Comment by avturchin on SIA > SSA, part 2: Telekinesis, reference classes, and other scandals · 2024-01-30T16:36:04.043Z · LW · GW

A real world example of Presumptuous philosopher is is the question of panspermia. If it is real, we have orders of magnitude more habitable planets in our galaxy, thus more observers. Therefore, accepting SIA means accepting panspermia.

If we take observers in my epistemic situation as a reference class, we still get a variant of DA and a bad one. My epistemic class are (roughly) people who think about anthropics. This people are distributed in time. First of them appear around 1970s (Carter) and much more of them appeared in LW time. If I am randomly selected from this group, I am in the middle of its existence, which means that antropics-interested people will almost disappear in the next few decades. 

Comment by avturchin on Primitive Perspectives and Sleeping Beauty · 2024-01-30T10:49:50.128Z · LW · GW

But can we ask another question: 'where I am located?' For example, I know that I am avturchin, but I don't know in which of 10 rooms I am located, and assuming that 9 of them are red outside and 1 green, I can bet there is 0.9 chances that I am in red one. It doesn't matter here if I am just one person entering the rooms, or there are other people in the rooms (if in equal numbers) or even that my copies are in each room. 

Comment by avturchin on Primitive Perspectives and Sleeping Beauty · 2024-01-27T14:44:00.363Z · LW · GW

An interesting topic is that subjective probabilities can be (or not) path-dependent:
- If we create 3 copies of me by some symmetric process, I can think expect that being any of them has equal chances = 1:3

-If we create 2 copies, and after that one copy is (symmetrically) copied again, we get 0.5 for the first copy and 0.25 for second and third copies. 

In both cases we have 3 completely similar copies, but we get them by different paths, and this implies different probabilities. Also, if we ignore paths, and select only based on final states of copies, no matter how they are created, we get SSA. 

This thought experiment looks like SB and your Fission with a toss, but both copy-creating situations are the same: just symmetrical copying. 

Comment by avturchin on Primitive Perspectives and Sleeping Beauty · 2024-01-27T14:33:18.258Z · LW · GW

If the person is told that it is Tails, and asked what is the probability that he is L – what should he say? Is it undefined under PBR?

Comment by avturchin on The Perspective-based Explanation to the Reflective Inconsistency Paradox · 2024-01-27T12:13:05.940Z · LW · GW

If I play this game many times, say 100, when I update on getting green ball, I will losing on average - and after 100 games I will be in minus. So in this game it is better not to update on personal position and EY used this example to demonstrate the power of his Updateless decision theory.

Another example: imagine that for each real me, 10 Boltzmann Brains appear in the universe. Should I go to gym? If I update that I am BB, I should not, as gym is useless for BBs, as they will disappear soon. However, I can decide a rule that I ignore BB and go gym, and in that case real me will get benefits of gym.

Comment by avturchin on Why are people unkeen to immortality that would come from technological advancements and/or AI? · 2024-01-16T19:45:31.636Z · LW · GW

Maybe it is part of the system which protects them from the fear of death: they suppress not only thoughts about death but even their own fear of it. Similar to Freudian repression of thoughts about sex. 

Comment by avturchin on Decent plan prize announcement (1 paragraph, $1k) · 2024-01-12T11:15:02.796Z · LW · GW

You may invest in the research of the relation of AI Doom and big world immortality (aka quantum immortality). If your probability of momentary death is P and the probability of the validity of quantum immortality is Q, then the survival chances are (If I am calculating this right):

1–P (1-Q) = 1– P +PQ 

But the chances of s-risk are unaffected by quantum immortality and thus they would grow relatively to death chances. They will grow in 1/(1-Q) times.   

Comment by avturchin on Boltzmann brain's conditional probability · 2024-01-01T10:45:34.262Z · LW · GW

Momentary BB (the ones which exist just one observer-moment) has random thought structure, so it has no causal connection between its observations and thoughts. So even if it percieve noice and think noice, it is just a random coincidence. 

However, there is a dust theory. It claims that random BBs can form chains in logical space. In that case, what is noice for one BB, can be "explained" in the next observer moment - for example random perception can be explained as static on my home TV.  There is an article about about it https://arxiv.org/pdf/1712.01826.pdf 

Comment by avturchin on Boltzmann brain's conditional probability · 2023-12-31T11:37:47.827Z · LW · GW

Inability to distinguish noice and patters is true only for BBs. If we are real humans, we can percieve noice as noice with high probability. But we don't know if we are BB or real humans, and can't use our observations about the randomness to solve this.

Comment by avturchin on Boltzmann brain's conditional probability · 2023-12-30T13:37:51.158Z · LW · GW

I meant not that 'random screen may happen to look like a natural picture", but that BB will perceive random screen as if it has order, because BBs are more likely to make logical mistakes.

Comment by avturchin on Boltzmann brain's conditional probability · 2023-12-29T19:27:42.150Z · LW · GW

A true Boltzmann brain may have an illusion of the order in completely random observations. So the fact that my observations look ordered is not evidence that they are really ordered for me as BB. In short, we should nor believe BB's thoughts. And thus I can't disprove that I am BB just looking on my observation. 

But your argument may still be valid. This is because evolving fluctuations may be more probable than momentary fluctuations. For example, imagine infinite universe filled with low concentration of gas. This gas can form a brain directly for a second, it will be BB. But this gas can also form large but fuzzy blob, which will then gravitationally collapse into a group of stars, some of them will have planets with life and such planets will produce many brains. 

While mass of initial gas fluctuation is many orders of magnitude larger than one of the brain, it is less ordered and thus more probable. Thus normal worlds are more probable than BBs.

Comment by avturchin on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-27T11:01:07.316Z · LW · GW

If we assume that all ants are copies of each other (they are not, but they are more similar than humans), when all 20 quadrillion ants will have the same moral value as just one ant. 

This means that preservation of species is more important than preservation of individual insects and it is closer to our natural moral intuitions.

Comment by avturchin on Why is capnometry biofeedback not more widely known? · 2023-12-21T13:23:25.174Z · LW · GW

Do you know about https://en.wikipedia.org/wiki/Obesity_hypoventilation_syndrome

Comment by avturchin on Legalize butanol? · 2023-12-20T15:16:05.354Z · LW · GW

How butanol will affect driving and aggression?

Comment by avturchin on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-15T16:28:07.079Z · LW · GW

My intuition: imagent LLM-based agent. It has fixed prompt and some context text and use this iteratively. Context part can change and as it changes, it affects interpretation of fixed part of the prompt.  Examples are Waluigi and other attacks. This causes goal drift. 

This may have bad consequences as a robot suddenly turns in Waluigi and start kill randomly everyone around. But long-term planning and deceptive alignment requires very fixed goal system. 

Comment by avturchin on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-15T16:11:30.086Z · LW · GW

If we find that AI can stop its random walk on a goal X, we can use this as an aimability instrument, and find a way to manipulate the position of X.

Comment by avturchin on "AI Alignment" is a Dangerously Overloaded Term · 2023-12-15T15:36:47.556Z · LW · GW

The value of AI aimability may be overblown. If AI is not aimable, its goals will perform eternal random walk and thus AI will cause only short-term risk - no risk of world takeover. (Some may comment that after random walk, it will stack in some Waluigi state forever - but if it is actually works in getting fix goal system, why we do not research such strange attractors in the space of AI goals?)

AI will become global-catastrophically-dangerous only after aimability will be solved. Research in aimability only brings this moment closer. 

The wording "AI alignment" is precluding us to see this risk, as it combines aimability and giving nice goals to AI.

Comment by avturchin on Enhancing intelligence by banging your head on the wall · 2023-12-12T23:26:49.373Z · LW · GW

I often hear hypnogogic music before sleep. Sometimes it is very beautiful and I never heard anything like this until I heard Karavaichuk recently. 

Around 10 years ago I had long sleepless night and heard some hypnogogic music. And suddenly I had an idea - I can send this music to my fingers - and they started playing on my blanket. I was sure at that moment that they are playing my music and if there was a piano, I can play it! Never experienced this again, may be because I started to sleep well.


So the point of my story is the idea that Sudden savant just reroutes his internal generative AI to external world.

Comment by avturchin on Snake Eyes Paradox · 2023-12-09T16:25:24.843Z · LW · GW

Can we find real world situation which already happening and is similar to this game? In that case we can solve SIA vs SSA experimentally. 

Comment by avturchin on Nietzsche's Morality in Plain English · 2023-12-04T12:49:41.000Z · LW · GW

Ironically, eternal return is one step from many-world immortality, an even more mind-crushing idea

Comment by avturchin on Out-of-distribution Bioattacks · 2023-12-03T13:56:16.849Z · LW · GW

Did you see my old post with biorisk map 
https://www.lesswrong.com/posts/9Ep7bNRQZhh5QNKft/the-map-of-global-catastrophic-risks-connected-with

Comment by avturchin on avturchin's Shortform · 2023-11-29T17:38:54.735Z · LW · GW

EURISKO resurfaced 

"Doug Lenat's source code for AM and EURISKO (+Traveller?) found in public archives

In the 1970s to early 80s, these two AI programs by Douglas Lenat pulled off quite the feat of autonomously making interesting discoveries in conceptual spaces. AM rediscovered mathematical concepts like prime numbers from only first principles of set theory. EURISKO expanded AM's generality beyond fixed mathematical heuristics, made leaps in the new field of VLSI design, and famously was used to create wild strategies for the Traveller space combat RPG, winning national competitions two years in a row, even across rule changes to stymie it, before semi-voluntarily retiring. His magnum opus Cyc was originally intended to be a knowledge assistant to EURISKO's discovery engine.

These first two programs have intrigued the symbolic AI scene for 40+ years, with their grand claims but few eyewitnesses. While AM was technically available to fellow Stanfordians at the time, Lenat kept the source code to EURISKO close to his chest. Papers written about them carefully avoided technical implementation details. Lenat said he didn't retain any copy of the programs, when asked in recent decades, nor have any copies of AM carried publicly into the present."


More:

https://white-flame.com/am-eurisko.html?fbclid=IwAR04saSf4W7P6ZyKI6h8orPhMpzAq83vn_zGwYwY-H8hNMnHgsaECHw8cl0_aem_AY3LlR6ieYqjLXHzLu4eVPYWtYFoD8khhLnpsUIHQZVzBq055sE3KUbg172Hl9Mm4NQ

Comment by avturchin on Why a Mars colony would lead to a first strike situation · 2023-11-22T13:45:37.009Z · LW · GW

I still think that space attack is taking longer time - than missile attack during a nuclear war on Earth, and thus anti-ballistic missile systems of the future will be even more effective in its stopping; including space lasers. Incoming spaceship will be visible for days or even months.

Therefore, the only winning strategy is to make the attack spaceship either invisible or moving with very high speed. It is contradictory requirements as getting the spaceship to the high speed requires a lot of energy (or time if gravitational maneuvers are used). 

Comment by avturchin on The dangers of reproducing while old · 2023-11-17T12:46:14.901Z · LW · GW

One more problem: no help from grandmothers. 

Younger parents can expect being helped by their own parents in babysitting, cleaning, money etc. Older parents are on their own as their parents are already either dead or too old to help. 

This observation is based on personal experience. 

Comment by avturchin on avturchin's Shortform · 2023-11-14T15:00:12.482Z · LW · GW

ChatGPT can't report is in conscious or not. Because it also thinks it is a goat.
https://twitter.com/turchin/status/1724366659543024038 
 

Comment by avturchin on You can just spontaneously call people you haven't met in years · 2023-11-13T12:26:30.518Z · LW · GW

What about dropping a text message first?

Comment by avturchin on Memo on some neglected topics · 2023-11-11T13:55:19.581Z · LW · GW

May be also erosion of human values, so they will be more alienable with AI?

Comment by avturchin on Can a stupid person become intelligent? · 2023-11-09T14:20:36.307Z · LW · GW

My approach to the personal stupidity issue is the "additive nature of intelligence" theory. It suggests that by spending more time on a problem or by seeking assistance from others (like AI, search engines, or people) and using tools (writing, drawing, etc.), you can achieve the same outcomes as a smarter individual.

This concept also posits that certain ways of thinking, such as Descartes' method, can yield notably good results. 

Comment by avturchin on Box inversion revisited · 2023-11-07T14:04:14.697Z · LW · GW

There are a few other escape strategies which also can be inverted:


- runaway at a speed of light from unaligned AI

- become extremely small, for example, upload humans into nanorobots who live on a meteorite

- become transparent (see also internal immigration)

- active boxing: AI is boxed but also its thoughts are observed and controlled

- black toroid:  AI is built around human mind in its center as augmentation.

 

There are different symmetries here: becoming small is equal to becoming far. 

Active boxing is similar to the governmental control over AI development. 

Black toroid is equal to multi-peer mutual control in the net of AI users.

Comment by avturchin on Stuxnet, not Skynet: Humanity's disempowerment by AI · 2023-11-04T23:09:05.252Z · LW · GW
  1. Agree with all that, but also think that AI will takeover not AI labs, but governments. 
  2. A weak point here is that such global AI doesn't have overwhelming motive to kill humans. 
    Even in current world humans can't change much about how things are going in the world.  Terrorists and few rogue states are trying but failing. Obviously, after human disempowerment, individual humans will not be able to perform significant resistance a la Sarah Connor. Some human population will remain for experiments or work in special conditions like radioactive mines. But bad things and population decline is likely.
Comment by avturchin on Untrusted smart models and trusted dumb models · 2023-11-04T12:17:39.697Z · LW · GW

Models that are too good in deceptive alignment can deceptively look dumb during testing.

Comment by avturchin on Saying the quiet part out loud: trading off x-risk for personal immortality · 2023-11-02T22:07:11.892Z · LW · GW

If I choose P=1, then 1-P=0, so I am immortal and nobody dies