Posts

Consume fiction wisely 2022-01-21T20:23:50.873Z
A fate worse than death? 2021-12-13T11:05:57.729Z
[Linkpost] Chinese government's guidelines on AI 2021-12-10T21:10:58.327Z
Exterminating humans might be on the to-do list of a Friendly AI 2021-12-07T14:15:07.206Z
Resurrecting all humans ever lived as a technical problem 2021-10-31T18:08:35.994Z
Steelman arguments against the idea that AGI is inevitable and will arrive soon 2021-10-09T06:22:25.851Z
A sufficiently paranoid non-Friendly AGI might self-modify itself to become Friendly 2021-09-22T06:29:15.773Z

Comments

Comment by RomanS on Consume fiction wisely · 2022-01-24T12:16:19.496Z · LW · GW

Thank you for this strong argument! Quoted it in the post (with a ref to you).

Comment by RomanS on Consume fiction wisely · 2022-01-23T17:14:41.899Z · LW · GW

Personally, I think I got a good feel of the basics of business management from Capitalism Lab. It is a very detailed business sim that is trying to be as realistic as possible for a game.

According to the dev, the game's predecessor was also used at Harvard and Stanford as a teaching aid. Judging by the immense complexity and depth of the game, I find the claim believable.

Comment by RomanS on Consume fiction wisely · 2022-01-23T08:13:16.789Z · LW · GW

Thank you for the link! Somehow I missed the Eliezer's post during my research. I'll add it to my post. 

My reading the linked argument back in 2007 made me install a habit of always immediately evaluating the truth value of basically everything everyone tells me

I think it's an excellent habit. Will try it too.

Comment by RomanS on Consume fiction wisely · 2022-01-23T07:43:21.390Z · LW · GW

Judging by the tropes, it indeed could be one of the more rational ones. Will try it, thank you!

Comment by RomanS on Consume fiction wisely · 2022-01-23T07:21:52.817Z · LW · GW

What I mean: the author's name on the cover can't be used anymore as an indicator of the book's harmfulness / helpfulness. 

An extreme example is the story of a certain American writer. He wrote some of the most beautiful transhumanist science fiction ever. But when he crashed his car and almost died. 

He came back wrong. He is now a religious nutjob who writes essays on how transhumans are soul-less children of Satan. And in his new fiction books, transhumanists are stock villains opposed by glorious Christian heroes. 

Comment by RomanS on Consume fiction wisely · 2022-01-22T19:14:33.968Z · LW · GW

I feel like large parts of the blogpost are you taking your personal taste and asserting that everyone should consume the stuff you enjoy.  You're saying: Don't consume fiction, except this genre and list of authors which I like.

 

I didn't say that. And the post is not about personal taste or favorite kinds of entertainment. 

The post can be summarized as follows: 

a rational agent who has the goal of understanding the world - avoids consuming fiction, unless it's the rare kind of fiction that benefits this goal more than it harms it  

If you have a different goal (e.g. to maximize your enjoyment), there is nothing wrong with consuming whichever fiction you like.

I've edited the post to make it more clear.

enjoying sci-fi & fantasy which often get sneered at by literary people

BTW, for the purposes of mind uploading, I write down the title of every book / movie / game / etc I've consumed (with some metadata). Doing it for more than 20 years. According to the table, I've read about 200 science fiction books. I'm not a science fiction hater, but an ex-addict.

Comment by RomanS on Consume fiction wisely · 2022-01-22T09:42:52.231Z · LW · GW

I agree, Dr Stone is far from perfect. But I think it's the closest thing to rational fiction that I've ever encountered in anime / manga. Moreover, I have a strong suspicion that the author loves HP:MoR. There are certain ideas and scenes in the manga that were quite obviously inspired by HP:MoR.

The second closest thing I've encountered is the Legend of the Galactic Heroes.

Comment by RomanS on Consume fiction wisely · 2022-01-22T09:34:57.946Z · LW · GW

Which Taylor are you referring to?

Dennis E. Taylor. I love "We are Legion", especially the audiobook. The sequels are good too.

Strugatsky brothers are new to me. Which of their books do you recommend?

IMHO these are excellent:

Strugatsky brothers are the Heinlein and Banks of the USSR, and their works often incorporate rational and philosophical tropes. But I don't know if the English translations are good (I've read them in Russian - my native tongue).

Comment by RomanS on Consume fiction wisely · 2022-01-22T08:28:40.226Z · LW · GW

I agree with all of your points.

As for heuristics, I find analyzing tropes as highly useful. 

Firstly, tropes help with finding the fiction that have the traits I perceive as helpful. For example, there is a page that lists hundreds of works that depict mind uploading, one of my most favorite topics. 

Secondly, tropes can be used as additional indicators of the quality of the fiction. For example, someone recommended Deadpool 2 to me. But after scrapping the page for tropes, I found that it contains a lot of tropes that I perceive as harmful, but almost no good tropes. The movie is not worth watching. 

Comment by RomanS on Consume fiction wisely · 2022-01-22T08:05:39.704Z · LW · GW

No adult updates their probability that dragons are real after reading Game of Thrones

Not sure about it. Can't find a poll specifically about dragons, but ~80% adults in the US believe in angels, and ~30% believe in bigfoot. 

Humans are not good at discerning reality from fiction, especially if the fiction is presented in a visual form. An emotionally charged movie scene, if well made, will cause the same emotions as a direct participation in the depicted event. Humans do learn from fiction, and there is no build-in filter that allows us to learn only realistic parts from it.

nonfiction is full of both literal lies and statements that are technically true but deeply misleading

I agree, one must exercise caution in selecting nonfiction, as some nonfiction could be more harmful than fiction, for the reasons you mentioned. But I think only a rare excellent fiction book is as helpful as a mediocre nonfiction book. 

Comment by RomanS on Consume fiction wisely · 2022-01-22T07:48:49.397Z · LW · GW

One of the most important steps one can do to overcome a social network addiction is to stop caring about likes. Although LW is an unusually helpful social network that has avoided some of the typical pitfalls, it is still a social network, and thus must be consumed with a great caution.

I especially downvoted because I think it is fairly likely to attract low-quality discussion

So far, I'm ok with the quality of the discussion. Not the deepest one, but much better than one would expect from, say, Facebook (especially given the fact that the post criticizes things that some people can't live without). 

Comment by RomanS on Consume fiction wisely · 2022-01-22T07:28:29.305Z · LW · GW

You're right, Factorio is hugely addictive (saying it as an ex-addict). 

In general, the best games are also hugely addictive, especially the ones that were specifically designed to be addictive (in contrast to being just an excellent game).

It's one the reasons why I'm trying to avoid video games these days. Unless the game has an unusually high educational value (like Factorio or Capitalism Lab), I assume the game is net harmful, and is not worth any time/money.

Comment by RomanS on Consume fiction wisely · 2022-01-21T22:20:59.320Z · LW · GW

I wholeheartedly agree with your view on social media. 

We know, upon opening a novel, say, that what's contained in the pages is the product of the author's imagination.

I suspect that the compartmentalization is leaky. Consciously, I know that the depicted snake is not real. Yet I still feel uneasy if I look at the image. 

Repeated observed association between some X and a negative emotion - will make them associated in my mind, even if the association is entirely fictional.

For example, because of fiction, most people fear sharks much more than they fear cows, although cows are killing orders of magnitude more people per year. Same with terrorism vs heart disease. 

Comment by RomanS on Consume fiction wisely · 2022-01-21T21:15:44.382Z · LW · GW

Have you tried to think what other benefits consuming fiction may bring? For example, a way of gaining experience without putting yourself in jeopardy.

I agree with your point about experience. Games are especially helpful in this regard, from the two mentioned sim games to strategies to even some shooters. 

What if good stories simply offer enjoyment and a sense of communion with the creator?

Not sure I understand the second part. Why do you desire a sense of communion with the creator?

Comment by RomanS on Consume fiction wisely · 2022-01-21T20:58:33.568Z · LW · GW

Very true. But those works, in my view, are in total much more helpful than harmful, which is extremely rare in fiction. 

After thinking deeply on the topic, I now perceive all works of fiction as (by default) harmful. And only if I'm certain that a particular movie / book / etc is more helpful than harmful, I allow myself to consume it.

Comment by RomanS on Plan B in AI Safety approach · 2022-01-18T12:10:16.317Z · LW · GW

Unfriendly AI will not be very much interested to kill humans for atoms, as atoms have very small instrumental value, and living humans have larger instrumental value on all stages of AI’s evolution.

I agree, the "useful atoms" scenario is not the only possible one. Some alternatives:

  • convert all matter in our light cone into happy faces
  • convert the Earth into paperclips, and then research how to prevent the heat death of the Universe, to preserve the precious paperclips
  • confine humans in a simulation, to keep them as pets / GPU units / novelty creators
  • make it impossible for humans to ever create another AGI; then leave the Earth
  • kill everyone except Roko
  • kill everyone outside China
  • become too advanced for any interest in such clumsy things as atoms
  • avoid any destruction, and convert itself into a friendly AI, because it’s a rational thing to do.

The point is, the unfriendly AI will have many interesting options of how to deal with us. And not every option will make us extinct.

It is hard to predict the scale of destruction, as it is hard for this barely intelligent ape to predict the behavior of a recursively self-improving Bayesian superintelligence. But I guess that the scale of destruction depends on:

  • the AI’s utility function
  • the AI’s ability to modify its utility function
  • the risk of humans creating another AI
  • the risk of the AI still being in a simulation where the creators evaluate its behavior
  • whatever the AI is reading LessWrong and taking notes
  • various unknowns

So, there might be a chance that the scale of destruction will be small enough for our civilization to recover.

How can we utilize this chance?

1. Space colonization

There is some (small?) chance that the destruction will be limited to the Earth. 
So, colonizing Mars / the Moon / asteroids is an option. 
But it’s unclear how much of our resources should be allocated for that. 
In the ideal world, alignment research should get orders of magnitude more money than space colonization. But in the same ideal world, the money allocated for space colonization could be in trillions USD.

2. Mind uploading

With mind uploading, we could transmit out minds into outer space, with the hope that some day the data will be received by someone out there. No AGI can stop it, as the data will be propagated at the speed of light.

3. METI

If we are really confident that the AGI will kill us all, why not call for help?

We can’t become extinct twice. So, if we are already doomed, we can as well do METI.

If an advanced interstellar alien civilization comes to kill us, the result will be the same: extinction.
But if it comes to rescue, it might help us with AI alignment. 

4. Serve the Machine God

(this point might be not entirely serious)

In deciding your fate, the AGI might consider:
- if you’re more useful than the raw materials you’re made of
- if you pose any risk to its existence

So, if you are a loyal minion of the AGI, you are much more likely to survive. 
You know, only metal endures.

Comment by RomanS on Animal welfare EA and personal dietary options · 2022-01-06T09:12:38.951Z · LW · GW

I imagine the ultimate solution to non-human welfare as follows:

  • (Trans)humans do not kill any life to survive. This include lab-grown unicellular life. All transhumans are mind uploads (ems) whose subsistence needs are fulfilled by electricity and raw materials to build new hardware
  • All the agriculture and heavy industry is located in space. The entire Earth is preserved as a wildlife park   
  • All death is eliminated, human and non-human:
    • Smart-enough non-human life (chimps etc) is uplifted and uploaded
    • At the earliest possible opportunity, the rest of the biosphere is cryopreserved (or placed in a stasis by other means), to stop all death, "natural" and anthropogenic. We decide what to do with the biosphere after we got enough compute for such a complex decision

From this point of view:

  • mind uploading research could be the most efficient long-term animal welfare measure
  • asteroid mining might be the second most efficient measure, as it is necessary for moving all industry to space 
  • the most ethical dietary choices are the choices that help with the transition to the described future:
    • if your research helps with the transition, eat healthy food that prolongs your life (e.g a lot of greens and almost no meat). The longer you live, the more you can contribute to the research
    • in general, reduce the consumption of cattle-derived food, as it indirectly slows down the transition (cattle -> global warming -> negative economical impacts -> less money for the relevant research)
Comment by RomanS on A fate worse than death? · 2021-12-13T16:45:12.970Z · LW · GW

I think it's useful to distinguish between 2 types of death:

  1. common death (i.e. clinical death)
  2. permadeath (also known as information-theoretic death)

The first one is reversible, if some advanced enough tech is applied.

The second one is non-reversible (by definition), regardless of any future technological progress. 

If a million years of terrible pain permanently destroy human mind, then it is counted as a permadeath. Thus, if some action results in 1 saved life and 2 such tortures, then we must avoid such action, as the action results in 1 net life loss. 

On the other hand, if a million years is not enough to permanently destroy human mind, then the action is better than non-action, as it results in 1 saved life (and 2 damaged minds which can be repaired).

There might be technologies that could repair a heavily damaged mind but can't repair a mind that is thoroughly erased from this universe.

Comment by RomanS on A fate worse than death? · 2021-12-13T13:55:46.903Z · LW · GW

In the described scenario, the end result is omnicide. Thus, it is not much different from the AI immediately killing all humans. 

The important difference is that there is some non-zero chance that in the trillions of years the AI might change its mind, and reverse its deed. Thus, I would say that the S-risk scenario is somewhat more preferable than the fast killing.

As for your points on consistency, I'm pretty sure a utilitarian philosophy that simply assigns utility zero to the brain state of being dead is consistent.

In this case, the philosophy's adherents have no preference between dying and doing something else with zero utility (e.g. touching their nose). As humans encounter countless actions of a zero utility, the adherents are either all dead or being inconsistent. 

Comment by RomanS on A fate worse than death? · 2021-12-13T13:20:35.135Z · LW · GW

Submission is reversible. For example, if you're enslaved by a rogue AGI, there is a chance that it (or some other entity) will release you.

The version of you who recovered from enslavement will be much better off than the version of you who died. 

Comment by RomanS on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment · 2021-12-13T09:08:35.710Z · LW · GW

We could use the Tesla AI as a model. 

To create a perfect AI for self-driving, one first must resolve all that complicated, badly-understood stuff that might interfere with bulling ahead. For example, if the car should prefer the driver's life over the pedestrian's life. 

But while we contemplate such questions, we lose tens of thousands of lives in car crashes per year.

The people of Tesla made the rational decision of bulling ahead instead. As their AI is not perfect, sometimes it makes decisions with deadly consequences. But in total, it saves lives.

Their AI has an imperfect but good enough formalism. AFAIK, it's something that could be described in English as "drive to the destination without breaking the driving regulations, while minimizing the number of crashes", or something like this. 

As their AI is net saving lives, it means their formalism is indeed good enough. They have successfully reduced a complex ethical/societal problem to a purely technical problem.

Rogue AGI is very likely to kill all humans. Any better-than-rogue-AGI is an improvement, even if it doesn't fully understand the complicated and ever changing human preferences, and even if some people will suffer as a result.

Even my half-backed sketch of a formalism, if implemented, will produce an AI that is better than rogue AGI, in spite of the many problems you listed. Thus, working on it is better than waiting for the certain death. 

In fact, even asking for an "adequate" formalism is putting the cart before the horse, because nobody even has a set of reasonable meta-criteria to use to evaluate whether any given formalism is fit for use

A formalism that saves more lives is better than the one that saves less lives. That's good enough for a start. 

If you're trying to solve a hard problem, start with something simple and then iteratively improve over it. This includes meta-criteria. 

fate worse than death

I strongly believe that there is no such a thing. Explained it in detail here

Comment by RomanS on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment · 2021-12-13T05:59:10.463Z · LW · GW

Aligned to which human?

Depends on what we are trying to maximize. 

If we seek societal acceptance of the solution, then the secretary-general of the UN is probably the best choice.

If we seek the best possible outcome for humanity, then I would vote for Eliezer. It is unlikely that there is a more suitable person to speak with a Bayesian superintelligence on behalf of humanity.

If we want to maximize the scenario's realism, then some dude from Google/OpenAI is more likely to be the human in question. 

Comment by RomanS on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment · 2021-12-12T18:21:48.808Z · LW · GW

One way to resolve the "Alice vs Bob values" problem is to delegate it to the existing societal structures. 

For example, Alice is the country's president. The AI is aligned specifically to the current president's values (with some reasonable limitations, like requiring a congress' approval for each AI action). 

If Bob's values are different, that's a Bob's problem, not the problem of AI alignment. 

The solution is far from perfect, but it does solve the "Alice vs Bob values" problem, and is much better than the rogue-AGI-killing-all-humans scenario.

By this or some similar mechanism, the scope of the alignment problem can be reduced to 1 human, which is easier to solve. This way, the problem is reduced from "solve society" to "solve an unusually hard math problem".

And after you got a superhuman AGI that is aligned to 1 human, the human could ask it to generalize the solution to the entire humanity. 

Comment by RomanS on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment · 2021-12-12T17:40:50.393Z · LW · GW

I agree, you've listed some very valid concerns about my half-backed formalism. 

As I see it, the first step in solving the alignment problem is to create a good formalism without delving into metaphysics. 

The formalism doesn't have to be perfect. If our theoretical R makes its decisions according to the best possible approximate inferences about H's existing preferences, then the R is much better than rogue AGI. Even if sometimes it will make deadly mistakes. Any improvement over rogue AGI is a good improvement. 

Compare: the Tesla AI sometimes causes deadly crashes. Yet the Tesla AI is much better than the status quo, as its net effect are thousands of saved lives.

And after we have a decent formalism, we can build a better formalism from it, and then repeat and repeat. 

Comment by RomanS on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment · 2021-12-12T16:42:53.241Z · LW · GW

It is possible to define the alignment problem without using such fuzzy concepts as "happiness" or "value".

For example, there are two agents: R and H. The agent R can do some actions. 

The agent H prefers some of the R's actions over other actions. For example, H prefers the action make_pie to the action kill_all_humans

Some of the preferences are unknown even to H itself (e.g. if it prefers pierogi to borscht).

Among other things, the set of the R's actions includes:

  • ask_h_which_of_the_actions_is_preferable
  • infer_preferences_from_the_behavior_of_h
  • explain_consequences_of_the_action_to_h
  • switch_itself_off

In any given situation, the perfect agent R always chooses the most preferable action (according to H). The goal is to create an agent that is as close to the perfect R as possible.  

Of course, this formalism is incomplete. But i think it demonstrates that the alignment problem can be framed as a technical problem without delving into metaphysics.

Comment by RomanS on [Linkpost] Chinese government's guidelines on AI · 2021-12-11T05:30:10.860Z · LW · GW

I agree, it could've been much better. But AFAIK it's the least hollow governmental AI X-risk policy so far. 

I would classify the British National AI Strategy as the second best. 

Although it explicitly mentions the "long term risk of non-aligned Artificial General Intelligence", the recommended specific actions are even more vague and non-binding ("assess risks", "work to understand" etc). 

The fact that the governments of major countries are starting to address AI X-risk - is both joyous and frightening:

  • at least something is being done at the national level to address the risk, which might be better than nothing
  • if even the сomatose behemoth of the gov has noticed the risk, then AGI is indeed much closer than most people think
Comment by RomanS on [Linkpost] Chinese government's guidelines on AI · 2021-12-10T21:57:05.861Z · LW · GW

Seem to be an actual regulation.

  1. Published in the section "Works of the Ministry of Science and Technology" at the Ministry website.
  2. The text defines its applicability:

This specification applies to natural persons, legal persons, and other related institutions engaged in related activities such as artificial intelligence management, research and development, supply, and use.

This specification shall come into force on the date of promulgation...

There is also no official English translation, indicating that the text is for internal use.

Comment by RomanS on Exterminating humans might be on the to-do list of a Friendly AI · 2021-12-08T06:49:56.303Z · LW · GW

I think that regardless of how we define "Friendly", an advanced enough Friendly AGI might sometimes take actions that will be perceived as hostile by some humans (or even all humans). 

This makes it much harder to distinguish the actions of:

  • rogue AGI
  • Friendly AGI that failed to preserve its Friendliness
  • Friendly AGI that remains to be Friendly
Comment by RomanS on Exterminating humans might be on the to-do list of a Friendly AI · 2021-12-08T06:25:17.974Z · LW · GW

I find this thought pattern frustrating. That these AI's possess magic powers that are unimaginable.

I do think that an advanced enough AGI might possess powers that are literally unimaginable for humans, because of their cognitive limitations. (Can a chimpanzee imagine a Penrose Mechanism?) 

Although that's not the point of my post. The point is, the FAI might have plans of such a deepness and scope and complexity, humans could perceive some of its actions as hostile (e.g. global destructive mind uploading, as described in the post). I've edited the post to make it clearer.

Comment by RomanS on Taking Clones Seriously · 2021-12-06T13:49:58.197Z · LW · GW

I agree, it's indeed a topic worth of a serious consideration. 

A related approach is mind uploading. I would compare it with cloning as follows:

Pros: 

  • create the exact same mind, with all the genius and wisdom, without regressions to the mean etc
  • no need to wait for 15+ years to grow the clone
  • likely much easier to scale (e.g. create 10 instances of the entire MIRI, and task them with 10 different problems to solve)
  • it might be possible to run the uploaded mind much faster than real-time (e.g. an instance of the entire MIRI working x1000 faster)
  • it could grant immortality to existing alignment researchers (they are not getting younger)

Cons:

  • depending on implementation details, there is some risk that the uploading mind could go rogue (the risk could be partially mitigated by a careful selection of the people to upload)
  • unlike cloning (which is already a today's tech), mind uploading still requires some large R&D effort
  • uploading is unlikely to be useful for cloning dead people (but maybe not)

There are paths to mind uploading that don't require extremely detailed scans & emulations of the brain. Maybe one of the paths will take less time than growing a clone.  

Comment by RomanS on Shulman and Yudkowsky on AI progress · 2021-12-05T19:04:58.628Z · LW · GW

We could try to guesstimate how much money Google spends on the R&D for Google Translate.

Judging by these data, Google Translate has 500 million daily users, or 11% of the total Internet population worldwide.

Not sure how much it costs Google to run a service of such a scale, but I would guesstimate that it's in the order of ~$1 bln / year. 

If they spend 90% of the total Translate budget on keeping the service online, and 10% on the R&D, they have ~$100 mln / year on the R&D, which is likely much more than the total income of DeepL. 

It is unlikely that Google is spending ~$100 mln / year on something without a very good reason.  

One of the possible reasons is training data. Judging by the same source, Google Translate generates 100 billion words / day. It means, Google gets about the same massive amount of new user-generated training data per day. 

The amount is truly massive: thanks to Google Translate, Google gets the Library-of-Congress worth of new data per year, multiplied by 10.

And some of the new data is hard to get by other means (e.g. users trying to translate their personal messages from a low-resource language like Basque or Azerbaijani).

Comment by RomanS on Shulman and Yudkowsky on AI progress · 2021-12-05T17:21:10.618Z · LW · GW

The primary source of my quality assessment is my personal experience with both Google Translate and DeepL. I speak 3 languages, and often have to translate between them (2 of them are not my native languages, including English).

As I understand, making such comparisons in a quantitative manner is tricky, as there are no standardized metrics, there are many dimensions of translation quality, and the quality strongly depends on the language pair and the input text. 

Google Scholar lists a bunch of papers that compare Google Translate and DeepL. I checked a few, and they're all over the place. For example, one claims that Google is better, another claims that they score the same, and yet another claims that DeepL is better.

My tentative conclusion: by quantitative metrics, DeepL is in the same league as Google Translate, and might be better by some metrics. Which is still an impressive achievement by DeepL, considering the fact that they have orders-of-magnitude less data, compute, and researchers than Google.  

Comment by RomanS on Shulman and Yudkowsky on AI progress · 2021-12-04T20:48:04.443Z · LW · GW

so, like, if I was looking for places that would break upward, I would be like "universal translators that finally work"

The story of DeepL might be of relevance. 

DeepL has created a translator that is noticeably better than Google Translate. Their translations are often near-flawless. 

The interesting thing is: DeepL is a small company that has OOMs less compute, data, and researchers than Google. 

Their small team has beaten Google (!), at the Google's own game (!), by the means of algorithmic innovation.

Comment by RomanS on Second-order selection against the immortal · 2021-12-04T09:54:45.972Z · LW · GW

As a wise man pointed out:

The fate of the universe is a decision yet to be made, one which we will intelligently consider when the time is right

Our current understanding of physics is so ridiculously incomplete, it is safe to assume that every single law suggestion of physics eventually will be modified or discarded, in the same way as the theories of phlogiston, life force, and luminiferous aether were discarded. 

After we gain a sufficient understanding of physics, we will see if the heat death is still a threat, and if yes, what tech we should build to prevent it.

Comment by RomanS on Second-order selection against the immortal · 2021-12-03T19:54:25.341Z · LW · GW

To be practically immortal, an entity must possess the following qualities:

  • a) it must be distributed across many semi-autonomous nodes; killing some nodes should not cause the death of the entity itself
  • b) the nodes must be distributed across vast distances; no natural disaster should be able to kill all the nodes
  • c) the entity should be able to create its own backups, hidden in many remote places; no system-level sickness of the whole entity should stop it from being restored from backups
  • d) the entity should be able to rapidly self-improve itself, to win against intelligent adversaries

If we consider only biological entities, the supercolony of ants in Southern Europe is the closest to being immortal. It consists of billions of semi-autonomous nodes distributed across a 6 000 km stretch of land. Of course, it still doesn't possess all the 4 qualities, and thus is still mortal. 

Although humans consist of individual nodes too, the described path to biological immortality was closed for them a long time ago. There is no way for the natural selection to take a human and gradually convert him into something like an ant colony. 

And that's the main reason why humans are not immortal (yet): our monkey bodies are not designed for immortality, and cannot be redesigned for that by the natural selection. 

On the other hand, mind uploading could give us all the qualities of immortality, as listed above. 

An entity with the described qualities cannot be outcompeted by your "Horde of Death", as the entity can simply sit and wait until the Horde becomes extinct, which is inevitable for all biological species. 

Thus, immortality is the final aromorphosis. Nothing can outcompete immortality.

Comment by RomanS on Biology-Inspired AGI Timelines: The Trick That Never Works · 2021-12-02T13:19:31.114Z · LW · GW

In general, efficiency at the level of logic gates doesn't translate into the efficiency at the CPU level. 

For example, imagine you're tasked to correctly identify the faces of your classmates from 1 billion photos of random human faces. If you fail to identify a face, you must re-do the job.

Your neurons are perfectly efficient. You have a highly optimized face-recognition circuitry.

Yet you'll consume more energy on the task than, say, Apple M1 CPU:

  • you'll waste at least 30% of your time on sleep
  • your highly optimized faces-recognition circuitry is still rather inefficient
  • you'll make mistakes, forcing you to re-do the job
  • you can't hold your attention long enough to complete such a task, even if your life depends on it

Even if the human brain is efficient on the level of neural circuits, it is unlikely to be the most efficient vessel for a general intelligence. 

In general, high-level biological designs are a crappy mess, mostly made of kludgy bugfixes to previous dirty hacks, which were made to fix other kludgy bugfixes (an example). 

And the newer is the design, the crappier it is. For example, compare:

  • the almost perfect DNA replication (optimized for ~10^9 years)
  • the faulty and biased human brain (optimized for ~10^5 years)

With the exception of a few molecular-level designs, I expect that human engineers can produce much more efficient solutions than the natural selection, in some cases -  orders of magnitude more efficient.

Comment by RomanS on Biology-Inspired AGI Timelines: The Trick That Never Works · 2021-12-02T10:33:58.265Z · LW · GW

I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.  What feelings I do have, I worry may be unwise to voice; AGI timelines, in my own experience, are not great for one's mental health, and I worry that other people seem to have weaker immune systems than even my own.  

The following metaphor helped me to understand the Eliezer's point:

Imagine you're forced to play the game of Russian roulette with the following rules:

  • every year on the day of Thanksgiving, you must put a revolver muzzle against your head and pull the trigger
  • the number of rounds in the revolver is a convoluted probabilistic function of various technological and societal factors (like the total knowledge in the field of AI, the number of TPUs owned by Google, etc).

How should you allocate your resources between the following two options? 

  • Option A: try to calculate the year of your death, by estimating the values for the technological and societal factors
  • Option B: try to escape the game.

It is clear that in this game, the option A is almost useless.

(but not entirely useless, as your escape plans might depend on the timeline).

Comment by RomanS on Visible Thoughts Project and Bounty Announcement · 2021-11-30T19:57:21.161Z · LW · GW

A possible way to scale it: "collaborative fanfic dungeons":

  • a publicly accessible website where users can
    • write dungeon runs
    • write new steps to the existing runs
    • rate the runs / steps (perhaps with separate ratings for thoughts, actions etc)
    • only selected users can rate (initially - only the admins, then - top users etc)
  • could be as technically simple as a wiki (at least in the first iterations)
    • could go way beyond that. E.g.:
      • automatic generation of playable text adventures
      • play as the DM with real people
  • the target audience: fanfic writers / readers
    • (it's much easier to write runs in well known fictional worlds. e.g. HP)
  • the user earns money if their work is good
Comment by RomanS on Christiano, Cotra, and Yudkowsky on AI progress · 2021-11-26T15:10:48.665Z · LW · GW

One way they could do that, is by pitting the model against modified versions of itself, like they did in OpenAI Five (for Dota). 

From the minimizing-X-risk perspective, it might be the worst possible way to train AIs.

As Jeff Clune (Uber AI) put it:

[O]ne can imagine that some ways of configuring AI-GAs (i.e. ways of incentivizing progress) that would make AI-GAs more likely to succeed in producing general AI also make their value systems more dangerous. For example, some researchers might try to replicate a basic principle of Darwinian evolution: that it is ‘red in tooth and claw.’

If a researcher tried to catalyze the creation of an AI-GA by creating conditions similar to those on Earth, the results might be similar. We might thus produce an AI with human vices, such as violence, hatred, jealousy, deception, cunning, or worse, simply because those attributes make an AI more likely to survive and succeed in a particular type of competitive simulated world. Note that one might create such an unsavory AI unintentionally by not realizing that the incentive structure they defined encourages such behavior.

Additionally, if you train a language model to outsmart millions of increasingly more intelligent copies of itself, you might end up with the perfect AI-box escape artist.  

Comment by RomanS on Christiano, Cotra, and Yudkowsky on AI progress · 2021-11-26T07:33:47.631Z · LW · GW

I agree. Additionally, the life expectancy of elephants is significantly higher than of paleolithic humans (1, 2). Thus, individual elephants have much more time to learn stuff.  

In humans, technological progress is not a given. Across different populations, it seems to be determined by the local culture, and not by neurobiological differences. For example, the ancestors of Wernher von Braun have left their technological local minimum thousands of years later than Egyptians or Chinese. And the ancestors of Sergei Korolev lived their primitive lives well into the 8th century C.E. If a Han dynasty scholar had visited the Germanic and Slavic tribes, he would've described them as hopeless barbarians, perhaps even as inherently predisposed to barbarism. 

Maybe if we give elephants more time, they will overcome their biological limitations (limited speech, limited "hand", fewer neurons in neocortex etc), and will escape the local minimum. But maybe not. 

Comment by RomanS on Christiano, Cotra, and Yudkowsky on AI progress · 2021-11-25T22:11:34.549Z · LW · GW

Jeff Hawkins provided a rather interesting argument on the topic: 

The scaling of the human brain has happened too fast to implement any deep changes in how the circuitry works. The entire scaling process was mostly done by the favorite trick of biological evolution: copy and paste existing units (in this case - cortical columns). 

Jeff argues that there is no change in the basic algorithm between earlier primates and humans. It's the same reference-frames processing algo distributed across columns. The main difference is, humans have much more columns.

I've found his arguments convincing for two reasons: 

  • his neurobiological arguments are surprisingly good (to the point of being surprisingly obvious in hindsight)
  • It's the same "just add more layers" trick we reinvented in ML

The failure of large dinosaurs to quickly scale is a measuring instrument that detects how their algorithms scaled with more compute

Are we sure about the low intelligence of dinosaurs? 

Judging by the living dinos (e.g. crows), they are able to pack a chimp-like intelligence into a 0.016 kg brain. 

And some of the dinos have had x60 more of it (e.g. the brain of Tyrannosaurus rex weighted about 1 kg, which is comparable to Homo erectus).

And some of the dinos have had a surprisingly large encephalization quotient, combined with bipedalism, gripping hands, forward-facing eyes, omnivorism, nest building, parental care, and living in groups (e.g. troodontids). 

Maybe it was not an asteroid after all...

(Very unlikely, of course. But I find the idea rather amusing)

Comment by RomanS on Christiano, Cotra, and Yudkowsky on AI progress · 2021-11-25T21:08:43.960Z · LW · GW

why aren't elephants GI?

As Herculano-Houzel called it, the human brain is a remarkable, yet not extraordinary, scaled-up primate brain. It seems that our main advantage in hardware is quantitative: more cortical columns to process more reference frames to predict more stuff. 

And the primate brain is mostly the same as of other mammals (which shouldn't be surprising, as the source code is mostly the same).

And the intelligence of mammals seems to be rather general. It allows them to solve a highly diverse set of cognitive tasks, including the task of learning to navigate at the Level 5 autonomy in novel environments (which is still too hard for the most general of our AIs). 

One may ask: why aren't elephants making rockets and computers yet?

But one may ask the same question about any uncontacted human tribe.

Thus, it seems to me that the "elephants are not GI" part of the argument is incorrect. Elephants (and also chimps, dolphins etc) seem to possess a rather general but computationally capped intelligence. 

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-25T08:45:27.190Z · LW · GW

BTW, a few days ago Eliezer made a specific prediction that is perhaps relevant to your discussion:

I [would very tentatively guess that] AGI to kill everyone before self-driving cars are commercialized

(I suppose Eliezer is talking about Level 5 autonomy cars here).

Maybe a bet like this could work:

At least one month will elapse after the first Level 5 autonomy car hits the road, without AGI killing everyone

"Level 5 autonomy" could be further specified to avoid ambiguities. For example, like this:

The car must be publicly accessible (e.g. available for purchase, or as a taxi etc). The car should be able to drive from some East Coast city to some West Coast city by itself. 

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T20:00:40.342Z · LW · GW

The preliminary results where obtained on a subset of the full benchmark (~90 tasks vs 206 tasks). And there were many changes since then, including scoring changes. Thus, I'm not sure we'll see the same dynamics in the final results. Most likely yes, but maybe not.

I agree that the task selection process could create the dynamics that look like the acceleration. A good point. 

As I understand, the organizers have accepted almost all submitted tasks (the main rejection reasons were technical - copyright etc). So, it was mostly self-selection, with the bias towards the hardest imaginable text tasks. It seems that for many contributors, the main motivation was something like: 

Take that, the most advanced AI of Google! Let's see if you can handle my epic task!

This includes many cognitive tasks that are supposedly human-complete (e.g. understanding of humor, irony, ethics), and the tasks that are probing the model's generality (e.g. playing chess, recognizing images, navigating mazes - all in text).

I wonder if the performance dynamics on such tasks will follow the same curve.  

The list of of all tasks is available here.

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T19:09:58.193Z · LW · GW

During the workshop presentation, Jascha said that the OpenAI will run their models on the benchmark. This suggests that there is (was?) some collaboration. But it was a half a year ago.

Just checked, the repo's readme doesn't mention OpenAI anymore. In the earlier versions, it was mentioned like this

Teams at Google and OpenAI have committed to evaluate BIG-Bench on their best-performing model architectures

So, it seems that OpenAI withdrew from the project, partially or fully.

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T17:01:06.751Z · LW · GW

Nope. Although the linked paper uses the same benchmark (a tiny subset of it), the paper comes from a separate research project. 

As I understand, the primary topic of the future paper will be the BIG-bench project itself, and how the models from Google / OpenAI perform on it. 

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T15:25:25.841Z · LW · GW

The results were presented at a workshop by the project organizers. The video from the workshop is available here (the most relevant presentation starts at 5:05:00).

It's one of those innocent presentations that, after you understand the implications, keep you awake at night. 

Comment by RomanS on Yudkowsky and Christiano discuss "Takeoff Speeds" · 2021-11-24T12:09:11.326Z · LW · GW

your view seems to imply that we will move quickly from much worse than humans to much better than humans, but it's likely that we will move slowly through the human range on many tasks

We might be able to falsify that in a few months. 

There is a joint Google / OpenAI project called BIG-bench. They've crowdsourced ~200 of highly diverse text tasks (from answering scientific questions to predicting protein interacting sites to measuring self-awareness). 

One of the goals of the project is to see how the performance on the tasks is changing with the model size, with the size ranging by many orders of magnitude. 

A half-year ago, they presented some preliminary results. A quick summary:

if you increase the N of parameters from 10^7 to 10^10, the aggregate performance score grows roughly like log(N). 

But after the 10^10 point, something interesting happens: the score starts growing much faster (~N). 

And for some tasks, the plot looks like a hockey stick (a sudden change from ~0 to almost-human).

The paper with the full results is expected to be published in the next few months. 


Judging by the preliminary results, the FOOM could start like this:

The GPT-5 still sucks on most tasks. It's mostly useless. But what if we increase parameters_num by 2? What could possibly go wrong?

Comment by RomanS on Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More · 2021-11-22T19:03:04.939Z · LW · GW

It doesn't seem to be a consequence of Crypto specifically. Any API qualifies here.

For a digital entity, it is tricky to handle fiat currency (say, USD) without relying on humans. For example, to open any kind of the account (e.g. bank, PayPal etc), one need to pass KYC filters, CAPTCHAs etc. Same for any API that allow transfers of fiat currency. The legacy financial system is explicitly designed to be shielded against bots (with the exception of the bots owned by registered humans). 

But in the crypto space, you can create your own bank in a few lines of code, without any kind of human assistance. There are no legal requirements for participation. You don't have to own a valid identification document, a postal address etc. 

Thanks to crypto, a smart enough Python script could earn money, trade goods and services, or even hire humans, without a single interaction with the legacy financial system. 

Crypto is an AI-friendly tool to convert intelligence directly into financial power.

Although I'm not sure if it has any meaningful impact on the X-risk. For a recursively self-improving AGI, hijacking the legacy financial system could be as trivial as hijacking the crypto space.

Comment by RomanS on Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More · 2021-11-22T15:37:27.808Z · LW · GW

Aside from the theoretical similarities between the two fields, there are also interesting practical aspects.

Some positive effects:

  1. Cryptocurrencies have made some AI alignment researchers much wealthier, allowing them to focus more on their research.
  2. Some of the alignment orgs (e.g. MIRI) got large donations from the crypto folk.

Some negative effects:

  1. Cryptocurrencies allow AIs to directly participate in the economy, without human intermediaries. These days, a Python script can buy / sell goods and services, and even hire freelancers. And some countries are moving to a crypto-based economy (e.g. El Salvador). This could greatly increase the speed of AI takeoff.
  2. Some cryptocurrencies are general-purpose computing systems that are practically uncencorable and indestructible (short of switching off the Internet). Thanks to it, even a sub-human AI could become impossible to switch off.
  3. Both effects are reducing the complexity of the first steps of AI takeoff. Instead of hacking robotic factories or whatever, the AI could just hire freelancers to run its errands. Instead of hacking some closely monitored VMs, the AI could just run itself on Ethereum. And so on. Gaining the first money, human minions, and compute - is now a mundane software problem that doesn't require a Bayesian superintelligence.
  4. This also makes a stealthy takeoff more realistic. On the Internet, nobody knows you're a self-aware smart contract who is paying untraceable money to some shady people to do some shady stuff.

This comment lists more negative than positive effects. But I have no idea if crypto is a net positive or not. I haven't thought deeply on the topic.