Posts
Comments
Thanks for the elaboration, looking forward to the next posts. :)
For another thing, out-of-control AGIs will have asymmetric advantages over good AGIs—like the ability to steal resources, to manipulate people and institutions via lying and disinformation; to cause wars, pandemics, blackouts, gray goo, and so on; and to not have to deal with coordination challenges across different (human) actors with different beliefs and goals. More on this topic here.
Is your claim that out-of-control AGIs will all-things-considered have an advantage? Because I expect the human environment to be very hostile towards AGIs that are not verified to be good, or that turn out to lie, cheat and steal, or act uncooperatively in other ways.
Thanks for asking, I just read the post and was also interested in other people's thoughts.
My thoughts while reading:
- Is the emergence of humans really a good example for a significantly discontinuous jump? I spontaneously imagined that the first humans weren't actually performing much better than other apes, and that it took a lot of time of cultural development before humans started clearly dominating via using their increased strategizing/planning/coordinating capabilities.
- Paul seemed unconvinced of the potential for major insights (or a "secret sauce") about how to design discontinuously superior AIs. He wondered about analogous examples were major insights led to significant technological advances. This probably is covered well by the AI Impacts project on discontinuous technological developments, which found 10 relatively clear instances, and e.g. the bridge length discontinuity was "based on a new theory of bridge design".
- Regarding his argument why recursive self-improvement doesn't lead to fast takeoff: "Summary of my response: Before there is AI that is great at self-improvement there will be AI that is mediocre at self-improvement."I had the thought that there might be a "capability overhang" regarding self-improvement, because ML might currently underrate the progress that can be had here and rather spends time on other applications. I personally also find it plausible that a stable recursively self-improving architecture might be a candidate for a major insight that somebody might have someday.