Posts
Comments
This look like a great list of risk factors leading to AI lethalities, why making AI safe is a hard problem and why we are failing. But this post is also not what I would have expected by taking the title at face value. I thought that the post would be about detailed and credible scenarios suggesting how AI could lead to extinction, where for example each scenario could represent a class of AI X-risks that we want to reduce. I suspect that such an article would also be really helpful because we probably have not been so good at generating very detailed and credible scenarios of doom so far. There are risks of info hazard associated with that for sure. Also I am sympathetic to the argument that "AI does not think like you do" and that AI is likely to lead to doom in ways we cannot think of because of its massive strategic advantage. But still I think it might be very helpful to write some detailed and credible stories of doom so that a large part of the AI community take extinction risks from AI really seriously and approaches AI capability research more like working at high security bio hazard lab. Perverse incentives might still lead lots of people in the AI community to not take these concerned seriously. Also, it is true that there are some posts going in that direction for ex What failure looks like, It looks like you’re trying to take over the world, but I don’t think we have done enough on that front and that probably hinders our capacity to have X-risks from AI be taken seriously.
higher levels of complexity should increase our credence somewhat that consciousness-related computations are being performed
To nuance a bit: while some increasing amount of complexity might be necessary for more consciousness a system with high complexity does not necessarily implies more consciousness. So not clear how we should update our credence that C is being computed because some system has higher complexity. This likely depends on the detail of the cognitive architecture of the system.
It might also be that systems that are very high in complexity have a harder time being conscious (because consciousness requires some integration/coordination within the system) so there might be a sweet spot for complexity to instantiate consciousness.
For example consciousness could be roughly modeled as some virtual reality that our brain computes. This "virtual world" is a sparse model of the complex world we live in. While the model is generated by complex neural network, it is a sparse representation of a complex signal. For example the information flow of a macroscopic system in the network that is relevant to detect consciousness might actually not be that complex although it "emerges" on top of a complex architecture.
The point is that consciousness is perhaps closer to some combination of complexity and sparsity rather than complexity alone.
Slow Takeoffs (years, decades).
The optimal strategy will likely substantially change depending on whether takeoffs happen over years or decades so it might make sense to conceptually separate these time scales.
when in practice there will be an intermediate regime where the system {the humans building the AI + the AI}
It seems that we are already in this regime, of {H,AI} and probably have been since as the system {H,AI} has existed? (although with different dynamics of growth and with a growing influence of AI helping humans )
The system {H,AI} is currently already self-improving with human using AI to make progress in AI, but one question is for example whether the system {H,AI} will keep improving at an accelerating rate until AI can foom autonomously, or will we see diminishing returns and then some re-acceleration?
More specifically, interaction between AI labor (researchers+engineers) and AI capability tech (compute, models, datasets, environments) to grow AI capability tech are the sort of models that could also be useful to make more crisp at more and check empirically.