Posts
Comments
We know very little about Ancient Egypt, how they made things and the provenance of their artefacts.
Can you explain low-resolution integers?
Another bad idea: why not use every possible alignment strategy at once (or many of them)? Presumably this would completely hobble the AGI, but with some interpretability you could find where the bottlenecks to behaviour are in the system and use it as a lab to figure out best options. Still a try-once strategy I guess, and maybe it precludes actually getting to AGI in the first place, since you can't really iterate on an AI that doesn't work.
Second one I just had that might be naive.
Glutted AI. Feed it almost maximum utils automatically anyway, so that it has far shallower gradient between current state and maximalist behaviour, if it's already got some kind of future discounting in effect, it might just do nothing except occasionally give out very good ideas and be comfortable with us making slower progress as long as existential risk remains relatively low.
Semi tongue-in-cheek sci-fi suggestion.
Apparently the probability of a Carrington-like event/large coronal mass ejection is about 2% per decade, so maybe it's 2% for an extremely severe one every half century. If time from AGI to it leaving the planet is a half century, maybe 2% chance of the grid getting fried is enough of a risk that it keeps humans around for the time being. After that there might be less of an imperative for it to re-purpose the earth, and so we survive.