Posts
Comments
I am not positive about the alignment of an AI with humans if we are talking about human values. Such values are hard to define cross culturally (e.g. do they include female subservience to males, as seems to be the case in many cultures? or preservation of property rights as are inherent in many cultures?), and the likelihood of the first AIs being developed by persons with nefarious values seems very high (e.g, the Pepsi value of increasing corporate wealth, the military value of defeating other AIs or cyberdefenses). Even the golden rule seems problematic if the AIs replicate by improving themselves and discarding less fit embodiments of themselves, as they would value their own demise and therefore the demise of less fit embodiments of others, including humans. Saying this, an unaligned AI seems worse only because it assumes no human control at all. Perhaps alignment defined as valuing human life would be the bottom line type of alignment needed, or, taking it a little further, alignment consisting of constant updating its goals against constantly updated assessments of current human values.