Introduction: Bias in Evaluating AGI X-Risks

post by Remmelt (remmelt-ellen), flandry19 · 2022-12-27T10:27:30.646Z · LW · GW · 0 comments

Contents

    About Forrest Landry
  Introduction
        A Partial List Of Affecting Bias...
None
No comments

The rationality community has a tradition of checking for biases, particularly when it comes to evaluating the non-intuitive risks of general AI.

We thought you might like this list, adapted from a 2015 essay by Forrest Landry[1]. Many names of biases listed may already be familiar for you. If you "boggle" more at the text, you will find curious new connections to evaluating upcoming risks of AI developments.


About Forrest Landry

Forrest is a polymath working on civilisation design and mitigating risks of auto-scaling/catalysing technology (eg. Dark Fire scenario). About 15 years ago, he started researching how to build in deep existential alignment into the internals of AGI, applying his deep understanding of programming, embodied ethics, and meta-physics. Then, Forrest discovered the substrate-needs convergence argument (as distinct from yet much enabled by instrumental convergence). Unfortunately, because of substrate-needs convergence, any approach to aligning any AI at the embedded level turned out unsound in practice (and moreover, inconsistent with physical theory). To inquire further, see this project.


Introduction

Note on unusual formatting:  Sentences are split into lines so you can parse parts precisely.


  Ideally, 
  in any individual or group decision making,
  there would be some means, processes, 
  and procedures in place to ensure that
  the kinds of distortions and inaccuracies
  introduced by individual and collective
  psychological and social bias
  do not lead to incorrect results,
  and thus poor (risk-prone) choices,
  with potentially catastrophic outcomes.

  While many types of bias
  are known to science
  and have been observed
  to be common to all people
  and all social groups, the world over
  in all working contexts, regardless 
  of their background, training, etc,
  they are also largely unconscious,
  being 'built-in' by long-term 
  evolutionary processes.

  These unconscious cognitive biases,
  while they are adaptive for the purposes
  of our being able to survive in
  non-technological environments,
  are also not able to serve us equally well
  when attempting to survive our
  current technological contexts.

  The changes in our 
  commonly experienced world
  continue to occur far too fast 
  for our existing evolutionary
  and cognitive adaptations
  to adjust naturally.
  We will therefore need to
  add the necessary corrections to 
  our thinking and choice-making process, 
  our own evolution, 'manually'.
  The hope is that 
  these 'adjustments' might
  make it possible to mitigate
  the distortions and inaccuracies
  introduced by the human condition
  to the maximum extent possible.

  Bear in mind 
  that each type of bias
  does not just affect individuals –
  they also arise due to specific 
  interpersonal and trans-personal effects
  seen only in larger groups.[2]  
  These bias aspects affect
  all of us, and in all sorts of ways,
  many of which are complex.
  It is important for everyone involved
  in critical decisions and projects
  to be aware of these general 
  and mutual concerns.
 

     We all run on corrupted hardware.
     Our minds are composed of many modules,
     and the modules that evolved to make 
     us seem impressive and gather allies
     are also evolved to subvert the ones 
     holding our conscious beliefs.

     Even when we believe that  
     we are working on something
     that may ultimately determine 
     the fate of humanity, our signaling 
     modules may hijack our goals so as 
     to optimize for persuading outsiders
     that we are working on the goal,
     instead of optimizing for
     achieving the goal.
 

  What is intended herein
  is to make some of these
  unconscious processes conscious,
  to provide a basis, and
  to identify the need,  
  for clear conversation 
  about these topics.
  Hopefully, as a result 
  of these conversations,
  and with the possibility 
  of a reasonable consensus reached,
  we will be able to identify (or create)
  a good general practice of decision making,
  which when implemented both 
  individually and collectively
  (though perhaps not easily),
  can materially improve 
  our mutual situation.

  The need for these practices
  of accuracy, precision, and correctness
  in decision making are especially
  acute in proportion to the degree 
  that we all find ourselves faced
  with a seemingly ever increasing
  number of situations for
  which our evolution has
  not yet prepared us.
  Where the true goal
  is making rational, realistic, 
  and reasonably good choices
  about matters that may
  potentially involve many people,
  larger groups and tribes, etc, 
  many specific and strong 
  cognitive and social biases
  will need to be compensated for.
 

      Particularly in regards 
      to category 1 and 2 extinction risks
      nothing less than complete and full 
      compensation for all bias, 
      and the complete application 
      of correct reason 
      can be allowed for.
 

  This sequence will not attempt 
  to outline or validate any of the 
  specific risk possibilities and outcomes
  for which there is significant concern
  (this is done elsewhere).
  Nor can it attempt to outline or define 
  which or what means, processes, or procedures 
  should be used for effective individual 
  or group decision making.

  As the 'general problem of governance',
  the main issue remains one of the identification, 
  development, and testing/refining of 
  such means and methods by which 
  all bias can be compensated for,
  and a basis for clear reason 
  thereby created.
  Hopefully this will  
  lead to real techniques of
  group decision making –
  and high quality decisions –
  that can be realistically defined, 
  outlined, and implemented.
 

  A Partial List Of Affecting Bias...

  The next posts cover a list
  of some of the known types of bias
  that have a significant and real potential to
  harmfully affect the accuracy and correctness
  of extinction risk assessments.

  Each bias will be given its
  common/accepted consensus name,
  along with relevant links to Wikipedia  
  articles with more details.[3]
  Each bias will be briefly described
  with particular regard to its potential impact
  on risk assessment in an existential context.[4]

  1. ^

      Some of the remarks and observations herein
      have been derived from content posted 
      to the website LessWrong.com –
      no claim of content originality by 
      this author is implied or intended.
      Content has been duplicated 
      and edited/expanded here for 
      informational and research purposes only.

  2. ^

      Nothing herein is intended
      to implicate or impugn any
      specific individual, group, or institution.
      The author has not specifically encountered
      these sorts of issues in regards to
      just one person or person or project.

      Most people are actually well-intentioned.
      Unfortunately, 
      'good intentions' is not equivalent to 
      (nor necessarily yielding of)
      'good results', particularly where 
      the possibility of extinction risks
      is concerned.

  3. ^

      All of the descriptive notations regarding
      the specific characteristics of each bias
      have been derived from Wikipedia.

  4. ^

      These descriptions, explanations, and discussions
       are not intended to be comprehensive  
       or authoritative – they are merely 
       indicative for the purposes of stimulating 
       relevant/appropriate conversation.

0 comments

Comments sorted by top scores.