The rationality community has a tradition of checking for biases, particularly when it comes to evaluating the non-intuitive risks of general AI.
We thought you might like this list, adapted from a 2015 essay by Forrest Landry[1]. Many names of biases listed may already be familiar for you. If you “boggle” more at the text, you will find curious new connections to evaluating upcoming risks of AI developments.
About Forrest Landry
Forrest is a polymath working on civilisation design and mitigating risks of auto-scaling/catalysing technology (eg. Dark Fire scenario). About 15 years ago, he started researching how to build in deep existential alignment into the internals of AGI, applying his deep understanding of programming, embodied ethics, and meta-physics. Then, Forrest discovered the substrate-needs convergence argument (as distinct from yet much enabled by instrumental convergence). Unfortunately, because of substrate-needs convergence, any approach to aligning any AI at the embedded level turned out unsound in practice (and moreover, inconsistent with physical theory). To inquire further, see this project.
Introduction
Note on unusual formatting: Sentences are split into lines so you can parse parts precisely.
Ideally, in any individual or group decision making, there would be some means, processes, and procedures in place to ensure that the kinds of distortions and inaccuracies introduced by individual and collective psychological and social bias do not lead to incorrect results, and thus poor (risk-prone) choices, with potentially catastrophic outcomes.
While many types of bias are known to science and have been observed to be common to all people and all social groups, the world over in all working contexts, regardless of their background, training, etc, they are also largely unconscious, being ‘built-in’ by long-term evolutionary processes.
These unconscious cognitive biases, while they are adaptive for the purposes of our being able to survive in non-technological environments, are also not able to serve us equally well when attempting to survive our current technological contexts.
The changes in our commonly experienced world continue to occur far too fast for our existing evolutionary and cognitive adaptations to adjust naturally. We will therefore need to add the necessary corrections to our thinking and choice-making process, our own evolution, ‘manually’. The hope is that these ‘adjustments’ might make it possible to mitigate the distortions and inaccuracies introduced by the human condition to the maximum extent possible.
Bear in mind that each type of bias does not just affect individuals – they also arise due to specific interpersonal and trans-personal effects seen only in larger groups.[2] These bias aspects affect all of us, and in all sorts of ways, many of which are complex. It is important for everyone involved in critical decisions and projects to be aware of these general and mutual concerns.
We all run on corrupted hardware. Our minds are composed of many modules, and the modules that evolved to make us seem impressive and gather allies are also evolved to subvert the ones holding our conscious beliefs.
Even when we believe that we are working on something that may ultimately determine the fate of humanity, our signaling modules may hijack our goals so as to optimize for persuading outsiders that we are working on the goal, instead of optimizing for achieving the goal.
What is intended herein is to make some of these unconscious processes conscious, to provide a basis, and to identify the need, for clear conversation about these topics. Hopefully, as a result of these conversations, and with the possibility of a reasonable consensus reached, we will be able to identify (or create) a good general practice of decision making, which when implemented both individually and collectively (though perhaps not easily), can materially improve our mutual situation.
The need for these practices of accuracy, precision, and correctness in decision making are especially acute in proportion to the degree that we all find ourselves faced with a seemingly ever increasing number of situations for which our evolution has not yet prepared us. Where the true goal is making rational, realistic, and reasonably good choices about matters that may potentially involve many people, larger groups and tribes, etc, many specific and strong cognitive and social biases will need to be compensated for.
Particularly in regards to category 1 and 2 extinction risks nothing less than complete and full compensation for all bias, and the complete application of correct reason can be allowed for.
This sequence will not attempt to outline or validate any of the specific risk possibilities and outcomes for which there is significant concern (this is done elsewhere). Nor can it attempt to outline or define which or what means, processes, or procedures should be used for effective individual or group decision making.
As the ‘general problem of governance’, the main issue remains one of the identification, development, and testing/refining of such means and methods by which all bias can be compensated for, and a basis for clear reason thereby created. Hopefully this will lead to real techniques of group decision making – and high quality decisions – that can be realistically defined, outlined, and implemented.
A Partial List Of Affecting Bias...
The next posts cover a list of some of the known types of bias that have a significant and real potential to harmfully affect the accuracy and correctness of extinction risk assessments.
Each bias will be given its common/accepted consensus name, along with relevant links to Wikipedia articles with more details.[3] Each bias will be briefly described with particular regard to its potential impact on risk assessment in an existential context.[4]
Some of the remarks and observations herein have been derived from content posted to the website LessWrong.com – no claim of content originality by this author is implied or intended. Content has been duplicated and edited/expanded here for informational and research purposes only.
Nothing herein is intended to implicate or impugn any specific individual, group, or institution. The author has not specifically encountered these sorts of issues in regards to just one person or person or project.
Most people are actually well-intentioned. Unfortunately, ‘good intentions’ is not equivalent to (nor necessarily yielding of) ‘good results’, particularly where the possibility of extinction risks is concerned.
These descriptions, explanations, and discussions are not intended to be comprehensive or authoritative – they are merely indicative for the purposes of stimulating relevant/appropriate conversation.
Introduction: Bias in Evaluating AGI X-Risks
The rationality community has a tradition of checking for biases, particularly when it comes to evaluating the non-intuitive risks of general AI.
We thought you might like this list, adapted from a 2015 essay by Forrest Landry[1]. Many names of biases listed may already be familiar for you. If you “boggle” more at the text, you will find curious new connections to evaluating upcoming risks of AI developments.
About Forrest Landry
Forrest is a polymath working on civilisation design and mitigating risks of auto-scaling/catalysing technology (eg. Dark Fire scenario). About 15 years ago, he started researching how to build in deep existential alignment into the internals of AGI, applying his deep understanding of programming, embodied ethics, and meta-physics. Then, Forrest discovered the substrate-needs convergence argument (as distinct from yet much enabled by instrumental convergence). Unfortunately, because of substrate-needs convergence, any approach to aligning any AI at the embedded level turned out unsound in practice (and moreover, inconsistent with physical theory). To inquire further, see this project.
Introduction
Note on unusual formatting: Sentences are split into lines so you can parse parts precisely.
Ideally,
in any individual or group decision making,
there would be some means, processes,
and procedures in place to ensure that
the kinds of distortions and inaccuracies
introduced by individual and collective
psychological and social bias
do not lead to incorrect results,
and thus poor (risk-prone) choices,
with potentially catastrophic outcomes.
While many types of bias
are known to science
and have been observed
to be common to all people
and all social groups, the world over
in all working contexts, regardless
of their background, training, etc,
they are also largely unconscious,
being ‘built-in’ by long-term
evolutionary processes.
These unconscious cognitive biases,
while they are adaptive for the purposes
of our being able to survive in
non-technological environments,
are also not able to serve us equally well
when attempting to survive our
current technological contexts.
The changes in our
commonly experienced world
continue to occur far too fast
for our existing evolutionary
and cognitive adaptations
to adjust naturally.
We will therefore need to
add the necessary corrections to
our thinking and choice-making process,
our own evolution, ‘manually’.
The hope is that
these ‘adjustments’ might
make it possible to mitigate
the distortions and inaccuracies
introduced by the human condition
to the maximum extent possible.
Bear in mind
that each type of bias
does not just affect individuals –
they also arise due to specific
interpersonal and trans-personal effects
seen only in larger groups.[2]
These bias aspects affect
all of us, and in all sorts of ways,
many of which are complex.
It is important for everyone involved
in critical decisions and projects
to be aware of these general
and mutual concerns.
We all run on corrupted hardware.
Our minds are composed of many modules,
and the modules that evolved to make
us seem impressive and gather allies
are also evolved to subvert the ones
holding our conscious beliefs.
Even when we believe that
we are working on something
that may ultimately determine
the fate of humanity, our signaling
modules may hijack our goals so as
to optimize for persuading outsiders
that we are working on the goal,
instead of optimizing for
achieving the goal.
What is intended herein
is to make some of these
unconscious processes conscious,
to provide a basis, and
to identify the need,
for clear conversation
about these topics.
Hopefully, as a result
of these conversations,
and with the possibility
of a reasonable consensus reached,
we will be able to identify (or create)
a good general practice of decision making,
which when implemented both
individually and collectively
(though perhaps not easily),
can materially improve
our mutual situation.
The need for these practices
of accuracy, precision, and correctness
in decision making are especially
acute in proportion to the degree
that we all find ourselves faced
with a seemingly ever increasing
number of situations for
which our evolution has
not yet prepared us.
Where the true goal
is making rational, realistic,
and reasonably good choices
about matters that may
potentially involve many people,
larger groups and tribes, etc,
many specific and strong
cognitive and social biases
will need to be compensated for.
Particularly in regards
to category 1 and 2 extinction risks
nothing less than complete and full
compensation for all bias,
and the complete application
of correct reason
can be allowed for.
This sequence will not attempt
to outline or validate any of the
specific risk possibilities and outcomes
for which there is significant concern
(this is done elsewhere).
Nor can it attempt to outline or define
which or what means, processes, or procedures
should be used for effective individual
or group decision making.
As the ‘general problem of governance’,
the main issue remains one of the identification,
development, and testing/refining of
such means and methods by which
all bias can be compensated for,
and a basis for clear reason
thereby created.
Hopefully this will
lead to real techniques of
group decision making –
and high quality decisions –
that can be realistically defined,
outlined, and implemented.
A Partial List Of Affecting Bias...
The next posts cover a list
of some of the known types of bias
that have a significant and real potential to
harmfully affect the accuracy and correctness
of extinction risk assessments.
Each bias will be given its
common/accepted consensus name,
along with relevant links to Wikipedia
articles with more details.[3]
Each bias will be briefly described
with particular regard to its potential impact
on risk assessment in an existential context.[4]
Some of the remarks and observations herein
have been derived from content posted
to the website LessWrong.com –
no claim of content originality by
this author is implied or intended.
Content has been duplicated
and edited/expanded here for
informational and research purposes only.
Nothing herein is intended
to implicate or impugn any
specific individual, group, or institution.
The author has not specifically encountered
these sorts of issues in regards to
just one person or person or project.
Most people are actually well-intentioned.
Unfortunately,
‘good intentions’ is not equivalent to
(nor necessarily yielding of)
‘good results’, particularly where
the possibility of extinction risks
is concerned.
All of the descriptive notations regarding
the specific characteristics of each bias
have been derived from Wikipedia.
These descriptions, explanations, and discussions
are not intended to be comprehensive
or authoritative – they are merely
indicative for the purposes of stimulating
relevant/appropriate conversation.