I just realized that my ‘Interests’ list on The Facebook made an okay ad hoc list of fields potentially related to Friendliness-like philosophy. They sorta kinda flow into each other by relatedness and are not particularly prioritized. This is my own incomplete list, though it was largely inspired by many conversations with Singularity Institute folk. Plus signs mean I’m moderately more certain that the existing field (or some rationalist re-interpretation of it) has useful insights.
However I don’t suggest you start hacking away at these fields until you have at least one sound ontology from which you can bootstrap by adding concepts in a coherent fashion (without becoming attached to that ontology or implicit metaontology). Unfortunately, this branch of rationality has not been discussed much on Less Wrong. In the meantime, just reading a huge effing amount of diverse material and looking for interesting connections and patterns is probably the next best bet, though I’m not really sure. Reading the Wikipedia articles on all of the above fields seems like a decent place to start, though I give no assurance of quality.
As far as I know, nobody in the world is making a systematic effort to do Friendliness philosophy; though some have at least started to ask some fundamental questions. What is a preference? What is a decision? What is reality? We do not yet know how to get an AI to do that kind of philosophy for us, nor how to give the AI metaphilosophy. Meanwhile, none of these problems seem like obstacles for those whose accidental aim is uFAI. Just some food for thought.
Fields related to Friendliness philosophy
I just realized that my ‘Interests’ list on The Facebook made an okay ad hoc list of fields potentially related to Friendliness-like philosophy. They sorta kinda flow into each other by relatedness and are not particularly prioritized. This is my own incomplete list, though it was largely inspired by many conversations with Singularity Institute folk. Plus signs mean I’m moderately more certain that the existing field (or some rationalist re-interpretation of it) has useful insights.
Friendliness philosophy:
+Epistemology (formal, Bayesian, reflective, group)
Axiology
+Singularity (seed AI, universal AI drives, neuromorphic/emulation/de novo/kludge timelines, etc)
+Cosmology (Tegmark-like stuff, shake vigorously with decision theory, don’t get attached to ontologies/intuitions)
Physics (Quantum MWI, etc)
Metaphysics
+Ontology of agency (Yudkowsky (kind of), Parfit, Buddha; limited good condensed stuff seemingly)
+Ontology (probably grounded in algorithmic information theory / theoretical computer science ideas)
Ontologyology (abstract Turing equivalence, et cetera)
+Metaphilosophy (teaching ourselves to teach an AI to do philosophy)
+Cognitive science (computational cognitive science especially)
Neuroscience (affective neuroscience)
Machine learning (reinforcement learners, Monte Carlo)
+Computer science (super theoretical)
+Algorithmic probability theory (algorithmic information theory, universal induction, etc)
+Decision theory (updateless-like)
Optimal control theory (stochastic, distributed; interestingly harder than it looks)
+Bayesian probability theory (for building intuitions, mostly, but generally useful)
Rationality
Dynamical systems (attractors, stability)
+Complex systems (multilevel selection, hierarchical stuff, convergent patterns / self-similarity)
Cybernetics (field kind of disintegrated AFAIK, complex systems took over)
Microeconomics (AGI negotiation stuff, human preference negotiation at different levels of organization)
+Meta-ethics (Bostrom)
Morality (Parfit)
Moral psychology
Evolutionary game theory
+Evolutionary psychology (where human preferences come from (although again, universal/convergent patterns))
+Evolutionary biology (how preferences evolve, convergent features, etc)
Evolutionary developmental biology
Dual inheritance theory (where preferences come from, different ontology and level of organization, see also memetics)
Computational sociology (how cultures’ preferences change over time)
Epidemiology (for getting intuitions about how beliefs/preferences (memes) spread)
Aesthetics (elegance, Occam-ness, useful across many domains)
Buddhism (Theravada, to a lesser extent Zen; basically rationality with a different ontology and more emphasis on understanding oneself/onenotself)
Jungian psychology (mostly archetypes)
Psychoanalysis (id/ego/super-ego, defense mechanisms)
Transpersonal psychology (Maslow’s hierarchy, convergent spiritual experiences, convergent superstimuli for reinforcement learners, etc)
Et cetera