Agent, oracle and tool are not clearly differenciated. I question wether we should differenciate these types the way Bostrums does. Katja last week drew a 4-quadrant classification scheme with dimensions “goal-directedness” and “oversight”. Realisations of AI would be classified into sovereign|genie|autonomous tool|oracle(tool) by some arbitrarily defined thresholds.
I love her idea to introduce dimensions, but I think this entire classification scheme is not helpful for our control debate. AI realisations will have a multitude of dimensions. Tagging certain realisations with a classification title may help to explain dimensions by typified examples. We should not discuss safety of isolated castes. We do not have castes, we will have different kinds of AIs that will be different in their capabilities and their restrictions. The higher the capability, the more sophisticated restrictive measures must be.
On the dimension goal directedness: Bostrum seems to love the concept of final goal (German: “Endziel”). After achieving a final goal there is emptiness, nothing remains to be done. This concept that is foreign to evolution. Evolution is not about final goals. Evolution has an ethernal goal: survival. To survive it is neccessary to be fit enough to survive long enough to generate offspring and protect and train it long enough until it can protect itself. If grandparent generation is available they serve as backup for parent generation and further safeguard and source of experience for the young endangered offspring.
Instrumental goals in evolution are: Nutrition, looking for protection, learning, offspring generation, protecting, teaching. These instrumental goals are paired with senses, motivations and drives:
hunger/thirst, heat-sense/smelling/tasting/vision/hearing/fear, curiosity/playing, social behavior/sexuality, dominance behaviour/physical activity, teaching motivation.
All instrumental goals have to be met at least for a certain amount to achieve the ethernal goal: survival of species.
To define final goals as Bostrum points out on many occasions is dangerous and could lead to UFAI. To debate non-goal-directed types of AI is leading to nowhere. Non-goal-directed AI would do nothing else than thermodynamics: entropy will rise. To clarify our discussion we should state:
Any AGI has goal directedness. Number and complexity of goals will differ significantly.
Goals are fuzzy and can be contradictory. Partial solutions are acceptable for most goals.
Goal-directedness is a priority measure in a diversity of goals.
Any AGI has learning functionality.
Safe FAI will have repellent behavior towards dangerous actions or states. (Anti-goals or taboos)
Oversight over goals and taboos should be done by independent entities. (non-accessible to the AI)
Bostrum uses often goal and puts aside that we do not have to discuss about the end of the way but about the route and how to steer development if possible. A goal can be a “guiding star” if a higher entity knows it guides toward e.g. Bethlehem. Bostrums guiding star seems to be CE via FAI. Our knowledge about FAI is not advanced enough that we could formulate final goals or utility functions. Therefore I recommend not to focus our debate on diffuse final goal but on dimensions and gradients that point away from UFAI and towards controllability, transparency and friendliness.
Agent, oracle and tool are not clearly differenciated. I question wether we should differenciate these types the way Bostrums does. Katja last week drew a 4-quadrant classification scheme with dimensions “goal-directedness” and “oversight”. Realisations of AI would be classified into sovereign|genie|autonomous tool|oracle(tool) by some arbitrarily defined thresholds.
I love her idea to introduce dimensions, but I think this entire classification scheme is not helpful for our control debate. AI realisations will have a multitude of dimensions. Tagging certain realisations with a classification title may help to explain dimensions by typified examples. We should not discuss safety of isolated castes. We do not have castes, we will have different kinds of AIs that will be different in their capabilities and their restrictions. The higher the capability, the more sophisticated restrictive measures must be.
On the dimension goal directedness: Bostrum seems to love the concept of final goal (German: “Endziel”). After achieving a final goal there is emptiness, nothing remains to be done. This concept that is foreign to evolution. Evolution is not about final goals. Evolution has an ethernal goal: survival. To survive it is neccessary to be fit enough to survive long enough to generate offspring and protect and train it long enough until it can protect itself. If grandparent generation is available they serve as backup for parent generation and further safeguard and source of experience for the young endangered offspring.
Instrumental goals in evolution are: Nutrition, looking for protection, learning, offspring generation, protecting, teaching.
These instrumental goals are paired with senses, motivations and drives: hunger/thirst, heat-sense/smelling/tasting/vision/hearing/fear, curiosity/playing, social behavior/sexuality, dominance behaviour/physical activity, teaching motivation.
All instrumental goals have to be met at least for a certain amount to achieve the ethernal goal: survival of species.
To define final goals as Bostrum points out on many occasions is dangerous and could lead to UFAI. To debate non-goal-directed types of AI is leading to nowhere. Non-goal-directed AI would do nothing else than thermodynamics: entropy will rise. To clarify our discussion we should state:
Any AGI has goal directedness. Number and complexity of goals will differ significantly.
Goals are fuzzy and can be contradictory. Partial solutions are acceptable for most goals.
Goal-directedness is a priority measure in a diversity of goals.
Any AGI has learning functionality.
Safe FAI will have repellent behavior towards dangerous actions or states. (Anti-goals or taboos)
Oversight over goals and taboos should be done by independent entities. (non-accessible to the AI)
Bostrum uses often goal and puts aside that we do not have to discuss about the end of the way but about the route and how to steer development if possible. A goal can be a “guiding star” if a higher entity knows it guides toward e.g. Bethlehem. Bostrums guiding star seems to be CE via FAI. Our knowledge about FAI is not advanced enough that we could formulate final goals or utility functions. Therefore I recommend not to focus our debate on diffuse final goal but on dimensions and gradients that point away from UFAI and towards controllability, transparency and friendliness.