An unfriendly AGI with general intelligence above a human’s doesn’t need us to do any work specifying the protein folding problem for them; they’ll find it themselves in their search for solutions to “take over the world”.
First of all, we have narrow AI’s that do not exhibit Omohundro’s “Basic AI Drives”. Secondly, everyone seems to agree that it should be possible to create general AI that does (a) not exhibit those drives or (b) only exhibit AI drives to a limited extent or (c) which focuses AI drives in a manner that agrees with human volition.
The question then—regarding whether a protein-folding solver will be invented before a general AI that solves the same problem for instrumental reasons—is about the algorithmic complexity of an AI whose terminal goal is protein-folding versus an AI that does exhibit the necessary drives in order to solve an equivalent problem for instrumental reasons.
The first sub-question here is whether the aforementioned drives are a feature or a side-effect of general AI. Whether those drives have to be an explicit feature of a general AI or if they are an implicit consequence. The belief around here seems the be the latter.
Given that the necessary drives are implicit, the second sub-question is then about the point at which mostly well-behaved (bounded) AI systems become motivated to act in unbounded and catastrophic ways.
My objections to Omohundro’s “Basic AI Drives” are basically twofold: (a) I do not believe that AIs designed by humans will ever exhibit Omohundro’s “Basic AI Drives” in an unbounded fashion and (b) I believe that AIs that do exhibit Omohundro’s “Basic AI Drives” are either infeasible or require a huge number of constrains to work at all.
(a) The point of transition (step 4 below) between systems that do not exhibit Omohundro’s “Basic AI Drives” and those that do is too vague to count as a non-negligible hypothesis:
(1) Present-day software is better than previous software generations at understanding and doing what humans mean.
(2) There will be future generations of software which will be better than the current generation at understanding and doing what humans mean.
(3) If there is better software, there will be even better software afterwards.
(4) Magic happens.
(5) Software will be superhuman good at understanding what humans mean but catastrophically worse than all previous generations at doing what humans mean.
(b) An AI that does exhibit Omohundro’s “Basic AI Drives” would be paralyzed by infinite choice and low-probability hypotheses that imply vast amounts of expected utility.
There is an infinite choice of paperclip designs to choose from and a choosing a wrong design could have negative consequences that are in the range of −3^^^^3 utils.
Such an AI will not even be able to decide if trying to acquire unlimited computationally resources was instrumentally rational because without more resources it will be unable to decide if the actions that are required to acquire those resources might be instrumentally irrational from the perspective of what it is meant to do (that any terminal goal can be realized in an infinite number of ways, implies an infinite number of instrumental goals to choose from).
Another example is self-protection, which requires a definition of “self”, or otherwise the AI risks destroying itself.
Well, I’ve argued with you about (a) in the past, and it didn’t seem to go anywhere, so I won’t repeat that.
With regards to (b), that sounds like a good list of problems we need to solve in order to obtain AGI. I’m sure someone somewhere is already working on them.
First of all, we have narrow AI’s that do not exhibit Omohundro’s “Basic AI Drives”. Secondly, everyone seems to agree that it should be possible to create general AI that does (a) not exhibit those drives or (b) only exhibit AI drives to a limited extent or (c) which focuses AI drives in a manner that agrees with human volition.
The question then—regarding whether a protein-folding solver will be invented before a general AI that solves the same problem for instrumental reasons—is about the algorithmic complexity of an AI whose terminal goal is protein-folding versus an AI that does exhibit the necessary drives in order to solve an equivalent problem for instrumental reasons.
The first sub-question here is whether the aforementioned drives are a feature or a side-effect of general AI. Whether those drives have to be an explicit feature of a general AI or if they are an implicit consequence. The belief around here seems the be the latter.
Given that the necessary drives are implicit, the second sub-question is then about the point at which mostly well-behaved (bounded) AI systems become motivated to act in unbounded and catastrophic ways.
My objections to Omohundro’s “Basic AI Drives” are basically twofold: (a) I do not believe that AIs designed by humans will ever exhibit Omohundro’s “Basic AI Drives” in an unbounded fashion and (b) I believe that AIs that do exhibit Omohundro’s “Basic AI Drives” are either infeasible or require a huge number of constrains to work at all.
(a) The point of transition (step 4 below) between systems that do not exhibit Omohundro’s “Basic AI Drives” and those that do is too vague to count as a non-negligible hypothesis:
(1) Present-day software is better than previous software generations at understanding and doing what humans mean.
(2) There will be future generations of software which will be better than the current generation at understanding and doing what humans mean.
(3) If there is better software, there will be even better software afterwards.
(4) Magic happens.
(5) Software will be superhuman good at understanding what humans mean but catastrophically worse than all previous generations at doing what humans mean.
(b) An AI that does exhibit Omohundro’s “Basic AI Drives” would be paralyzed by infinite choice and low-probability hypotheses that imply vast amounts of expected utility.
There is an infinite choice of paperclip designs to choose from and a choosing a wrong design could have negative consequences that are in the range of −3^^^^3 utils.
Such an AI will not even be able to decide if trying to acquire unlimited computationally resources was instrumentally rational because without more resources it will be unable to decide if the actions that are required to acquire those resources might be instrumentally irrational from the perspective of what it is meant to do (that any terminal goal can be realized in an infinite number of ways, implies an infinite number of instrumental goals to choose from).
Another example is self-protection, which requires a definition of “self”, or otherwise the AI risks destroying itself.
Well, I’ve argued with you about (a) in the past, and it didn’t seem to go anywhere, so I won’t repeat that.
With regards to (b), that sounds like a good list of problems we need to solve in order to obtain AGI. I’m sure someone somewhere is already working on them.