You have correctly identified the area in which we do not agree.
The most relevant knowledge needed in this case is knowledge of game theory and human behaviour. They also need to know ‘friendliness is a very hard problem’. They then need to ask themselves the following question:
What is likely to happen if people have the ability to create an AGI but do not have a proven mechanism for implementing friendliness? Is it:
Shelve the AGI, don’t share the research and set to work on creating a framework for friendliness. Don’t rush the research—act as if the groundbreaking AGI work that you just created was a mere toy problem and the only real challenge is the friendliness. Spend an even longer period of time verifying the friendliness design and never let on that you have AGI capabilities.
Something else.
What are your reasons for believing that friendliness can be formalized practically, and an AGI based on that formalization built before any other sort of AGI?
I don’t (with that phrasing). I actually suspect that the problem is too difficult to get right and far too easy to get wrong. We’re probably all going to die. However, I think we’re even more likely to die if some fool goes and invents a AGI before they have a proven theory of friendliness.
You have correctly identified the area in which we do not agree.
The most relevant knowledge needed in this case is knowledge of game theory and human behaviour. They also need to know ‘friendliness is a very hard problem’. They then need to ask themselves the following question:
What is likely to happen if people have the ability to create an AGI but do not have a proven mechanism for implementing friendliness? Is it:
Shelve the AGI, don’t share the research and set to work on creating a framework for friendliness. Don’t rush the research—act as if the groundbreaking AGI work that you just created was a mere toy problem and the only real challenge is the friendliness. Spend an even longer period of time verifying the friendliness design and never let on that you have AGI capabilities.
Something else.
I don’t (with that phrasing). I actually suspect that the problem is too difficult to get right and far too easy to get wrong. We’re probably all going to die. However, I think we’re even more likely to die if some fool goes and invents a AGI before they have a proven theory of friendliness.