I propose that we ought to have less faith in our ability to control AI or its worldview and place more effort into making sure that potential AIs exist in a sociopolitical environment where it is to their benefit not to destroy us.
This is probably the crux of our disagreement. If an AI is indeed powerful enough to wrest power from humanity, the catastrophic convergence conjecture implies that it by default will. And if the AI is indeed powerful enough to wrest power from humanity, I have difficulty envisioning things we could offer it in trade that it couldn’t just unilaterally satisfy for itself in a cheaper and more efficient manner.
As an intuition pump for this, I think that the AI-human power differential will be more similar to the human-animal differential than the company-human differential. In the latter case, the company actually relies on humans for continued support (something an AI that can roll-out human-level AI won’t need to do at some point) and thus has to maintain a level of trust. In the former case, well… people don’t really negotiate with animals at all.
This is probably the crux of our disagreement. If an AI is indeed powerful enough to wrest power from humanity, the catastrophic convergence conjecture implies that it by default will. And if the AI is indeed powerful enough to wrest power from humanity, I have difficulty envisioning things we could offer it in trade that it couldn’t just unilaterally satisfy for itself in a cheaper and more efficient manner.
As an intuition pump for this, I think that the AI-human power differential will be more similar to the human-animal differential than the company-human differential. In the latter case, the company actually relies on humans for continued support (something an AI that can roll-out human-level AI won’t need to do at some point) and thus has to maintain a level of trust. In the former case, well… people don’t really negotiate with animals at all.