This is probably the wrong place to respond to the notion of incommensurable ontologies. Oh well, sorry.
While I agree that if an agent has a thoroughly incommensurable ontology, alignment is impossible (or perhaps even meaningless or incoherent), it also means that the agent has no access whatsoever to human science. If it can’t understand what we want, it also can’t understand what we’ve accomplished. To be more concrete, it will not understand electrons from any of our books, because it won’t understand our books. It won’t understand our equations, because it won’t understand equations nor will it have referents (neither theoretical nor observational) for the variables and entities contained there.
Consequently, it will have to develop science and technology from scratch. It took a long time for us to do that, and it will take that agent a long time to do it. Sure, it’s “superintelligent,” but understanding the physical world requires empirical work. That is time-consuming, it requires tools and technology, etc. Furthermore, an agent with an incommensurable ontology can’t manipulate humans effectively—it doesn’t understand us at all, aside from what it observes, which is a long, slow way to learn about us. Indeed it doesn’t even know that we are a threat, nor does it know what a threat is.
Long story short, it will be a long time—decades? Centuries? before such an agent would be able to prevent us from simply unplugging it. Science does not and cannot proceed at the speed of computation, so all of the “exponential improvement” in its “intelligence” is limited by the pace of knowledge growth.
Now, what if it has some purchase on human ontology? Well, then, it seems likely that it can grow that to a sufficient subset and in that way we can understand each other sufficiently well—it can understand our science, but also it can understand our values.
The point if you have one you’re likely to have the other. Of course, this does not mean that it will align with those values. But the incommensurable ontology argument just reduces to an argument for slow takeoff.
This is probably the wrong place to respond to the notion of incommensurable ontologies. Oh well, sorry.
While I agree that if an agent has a thoroughly incommensurable ontology, alignment is impossible (or perhaps even meaningless or incoherent), it also means that the agent has no access whatsoever to human science. If it can’t understand what we want, it also can’t understand what we’ve accomplished. To be more concrete, it will not understand electrons from any of our books, because it won’t understand our books. It won’t understand our equations, because it won’t understand equations nor will it have referents (neither theoretical nor observational) for the variables and entities contained there.
Consequently, it will have to develop science and technology from scratch. It took a long time for us to do that, and it will take that agent a long time to do it. Sure, it’s “superintelligent,” but understanding the physical world requires empirical work. That is time-consuming, it requires tools and technology, etc. Furthermore, an agent with an incommensurable ontology can’t manipulate humans effectively—it doesn’t understand us at all, aside from what it observes, which is a long, slow way to learn about us. Indeed it doesn’t even know that we are a threat, nor does it know what a threat is.
Long story short, it will be a long time—decades? Centuries? before such an agent would be able to prevent us from simply unplugging it. Science does not and cannot proceed at the speed of computation, so all of the “exponential improvement” in its “intelligence” is limited by the pace of knowledge growth.
Now, what if it has some purchase on human ontology? Well, then, it seems likely that it can grow that to a sufficient subset and in that way we can understand each other sufficiently well—it can understand our science, but also it can understand our values.
The point if you have one you’re likely to have the other. Of course, this does not mean that it will align with those values. But the incommensurable ontology argument just reduces to an argument for slow takeoff.
I’ve published this point as part of a paper in Informatica. https://www.informatica.si/index.php/informatica/article/view/1875