Seems like there’s a potential solution to ELK-like problems. If you can force the information to move from the AI’s ontology to (it’s model of) a human’s ontology and then force it to move it back again.
This gets around “basic” deception since we can always compare the AI’s ontology before and after the translation.
The question is how do we force the knowledge to go through the (modeled) human’s ontology, and how do we know the forward and backward translators aren’t behaving badly in some way.
Seems like there’s a potential solution to ELK-like problems. If you can force the information to move from the AI’s ontology to (it’s model of) a human’s ontology and then force it to move it back again.
This gets around “basic” deception since we can always compare the AI’s ontology before and after the translation.
The question is how do we force the knowledge to go through the (modeled) human’s ontology, and how do we know the forward and backward translators aren’t behaving badly in some way.