RyanCarey comments on Concept Safety: The problem of alien concepts

RyanCarey 17 Apr 2015 15:33 UTC
2 points
Thanks for another interesting post.

Say that you’re an inhabitant of Flatland. But then you suddenly become aware that actually, the world is three-dimensional, and has a height dimension as well! That raises the question of, how should the “forbidden” or “allowed” area be understood in this new three-dimensional world?

This is an interesting question. I know that you plan to suggest some ideas in your next post, but let me pre-empt that with some alternatives: i) it’s underspecified: if your training set has n dimensions and your test set has n+1 dimensions, then you don’t have any infallible way to learn the relationship of that n+1th dimension with the labels. ii) it’s underspecified, but we can guess anyway: try extending the walls up infinitely (i.e. projecting the n+1 dimensional space down to the familiar plane), make predictions for some of these points in n+1 dimensional space, and see how effectively you can answer them. Check whether this new dimension is correlated with any of the existing dimensions. If so, maybe you can reduce the collinear axis to just one, and again test the results. The problem with these kinds of suggestions is that you have at least some labelled data in the higher-dimensional space, in which to test your extrapolations. iii) model some ideal reasoning agent discovering this new dimension, and behave as they would behave iv) find a preferable abstraction, in some Occam’s Razor-esque sense, there should be a ‘simplest’ abstraction that can make sense of the new world without diverging too far from your existing priors. What kind of information theoretic tool to use here is obscure to me. Would AIXI/Kolmogorov complexity make sense here?

Anyhow, will be interested to see where this leads.
- PyryP 18 Apr 2015 17:18 UTC
  7 points
  Parent
  I think we have to be really careful when bringing AIXI/Kolmogorov complexity to the discussion. My strong intuition is that AIXI is specifically the type of alien intelligence we want to avoid here. Most programming languages have very short programs that behave in wild and complex ways when analyzed from a human perspective. To an AIXI mind based on those programming languages this kind of behavior is by definition simple and preferred by Occam’s Razor. The generalizations to n+1 dimensions that this kind of an intelligence makes are almost guaranteed to be incomprehensible and totally alien to a human mind.
  
  I’m not saying that information theory is a bad approach in itself. I’m just saying that we have to be really careful what kind of primitives we want to give our AI to reason with. It’s important to remember that even though human concepts are only a constant factor away from concepts based on Turing machines, those constant factors are huge and the behavior and reasoning of an information theoretic mind are always exponentially dominated by what it finds simple in it’s own terms.