Building off Chris’ suggestion about Kolmogorov complexity, what if we consider the Kolmogorov complexity of thing we want knowledge about (e.g. the location of an object) given the ‘knowledge containing’ thing (e.g. a piece of paper with the location coordinates written on it) as input.
Wikipedia tells me this is called the ‘conditional Kolmogorov complexity’ of x (the thing we want knowledge about) given r (the state of the region potentially containing knowledge), K(x|r)
(Chris I’m not sure if I understood all of your comment, so maybe this is what you were already gesturing at.)
It seems like the problem you (Alex) see with mutual information as a metric for knowledge is that it doesn’t take into account how “useful and accessible” that information is. I am guessing that what you mean by ‘useful’ is ‘able to be used’ (i.e. if the information was ‘not useful’ to an agent but simply because the agent didn’t care about it, I’m guessing we wouldn’t want to therefore saying the knowledge isn’t there), so I’m going to take the liberty of saying “usable” here to capture the “useful and accessible” notion (But please correct me if I’m misunderstanding you).
I can see two ways that information can be less easily “usable” for a given agent. 1. Physical constraint. e.g a map is locked in a safe so it’s hard for the agent to get to it, or the map is very far away. 2. Complexity, e.g. rather than a map we have a whole bunch of readings from sensors from a gokart that has driven around the area which we want to know the layout of. This is less easily “usable” than a map, because we need a longer algorithm to extract the answers we want from it (e.g. “what road will this left turn take me to) [EDIT: Though maybe I’m equivocating between Kolmogorov complexity and runtime complexity here?] This second way of being less easily usable is what duck_master articulates in their comment (I think!).
It makes sense to me to not use sense 1 (physical constraint) in our definition of knowledge, because it seems like we want to say a map contains knowledge regardless of whether it is, in your example for another post, at the bottom of the ocean or not.
So then we’re left with sense 2, for which we can use the conditional Kolmogorov complexity to make a metric.
To be more specific, perhaps we could say that for a variable X (e.g. the location of an object), and the state r of some physical region (e.g. a map), the knowledge which r contains about X is
K(x)−K(x|r)
where x is the value of the variable X.
This seems like the kind of thing that would already have a name, so I just did some Googling and yes it looks like this is “Absolute mutual information”, notated IK(x,r).
Choosing this way to define knowledge means we include cases where the knowledge is encoded by chance—e.g. If someone draws a dot on a map at random and the dot coincidentally matches the position of an object, this metric would say that the map now does contain knowledge about the position of an object. I think this is a good thing—It means that we can e.g. consider at a rock that came in from outer space with an inscription on it and say whether it contains knowledge, without having to know about the causal process that produced those inscriptions. But if we wanted to only include cases where there’s a reliable correlation and not just chance, we could modify the metric (perhaps just modify it to the expected absolute mutual information E(IK(X,R))).
P.S. I commented on another post in this sequence with a different idea last night, but I like this idea better :)
Hm on reflection I actually don’t think this does what I thought it did. Specifically I don’t think it captures the amount of ‘complexity barrier’ reducing the usability of the information. I think I was indeed equivocating between computational (space and time) complexity, vs. Kolmogorov complexity. My suggestion captures the later, not the former.
Also, some further Googling has told me that the expected absolute mutual information, my other suggestion at the end, is “close” to Shannon mutual information (https://arxiv.org/abs/cs/0410002) so doesn’t seem like that’s actually significantly different to the mutual information option which you already discussed.
Building off Chris’ suggestion about Kolmogorov complexity, what if we consider the Kolmogorov complexity of thing we want knowledge about (e.g. the location of an object) given the ‘knowledge containing’ thing (e.g. a piece of paper with the location coordinates written on it) as input.
Wikipedia tells me this is called the ‘conditional Kolmogorov complexity’ of x (the thing we want knowledge about) given r (the state of the region potentially containing knowledge), K(x|r)
(Chris I’m not sure if I understood all of your comment, so maybe this is what you were already gesturing at.)
It seems like the problem you (Alex) see with mutual information as a metric for knowledge is that it doesn’t take into account how “useful and accessible” that information is. I am guessing that what you mean by ‘useful’ is ‘able to be used’ (i.e. if the information was ‘not useful’ to an agent but simply because the agent didn’t care about it, I’m guessing we wouldn’t want to therefore saying the knowledge isn’t there), so I’m going to take the liberty of saying “usable” here to capture the “useful and accessible” notion (But please correct me if I’m misunderstanding you).
I can see two ways that information can be less easily “usable” for a given agent. 1. Physical constraint. e.g a map is locked in a safe so it’s hard for the agent to get to it, or the map is very far away. 2. Complexity, e.g. rather than a map we have a whole bunch of readings from sensors from a gokart that has driven around the area which we want to know the layout of. This is less easily “usable” than a map, because we need a longer algorithm to extract the answers we want from it (e.g. “what road will this left turn take me to) [EDIT: Though maybe I’m equivocating between Kolmogorov complexity and runtime complexity here?] This second way of being less easily usable is what duck_master articulates in their comment (I think!).
It makes sense to me to not use sense 1 (physical constraint) in our definition of knowledge, because it seems like we want to say a map contains knowledge regardless of whether it is, in your example for another post, at the bottom of the ocean or not.
So then we’re left with sense 2, for which we can use the conditional Kolmogorov complexity to make a metric.
To be more specific, perhaps we could say that for a variable X (e.g. the location of an object), and the state r of some physical region (e.g. a map), the knowledge which r contains about X is
K(x)−K(x|r)
where x is the value of the variable X.
This seems like the kind of thing that would already have a name, so I just did some Googling and yes it looks like this is “Absolute mutual information”, notated IK(x,r).
Choosing this way to define knowledge means we include cases where the knowledge is encoded by chance—e.g. If someone draws a dot on a map at random and the dot coincidentally matches the position of an object, this metric would say that the map now does contain knowledge about the position of an object. I think this is a good thing—It means that we can e.g. consider at a rock that came in from outer space with an inscription on it and say whether it contains knowledge, without having to know about the causal process that produced those inscriptions. But if we wanted to only include cases where there’s a reliable correlation and not just chance, we could modify the metric (perhaps just modify it to the expected absolute mutual information E(IK(X,R))).
P.S. I commented on another post in this sequence with a different idea last night, but I like this idea better :)
Hm on reflection I actually don’t think this does what I thought it did. Specifically I don’t think it captures the amount of ‘complexity barrier’ reducing the usability of the information. I think I was indeed equivocating between computational (space and time) complexity, vs. Kolmogorov complexity. My suggestion captures the later, not the former.
Also, some further Googling has told me that the expected absolute mutual information, my other suggestion at the end, is “close” to Shannon mutual information (https://arxiv.org/abs/cs/0410002) so doesn’t seem like that’s actually significantly different to the mutual information option which you already discussed.