Trying to think of some examples, it seems to me that what matters is simply the presence of features that are “decision-relevant with respect to the agent’s goals”. [...]
So, I think my motivation (which didn’t make it into the parable) for the “cheap to detect features that correlate with decision-relevant expensive to detect features” heuristic is that I’m thinking in terms of naïve Bayes models. You imagine a “star-shaped” causal graph with a central node (whose various values represent the possible categories you might want to assign an entity to), with arrows pointing to various other nodes (which represent various features of the entity). (That is, we’re assuming that the features of the entity are conditionally independent given category membership: P(X|C) = Π_i P(X_i|C).) Then when we observe some subset of features, we can use that to update our probabilities of category-membership, and use that to update our probabilities of the features we haven’t observed yet. The “category” node doesn’t actually “exist” out there in the world—its something we construct to help factorize our probability distribution over the features (which do “exist”).
So, as AI designers, we’re faced with the question of how we want the “category” node to work. I’m pretty sure there’s going to be a mathematically correct answer to this that I just don’t know (yet) because I don’t study enough and haven’t gotten to Chapter 17 of Daphne Koller and the Methods of Rationality. Since I’m not there yet, if I just take at intuitive amateur guess at how I might expect this to work, it seems pretty intuitively plausible that we’re going to want the category node to be especially sensitive to cheap-to-observe features that correlate with goal-relevant features? Like, yes, we ultimately just want to know as much as possible about the decision-relevant variables, but if some observations are more expensive to make than others, that seems like the sort of thing the network should be able to take into account, right??
Remember those 2% of otherwise ordinary bleggs that contain palladium? Personally, I’d want a category for those
I agree that “things that look like ‘bleggs’ that contain palladium” is a concept that you want to be able to think about. (I just described it in words, therefore it’s representable!) But while working on the sorting line, your visual system’s pattern-matching faculties aren’t going to spontaneously invent “palladium-containing bleggs” as a thing to look out for if you don’t know any way to detect them, whereas if adapted bleggs tend to look different in ways you can see, then that category is something your brain might just “learn from experience.” In terms of the naïve Bayes model, I’m sort of assuming that the 2% of palladium containing non-adapted bleggs are “flukes”: that variable takes that value with that probability independently of the other blegg features. I agree that if that assumption were wrong, then that would be really valuable information, and if you suspect that assumption is wrong, then you should definitely be on the lookout for ways to spot palladium-containing bleggs.
But like, see this thing I’m at least trying to do here, where I think there’s learnable statistical structure in the world that I want to describe using language? That’s pretty important! I can totally see how, from your perspective, on certain object-level applications, you might suspect that the one who says, “Hey! Categories aren’t even ‘somewhat’ arbitrary! There’s learnable statistical structure in the world; that’s what categories are for!” is secretly being driven by nefarious political motivations. But I hope you can also see how, from my perspective, I might suspect that the one who says, “Categories are somewhat arbitrary; the one who says otherwise is secretly being driven by nefarious political motivations” is secretly being driven by political motivations that have pretty nefarious consequences for people like me trying to use language to reason about the most important thing in my life, even if the psychological foundation of the political motivation is entirely kindhearted.
Since I’m not there yet, if I just take at intuitive amateur guess at how I might expect this to work, it seems pretty intuitively plausible that we’re going to want the category node to be especially sensitive to cheap-to-observe features that correlate with goal-relevant features? Like, yes, we ultimately just want to know as much as possible about the decision-relevant variables, but if some observations are more expensive to make than others, that seems like the sort of thing the network should be able to take into account, right??
I think the mathematically correct thing here is to use something like the expectation maximization algorithm. Let’s say you have a dataset that is a list of elements, each of which has some subset of its attributes known to you, and the others unknown. EM does the following:
Start with some parameters (parameters tell you things like what the cluster means/covariance matrices are; it’s different depending on the probabilistic model)
Use your parameters, plus the observed variables, to infer the unobserved variables (and cluster assignments) and put Bayesian distributions over them
Do something mathematically equivalent to generating a bunch of “virtual” datasets by sampling the unobserved variables from these distributions, then setting the parameters to assign high probability to the union of these virtual datasets (EM isn’t usually described this way but it’s easier to think about IMO)
Repeat starting from step 2
This doesn’t assign any special importance to observed features. Since step 3 is just a function of the virtual datasets (not taking into account additional info about which variables are easy to observe), they’re going to take all the features, observable or not, into account. However, the hard-to-observe features are going to have more uncertainty to them, which affects the virtual datasets. With enough data, this shouldn’t matter that much, but the argument for this is a little complicated.
Another way to solve this problem (which is easier to reason about) is by fully observing a sufficiently high number of samples. Then there isn’t a need for EM, you can just do clustering (or whatever other parameter fitting) on the dataset (actually, clustering can be framed in terms of EM, but doesn’t have to be). Of course, this assigns no special importance to easy-to-observe features. (After learning the parameters, we can use them to infer the unobserved variables probabilistically)
Philosophically, “functions of easily-observed features” seem more like percepts than concepts (this post describes the distinction). These are still useful, and neural nets are automatically going to learn high-level percepts (i.e. functions of observed features), since that’s what the intermediate layers are optimized for. However, a Bayesian inference method isn’t going to assign special importance to observed features, as it treats the observations as causally downstream of the ontological reality rather than causally upstream of it.
I share jessicata’s feeling that the best set of concepts to work with may not be very sensitive to what’s easy to detect. This might depend a little on how we define “concepts”, and you’re right that your visual system or some other fairly “early” bit of processing may well come up with ways of lumping things together, and that that will be dependent on what’s easy to detect, whether or not we want to call those things concepts or categories or percepts or whatever else.
But in the cases I can think of where it’s become apparent that some set of categories needs refinement, there doesn’t seem to be a general pattern of basing that refinement on the existence of convenient detectable features. (Except in the too-general sense that everything ultimately comes down to empirical observation.)
I don’t think your political motivations are nefarious, and I don’t think there’s anything wrong with a line of thinking that goes “hmm, it seems like the way a lot of people think about X makes them misunderstand an important thing in my life really badly; let’s see what other ways one could think about X, because they might be better”—other than that “hard cases make bad law”, and that it’s easy to fall into an equal-and-opposite error where you think about X in a way that would make you misunderstand a related important thing in other people’s lives. The political hot potato we’re discussing here demonstrably is one where some people have feelings that (so far as I can tell) are as strong as yours and of opposite sign, after all. (Which may suggest, by the way, that if you want an extra category then you may actually need two or more extra categories: “adapted bleggs” may have fundamental internal differences from one another. [EDITED to add:] … And indeed your other writings on this topic do propose two or more extra categories.)
I am concerned that we are teetering on the brink of—if we have not already fallen into—exactly the sort of object-level political/ideological/personal argument that I was worried about when you first posted this. Words like “nefarious” and “terrorist” seem like a warning sign. So I’ll limit my response to that part of what you say to this: It is not at all my intention to endorse any way of talking to you, or anyone else, that makes you, or anyone else, feel the way you describe feeling in that “don’t negotiate with terrorist memeplexes” article.
I share jessicata’s feeling that the best set of concepts to work with may not be very sensitive to what’s easy to detect. [...] there doesn’t seem to be a general pattern of basing that refinement on the existence of convenient detectable features
Yeah, I might have been on the wrong track there. (Jessica’s comment is great! I need to study more!)
I am concerned that we are teetering on the brink of—if we have not already fallen into—exactly the sort of object-level political/ideological/personal argument that I was worried about
I think we’re a safe distance from the brink.
Words like “nefarious” and “terrorist” seem like a warning sign
“Nefarious” admittedly probably was a high-emotional-temperature warning sign (oops), but in this case, “I don’t negotiate with terrorists” is mostly functioning as the standard stock phrase to evoke the timeless-decision-theoretic “don’t be extortable” game-theory intuition, which I don’t think should count as a warning sign, because it would be harder to communicate if people had to avoid genuinely useful metaphors because they happened to use high-emotional-valence words.
So, I think my motivation (which didn’t make it into the parable) for the “cheap to detect features that correlate with decision-relevant expensive to detect features” heuristic is that I’m thinking in terms of naïve Bayes models. You imagine a “star-shaped” causal graph with a central node (whose various values represent the possible categories you might want to assign an entity to), with arrows pointing to various other nodes (which represent various features of the entity). (That is, we’re assuming that the features of the entity are conditionally independent given category membership: P(X|C) = Π_i P(X_i|C).) Then when we observe some subset of features, we can use that to update our probabilities of category-membership, and use that to update our probabilities of the features we haven’t observed yet. The “category” node doesn’t actually “exist” out there in the world—its something we construct to help factorize our probability distribution over the features (which do “exist”).
So, as AI designers, we’re faced with the question of how we want the “category” node to work. I’m pretty sure there’s going to be a mathematically correct answer to this that I just don’t know (yet) because I don’t study enough and haven’t gotten to Chapter 17 of Daphne Koller and the Methods of Rationality. Since I’m not there yet, if I just take at intuitive amateur guess at how I might expect this to work, it seems pretty intuitively plausible that we’re going to want the category node to be especially sensitive to cheap-to-observe features that correlate with goal-relevant features? Like, yes, we ultimately just want to know as much as possible about the decision-relevant variables, but if some observations are more expensive to make than others, that seems like the sort of thing the network should be able to take into account, right??
I agree that “things that look like ‘bleggs’ that contain palladium” is a concept that you want to be able to think about. (I just described it in words, therefore it’s representable!) But while working on the sorting line, your visual system’s pattern-matching faculties aren’t going to spontaneously invent “palladium-containing bleggs” as a thing to look out for if you don’t know any way to detect them, whereas if adapted bleggs tend to look different in ways you can see, then that category is something your brain might just “learn from experience.” In terms of the naïve Bayes model, I’m sort of assuming that the 2% of palladium containing non-adapted bleggs are “flukes”: that variable takes that value with that probability independently of the other blegg features. I agree that if that assumption were wrong, then that would be really valuable information, and if you suspect that assumption is wrong, then you should definitely be on the lookout for ways to spot palladium-containing bleggs.
But like, see this thing I’m at least trying to do here, where I think there’s learnable statistical structure in the world that I want to describe using language? That’s pretty important! I can totally see how, from your perspective, on certain object-level applications, you might suspect that the one who says, “Hey! Categories aren’t even ‘somewhat’ arbitrary! There’s learnable statistical structure in the world; that’s what categories are for!” is secretly being driven by nefarious political motivations. But I hope you can also see how, from my perspective, I might suspect that the one who says, “Categories are somewhat arbitrary; the one who says otherwise is secretly being driven by nefarious political motivations” is secretly being driven by political motivations that have pretty nefarious consequences for people like me trying to use language to reason about the most important thing in my life, even if the psychological foundation of the political motivation is entirely kindhearted.
I think the mathematically correct thing here is to use something like the expectation maximization algorithm. Let’s say you have a dataset that is a list of elements, each of which has some subset of its attributes known to you, and the others unknown. EM does the following:
Start with some parameters (parameters tell you things like what the cluster means/covariance matrices are; it’s different depending on the probabilistic model)
Use your parameters, plus the observed variables, to infer the unobserved variables (and cluster assignments) and put Bayesian distributions over them
Do something mathematically equivalent to generating a bunch of “virtual” datasets by sampling the unobserved variables from these distributions, then setting the parameters to assign high probability to the union of these virtual datasets (EM isn’t usually described this way but it’s easier to think about IMO)
Repeat starting from step 2
This doesn’t assign any special importance to observed features. Since step 3 is just a function of the virtual datasets (not taking into account additional info about which variables are easy to observe), they’re going to take all the features, observable or not, into account. However, the hard-to-observe features are going to have more uncertainty to them, which affects the virtual datasets. With enough data, this shouldn’t matter that much, but the argument for this is a little complicated.
Another way to solve this problem (which is easier to reason about) is by fully observing a sufficiently high number of samples. Then there isn’t a need for EM, you can just do clustering (or whatever other parameter fitting) on the dataset (actually, clustering can be framed in terms of EM, but doesn’t have to be). Of course, this assigns no special importance to easy-to-observe features. (After learning the parameters, we can use them to infer the unobserved variables probabilistically)
Philosophically, “functions of easily-observed features” seem more like percepts than concepts (this post describes the distinction). These are still useful, and neural nets are automatically going to learn high-level percepts (i.e. functions of observed features), since that’s what the intermediate layers are optimized for. However, a Bayesian inference method isn’t going to assign special importance to observed features, as it treats the observations as causally downstream of the ontological reality rather than causally upstream of it.
I share jessicata’s feeling that the best set of concepts to work with may not be very sensitive to what’s easy to detect. This might depend a little on how we define “concepts”, and you’re right that your visual system or some other fairly “early” bit of processing may well come up with ways of lumping things together, and that that will be dependent on what’s easy to detect, whether or not we want to call those things concepts or categories or percepts or whatever else.
But in the cases I can think of where it’s become apparent that some set of categories needs refinement, there doesn’t seem to be a general pattern of basing that refinement on the existence of convenient detectable features. (Except in the too-general sense that everything ultimately comes down to empirical observation.)
I don’t think your political motivations are nefarious, and I don’t think there’s anything wrong with a line of thinking that goes “hmm, it seems like the way a lot of people think about X makes them misunderstand an important thing in my life really badly; let’s see what other ways one could think about X, because they might be better”—other than that “hard cases make bad law”, and that it’s easy to fall into an equal-and-opposite error where you think about X in a way that would make you misunderstand a related important thing in other people’s lives. The political hot potato we’re discussing here demonstrably is one where some people have feelings that (so far as I can tell) are as strong as yours and of opposite sign, after all. (Which may suggest, by the way, that if you want an extra category then you may actually need two or more extra categories: “adapted bleggs” may have fundamental internal differences from one another. [EDITED to add:] … And indeed your other writings on this topic do propose two or more extra categories.)
I am concerned that we are teetering on the brink of—if we have not already fallen into—exactly the sort of object-level political/ideological/personal argument that I was worried about when you first posted this. Words like “nefarious” and “terrorist” seem like a warning sign. So I’ll limit my response to that part of what you say to this: It is not at all my intention to endorse any way of talking to you, or anyone else, that makes you, or anyone else, feel the way you describe feeling in that “don’t negotiate with terrorist memeplexes” article.
Yeah, I might have been on the wrong track there. (Jessica’s comment is great! I need to study more!)
I think we’re a safe distance from the brink.
“Nefarious” admittedly probably was a high-emotional-temperature warning sign (oops), but in this case, “I don’t negotiate with terrorists” is mostly functioning as the standard stock phrase to evoke the timeless-decision-theoretic “don’t be extortable” game-theory intuition, which I don’t think should count as a warning sign, because it would be harder to communicate if people had to avoid genuinely useful metaphors because they happened to use high-emotional-valence words.