Features like “edge” or “dappled” were IIRC among the first discoveries when people first started doing interp on CNNs back around 2016 or so. So they might be specific to a data modality (i.e. vision), but they’re not specific to the human brain’s learning algorithm.
“Behind” seems similar to “edge” and “dappled”, but at a higher level of abstraction; it’s something which might require a specific data modality but probably isn’t learning algorithm specific.
I buy your claim a lot more for value-loaded words, like “I’m feeling down”, the connotations of “contaminate”, and “much”. (Note that an alien mind might still reify human-value-loaded concepts in order to model humans, but that still probably involves modeling a lot of the human learning algorithm, so your point stands.)
I buy that “salient” implies an attentional spotlight, but I would guess that an attentional spotlight can be characterized without modeling the bulk of the human learning algorithm.
I buy that the semantics of “and” or “but” are pretty specific to humans’ language-structure, but I don’t actually care that much about the semantics of connectives like that. What I care about is the semantics of e.g. sentences containing “and” or “but”.
I definitely buy that analogies like “butterfingers” are a pretty large chunk of language in practice, and it sure seems hard to handle semantics of those without generally understanding analogy, and analogy sure seems like a big central piece of the human learning algorithm.
At the meta-level: I’ve been working on this natural abstraction business for four years now, and your list of examples in that comment is one of the most substantive and useful pieces of pushback I’ve gotten in that time. So the semantics frame is definitely proving useful!
One mini-project in this vein which would potentially be high-value would be for someone to go through a whole crapton of natural language examples and map out some guesses at which semantics would/wouldn’t be convergent across minds in our environment.
I think a big aspect of salience arises from dealing with commensurate variables that have a natural zero-point (e.g. physical size), because then one can rank the variables by their distance from zero, and the ones that are furthest from zero are inherently more salient. Attentional spotlights are also probably mainly useful in cases where the variables have high skewness so there are relevant places to put the spotlight.
I don’t expect this model to capture all of salience, but I expect it to capture a big chunk, and to be relevant in many other contexts too. E.g. an important aspect of “misleading” communication is to talk about the variables of smaller magnitude while staying silent about the variables of bigger magnitude.
For example, if I got attacked by a squirrel ten years ago, and it was a very traumatic experience for me, then the possibility-of-getting-attacked-by-a-squirrel will be very salient in my mind whenever I’m making decisions, even if it’s not salient to anyone else. (Squirrels are normally shy and harmless.)
In this case, under my model of salience as the biggest deviating variables, the variable I’d consider would be something like “likelihood of attacking”. It is salient to you in the presence of squirrels because all other things nearby (e.g. computers or trees) are (according to your probabilistic model) much less likely to attack, and because the risk of getting attacked by something is much more important than many other things (e.g. seeing something).
In a sense, there’s a subjectivity because different people might have different traumas, but this subjectivity isn’t such a big problem because there is a “correct” frequency with which squirrels attack under various conditions, and we’d expect the main disagreement with a superintelligence to be that it has a better estimate than we do.
A deeper subjectivity is that we care about whether we get attacked by squirrels, and we’re not powerful enough that it is completely trivial and ignorable whether squirrels attack us and our allies, so squirrel attacks are less likely to be of negligible magnitude relative to our activities.
Great examples! I buy them to varying extents:
Features like “edge” or “dappled” were IIRC among the first discoveries when people first started doing interp on CNNs back around 2016 or so. So they might be specific to a data modality (i.e. vision), but they’re not specific to the human brain’s learning algorithm.
“Behind” seems similar to “edge” and “dappled”, but at a higher level of abstraction; it’s something which might require a specific data modality but probably isn’t learning algorithm specific.
I buy your claim a lot more for value-loaded words, like “I’m feeling down”, the connotations of “contaminate”, and “much”. (Note that an alien mind might still reify human-value-loaded concepts in order to model humans, but that still probably involves modeling a lot of the human learning algorithm, so your point stands.)
I buy that “salient” implies an attentional spotlight, but I would guess that an attentional spotlight can be characterized without modeling the bulk of the human learning algorithm.
I buy that the semantics of “and” or “but” are pretty specific to humans’ language-structure, but I don’t actually care that much about the semantics of connectives like that. What I care about is the semantics of e.g. sentences containing “and” or “but”.
I definitely buy that analogies like “butterfingers” are a pretty large chunk of language in practice, and it sure seems hard to handle semantics of those without generally understanding analogy, and analogy sure seems like a big central piece of the human learning algorithm.
At the meta-level: I’ve been working on this natural abstraction business for four years now, and your list of examples in that comment is one of the most substantive and useful pieces of pushback I’ve gotten in that time. So the semantics frame is definitely proving useful!
One mini-project in this vein which would potentially be high-value would be for someone to go through a whole crapton of natural language examples and map out some guesses at which semantics would/wouldn’t be convergent across minds in our environment.
I think a big aspect of salience arises from dealing with commensurate variables that have a natural zero-point (e.g. physical size), because then one can rank the variables by their distance from zero, and the ones that are furthest from zero are inherently more salient. Attentional spotlights are also probably mainly useful in cases where the variables have high skewness so there are relevant places to put the spotlight.
I don’t expect this model to capture all of salience, but I expect it to capture a big chunk, and to be relevant in many other contexts too. E.g. an important aspect of “misleading” communication is to talk about the variables of smaller magnitude while staying silent about the variables of bigger magnitude.
For example, if I got attacked by a squirrel ten years ago, and it was a very traumatic experience for me, then the possibility-of-getting-attacked-by-a-squirrel will be very salient in my mind whenever I’m making decisions, even if it’s not salient to anyone else. (Squirrels are normally shy and harmless.)
In this case, under my model of salience as the biggest deviating variables, the variable I’d consider would be something like “likelihood of attacking”. It is salient to you in the presence of squirrels because all other things nearby (e.g. computers or trees) are (according to your probabilistic model) much less likely to attack, and because the risk of getting attacked by something is much more important than many other things (e.g. seeing something).
In a sense, there’s a subjectivity because different people might have different traumas, but this subjectivity isn’t such a big problem because there is a “correct” frequency with which squirrels attack under various conditions, and we’d expect the main disagreement with a superintelligence to be that it has a better estimate than we do.
A deeper subjectivity is that we care about whether we get attacked by squirrels, and we’re not powerful enough that it is completely trivial and ignorable whether squirrels attack us and our allies, so squirrel attacks are less likely to be of negligible magnitude relative to our activities.