“Myopia” involves competitiveness reduction to the extent that the sort of side effects it’s trying to rule out are useful. Is the real-world example of speculative execution (and related tech) informative as a test case?
One simple version is that when doing computation A, you adjust your probability of having to do computations like computation A, so that in the future you can do that computation more quickly. But this means there’s a side channel that can be used to extract information that should be private; this is what Spectre and Meltdown were about. Various mitigations were proposed, various additional attacks developed, and so on; at one point I saw an analysis that suggested 10 years of improvements had to be thrown out because of these attacks, whereas others suggest that mitigation could be quite cheap.
One downside from the comparison’s point of view is that the scale is very low-level, and some of the exploits mostly deal with communication between two contemporary processes in a way that matters for some versions of factored cognition and not others (but matters a lot for private keys on public servers). Would it even be useful for parallel executions of the question-answerer to use this sort of exploit as a shared buffer?
[That is, there’s a clearly seen direct cost to algorithms that don’t use a shared buffer over algorithms that do; this question is closer to “can we estimate the unseen cost of having to be much more stringent in our other assumptions to eliminate hidden shared buffers?”.]
Is that a core part of the definition of myopia in AI/ML? I understood it only to mean that models lose accuracy if the environment (the non-measured inputs to real-world outcomes) changes significantly from the training/testing set.
Is that a core part of the definition of myopia in AI/ML?
To the best of my knowledge, the use of ‘myopia’ in the AI safety context was introduced by evhub, maybe here, and is not a term used more broadly in ML.
I understood it only to mean that models lose accuracy if the environment (the non-measured inputs to real-world outcomes) changes significantly from the training/testing set.
This is typically referred to as ‘distributional shift.’
“Myopia” involves competitiveness reduction to the extent that the sort of side effects it’s trying to rule out are useful. Is the real-world example of speculative execution (and related tech) informative as a test case?
One simple version is that when doing computation A, you adjust your probability of having to do computations like computation A, so that in the future you can do that computation more quickly. But this means there’s a side channel that can be used to extract information that should be private; this is what Spectre and Meltdown were about. Various mitigations were proposed, various additional attacks developed, and so on; at one point I saw an analysis that suggested 10 years of improvements had to be thrown out because of these attacks, whereas others suggest that mitigation could be quite cheap.
One downside from the comparison’s point of view is that the scale is very low-level, and some of the exploits mostly deal with communication between two contemporary processes in a way that matters for some versions of factored cognition and not others (but matters a lot for private keys on public servers). Would it even be useful for parallel executions of the question-answerer to use this sort of exploit as a shared buffer?
[That is, there’s a clearly seen direct cost to algorithms that don’t use a shared buffer over algorithms that do; this question is closer to “can we estimate the unseen cost of having to be much more stringent in our other assumptions to eliminate hidden shared buffers?”.]
Is that a core part of the definition of myopia in AI/ML? I understood it only to mean that models lose accuracy if the environment (the non-measured inputs to real-world outcomes) changes significantly from the training/testing set.
To the best of my knowledge, the use of ‘myopia’ in the AI safety context was introduced by evhub, maybe here, and is not a term used more broadly in ML.
This is typically referred to as ‘distributional shift.’