Charlie Sanders comments on Science in a High-Dimensional World

Charlie Sanders 18 Jan 2022 20:18 UTC
4 points
I think this post would be stronger if it covered at least basic metrology and statistics.
It’s incorrect to say that billions of variables aren’t affecting a sled sliding down a hill—of course they’re affecting the speed, even if most are only by a few planck-lengths per hour. But, crucially, they’re mostly not affecting it to a detectable amount. The detectability threshold is the key to the argument.
For detectability, whether you notice the effects of outside variables is going to come down to the precision of the instrument that you’re using to measure your output. If you’re using a radar gun that gives readings to the nearest MPH, for example, you won’t perceive a difference between 10.1 and 10.2 MPH, and so to you the two are equivalent. Nonetheless, outside variables have absolutely influenced the two readings differently.
Equally critical is the number of measurements that you’re taking. For example, if you’re taking repeated measurements after controlling a certain set of variables, you may be able to say with a certain confidence/reliability that no other variables are causing enough variations in speed to register an output that’s outside of the parameters that you’ve set. But that is a very different thing than saying that those other variables simply don’t exist! One is a statement of probability, another is a statement of certainty. Maybe there’s a confluence of variables that only occur once every thousand times, which you won’t pick up when doing an initial evaluation.
- TLW 19 Jan 2022 3:39 UTC
  2 points
  Parent
  If you’re using a radar gun that gives readings to the nearest MPH, for example, you won’t perceive a difference between 10.1 and 10.2 MPH, and so to you the two are equivalent.
  As an aside, this is one of the reasons why some sensing systems deliberately inject random noise.
  
  If it turns out that, for instance, your system’s actual states are always X.4 MPH, you have a systematic bias if you use a radar gun that actually gives readings to the nearest MPH. If, however, you inject $\pm 0.5$ MPH random noise, you don’t have a systematic bias any more. (Of course, this requires repeated sampling to pick up on.)
  But that is a very different thing than saying that those other variables simply don’t exist! One is a statement of probability, another is a statement of certainty. Maybe there’s a confluence of variables that only occur once every thousand times, which you won’t pick up when doing an initial evaluation.
  As an extreme example of that, consider:
  
  $f (x) \to S H A 512 (x) = S H A 512 (s o m e R a n d o m L o n g C o n s t a n t)$
  
  Under blackbox testing, this function is indistinguishable from $f (x) \to f a l s e$ .
  It’s incorrect to say that billions of variables aren’t affecting a sled sliding down a hill—of course they’re affecting the speed, even if most are only by a few planck-lengths per hour. But, crucially, they’re mostly not affecting it to a detectable amount. The detectability threshold is the key to the argument.
  It is important to differentiate between billions of: uncorrelated variables, correlated variables, and anticorrelated variables. A grain of sand on the hill may not detectably influence the sled. A truckload of sand, on the other hand, will very likely do so.
  You are correct in the case of uncorrelated variables with a mean of zero; it is interesting in the real world that almost all variables appear to fall into this category.