philip_b comments on Understanding Machine Learning (I)

philip_b Dec 23, 2019, 10:22 AM
4 points
Two nitpicks:

like if you want it to recognize spam emails, but you only show it aspects of the emails such that there is at best a statistically weak correlation between them and whether the email is spam or not-spam

Here “statistically weak correlation” should be “not a lot of mutual information”, since correlation is only about linear dependence between random variables.

i.d.d.

Should be i.i.d.
- Rafael Harth Dec 23, 2019, 11:42 AM
  1 point
  Parent
  Thanks; fixed the second one.
  On the first – I don’t understand why correlation doesn’t apply. “Spam” is a random variable, and whatever feature you show, like “length”, is also a random variable – right?