Darklight comments on Darklight’s Shortform

Darklight 20 Oct 2024 13:34 UTC
1 point
0
Correlation space is between −1 and 1, with 1 being the same (definitely true), −1 being the opposite (definitely false), and 0 being orthogonal (very uncertain). I had the idea that you could assume maximum uncertainty to be 0 in correlation space, and 1/n (the uniform distribution) in probability space.
- cubefox 20 Oct 2024 14:08 UTC
  3 points
  0
  Parent
  Not sure what you mean here, but $p \times 2 - 1$ would linearly transform a probability $p$ from [0..1] to [-1..1]. You could likewise transform a correlation coefficient $ϕ$ to [0..1] with $\frac{ϕ (A, B) + 1}{2}$ . For $P (A) = P (B) = \frac{1}{2}$ , this would correspond to the probability of A occuring if and only if B occurs. I.e. $\frac{ϕ (A, B) + 1}{2} = P (A \leftrightarrow B)$ when $P (A) = P (B) = 0.5$ .
  - Darklight 20 Oct 2024 15:59 UTC
    1 point
    0
    Parent
    So, my main idea is that the principle of maximum entropy aka the principle of indifference suggests a prior of 1/n where n is the number of possibilities or classes. P x 2 − 1 leads to p = 0.5 for c = 0. What I want is for c = 0 to lead to p = 1/n rather than 0.5, so that it works in the multiclass cases where n is greater than 2.
    - cubefox 20 Oct 2024 16:45 UTC
      3 points
      0
      Parent
      What’s the solution?
      - Darklight 20 Oct 2024 16:53 UTC
        3 points
        0
        Parent
        p = (n^c * (c + 1)) / (2^c * n)
        As far as I know, this is unpublished in the literature. It’s a pretty obscure use case, so that’s not surprising. I have doubts I’ll ever get around to publishing the paper I wanted to write that uses this in an activation function to replace softmax in neural nets, so it probably doesn’t matter much if I show it here.