My attempts at putting LaTeX notation here didn’t work out, so I hope this is at all readable.
I would not call the data you gave me a distribution. I think of a distribution as being something like a Gaussian; some function f where, if I keep collecting data, and I take the average sum of powers of that data, it looks like the integral over some topological group of that function.
so: lim n->\infty sum.{k=1}^n g(x.k,y.k) = \int_{R^2} f(x,y)g(x,y) dx ^ dy
for any function g on R^2
usually rather than integrating over R^2, I would be integrating over SU(2) or some other matrix group; meaning the group structure isn’t additive; usually I’d expect data to be like traces of matrices or something; for example on the appropriate subgroup of GL(2,R)+ these traces should never be below two; that sort of kinematic reason should translate into insight about what group you’re integrating over.
When you say “fitting distributions” I assume you’re looking for the appropriate f(x) (at least, after a fashion) in the above equality; minimizing a variable which should be the difference between the limits in some sense.
I may be a little out of my depth here, though.
Sorry I didn’t mean harmonic analysis, I meant Fourier analysis. I am under the impression that this is everywhere in physics and electrical engineering?
I was a little sloppy in my language; strictly speaking ‘distribution’ does refer to a generating function, not to the generated data.
When you say “fitting distributions” I assume you’re looking for the appropriate f(x) (at least, after a fashion) in the above equality; minimizing a variable which should be the difference between the limits in some sense.
Yes, exactly.
Sorry I didn’t mean harmonic analysis, I meant Fourier analysis.
We certainly do partial waves, but not on absolutely everything. Take a detector resolution with unknown parameters; it can usually be well modelled by a simple Gaussian, and then there’s no partial waves, there’s just the two parameters and the exponential.
lim n->\infty sum.{k=1}^n g(x.k,y.k) = \int_{R^2} f(x,y)g(x,y) dx ^ dy for any function g on R^2
Maybe something got lost in the notation? In the limit of n going to infinity the sum should likewise go to infinity, while the integral may converge. Also it’s not clear to me what the function g is doing. I prefer to think in terms of probabilities: We seek some function f such that, in the limit of infinite data, the fraction of data falling within (x0, x0+epsilon) equals the integral on (x0, x0+epsilon) of f with respect to x, divided by the integral over all x. Generalise to multiple dimensions as required; taking the limit epsilon->0 is optional.
average sum of powers of that data,
I’m not sure what an average sum of powers is; where do you do this in the formula you gave? Is it encapsulated in the function g? Does it reduce to “just count the events” (as in the fraction-of-events goal above) in some limit?
Yes, there was supposed to be a 1/n in the sum, sorry!
Essentially what the g is doing is taking the place of the interval probabilities; for example, if I think of g as being the characteristic function on an interval (one on that interval and zero elsewhere) then the sum and integral should both be equal to the probability of a point landing in that interval. Then one can approximate all measurable functions by characteristic functions or somesuch to make the equivalence.
In practice (for me) in Fourier analysis you prove this for a basis, such as integer powers of cosine on a close interval, or simply integer powers on an open interval (these are the moments of a distribution).
I’m not sure what an average sum of powers is; where do you do this in the formula you gave? Is it encapsulated in the function g?
Yes; after you add in the 1/n hopefully the “average” part makes sense, and then just take g for a single variable to be x^k and vary over integers k. And as I mentioned above, yes I believe it does reduce to just “count the events;” just if you want to prove things you need to count using a countable basis of function space rather than looking at intervals.
It looks to me like we’ve bridged the gap between the approaches. We are doing the same thing, but the physics case is much more specific: We have a generating function in mind and just want to know its parameters, and we look only at the linear average, we don’t vary the powers (*). So we don’t use the tools you mentioned in the comment that started this thread, because they’re adapted to the much more general case.
(*) Edit to add: Actually, on further thought, that’s not entirely true. There are cases where we take moments of distributions and whatnot; a friend of mine who was a PhD student at the same time as me worked on such an analysis. It’s just sufficiently rare (or maybe just rare in my experience!) that it didn’t come to my mind right away.
Okay, so my hypothesis that basically all of the things that I care about are swept under the rug because you only care about what I would call trivial cases was essentially right.
And it definitely makes sense that if you’ve already restricted to a specific function and you just want parameters that you really don’t need to deal with higher moments.
My attempts at putting LaTeX notation here didn’t work out, so I hope this is at all readable.
I would not call the data you gave me a distribution. I think of a distribution as being something like a Gaussian; some function f where, if I keep collecting data, and I take the average sum of powers of that data, it looks like the integral over some topological group of that function.
so: lim n->\infty sum.{k=1}^n g(x.k,y.k) = \int_{R^2} f(x,y)g(x,y) dx ^ dy for any function g on R^2
usually rather than integrating over R^2, I would be integrating over SU(2) or some other matrix group; meaning the group structure isn’t additive; usually I’d expect data to be like traces of matrices or something; for example on the appropriate subgroup of GL(2,R)+ these traces should never be below two; that sort of kinematic reason should translate into insight about what group you’re integrating over.
When you say “fitting distributions” I assume you’re looking for the appropriate f(x) (at least, after a fashion) in the above equality; minimizing a variable which should be the difference between the limits in some sense.
I may be a little out of my depth here, though.
Sorry I didn’t mean harmonic analysis, I meant Fourier analysis. I am under the impression that this is everywhere in physics and electrical engineering?
I was a little sloppy in my language; strictly speaking ‘distribution’ does refer to a generating function, not to the generated data.
Yes, exactly.
We certainly do partial waves, but not on absolutely everything. Take a detector resolution with unknown parameters; it can usually be well modelled by a simple Gaussian, and then there’s no partial waves, there’s just the two parameters and the exponential.
Maybe something got lost in the notation? In the limit of n going to infinity the sum should likewise go to infinity, while the integral may converge. Also it’s not clear to me what the function g is doing. I prefer to think in terms of probabilities: We seek some function f such that, in the limit of infinite data, the fraction of data falling within (x0, x0+epsilon) equals the integral on (x0, x0+epsilon) of f with respect to x, divided by the integral over all x. Generalise to multiple dimensions as required; taking the limit epsilon->0 is optional.
I’m not sure what an average sum of powers is; where do you do this in the formula you gave? Is it encapsulated in the function g? Does it reduce to “just count the events” (as in the fraction-of-events goal above) in some limit?
Yes, there was supposed to be a 1/n in the sum, sorry!
Essentially what the g is doing is taking the place of the interval probabilities; for example, if I think of g as being the characteristic function on an interval (one on that interval and zero elsewhere) then the sum and integral should both be equal to the probability of a point landing in that interval. Then one can approximate all measurable functions by characteristic functions or somesuch to make the equivalence.
In practice (for me) in Fourier analysis you prove this for a basis, such as integer powers of cosine on a close interval, or simply integer powers on an open interval (these are the moments of a distribution).
Yes; after you add in the 1/n hopefully the “average” part makes sense, and then just take g for a single variable to be x^k and vary over integers k. And as I mentioned above, yes I believe it does reduce to just “count the events;” just if you want to prove things you need to count using a countable basis of function space rather than looking at intervals.
It looks to me like we’ve bridged the gap between the approaches. We are doing the same thing, but the physics case is much more specific: We have a generating function in mind and just want to know its parameters, and we look only at the linear average, we don’t vary the powers (*). So we don’t use the tools you mentioned in the comment that started this thread, because they’re adapted to the much more general case.
(*) Edit to add: Actually, on further thought, that’s not entirely true. There are cases where we take moments of distributions and whatnot; a friend of mine who was a PhD student at the same time as me worked on such an analysis. It’s just sufficiently rare (or maybe just rare in my experience!) that it didn’t come to my mind right away.
Okay, so my hypothesis that basically all of the things that I care about are swept under the rug because you only care about what I would call trivial cases was essentially right.
And it definitely makes sense that if you’ve already restricted to a specific function and you just want parameters that you really don’t need to deal with higher moments.