Bucky comments on Bayes Questions

Bucky 9 Nov 2018 12:10 UTC
1 point
Birnbaum-Saunders is an interesting one. For the purposes of fatigue analysis, the assumptions which bring about the three models are:
Weibull—numerous failure modes (of similar failure speed) racing to see which causes the component to fail first
Log-normal—Damage per cycle is proportional to current damage
Birnbaum-Saunders—Damage per cycle is normally distributed and independent of previous damage
My engineering gut says that this component is probably somewhere between Log-normal and Birnbaum-Saunders (I think proportionality will decay as damage increases) which is maybe why I don’t have a clear winner yet.
***
I think I understand now where my original reasoning was incorrect when I was calculating the expected worst in a million. I was just calculating worst in a million for each model and taking a weighted average of the answers. This meant that bad values from the outlier potential pdfs were massively suppressed.
I’ve done some sampling of the worst in a million by repeatedly creating 2 random numbers from 0 to 1. I use the first to select a μ,σ combination based on the posterior for each pair. I use the second random number as a p-value. I then icdf those values to get an x.
Is this an ok sampling method? I’m not sure if I’m missing something or should be using MCMC. I definitely need to read up on this stuff!
The worst in a million is currently dominated by those occasions where the probability distribution is an outlier. In those cases the p-value doesn’t need to be particularly extreme to achieve low x.
I think my initial estimates were based either mainly on uncertainty in p or mainly on uncertainty in μ,σ. The sampling method allows me to account for uncertainty in both which definitely makes more sense. The model seems to react sensibly when I add potential new data so I think I can assess much better now how many data points I require.
- johnswentworth 9 Nov 2018 19:12 UTC
  1 point
  Parent
  That sampling method sounds like it should work, assuming it’s all implemented correctly (not sure what method you’re using to sample from the posterior distribution of $μ$ , $σ$ ).
  Worst case in a million being dominated by parameter uncertainty definitely makes sense, given the small sample size and the rate at which those distributions fall off.
  - Bucky 9 Nov 2018 20:14 UTC
    1 point
    Parent
    For μ,σ I effectively created a quasi-cumulative distribution with the parameter pairs as the x-axis.
    
    μ1,σ1. μ2,σ1. μ3,σ1 … μ1,σ2. μ2,σ2. μ3,σ2 … μn,σm
    
    The random number defines the relevant point on the y-axis. From there I get the corresponding μ,σ pair from the x-axis.
    
    If this method works I’ll probably have to code the whole thing instead of using a spreadsheet as I don’t have nearly enough μ,σ values to get a good answer currently.
    - johnswentworth 9 Nov 2018 22:30 UTC
      1 point
      Parent
      At what point is the data used?
      - Bucky 9 Nov 2018 22:44 UTC
        1 point
        Parent
        I use it to determine the relative probabilities of each μ,σ pair which in turn create the pseudo cdf.
        
        johnswentworth 9 Nov 2018 22:49 UTC
        1 point
        Parent
        Ok, that sounds right.