The proposition we were actually going for was limB→∞P[(sa,s1,…,sB)]=0., i.e. the probability without the end of the bridge!
In that case, I agree the monotonically decreasing version of the statement is correct. I think the limit still isn’t necessarily zero, for the reasons I mention in my original comment. (Though I do agree it will be zero under somewhat reasonable assumptions, and in particular for LMs)
So Proposition II implies something like P(sb)∼exp[−(B+1)maxP(sa,s1,…,sB,sb)], or that in the limit “the probability of the most likely sequence ending in sb will be (when appropriately normalized) proportional to the probability of sb”, which seems sensible?
One crux here is the “appropriately normalized”: why should the normalization be linear, i.e. just B + 1? I buy that there are some important systems where this holds, and maybe it even holds for LMs, but it certainly won’t be true in general (e.g. sometimes you need exponential normalization). Even modulo that issue, the claim still isn’t obvious to me, but that may be a good point to start (i.e. an explanation of where the normalization factor comes from would plausibly also clear up my remaining skepticism).
In that case, I agree the monotonically decreasing version of the statement is correct. I think the limit still isn’t necessarily zero, for the reasons I mention in my original comment. (Though I do agree it will be zero under somewhat reasonable assumptions, and in particular for LMs)
One crux here is the “appropriately normalized”: why should the normalization be linear, i.e. just B + 1? I buy that there are some important systems where this holds, and maybe it even holds for LMs, but it certainly won’t be true in general (e.g. sometimes you need exponential normalization). Even modulo that issue, the claim still isn’t obvious to me, but that may be a good point to start (i.e. an explanation of where the normalization factor comes from would plausibly also clear up my remaining skepticism).