Posts
Comments
Comment by
Peter Schmidt-Nielsen (peter-schmidt-nielsen) on
[Simulators seminar sequence] #2 Semiotic physics - revamped ·
2023-01-05T05:43:51.930Z ·
LW ·
GW
So, a softmax can never emit a probability of 0 or 1, maybe they were implicitly assuming the model ends in a softmax (as is the common case)? Regardless, the proof is still wrong if a model is allowed unbounded context, as an infinite product of positive numbers less than 1 can still be nonzero. For example, if the probability of emitting another " 0" is even just as high as $1 - \frac1{n^{1.001}}$ after already having emitted $n$ copies of " 0", then the limiting probability is still nonzero.
But if the model has a finite context and ends in a softmax then I think there is some minimum probability of transitioning to a given token, and then the proposition is true. Maybe that was implicitly assumed?