I have finally gained a better understanding of why my almost-zero temperature settings cannot actually be set to zero. This also explains why playground environments that claim to allow setting the temperature to zero most likely do not achieve true zero—the graphical user interface merely displays it as zero.
softmax(xi)=exp(xi/T)/sum(exp(xj/T))
In the standard softmax function mentioned above, it is not possible to input a value of zero, as doing so will result in an error.
, the output distribution will be the same as a standard softmax output. The higher the value of
, the “softer” the output distribution will become. For example, if we wish to increase the randomness of the output distribution, we can increase the value of the parameter
.
So, the standard softmax function w/o temperature shown as:softmax(xi)=exp(xi)/sum(exp(xj))
Is the same as a softmax function with the temperature of 1:softmax(xi)=exp(xi/1)/sum(exp(xj/1))
For the experiments I am conducting, it is impossible to input zero as a value (again, this is very different from what playground environments show). To achieve a deterministic output, an almost zero temperature is the ideal setting like a temperature of 0.000000000000001.
Why Almost Zero Temperature?
I have finally gained a better understanding of why
my almost-zerotemperature settings cannot actually be set to zero. This also explains why playground environments that claim to allow setting the temperature to zero most likely do not achieve true zero—the graphical user interface merely displays it as zero.softmax(xi)=exp(xi/T)/sum(exp(xj/T))
In the standard softmax function mentioned above, it is not possible to input a value of zero, as doing so will result in an error.
As explained also in this post: https://www.baeldung.com/cs/softmax-temperature
So, the standard softmax function w/o temperature shown as:softmax(xi)=exp(xi)/sum(exp(xj))
Is the same as a softmax function with the temperature of 1:softmax(xi)=exp(xi/1)/sum(exp(xj/1))
For the experiments I am conducting, it is impossible to input zero as a value (again, this is very different from what playground environments show). To achieve a deterministic output, an almost zero temperature is the ideal setting like a temperature of 0.000000000000001.