I think you might apply this to the hot neurons, the neurons who are most used.Some recent models allready handle those diffrent for speedup reason.
Current theme: default
Less Wrong (text)
Less Wrong (link)
I think you might apply this to the hot neurons, the neurons who are most used.
Some recent models allready handle those diffrent for speedup reason.