Set a random variable XA to be a trained model with bilinear layers with random initialization and training data A. Then I would like to know if various estimated upper bounds for various entropies for XA are much lower than if XA were a more typical machine learning model where a linear layer is composed with ReLU. It seems like entropy is a good objective measure of the lack of decipherability.
Set a random variable XA to be a trained model with bilinear layers with random initialization and training data A. Then I would like to know if various estimated upper bounds for various entropies for XA are much lower than if XA were a more typical machine learning model where a linear layer is composed with ReLU. It seems like entropy is a good objective measure of the lack of decipherability.