A very simple task, like MNIST or CIFAR classification, but the final score is:
Score=Performance+λ∗numberofnonzeroweights
where “λ” is a normalization factor that is chosen to make the tradeoff as interesting as possible. This should be correlated to AI safety as a small and/or very sparse model is much more interpretable and thus safer than a large/dense one. You can work on this for a very long, time, trying simple fully connected neural nets, CNNs, resnets, transformers, autoencoders of any kind and so on. If the task looks too easy you might change it imagenet classification or image generation or anything else, depending on the skill level of the contestants.
In the simplest possible way to partecipate, yes, but a hackathon is made to elicit imaginative and novel ways to approach the problem (how? I do not know, it is the partecipants’ job to find out).
I’m still thinking about this idea. We could try to do the same thing but on Cifar10. I do not know if it would be possible to construct by hand the layers.
On mnist, for a network (LeNet, 60k parameters)with 99 percents accuracy, the crossentropy is 0.05
If we take the formula: CE + lambda log nb non null params
A good lambda is equal to 100. (Equalizing crossentropy and regularization)
In the mnist minimal number of weights competition, we have 99 percents accuracy with 2000 weights. So lambda is equal to 80.
Maybe If we want to stress the importance of sparsity, we can choose a lambda equal to 300.
Probably even if not completely by hand, MNIST is so simple that hybrid human-machine optimization could be possible, maybe with a UI where you can see the effect on validation loss in (almost) real time of changing a particular weight with a slider. I do not know if it would be possible to improve the final score by changing the weights one by one. Or maybe the human can use instinctual vision knowledge to improve the convolutional filters.
On Cifar this looks very hard to do manually given that the dataset is much harder than Mnist.
I think that a too large choice of lambda is better than a too small one because if lambda is too big the results will still be interesting (which model architecture is the best under extremely strong regularization?) while if it is too small you will just get a normal architecture slightly more regularized.
A very simple task, like MNIST or CIFAR classification, but the final score is:
Score=Performance+λ∗numberofnonzeroweights
where “λ” is a normalization factor that is chosen to make the tradeoff as interesting as possible. This should be correlated to AI safety as a small and/or very sparse model is much more interpretable and thus safer than a large/dense one. You can work on this for a very long, time, trying simple fully connected neural nets, CNNs, resnets, transformers, autoencoders of any kind and so on. If the task looks too easy you might change it imagenet classification or image generation or anything else, depending on the skill level of the contestants.
Is this like “have the hackathon participants do manual neural architecture search and train with L1 loss”?
In the simplest possible way to partecipate, yes, but a hackathon is made to elicit imaginative and novel ways to approach the problem (how? I do not know, it is the partecipants’ job to find out).
I like the idea.
But this already exist: https://github.com/ruslangrimov/mnist-minimal-model
Cool GitHub repository, thanks for the link.
I’m still thinking about this idea. We could try to do the same thing but on Cifar10. I do not know if it would be possible to construct by hand the layers.
On mnist, for a network (LeNet, 60k parameters)with 99 percents accuracy, the crossentropy is 0.05
If we take the formula: CE + lambda log nb non null params
A good lambda is equal to 100. (Equalizing crossentropy and regularization)
In the mnist minimal number of weights competition, we have 99 percents accuracy with 2000 weights. So lambda is equal to 80.
Maybe If we want to stress the importance of sparsity, we can choose a lambda equal to 300.
Probably even if not completely by hand, MNIST is so simple that hybrid human-machine optimization could be possible, maybe with a UI where you can see the effect on validation loss in (almost) real time of changing a particular weight with a slider. I do not know if it would be possible to improve the final score by changing the weights one by one. Or maybe the human can use instinctual vision knowledge to improve the convolutional filters.
On Cifar this looks very hard to do manually given that the dataset is much harder than Mnist.
I think that a too large choice of lambda is better than a too small one because if lambda is too big the results will still be interesting (which model architecture is the best under extremely strong regularization?) while if it is too small you will just get a normal architecture slightly more regularized.