As you probably know, there are multiple theoretically-interesting ML ideas that achieve very good results on MNIST. Have you tried more challenging image recognition benchmarks, such as CIFAR-100, or some non-CV benchmark? Since you posted your code, I wouldn’t mind spending a bit of time looking over what you’ve accomplished. However, MNIST, which is now considered pretty much a toy benchmark (I don’t consider PI-MNIST to be a better benchmark), will likely be an obstacle to get others to also look at it in-depth, as it will be considered quite preliminary. Another practical point: using C and CUDA kernels also makes it less accessible to a good percentage of researchers.
I have, deliberately, taken away everything relating to geometry from this presentation.
It took 12 years (1986-1998) and (how much research effort ?) ,to go from BP/SGD to convolutions.
This is a one man effort, on my own personal time (20,000 hours over the past 6 years), that I am giving away for the community to freely take over. I am really sorry if it is not enough. Their choice.
It is not an add-on to something that exist but a complete restart. One thing at a time.
As for CUDA, if you have a lot of threads, it is bearable, and you can use old, cheap, GPUs with very little loss (they have been optimised recently for FP multiply/add, at the expense of ADD of INT).
FYI, I got >99.3% with only adding a fix layer of simple preset filters (no augmentation) and the idea behind can be readily extended. And you can also train, unsupervised, convolutions.
I didn’t mean “CUDA kernels” as in requiring NVIDIA GPUs—that’s fine. I meant that you’re limiting the readability of your code to a subset of people who understand both ML and CUDA programming. In my experience, this limits the reach, especially among younger researchers (I’ve hired C++ programmers and ML researchers for my business).
But, of course, you can choose to promote (or not) your work however you prefer.
The only function that is implemented in CUDA is the test one (test_gpu).
It is also implement for CPU (test_mt_one), identically.
What matters is all clearly (I hope) explained in the text. It is simple enough that its reach is not limited to ML researchers and clearly within that of a lot of coders. The IT revolution started when amateurs got PCs.
In this version of the code, I had to make a tradeoff between completeness, usability and practicality. Write your own code, it does not matter. It is the concept that does.
The (upcoming) website will give separate, readable, versions. I am waiting to get a proper idea of what is demanded before I do that, so thank you for that input.
As you probably know, there are multiple theoretically-interesting ML ideas that achieve very good results on MNIST. Have you tried more challenging image recognition benchmarks, such as CIFAR-100, or some non-CV benchmark? Since you posted your code, I wouldn’t mind spending a bit of time looking over what you’ve accomplished. However, MNIST, which is now considered pretty much a toy benchmark (I don’t consider PI-MNIST to be a better benchmark), will likely be an obstacle to get others to also look at it in-depth, as it will be considered quite preliminary. Another practical point: using C and CUDA kernels also makes it less accessible to a good percentage of researchers.
I have, deliberately, taken away everything relating to geometry from this presentation.
It took 12 years (1986-1998) and (how much research effort ?) ,to go from BP/SGD to convolutions.
This is a one man effort, on my own personal time (20,000 hours over the past 6 years), that I am giving away for the community to freely take over. I am really sorry if it is not enough. Their choice.
It is not an add-on to something that exist but a complete restart. One thing at a time.
As for CUDA, if you have a lot of threads, it is bearable, and you can use old, cheap, GPUs with very little loss (they have been optimised recently for FP multiply/add, at the expense of ADD of INT).
FYI, I got >99.3% with only adding a fix layer of simple preset filters (no augmentation) and the idea behind can be readily extended. And you can also train, unsupervised, convolutions.
I didn’t mean “CUDA kernels” as in requiring NVIDIA GPUs—that’s fine. I meant that you’re limiting the readability of your code to a subset of people who understand both ML and CUDA programming. In my experience, this limits the reach, especially among younger researchers (I’ve hired C++ programmers and ML researchers for my business).
But, of course, you can choose to promote (or not) your work however you prefer.
The only function that is implemented in CUDA is the test one (test_gpu).
It is also implement for CPU (test_mt_one), identically.
What matters is all clearly (I hope) explained in the text. It is simple enough that its reach is not limited to ML researchers and clearly within that of a lot of coders. The IT revolution started when amateurs got PCs.
In this version of the code, I had to make a tradeoff between completeness, usability and practicality. Write your own code, it does not matter. It is the concept that does.
The (upcoming) website will give separate, readable, versions. I am waiting to get a proper idea of what is demanded before I do that, so thank you for that input.