There are a number of ways to combine this approach with learning, but I haven’t had time to try any of them yet. Some ideas I have thought of:
Use hard-coded weights, plus some random noise, to initialize the weights of a transformer that you then train in the traditional fashion
Doesn’t really help with interpretability or alignment, but might(???) help with performance
Write out all the weight and bias parameters as combinations of semes and outer products of semes, then learn seme embeddings by gradient descent
Semantic seme embeddings could be initialized from something like WordNet relationships, or learned with word2vec, to automate those guys
You could do smallish amounts of gradient descent to suggest new rules to add, but then add them by hand
Still would be very slow
Perhaps it is possible to start with a strong learned transformer and gradually identify human-legible rules that it is using, and replacing those specific parts with hard-coding
Could prove very difficult!!!
It seems almost certain to me that hard-coding weights would at least help us build the muscles needed to recognize what is going on, to the extent that we are able to
There are a number of ways to combine this approach with learning, but I haven’t had time to try any of them yet. Some ideas I have thought of:
Use hard-coded weights, plus some random noise, to initialize the weights of a transformer that you then train in the traditional fashion
Doesn’t really help with interpretability or alignment, but might(???) help with performance
Write out all the weight and bias parameters as combinations of semes and outer products of semes, then learn seme embeddings by gradient descent
Semantic seme embeddings could be initialized from something like WordNet relationships, or learned with word2vec, to automate those guys
You could do smallish amounts of gradient descent to suggest new rules to add, but then add them by hand
Still would be very slow
Perhaps it is possible to start with a strong learned transformer and gradually identify human-legible rules that it is using, and replacing those specific parts with hard-coding
Could prove very difficult!!!
It seems almost certain to me that hard-coding weights would at least help us build the muscles needed to recognize what is going on, to the extent that we are able to