Artyom Karpov comments on Inducing human-like biases in moral reasoning LMs

Artyom Karpov 4 Mar 2024 11:59 UTC
2 points
0
Thanks for your comment. This was hard work for us for weeks/months. Unfortunately, we didn’t include the part about how we calculated brain score in this text yet, though you might find this in our code, which should match the way others calculate this (see our references). The models with ‘none’ fine-tuning have somewhat higher brain score but this is within the error range with other models which is partially due we didn’t run many calculations for that to reduce std for ‘none’. Also, our target was mainly the accuracy on the ETHICS dataset.