I’m more active on Twitter than LW/AF these days: https://twitter.com/DavidSKrueger
Bio from https://www.davidscottkrueger.com/:
I am an Assistant Professor at the University of Cambridge and a member of Cambridge’s Computational and Biological Learning lab (CBL). My research group focuses on Deep Learning, AI Alignment, and AI safety. I’m broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Particular interests include:
Reward modeling and reward gaming
Aligning foundation models
Understanding learning and generalization in deep learning and foundation models, especially via “empirical theory” approaches
Preventing the development and deployment of socially harmful AI systems
Elaborating and evaluating speculative concerns about more advanced future AI systems
OK, so it’s not really just your results? You are aggregating across these studies (and presumably ones of “Westerners” as well)? I do wonder how directly comparable things are… Did you make an effort to translate a study or questions from studies, or are the questions just independently conceived and formulated?