Currently, we have zero concrete feedback about which strategies can effectively align complex systems of equal or greater intelligence to humans.
Actually, I now suspect this is to a significant extent disinformation. You can tell when ideas make sense if you think hard about them. There’s plenty of feedback, that’s not already being taken advantage of, at the level of “abstract, high-level, philosophy of mind”, about the questions of alignment.
Actually, I now suspect this is to a significant extent disinformation. You can tell when ideas make sense if you think hard about them. There’s plenty of feedback, that’s not already being taken advantage of, at the level of “abstract, high-level, philosophy of mind”, about the questions of alignment.