My mind keeps returning to exercises which could clarify parts of alignment, both for me and others. Some of them are obvious: think about what kind of proof you’d need to solve alignment, what type of objects it would have to be talking about etc. and see whether that implies having a maths oracle would make the problem easier. Or try and come up with a list of human values to make a utility function and see how it breaks down under greater optimization pressure.
But what about new exercises? For skillsets I’ve never learnt? Well, there’s the security mindset, which I don’t have. I think it is about “trying to break everything you see”, so presumably I should just spend a bunch of time breaking things or reading the thoughts of people who deeply inhabit this perspective for more tacit knowledge. For the former, I could do something like exercises for computer secuirty: https://hex-rays.com/ida-free/ For the latter, I’ve heard “Silence on the Wire” is good: the author is supposedly a hacker’s hacker, and writes about solutions to security challenges which defy classification. Seeing solutions to complex, real world problems is very important to developing expertise.
But I just had a better thought: wouldn’t watching someone hacking something be a better exmaple of the security mindset? See the problem they’re tackling and guess where the flaws will be. That’s the way to really acquire Tacit knowledge. In fact, looking at the LockPickingLawyer’s channel is what kicked off this post. There, you can see every lock under the sun picked apart in minutes. Clearly, the expertise is believable. So maybe a good exercise for showing people that security mindset exists, and perhaps to develop it, would be getting a bunch of these locks and their design specs, giving people some tools, and asking them to break them. Then show them how the lock picking lawyer does it. Again, and again and again.
My mind keeps returning to exercises which could clarify parts of alignment, both for me and others. Some of them are obvious: think about what kind of proof you’d need to solve alignment, what type of objects it would have to be talking about etc. and see whether that implies having a maths oracle would make the problem easier. Or try and come up with a list of human values to make a utility function and see how it breaks down under greater optimization pressure.
But what about new exercises? For skillsets I’ve never learnt? Well, there’s the security mindset, which I don’t have. I think it is about “trying to break everything you see”, so presumably I should just spend a bunch of time breaking things or reading the thoughts of people who deeply inhabit this perspective for more tacit knowledge. For the former, I could do something like exercises for computer secuirty: https://hex-rays.com/ida-free/ For the latter, I’ve heard “Silence on the Wire” is good: the author is supposedly a hacker’s hacker, and writes about solutions to security challenges which defy classification. Seeing solutions to complex, real world problems is very important to developing expertise.
But I just had a better thought: wouldn’t watching someone hacking something be a better exmaple of the security mindset? See the problem they’re tackling and guess where the flaws will be. That’s the way to really acquire Tacit knowledge. In fact, looking at the LockPickingLawyer’s channel is what kicked off this post. There, you can see every lock under the sun picked apart in minutes. Clearly, the expertise is believable. So maybe a good exercise for showing people that security mindset exists, and perhaps to develop it, would be getting a bunch of these locks and their design specs, giving people some tools, and asking them to break them. Then show them how the lock picking lawyer does it. Again, and again and again.