Ok. But don’t you think “reverse engineering human instincts” is a necessary part of the solution?
My intuition is that value is fragile, so we need to specify it. If we want to specify it correctly, either we learn it or we reverse engineer it, no?
But don’t you think “reverse engineering human instincts” is a necessary part of the solution?
I don’t know, I don’t have a coherent idea for a solution. Here’s one of my best ideas (not so good).
Yudkowsky split up the solutions in his post, see point 24. The first sub-bullet there is about inferring human values.
Maybe someone else will have different opinions
Ok. But don’t you think “reverse engineering human instincts” is a necessary part of the solution?
My intuition is that value is fragile, so we need to specify it. If we want to specify it correctly, either we learn it or we reverse engineer it, no?
I don’t know, I don’t have a coherent idea for a solution. Here’s one of my best ideas (not so good).
Yudkowsky split up the solutions in his post, see point 24. The first sub-bullet there is about inferring human values.
Maybe someone else will have different opinions