I fell we are repeating things which may mean we have reached the end of usefulness in continuing further. So let me address what I see as just the most important points:
You are assuming that human morality is something which can be specified by a set of exact decision theory equations, or at least roughly approximated by such. I am saying that there is no reason to believe this, especially given that we know that is not how the human mind works. There are cases (like turbulence) where we know the underlying governing equations, but still can’t make predictions beyond a certain threshold. It is possible that human ethics work the same way—that you can’t write down a single utility function describing human ethics as separate from the operation of the brain itself.
In other words, you believe that human morality is fundamentally simple, and we know more than enough details of it to specify it in morality-space to within a small tolerance? That seems likely to be the main disagreement between you and Eliezer & crowd.
I’m not sure how you came to that conclusions as my position is quite the opposite: I suspect that human morality is very, very complex. So complex that it may not even be possible to construct a model of human morality short of emulating a variety of human minds. In other words, morality itself is AI-hard or worse.
If that were true, MIRI’s current strategy is a complete waste of time (and waste in human lives in opportunity cost as smart people are persuaded against working on AGI).
You are assuming that human morality is something which can be specified by a set of exact decision theory equations, or at least roughly approximated by such.
No I’m not. At least, it’s not humanly possible. An AI could work out a human’s implicit utility function, but it would be extremely long and complicated.
There are cases (like turbulence) where we know the underlying governing equations, but still can’t make predictions beyond a certain threshold.
Human morality is a difficult thing to predict. If you build your AI the same way, it will also be difficult to predict. They will not end up being the same.
If human morality is too complicated for an AI to understand, then let it average over the possibilities. Or at least let it guess. Don’t tell it to come up with something on its own. That will not end well.
I’m not sure how you came to that conclusion
It was the line:
what we normally think of as human morals is not very compressed, so specifying many of them inconsistently and leaving a few out would still have a high likelihood of resulting in an acceptable moral value function.
In order for this to work, whatever statements we make about our morality must have more information content then morality itself. That is, we not only describe all of our morality, we repeat ourselves several times. Sort of like how if you want to describe gravity, and you give the position of a falling ball at fifty points in time, there’s significantly more information in there than you need to describe gravity, so you can work out the law of gravity from just that data.
If our morality is complicated, then specifying many of them approximately would result in the AI finding some point in morality space that’s a little off in every area we specified, and completely off in all the areas we forgot about.
If that were true, MIRI’s current strategy is a complete waste of time
Their strategy is not to figure out human morality and explicitly program that into an AI. It’s to find some way of saying “figure out human morality and do that” that’s not rife with loopholes. Once they have that down, the AI can emulate a variety of human minds, or do whatever it is it needs to do.
I fell we are repeating things which may mean we have reached the end of usefulness in continuing further. So let me address what I see as just the most important points:
You are assuming that human morality is something which can be specified by a set of exact decision theory equations, or at least roughly approximated by such. I am saying that there is no reason to believe this, especially given that we know that is not how the human mind works. There are cases (like turbulence) where we know the underlying governing equations, but still can’t make predictions beyond a certain threshold. It is possible that human ethics work the same way—that you can’t write down a single utility function describing human ethics as separate from the operation of the brain itself.
I’m not sure how you came to that conclusions as my position is quite the opposite: I suspect that human morality is very, very complex. So complex that it may not even be possible to construct a model of human morality short of emulating a variety of human minds. In other words, morality itself is AI-hard or worse.
If that were true, MIRI’s current strategy is a complete waste of time (and waste in human lives in opportunity cost as smart people are persuaded against working on AGI).
No I’m not. At least, it’s not humanly possible. An AI could work out a human’s implicit utility function, but it would be extremely long and complicated.
Human morality is a difficult thing to predict. If you build your AI the same way, it will also be difficult to predict. They will not end up being the same.
If human morality is too complicated for an AI to understand, then let it average over the possibilities. Or at least let it guess. Don’t tell it to come up with something on its own. That will not end well.
It was the line:
In order for this to work, whatever statements we make about our morality must have more information content then morality itself. That is, we not only describe all of our morality, we repeat ourselves several times. Sort of like how if you want to describe gravity, and you give the position of a falling ball at fifty points in time, there’s significantly more information in there than you need to describe gravity, so you can work out the law of gravity from just that data.
If our morality is complicated, then specifying many of them approximately would result in the AI finding some point in morality space that’s a little off in every area we specified, and completely off in all the areas we forgot about.
Their strategy is not to figure out human morality and explicitly program that into an AI. It’s to find some way of saying “figure out human morality and do that” that’s not rife with loopholes. Once they have that down, the AI can emulate a variety of human minds, or do whatever it is it needs to do.