Most artefacts today don’t need many human values to function reasonably safely. The liquidiser needs to know to stop spinning when you take the lid off—but that’s about it.
Cars make good examples of a failure to program in human values. Cars respect the values of both drivers and pedestrians poorly. They are too stupid to know how to behave.
As cars get smarter, their understanding of driver and pedestrian values seems likely to improve dramatically. In this case, one of the primary functions of the machine brains will be to better respect human values. Drivers do not like to crash into other vehicles or pedestrians any more than they like being lost. It is also a case of the programming being relatively complex and difficult.
There’s a reasonable case to be made that this kind of thing is the rule—rather than the exception. In which case, any diverting of funds away from rapidly developing machine intelligence would fairly directly cause harm.
Absolutely agreed that whatever instrumental human values that we think about explicitly enough to encode into our machines (like not killing passengers or pedestrians while driving from point A to point B), or that are implicit enough in the task itself that optimizing for performing that task will necessarily implement those values as well (like not crashing and exploding between A and B) will most likely be instantiated in machine intelligence as we develop it.
Agreed that if that’s the rule rather than the exception—that is, if all or almost all of the things we care about are either things we understand explicitly or things that are implicit in the tasks we attempt to optimize—then building systems that attempt to optimize those things, with explicit safety features, is likely to alleviate more suffering than it causes.
Most artefacts today don’t need many human values to function reasonably safely. The liquidiser needs to know to stop spinning when you take the lid off—but that’s about it.
Cars make good examples of a failure to program in human values. Cars respect the values of both drivers and pedestrians poorly. They are too stupid to know how to behave.
As cars get smarter, their understanding of driver and pedestrian values seems likely to improve dramatically. In this case, one of the primary functions of the machine brains will be to better respect human values. Drivers do not like to crash into other vehicles or pedestrians any more than they like being lost. It is also a case of the programming being relatively complex and difficult.
There’s a reasonable case to be made that this kind of thing is the rule—rather than the exception. In which case, any diverting of funds away from rapidly developing machine intelligence would fairly directly cause harm.
Absolutely agreed that whatever instrumental human values that we think about explicitly enough to encode into our machines (like not killing passengers or pedestrians while driving from point A to point B), or that are implicit enough in the task itself that optimizing for performing that task will necessarily implement those values as well (like not crashing and exploding between A and B) will most likely be instantiated in machine intelligence as we develop it.
Agreed that if that’s the rule rather than the exception—that is, if all or almost all of the things we care about are either things we understand explicitly or things that are implicit in the tasks we attempt to optimize—then building systems that attempt to optimize those things, with explicit safety features, is likely to alleviate more suffering than it causes.