[Link] A review of proposals toward safe AI

Eliezer Yudkowsky set out to define more precisely what it means for an entity to have “what people really want” as a goal. Coherent Extrapolated Volition was his proposal. Though CEV was never meant as more than a working proposal; his write-up provides the best insights to date into the challenges of the Friendly AI problem, the pitfalls and possible paths to a solution.

[...]

Ben Goertzel responded with Coherent Aggregated Volition, a simplified variant of CEV. In CAV, the entity’s goal is a balance between the desires of all humans, but it looks at the volition of humans directly, without extrapolation to a wiser future. This omission is not just to make the computation easier (it is still quite intractable), but rather to show some respect to humanity’s desires as they are, without extrapolation to a hypothetical improved morality.

[...]

Stuart Armstrong’s “Chaining God” is a different approach, aimed at the problem of interacting with and trusting the good will of an ultraintelligence so far beyond us that we have nothing in common with it. A succession of AIs, of gradually increasing intelligence, each guarantees the trustworthiness of one which is slightly smarter than it. This resembles Yudkowsy’s idea of a self-improving machine which verifies that its next stage has the same goals, but the successive levels of intelligence remain active simultaneously, so that they can continue to verify Friendliness.

Ray Kurzweil thinks that we will achieve safe ultraintelligence by gradually becoming that ultraintelligence. We will merge with the rising new intelligence, whether by interfacing with computers or by uploading our brains to a computer substrate.

Link: adarti.blogspot.com/2011/04/review-of-proposals-toward-safe-ai.html