Emile comments on Brainstorming additional AI risk reduction ideas

Emile 14 Jun 2012 11:47 UTC
1 point
Make a “Friendly AI” programming challenge in a toy simulated society

(warning: this is pretty half-baked)

Step one (to prepare the contest) would be making interesting simulated societies, in which agents evolve social norms.

A simulated society would be a population of agents, and each turn some of them would be picked randomly to participate in a game, and as a result some agents may be killed, or modified (tagged “evil”, “pregnant”, “hungry”, etc.), or new agents may be created, etc.

Agents would each have a set of “norms” (“in situation X, doing Y is good”) that together would effectively amount to the agent’s utility function. These norms would also work as the agents’ genome, so when new agents are created, their norms are derived from those of their parent(s).

In effect, this would amount to evolving the most appropriate utility function for the simulation’s specific rules. Several sets of rules could be devised to make different interesting societies.

Step two would be to have a contest where programmers have to write a “God” program that would be analogous to a superhuman AI in our world. I’m not quite sure of the best way to do that, but in all cases the program should be given some of the evolved norms as input.

One possibility would be to have the “God” program be a utility function, after which the whole simulated world is optimized according to that utility function.

Another would be to have the “God” program be an agent that participates in all games with many more choices available.

A twist could be that the “God” program is given only the norms on which some agents disagree, those on which everybody always agree “go without saying” (such as “rocks don’t count as sentient beings”). Another is that the same “’God” program would be used in simulations with different rules.

(A big problem with this is that in the simulations, unlike in reality, agents are ontologically basic objects, so a lot of “dreams of friendly AI” would actually work in the simulation. Still, it should be possible to devise a simulation where the “God” program doesn’t have access to the agents as ontologically basic objects)

A contest like that may allow people to realize that some of their ideas on value extrapolation etc. do not work.