Wei Dai comments on Genies and Wishes in the context of computer science

Wei Dai Sep 1, 2013, 9:17 PM
2 points
I don’t think anyone is claiming that any mistake one might make with a powerful optimization algorithm is a fatal one. As I said, I think the danger is in step 2 where it would be easy to come up with self-mindhacks, i.e., seemingly convincing philosophical insights that aren’t real insights, that cause you to build the FAI with a wrong utility function or adopt crazy philosophies or religions. Do you agree with that?

Whereas the fact that it is awfully hard to rationalize that superdanger when you start with my optimizer (where no magic bans you from making models that you can inspect visually), provides the information against the notion.

Are you assuming that nobody will be tempted to build AIs that make models and optimize over models in a closed loop (e.g., using something like Bayesian decision theory)? Or that such AIs are infeasible or won’t ever be competitive with AIs that have hand-crafted models that allow for visual inspection?
- private_messaging Sep 2, 2013, 10:09 AM
  3 points
  Parent
  
  I don’t think anyone is claiming that any mistake one might make with a powerful optimization algorithm is a fatal one.
  
  Well, some people do, by a trick of substituting some magical full blown AI in place of it. I’m sure you are aware of “tool ai” stuff.
  
  As I said, I think the danger is in step 2 where it would be easy to come up with self-mindhacks, i.e., seemingly convincing philosophical insights that aren’t real insights, that cause you to build the FAI with a wrong utility function or adopt crazy philosophies or religions. Do you agree with that?
  
  To kill everyone or otherwise screw up on the grand scale, you still have to actually make it, make some utility function over an actual world model, and so on, and my impression was that you would rely on your mindhack prone scheme for getting technical insights as well. Good thing about nonsense in the technical fields is that it doesn’t work.
  
  Are you assuming that nobody will be tempted to build AIs that make models and optimize over models in a closed loop (e.g., using something like Bayesian decision theory)? Or that such AIs are infeasible or won’t ever be competitive with AIs that have hand-crafted models that allow for visual inspection?
  
  These things come awfully late without bringing in any novel problem solving capacity whatsoever (which degrades them from the status of “superintelligences” to the status of “meh, what ever”), and no, models do not have to be hand crafted to allow for inspection*. Also, your handwave of “Bayesian decision theory” still doesn’t solve any hard problems of representing oneself in the model but neither wireheading nor self destructing. Or the problem of productive use of external computing resources to do something that one can’t actually model without doing it.
  
  At least as far as “neat” AIs go, those are made of components that are individually useful. Of course one can postulate all sorts of combinations of components, but combinations that can’t be used to do anything new or better than what some of the constituents can be straightforwardly used to do, and only want-on-their-own things that components can and were used as tools to do, are not a risk.
  
  edit: TL;DR; the actual “thinking” in a neat generally self willed AI is done by optimization and model-building algorithms that are usable, useful, and widely used within other contexts. Let’s picture it this way. There’s a society of people who work on their fairly narrowly defined jobs, employing their expertise that they obtained by domain specific training. In comes a mutant newborn who will grow to be perfectly selfish, but will have an IQ of 100 exactly. No one cares.
  
  *in case that’s not clear, any competent model of physics can be inspected by creating a camera in it.