This is the kind of issue that makes me skeptical about FAI. We can teach a decision algorithm to an AI, but can we teach the AI how to think about decision algorithms? How to decide whether one concept of fairness is better than another? Whether one interpretation of probability theory is better than another? Whether one Bayesian prior is better than another? To see the obviousness that coding conventions shouldn’t matter?
If we can’t, if we have to hard code all these answers into the AI, then that’s hardly safe, since it seems unlikely that we (i.e. ordinary humans) can achieve a high confidence in all of the answers. But as far as I know, no one has attempted or even proposed to formalize the process of philosophy as an algorithm.
Yes, we do think about these things, and not in an offhand after-work bull session either. The further you go back, the more compact the problem becomes.
What I’m saying is that if you can get the AI to choose between decision algorithms, then you can code a decision-algorithm-chooser that searches through many decision algorithms and picks the best one, instead of having to do all that work yourself. Like the difference between the amount of work put in by human chessplayers to build up the human skill of chess over centuries, versus the effort of the programmers to build Deep Blue to search the game tree for good strategies. It’s conceptually much harder work to find the source of the source of the source and build the seed of the seed of the seed, it’s more confusing labor, but if you don’t have a few centuries to spare, you had better look for the generator and not try to duplicate the output.
This merits a long response, please bear with me...
Deep Blue wasn’t built on the single insight of searching the game tree. It incorporates a whole lot of grandmaster-level advice from humans. Computers did not (and AFAIK cannot yet) regenerate the opening book. Also, perhaps more relevantly to AI, the evaluation function for positions is mostly designed by human master players, not inferred all the way backward from winning and losing positions.
One important insight like the minimax tree rarely takes you all the way to victory. We nibble away at the problem, taking out small digestible chunks. Game theory (or Bayesian statistics, or any other insight as of now) doesn’t explain all of human behavior, but it’s a first step without which further steps would’ve been that much harder.
See, I have this persistent idea—maybe stemming from my math upbringing—that the way to advancement of knowledge leads through small problems conclusively solved. That as beliefs should pay rent in predictions, so directions of inquiry should pay rent in small problems conclusively solved, building on one another. Failing to meet this standard is a pattern of FAIL that I’ve seen many, many times, and no doubt so did you. And even if you see your quest as reductionism, searching for the small computational core that would suffice to regenerate game theory and insights beyond, you’d still do well to heed the pattern.
A long quote from my textbook on game theory, Ken Binmore’s “Fun and Games”:
People who do not understand how human knowledge is advanced are often scornful of such claims. What is the point, they say, of studying a silly parlor game like Poker if the aim is to gain insight into the broad sweep of human affairs? I like to characterize such folk as naive positivists. They have seldom received any serious mathematical training. Otherwise they would have learned that difficult problems do not commonly succumb to frontal attacks. One has to lay siege to them. In graduate school, mathematicians are taught to look at simpler versions of their problems first. Examples help to rid the mind of misconceptions and clarify what the difficulties are. Such a search for insight may take one far afield. A political scientist may end up studying the Prisoners’ Dilemma. Economists may look at the literature on Poker models. Naive positivists will think their labors ridiculous. But this is because naive positivists do not understand that nobody ever solved a difficult problem who had not learned to solve simple problems first.
Maybe I should’ve made it a toplevel post, but it sounds too much like dancing about architecture. Better leave it in a comment and think of something mathy to post instead :-)
Interestingly, recent advances in computer Go have come from ignoring much human expertise and incorporating Monte Carlo approaches into the evaluation function.
It works like this:
1) Choose the move to be evaluated.
2a) Search the game tree by having players make random moves for many turns after the initial move is made.
2b) Score each randomly generated “far future” game state by using a very simple evaluation function, such as counting the number of stones on the board of each color.
3) Repeat steps 2a and 2b lots of times, generating lots of randomly played games.
4) Average the score of the randomly generated games. The result is the score for the move.
Very little human knowledge is incorporated into this method, but it has produced some of the strongest computer Go players so far.
Reason for the success of this method is propably something not all-that-interesting. The random sampling and evaluation method works because it allows using simple brute force and random chance against a problem that computer can’t (yet) handle otherwise. I am dan-player myself, and I have watched games where supercomputers with monte carlo -method and go professionals have dueled, and as far as I can tell, the go program makes repeated weird moves that seem to be doing something(hyperactive agent detector), but after a while, original idea is lost and the move becomes bad. I’d estimate that this defect could easily be used against the bot, but even with just good, honest moves, professionals did good against the bot.
My opinion, that I think is widely accepted(not sure though, haven’t been following this much lately), is that Monte Carlo method won’t produce the programs that beat humans. Reason it’s doing so well is not because the method is awesome, but because good go-players use so much of the potential of human brain, encorporating pattern-detection, strategy-planning, many different level of abstractions etcetc(all this effectively. That’s important point, I think), that our current know-how simply can’t match that at all. Which is why bots that try to simulate humans end up losing to moderately experienced(1 year) players even with multiple handicap stones.
Gonna edit/remake that post to make it clearer, but the point anyway was: go is difficult, current go programs do very poorly against even moderately skilled players, so even random sampling beats those attempts. But random sampling is not going to take go programs past human players, and I believe this hype settles as soon as some better decision-making algorhitms are invented.
It’s not clear to me that the decision-algorithm-chooser problem is more compact than the decision-algorithm problem. I think the right decision algorithm (or at least an AIXI-style idealized version of it) may be pretty short, but to choose between decision algorithms seems to involve most areas of philosophy.
ETA: Do you have the intuition that philosophy can be formalized, and in a compact way?
It’s not clear to me that the decision-algorithm-chooser problem is more compact than the decision-algorithm problem.
Perhaps they’re not different problems. If we solve the decision problem and then the algorithm just replicates itself when we tell it to decide between decision algorithms, this suggests that the philosophy “behind” was actually not behind at all. But one way or the other, you don’t want to be stuck with the problem of trying to choose according to a higher criterion using only human wisdom. A quining algorithm could plausibly represent having hit bottom. But if there are leftover dilemmas then you haven’t hit bottom. Then you want to step back and try to compact further.
ETA: Do you have the intuition that philosophy can be formalized, and in a compact way?
I have an intuition that the functional parts can be formalized in a compact way, and that the rest is our own confusion.
Perhaps they’re not different problems. If we solve the decision problem and then the algorithm just replicates itself when we tell it to decide between decision algorithms, this suggests that the philosophy “behind” was actually not behind at all. But one way or the other, you don’t want to be stuck with the problem of trying to choose according to a higher criterion using only human wisdom.
The set of quining algorithms is infinite, right? Restricting to the set of quining algorithms rules out many pathological cases, but using human wisdom to choose between them is still unavoidable.
But if there are leftover dilemmas then you haven’t hit bottom. Then you want to step back and try to compact further.
How will you know when there are really no leftover dilemmas? In other words, how can you tell that a candidate quining decision algorithm is free of errors? For example, you once thought that it was harmless for a decision algorithm to use the Solomonoff prior (which assumes that the universe is computable). (I hope I’ve convinced you that’s an error.) Such a decision algorithm can certainly be quining, so don’t we need a way to prevent this type of error in the future?
I have an intuition that the functional parts can be formalized in a compact way, and that the rest is our own confusion.
Human philosophy is error prone, but also error tolerant and correcting in a way that no decision algorithm that anyone has ever proposed is. If you give a typical decision algorithm the wrong prior, utility function, or implicit theory of fairness, it will happily go on and do its thing without complaint. Restricting to the set of quining algorithms doesn’t solve this problem. Human beings can somehow sense such errors and try to correct them. I think this ability needs to be understood and programmed into an AI before it can be considered safe.
These are old questions, and certainly they’ll need to be answered. There is no running away from having a good theory that explains why the take-off process can be trusted to do the right thing. Ideally, this theory will be as clear-cut as classical mechanics.
The data is in the process of humanity, but so is the right way of interpreting it. How much do you need to specify by hand to launch the process of extracting preference of humanity formally, that knows what to look for, and is able to find enough to continue looking for more? It’d take at least a technical theory of preference generalized to non-formal agents (of which the humanity is an instance) to answer that question.
This is the kind of issue that makes me skeptical about FAI. We can teach a decision algorithm to an AI, but can we teach the AI how to think about decision algorithms? How to decide whether one concept of fairness is better than another? Whether one interpretation of probability theory is better than another? Whether one Bayesian prior is better than another? To see the obviousness that coding conventions shouldn’t matter?
If we can’t, if we have to hard code all these answers into the AI, then that’s hardly safe, since it seems unlikely that we (i.e. ordinary humans) can achieve a high confidence in all of the answers. But as far as I know, no one has attempted or even proposed to formalize the process of philosophy as an algorithm.
Yes, we do think about these things, and not in an offhand after-work bull session either. The further you go back, the more compact the problem becomes.
Sorry, that’s too elliptical for me to understand. If you have any insights to the problem, I’d love to see a longer write-up.
What I’m saying is that if you can get the AI to choose between decision algorithms, then you can code a decision-algorithm-chooser that searches through many decision algorithms and picks the best one, instead of having to do all that work yourself. Like the difference between the amount of work put in by human chessplayers to build up the human skill of chess over centuries, versus the effort of the programmers to build Deep Blue to search the game tree for good strategies. It’s conceptually much harder work to find the source of the source of the source and build the seed of the seed of the seed, it’s more confusing labor, but if you don’t have a few centuries to spare, you had better look for the generator and not try to duplicate the output.
This merits a long response, please bear with me...
Deep Blue wasn’t built on the single insight of searching the game tree. It incorporates a whole lot of grandmaster-level advice from humans. Computers did not (and AFAIK cannot yet) regenerate the opening book. Also, perhaps more relevantly to AI, the evaluation function for positions is mostly designed by human master players, not inferred all the way backward from winning and losing positions.
One important insight like the minimax tree rarely takes you all the way to victory. We nibble away at the problem, taking out small digestible chunks. Game theory (or Bayesian statistics, or any other insight as of now) doesn’t explain all of human behavior, but it’s a first step without which further steps would’ve been that much harder.
See, I have this persistent idea—maybe stemming from my math upbringing—that the way to advancement of knowledge leads through small problems conclusively solved. That as beliefs should pay rent in predictions, so directions of inquiry should pay rent in small problems conclusively solved, building on one another. Failing to meet this standard is a pattern of FAIL that I’ve seen many, many times, and no doubt so did you. And even if you see your quest as reductionism, searching for the small computational core that would suffice to regenerate game theory and insights beyond, you’d still do well to heed the pattern.
A long quote from my textbook on game theory, Ken Binmore’s “Fun and Games”:
Maybe I should’ve made it a toplevel post, but it sounds too much like dancing about architecture. Better leave it in a comment and think of something mathy to post instead :-)
Interestingly, recent advances in computer Go have come from ignoring much human expertise and incorporating Monte Carlo approaches into the evaluation function.
It works like this:
1) Choose the move to be evaluated.
2a) Search the game tree by having players make random moves for many turns after the initial move is made.
2b) Score each randomly generated “far future” game state by using a very simple evaluation function, such as counting the number of stones on the board of each color.
3) Repeat steps 2a and 2b lots of times, generating lots of randomly played games.
4) Average the score of the randomly generated games. The result is the score for the move.
Very little human knowledge is incorporated into this method, but it has produced some of the strongest computer Go players so far.
Source: Wired
Reason for the success of this method is propably something not all-that-interesting. The random sampling and evaluation method works because it allows using simple brute force and random chance against a problem that computer can’t (yet) handle otherwise. I am dan-player myself, and I have watched games where supercomputers with monte carlo -method and go professionals have dueled, and as far as I can tell, the go program makes repeated weird moves that seem to be doing something(hyperactive agent detector), but after a while, original idea is lost and the move becomes bad. I’d estimate that this defect could easily be used against the bot, but even with just good, honest moves, professionals did good against the bot.
My opinion, that I think is widely accepted(not sure though, haven’t been following this much lately), is that Monte Carlo method won’t produce the programs that beat humans. Reason it’s doing so well is not because the method is awesome, but because good go-players use so much of the potential of human brain, encorporating pattern-detection, strategy-planning, many different level of abstractions etcetc(all this effectively. That’s important point, I think), that our current know-how simply can’t match that at all. Which is why bots that try to simulate humans end up losing to moderately experienced(1 year) players even with multiple handicap stones.
Gonna edit/remake that post to make it clearer, but the point anyway was: go is difficult, current go programs do very poorly against even moderately skilled players, so even random sampling beats those attempts. But random sampling is not going to take go programs past human players, and I believe this hype settles as soon as some better decision-making algorhitms are invented.
That’s the most fascinating thought I’ve read on LW for a long time. Thanks a lot!
It’s not clear to me that the decision-algorithm-chooser problem is more compact than the decision-algorithm problem. I think the right decision algorithm (or at least an AIXI-style idealized version of it) may be pretty short, but to choose between decision algorithms seems to involve most areas of philosophy.
ETA: Do you have the intuition that philosophy can be formalized, and in a compact way?
Perhaps they’re not different problems. If we solve the decision problem and then the algorithm just replicates itself when we tell it to decide between decision algorithms, this suggests that the philosophy “behind” was actually not behind at all. But one way or the other, you don’t want to be stuck with the problem of trying to choose according to a higher criterion using only human wisdom. A quining algorithm could plausibly represent having hit bottom. But if there are leftover dilemmas then you haven’t hit bottom. Then you want to step back and try to compact further.
I have an intuition that the functional parts can be formalized in a compact way, and that the rest is our own confusion.
The set of quining algorithms is infinite, right? Restricting to the set of quining algorithms rules out many pathological cases, but using human wisdom to choose between them is still unavoidable.
How will you know when there are really no leftover dilemmas? In other words, how can you tell that a candidate quining decision algorithm is free of errors? For example, you once thought that it was harmless for a decision algorithm to use the Solomonoff prior (which assumes that the universe is computable). (I hope I’ve convinced you that’s an error.) Such a decision algorithm can certainly be quining, so don’t we need a way to prevent this type of error in the future?
Human philosophy is error prone, but also error tolerant and correcting in a way that no decision algorithm that anyone has ever proposed is. If you give a typical decision algorithm the wrong prior, utility function, or implicit theory of fairness, it will happily go on and do its thing without complaint. Restricting to the set of quining algorithms doesn’t solve this problem. Human beings can somehow sense such errors and try to correct them. I think this ability needs to be understood and programmed into an AI before it can be considered safe.
Basic paradigm on which I operate: http://lesswrong.com/lw/l9/artificial_addition/
These are old questions, and certainly they’ll need to be answered. There is no running away from having a good theory that explains why the take-off process can be trusted to do the right thing. Ideally, this theory will be as clear-cut as classical mechanics.
The data is in the process of humanity, but so is the right way of interpreting it. How much do you need to specify by hand to launch the process of extracting preference of humanity formally, that knows what to look for, and is able to find enough to continue looking for more? It’d take at least a technical theory of preference generalized to non-formal agents (of which the humanity is an instance) to answer that question.