This merits a long response, please bear with me...
Deep Blue wasn’t built on the single insight of searching the game tree. It incorporates a whole lot of grandmaster-level advice from humans. Computers did not (and AFAIK cannot yet) regenerate the opening book. Also, perhaps more relevantly to AI, the evaluation function for positions is mostly designed by human master players, not inferred all the way backward from winning and losing positions.
One important insight like the minimax tree rarely takes you all the way to victory. We nibble away at the problem, taking out small digestible chunks. Game theory (or Bayesian statistics, or any other insight as of now) doesn’t explain all of human behavior, but it’s a first step without which further steps would’ve been that much harder.
See, I have this persistent idea—maybe stemming from my math upbringing—that the way to advancement of knowledge leads through small problems conclusively solved. That as beliefs should pay rent in predictions, so directions of inquiry should pay rent in small problems conclusively solved, building on one another. Failing to meet this standard is a pattern of FAIL that I’ve seen many, many times, and no doubt so did you. And even if you see your quest as reductionism, searching for the small computational core that would suffice to regenerate game theory and insights beyond, you’d still do well to heed the pattern.
A long quote from my textbook on game theory, Ken Binmore’s “Fun and Games”:
People who do not understand how human knowledge is advanced are often scornful of such claims. What is the point, they say, of studying a silly parlor game like Poker if the aim is to gain insight into the broad sweep of human affairs? I like to characterize such folk as naive positivists. They have seldom received any serious mathematical training. Otherwise they would have learned that difficult problems do not commonly succumb to frontal attacks. One has to lay siege to them. In graduate school, mathematicians are taught to look at simpler versions of their problems first. Examples help to rid the mind of misconceptions and clarify what the difficulties are. Such a search for insight may take one far afield. A political scientist may end up studying the Prisoners’ Dilemma. Economists may look at the literature on Poker models. Naive positivists will think their labors ridiculous. But this is because naive positivists do not understand that nobody ever solved a difficult problem who had not learned to solve simple problems first.
Maybe I should’ve made it a toplevel post, but it sounds too much like dancing about architecture. Better leave it in a comment and think of something mathy to post instead :-)
Interestingly, recent advances in computer Go have come from ignoring much human expertise and incorporating Monte Carlo approaches into the evaluation function.
It works like this:
1) Choose the move to be evaluated.
2a) Search the game tree by having players make random moves for many turns after the initial move is made.
2b) Score each randomly generated “far future” game state by using a very simple evaluation function, such as counting the number of stones on the board of each color.
3) Repeat steps 2a and 2b lots of times, generating lots of randomly played games.
4) Average the score of the randomly generated games. The result is the score for the move.
Very little human knowledge is incorporated into this method, but it has produced some of the strongest computer Go players so far.
Reason for the success of this method is propably something not all-that-interesting. The random sampling and evaluation method works because it allows using simple brute force and random chance against a problem that computer can’t (yet) handle otherwise. I am dan-player myself, and I have watched games where supercomputers with monte carlo -method and go professionals have dueled, and as far as I can tell, the go program makes repeated weird moves that seem to be doing something(hyperactive agent detector), but after a while, original idea is lost and the move becomes bad. I’d estimate that this defect could easily be used against the bot, but even with just good, honest moves, professionals did good against the bot.
My opinion, that I think is widely accepted(not sure though, haven’t been following this much lately), is that Monte Carlo method won’t produce the programs that beat humans. Reason it’s doing so well is not because the method is awesome, but because good go-players use so much of the potential of human brain, encorporating pattern-detection, strategy-planning, many different level of abstractions etcetc(all this effectively. That’s important point, I think), that our current know-how simply can’t match that at all. Which is why bots that try to simulate humans end up losing to moderately experienced(1 year) players even with multiple handicap stones.
Gonna edit/remake that post to make it clearer, but the point anyway was: go is difficult, current go programs do very poorly against even moderately skilled players, so even random sampling beats those attempts. But random sampling is not going to take go programs past human players, and I believe this hype settles as soon as some better decision-making algorhitms are invented.
This merits a long response, please bear with me...
Deep Blue wasn’t built on the single insight of searching the game tree. It incorporates a whole lot of grandmaster-level advice from humans. Computers did not (and AFAIK cannot yet) regenerate the opening book. Also, perhaps more relevantly to AI, the evaluation function for positions is mostly designed by human master players, not inferred all the way backward from winning and losing positions.
One important insight like the minimax tree rarely takes you all the way to victory. We nibble away at the problem, taking out small digestible chunks. Game theory (or Bayesian statistics, or any other insight as of now) doesn’t explain all of human behavior, but it’s a first step without which further steps would’ve been that much harder.
See, I have this persistent idea—maybe stemming from my math upbringing—that the way to advancement of knowledge leads through small problems conclusively solved. That as beliefs should pay rent in predictions, so directions of inquiry should pay rent in small problems conclusively solved, building on one another. Failing to meet this standard is a pattern of FAIL that I’ve seen many, many times, and no doubt so did you. And even if you see your quest as reductionism, searching for the small computational core that would suffice to regenerate game theory and insights beyond, you’d still do well to heed the pattern.
A long quote from my textbook on game theory, Ken Binmore’s “Fun and Games”:
Maybe I should’ve made it a toplevel post, but it sounds too much like dancing about architecture. Better leave it in a comment and think of something mathy to post instead :-)
Interestingly, recent advances in computer Go have come from ignoring much human expertise and incorporating Monte Carlo approaches into the evaluation function.
It works like this:
1) Choose the move to be evaluated.
2a) Search the game tree by having players make random moves for many turns after the initial move is made.
2b) Score each randomly generated “far future” game state by using a very simple evaluation function, such as counting the number of stones on the board of each color.
3) Repeat steps 2a and 2b lots of times, generating lots of randomly played games.
4) Average the score of the randomly generated games. The result is the score for the move.
Very little human knowledge is incorporated into this method, but it has produced some of the strongest computer Go players so far.
Source: Wired
Reason for the success of this method is propably something not all-that-interesting. The random sampling and evaluation method works because it allows using simple brute force and random chance against a problem that computer can’t (yet) handle otherwise. I am dan-player myself, and I have watched games where supercomputers with monte carlo -method and go professionals have dueled, and as far as I can tell, the go program makes repeated weird moves that seem to be doing something(hyperactive agent detector), but after a while, original idea is lost and the move becomes bad. I’d estimate that this defect could easily be used against the bot, but even with just good, honest moves, professionals did good against the bot.
My opinion, that I think is widely accepted(not sure though, haven’t been following this much lately), is that Monte Carlo method won’t produce the programs that beat humans. Reason it’s doing so well is not because the method is awesome, but because good go-players use so much of the potential of human brain, encorporating pattern-detection, strategy-planning, many different level of abstractions etcetc(all this effectively. That’s important point, I think), that our current know-how simply can’t match that at all. Which is why bots that try to simulate humans end up losing to moderately experienced(1 year) players even with multiple handicap stones.
Gonna edit/remake that post to make it clearer, but the point anyway was: go is difficult, current go programs do very poorly against even moderately skilled players, so even random sampling beats those attempts. But random sampling is not going to take go programs past human players, and I believe this hype settles as soon as some better decision-making algorhitms are invented.
That’s the most fascinating thought I’ve read on LW for a long time. Thanks a lot!