Just a thought on chess playing. Rather than looking at an extreme like Kasparov vs the world, it would be interesting to me to have teams of two, three, and four players of well-known individual ranking. These teams could then play many games against individuals and against each other. The effective ranking of the teams could be determined from their results. In this way, some sense of “how much smarter” a team is than the individual members could be determined. Ideally, the team would not be ranked until it had had significant experience playing as a team. We are interested in what a team could accomplish, and no strong reason to think it would take less time to optimize a team than to optimize an individual.
Along the same lines, teams could be developed to take IQ and other GI correlated tests to see how much smarter a few people together are than a single human. Would the results have implications for optimal AI design?
I think that teams of up to five people can scale “pretty well by human standards”—not too far from linearly. It’s going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
As group size increases you have to spend more and more of your effort getting your ideas heard and keeping up with the worthwhile ideas being proposed by other people, as opposed to coming up with your own good ideas.
Depending on the relevant infrastructure and collaboration mechanisms, it’s fairly easy to have a negative contribution from each additional person in the project. If someone is trying to say something, then someone else has to listen—even if all the listener does is keep it from lowering the signal-to-noise ratio by removing the contribution.
You correctly describe the problems of coordinating the selection of the best result produced. But there’s another big problem: coordinating the division of work.
When you add another player to a huge team of 5000 people, he won’t start exploring a completely new series of moves no-one else had considered before. Instead, he will likely spend most of his time considering moves already considered by some of the existing players. That’s another reason why his marginal contribution will be so low.
Unlike humans, computers are good at managing divide-and-conquer problems. In chess, a lot of the search for the next move is local in the move tree. That’s what makes it a particularly good example of human groups not scaling where computers would.
I think that teams of up to five people can scale “pretty well by human standards”—not too far from linearly. It’s going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
That’s parallelism for you. It’s like the way that four-core chips are popular, while million-core chips are harder to come by.
I assume by ‘linear’ you mean directly proportional to population size.
The diminishing marginal returns of some tasks, like the “wisdom of crowds” (concerned with forming accurate estimates) are well established, and taper off quickly regardless of the difficulty of the task—it’s basically follows the law of large numbers and sample error (see “A Note on Aggregating Opinions”, Hogarth, 1978). This glosses over some potential complexity but you’re probably unlikely to do ever get much benefit from more than a few hundred people, if that many.
Other tasks do not see such quickly diminishing returns, such as problem solving in a complex fitness landscape (see work on “Exploration and Exploitation” especially in NK space). Supposing the number of possible solutions to a problem to be much greater than the number of people feasibly working on the problem (e.g., the population of creative and engaged humans) then as the number of people increase, the probability of finding the optimal solution increases. Coordinating all those people is another issue, as is the potential opportunity cost of having so many people work on the same problem.
However, in my experience, this difference between problem-solving and wisdom-of-crowds tasks is often glossed over in collective intelligence research.
Regarding the apparent non-scaling benefits of history: what you call the “most charitable” explanation seems to me the most likely. Thousands of people work at places like CERN and spend 20 years contributing to a single paper, doing things that simply could not be done by a small team. Models of problem-solving on “NK Space” type fitness landscapes also support this interpretation: fitness improvements become increasingly hard to find over time. As you’ve noted elsewhere, it’s easier to pluck low-hanging fruit.
Are you or anyone else aware of any work along these lines, showing the intelligence of groups of people?
Any sense of what the intelligence of the planet as a whole, or the largest effective intelligence of any group on the planet might be?
If groups of up to 5 scale well, and we get sublinear returns above 5, but positive returns up to some point anyway, does this prove that AI won’t FOOM until it has an intelligence larger than the largest intelligence of a group of humans? That is, until AI has a higher intelligence than the group, that the group of humans will dominate the rate at which new AI’s are improved?
As far as empirically finding the optimum group size, it’d be cheaper to find the number of researchers in a scientific sub-discipline and measure the productive work they do in that field. They are teams that review work for general distribution, read on others’ progress, and contribute to the discussion. Larger sub-fields that would be more efficient divided up would have large incentives to do so, as defectors to the sub-sub-field would have higher productivity (and less irrelevant work to read up on).
Does anyone play (rated) chess on freechess.org? If so, do you want to get together to play some team games for the purposes of adding hard data to this discussion?
My blitz rating is in the high 1200s. My teammate should have a blitz rating close to that to make the data valuable. I play 8-minute games, and am not interested in playing enough non-blitz games to get my rating to be an accurate reflection of my (individual) skill. (Non-blitz games would take too much time and take too much out of me. “Non-blitz” games are defined as games with at least 15 minutes on the clock for each player.)
I envision the team being co-located while playing, which limits my teammate to someone who is or will be in San Francisco or Berkeley.
I’ve played a little “team chess” before. Was a lot of fun.
Just a thought on chess playing. Rather than looking at an extreme like Kasparov vs the world, it would be interesting to me to have teams of two, three, and four players of well-known individual ranking. These teams could then play many games against individuals and against each other. The effective ranking of the teams could be determined from their results. In this way, some sense of “how much smarter” a team is than the individual members could be determined. Ideally, the team would not be ranked until it had had significant experience playing as a team. We are interested in what a team could accomplish, and no strong reason to think it would take less time to optimize a team than to optimize an individual.
Along the same lines, teams could be developed to take IQ and other GI correlated tests to see how much smarter a few people together are than a single human. Would the results have implications for optimal AI design?
I think that teams of up to five people can scale “pretty well by human standards”—not too far from linearly. It’s going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
As group size increases you have to spend more and more of your effort getting your ideas heard and keeping up with the worthwhile ideas being proposed by other people, as opposed to coming up with your own good ideas.
Depending on the relevant infrastructure and collaboration mechanisms, it’s fairly easy to have a negative contribution from each additional person in the project. If someone is trying to say something, then someone else has to listen—even if all the listener does is keep it from lowering the signal-to-noise ratio by removing the contribution.
You correctly describe the problems of coordinating the selection of the best result produced. But there’s another big problem: coordinating the division of work.
When you add another player to a huge team of 5000 people, he won’t start exploring a completely new series of moves no-one else had considered before. Instead, he will likely spend most of his time considering moves already considered by some of the existing players. That’s another reason why his marginal contribution will be so low.
Unlike humans, computers are good at managing divide-and-conquer problems. In chess, a lot of the search for the next move is local in the move tree. That’s what makes it a particularly good example of human groups not scaling where computers would.
That’s parallelism for you. It’s like the way that four-core chips are popular, while million-core chips are harder to come by.
I assume by ‘linear’ you mean directly proportional to population size.
The diminishing marginal returns of some tasks, like the “wisdom of crowds” (concerned with forming accurate estimates) are well established, and taper off quickly regardless of the difficulty of the task—it’s basically follows the law of large numbers and sample error (see “A Note on Aggregating Opinions”, Hogarth, 1978). This glosses over some potential complexity but you’re probably unlikely to do ever get much benefit from more than a few hundred people, if that many.
Other tasks do not see such quickly diminishing returns, such as problem solving in a complex fitness landscape (see work on “Exploration and Exploitation” especially in NK space). Supposing the number of possible solutions to a problem to be much greater than the number of people feasibly working on the problem (e.g., the population of creative and engaged humans) then as the number of people increase, the probability of finding the optimal solution increases. Coordinating all those people is another issue, as is the potential opportunity cost of having so many people work on the same problem.
However, in my experience, this difference between problem-solving and wisdom-of-crowds tasks is often glossed over in collective intelligence research.
Regarding the apparent non-scaling benefits of history: what you call the “most charitable” explanation seems to me the most likely. Thousands of people work at places like CERN and spend 20 years contributing to a single paper, doing things that simply could not be done by a small team. Models of problem-solving on “NK Space” type fitness landscapes also support this interpretation: fitness improvements become increasingly hard to find over time. As you’ve noted elsewhere, it’s easier to pluck low-hanging fruit.
Are you or anyone else aware of any work along these lines, showing the intelligence of groups of people?
Any sense of what the intelligence of the planet as a whole, or the largest effective intelligence of any group on the planet might be?
If groups of up to 5 scale well, and we get sublinear returns above 5, but positive returns up to some point anyway, does this prove that AI won’t FOOM until it has an intelligence larger than the largest intelligence of a group of humans? That is, until AI has a higher intelligence than the group, that the group of humans will dominate the rate at which new AI’s are improved?
There is the MIT Center for Collective Intelligence.
Update: this is a pretty large field of research now. The Collective Intelligence Conference is going into its 7th year.
As far as empirically finding the optimum group size, it’d be cheaper to find the number of researchers in a scientific sub-discipline and measure the productive work they do in that field. They are teams that review work for general distribution, read on others’ progress, and contribute to the discussion. Larger sub-fields that would be more efficient divided up would have large incentives to do so, as defectors to the sub-sub-field would have higher productivity (and less irrelevant work to read up on).
Does anyone play (rated) chess on freechess.org? If so, do you want to get together to play some team games for the purposes of adding hard data to this discussion?
My blitz rating is in the high 1200s. My teammate should have a blitz rating close to that to make the data valuable. I play 8-minute games, and am not interested in playing enough non-blitz games to get my rating to be an accurate reflection of my (individual) skill. (Non-blitz games would take too much time and take too much out of me. “Non-blitz” games are defined as games with at least 15 minutes on the clock for each player.)
I envision the team being co-located while playing, which limits my teammate to someone who is or will be in San Francisco or Berkeley.
I’ve played a little “team chess” before. Was a lot of fun.
My contact info is here.