Not surprised that we’re thinking along the same lines, if we both read this blog! ;)
I love your questions. Let’s do this:
Keynesian Beauty Contest: I don’t have a silver bullet for it, but a lot of mitigation tactics. First of all, I envision offering a cascading set of progressively more fine-grained rating attributes, so that, while you can still upvote or downvote, or rate something with starts, you can also rate it on truthfulness, entertainment value, fairness, rationality (and countless other attributes)… More nuanced ratings would probably carry more influence (again, subject to others’ cross-rating). Therefore, to gain the highest levels of influence, you’d need to be nuanced in your ratings of content… gaming the system with nuanced, detailed opinions might be effectively the same as providing value to the system. I don’t mind someone trying to figure out the general population’s nuanced preferences… that’s actually a valuable service!
Secondly, your ratings are also cross-related to the semantic metadata (folksonomy of tags) of the content, so that your influence is limited to the topic at hand. Gaining a high influence score as a fashion celebrity doesn’t put your political or scientific opinions at the top of search results. Hopefully, this works as a sort of structural Palin-filter. ;)
The third mitigation has to do with your second question: How do we handle the processing of millions of real-time preference data points, when all of them should (in theory) get cross-related to all others, with (theoretically) endless recursion?
The typical web-based service approach of centralized crunching doesn’t make sense. I’m envisioning a distributed system where each influence node talks with a few others (a dozen?), and does some cross-processing with a them to agree on some temporary local normals, means and averages. That cluster does some more higher-level processing in consort with other close-by clusters, and they negotiate some “regional” aggregates… that gets propagated back down into the local level, and up to the next level of abstraction… up until you reach some set of a dozen superclusters that span the globe, and who trade in high-level aggregates.
All that is regulated, in terms of clock ticks, by activity: Content that is being rated/shared/commented on by many people will be accessed and cached by more local nodes, and processed by more clusters, and its cross-processing will be accelerated because it’s “hot”. Whereas one little opinion on one obscure item might not get processed by servers on the other side of the world until someone there requests it. We also decay data this way: If nobody cares, the system eventually forgets. (Your personal node will remember your preferences, but the network, after having consumed their influence effects, might forget their data points.)
A distributed, propagation system, batch-processed, not real-time, not atomic but aggregated. That means you can’t go back and change old ratings, and individual data points, because they get consumed by the aggregates. That means you can’t inspect what made your scored go up and down at the atomic level. That means your score isn’t the same everywhere on the planet at the same time. So gaming the system is harder because there’s no real-time feedback loop, there’s no single source of absolute truth (truth is local and propagates lazily), and there’s no auditing trail of the individual effects of your influence.
All of this hopefully makes the system so fluid that it holds innumerable beauty contests, always ongoing, always local, and the results are different depending on when and where you are. Hopefully this makes the search for the Nash equilibrium a futile exercise, and people give up and just say what they actually think is valuable to others, as opposed to just expected by others.
That’s my wishful thinking at the point. Am I fooling myself?
I’d create a simplified evolutionary model of the system using a GA to create the agents. If groups can find a way to game your system to create infinite interesting-ness/insightful-ness for specific topics, that then you need to change it.
You’re right: A system like that could be genetically evolved for optimization.
On the other hand, I was hoping to create an open optimization algorithm, governable by the community at large… based on their influence scores in the field of “online influence governance.” So the community would have to notice abuse and gaming of the system, and modify policy (as expressed in the algorithm, in the network rules, in laws and regulations and in social mores) to respond to it. Kind of like democracy: Make a good set of rules for collaborative rule-making, give it to the people, and hope they don’t break it.
But of course the Huns could take over. I’m trusting us to protect ourselves. In some way this would be poetic justice: If crowds can’t be wise, even when given a chance to select and filter among the members for wisdom, then I’ll give up on bootstrapping humanity and wait patiently for the singularity. Until then, though, I’d like to see how far we could go if given a useful tool for collaboration, and left to our own devices.
I think you are closer to a strong solution than you realize. You have mentioned the pieces but I think you haven’t put them together yet. In short, the solution I see is to depend on local (individual) decisions rather than group ones. If each node has its own ranking algorithm and its own set of trust relations, there is no reason to create complex group-cooperation mechanisms. A user that spams gets negative feedback and therefore eventually gets isolated in the graph. Even if automated users outnumber real users, the best they can do is vote each other up and therefore end up with their own cluster of the network, with real users only strongly connected to each other. Of course, if a bot provides value, it can be incorporated in that graph. “sufficiently advanced spam...”, etc. etc. This also means that the graph splinters into various clusters depending on worldview. (your rush limbaugh example). This deals with keynesian beauty contests as there is no ‘average’ to aim at. Your values simply cluster you with people who share them. If you value quality, you go closer to quality. If you value ‘republican-ness’ you move closer to that. The price you pay is that there is no ‘objective’ view of the system. There is no ‘top 10 articles’, only ‘top 10 articles for user X’.
Another thing I see with your design is that it is complex and attempts to boil at least a few oceans. (emergent ontologies/folksonomies for one, distributing identity, storage, etc.). I have some experience with defining complex architectures for distributed systems (e.g. http://arxiv.org/abs/0907.2485 ) and the problem is that they need years of work by many people to reach some theoretical purity, and even then bootstrapping will be a bitch. The system I have in mind is extremely simple by comparison, definitely more pragmatic (and therefore makes compromises) and is based on established web technologies. As a result, it should bootstrap itself quite easily. I find myself not wanting to publicly share the full details until I can start working on the thing (I am currently writing up my PhD thesis and my deadline is Oct. 1. After that, I’m focusing on this project). If you want to talk more details, we should probably take this to a private discussion.
You are right: This needs to be a fully decentralized system, with no center, and processing happening at the nodes. I was conceiving of “regional” aggregates mostly as a guess as to what may relieve network congestion if every node calls out to thousands of others.
Thank you for setting me right: My thinking has been so influenced by over a decade of web app dev that I’m still working on integrating the full principles of decentralized systems.
As for boiling oceans… I wish you were wrong, but you probably are right. Some of these architectures are likely to be enormously hard to fine-tune for effectiveness. At the same time, I am also hoping to piggyback on existing standards and systems.
Alexandros,
Not surprised that we’re thinking along the same lines, if we both read this blog! ;)
I love your questions. Let’s do this:
Keynesian Beauty Contest: I don’t have a silver bullet for it, but a lot of mitigation tactics. First of all, I envision offering a cascading set of progressively more fine-grained rating attributes, so that, while you can still upvote or downvote, or rate something with starts, you can also rate it on truthfulness, entertainment value, fairness, rationality (and countless other attributes)… More nuanced ratings would probably carry more influence (again, subject to others’ cross-rating). Therefore, to gain the highest levels of influence, you’d need to be nuanced in your ratings of content… gaming the system with nuanced, detailed opinions might be effectively the same as providing value to the system. I don’t mind someone trying to figure out the general population’s nuanced preferences… that’s actually a valuable service!
Secondly, your ratings are also cross-related to the semantic metadata (folksonomy of tags) of the content, so that your influence is limited to the topic at hand. Gaining a high influence score as a fashion celebrity doesn’t put your political or scientific opinions at the top of search results. Hopefully, this works as a sort of structural Palin-filter. ;)
The third mitigation has to do with your second question: How do we handle the processing of millions of real-time preference data points, when all of them should (in theory) get cross-related to all others, with (theoretically) endless recursion?
The typical web-based service approach of centralized crunching doesn’t make sense. I’m envisioning a distributed system where each influence node talks with a few others (a dozen?), and does some cross-processing with a them to agree on some temporary local normals, means and averages. That cluster does some more higher-level processing in consort with other close-by clusters, and they negotiate some “regional” aggregates… that gets propagated back down into the local level, and up to the next level of abstraction… up until you reach some set of a dozen superclusters that span the globe, and who trade in high-level aggregates.
All that is regulated, in terms of clock ticks, by activity: Content that is being rated/shared/commented on by many people will be accessed and cached by more local nodes, and processed by more clusters, and its cross-processing will be accelerated because it’s “hot”. Whereas one little opinion on one obscure item might not get processed by servers on the other side of the world until someone there requests it. We also decay data this way: If nobody cares, the system eventually forgets. (Your personal node will remember your preferences, but the network, after having consumed their influence effects, might forget their data points.)
A distributed, propagation system, batch-processed, not real-time, not atomic but aggregated. That means you can’t go back and change old ratings, and individual data points, because they get consumed by the aggregates. That means you can’t inspect what made your scored go up and down at the atomic level. That means your score isn’t the same everywhere on the planet at the same time. So gaming the system is harder because there’s no real-time feedback loop, there’s no single source of absolute truth (truth is local and propagates lazily), and there’s no auditing trail of the individual effects of your influence.
All of this hopefully makes the system so fluid that it holds innumerable beauty contests, always ongoing, always local, and the results are different depending on when and where you are. Hopefully this makes the search for the Nash equilibrium a futile exercise, and people give up and just say what they actually think is valuable to others, as opposed to just expected by others.
That’s my wishful thinking at the point. Am I fooling myself?
I’d create a simplified evolutionary model of the system using a GA to create the agents. If groups can find a way to game your system to create infinite interesting-ness/insightful-ness for specific topics, that then you need to change it.
You’re right: A system like that could be genetically evolved for optimization.
On the other hand, I was hoping to create an open optimization algorithm, governable by the community at large… based on their influence scores in the field of “online influence governance.” So the community would have to notice abuse and gaming of the system, and modify policy (as expressed in the algorithm, in the network rules, in laws and regulations and in social mores) to respond to it. Kind of like democracy: Make a good set of rules for collaborative rule-making, give it to the people, and hope they don’t break it.
But of course the Huns could take over. I’m trusting us to protect ourselves. In some way this would be poetic justice: If crowds can’t be wise, even when given a chance to select and filter among the members for wisdom, then I’ll give up on bootstrapping humanity and wait patiently for the singularity. Until then, though, I’d like to see how far we could go if given a useful tool for collaboration, and left to our own devices.
I think you are closer to a strong solution than you realize. You have mentioned the pieces but I think you haven’t put them together yet. In short, the solution I see is to depend on local (individual) decisions rather than group ones. If each node has its own ranking algorithm and its own set of trust relations, there is no reason to create complex group-cooperation mechanisms. A user that spams gets negative feedback and therefore eventually gets isolated in the graph. Even if automated users outnumber real users, the best they can do is vote each other up and therefore end up with their own cluster of the network, with real users only strongly connected to each other. Of course, if a bot provides value, it can be incorporated in that graph. “sufficiently advanced spam...”, etc. etc. This also means that the graph splinters into various clusters depending on worldview. (your rush limbaugh example). This deals with keynesian beauty contests as there is no ‘average’ to aim at. Your values simply cluster you with people who share them. If you value quality, you go closer to quality. If you value ‘republican-ness’ you move closer to that. The price you pay is that there is no ‘objective’ view of the system. There is no ‘top 10 articles’, only ‘top 10 articles for user X’.
Another thing I see with your design is that it is complex and attempts to boil at least a few oceans. (emergent ontologies/folksonomies for one, distributing identity, storage, etc.). I have some experience with defining complex architectures for distributed systems (e.g. http://arxiv.org/abs/0907.2485 ) and the problem is that they need years of work by many people to reach some theoretical purity, and even then bootstrapping will be a bitch. The system I have in mind is extremely simple by comparison, definitely more pragmatic (and therefore makes compromises) and is based on established web technologies. As a result, it should bootstrap itself quite easily. I find myself not wanting to publicly share the full details until I can start working on the thing (I am currently writing up my PhD thesis and my deadline is Oct. 1. After that, I’m focusing on this project). If you want to talk more details, we should probably take this to a private discussion.
You are right: This needs to be a fully decentralized system, with no center, and processing happening at the nodes. I was conceiving of “regional” aggregates mostly as a guess as to what may relieve network congestion if every node calls out to thousands of others.
Thank you for setting me right: My thinking has been so influenced by over a decade of web app dev that I’m still working on integrating the full principles of decentralized systems.
As for boiling oceans… I wish you were wrong, but you probably are right. Some of these architectures are likely to be enormously hard to fine-tune for effectiveness. At the same time, I am also hoping to piggyback on existing standards and systems.
Anyway, let’s certainly talk offline!