We got to discussing this on #lesswrong recently. I don’t see anyone here pointing this out yet directly, so:
Can you technically Strong Upvote everything? Well, we can’t stop you. But we’re hoping a combination of mostly-good-faith + trivial inconveniences will result in people using Strong Upvotes when they feel it’s actually important.
This approach, hoping that good faith will prevent people from using Strong votes “too much”, is a good example of an Asshole Filter (linkposted on LW last year). You’ve set some (unclear) boundaries, then due to not enforcing them, reward those who violate them with increased control over the site conversation. Chris_Leong gestures towards this without directly naming it in a sibling comment.
In my opinion “maybe put limits on strong upvotes if this seems to be a problem” is not the correct response to this problem, nor would be banning or otherwise ‘disciplining’ users who use strong votes “too much”. The correct response is to remove the asshole filter by altering the incentives to match what you want to happen. Options include:
Making votes normal by default but encouraging users to use strong votes freely, up to 100% of the time, so that good faith users are not disadvantaged. (Note: still disenfranchises users who don’t notice that this feature exists, but maybe that’s ok.)
Making votes strong by default so that it’s making a “weak” vote that takes extra effort. (Note: this gives users who carefully make weak votes when they have weak opinions less weight, but at least they do this with eyes open and in the absence of perverse incentives.)
#2 but with some algorithmic adjustment to give careful users more weight instead of less. This seems extremely difficult to get right (cf. slashdot metamoderation). Probably the correct answer there is some form of collaborative filtering.
Personally I favour solution #1.
I’ll add that this is not just a hypothetical troll-control issue. This is also a UX issue. Forcing users to navigate an unclear ethical question and prisoner’s dilemma—how much strong voting is “too much”—in order to use the site is unpleasant and a bad user experience. There should not be a “wrong” action available in the user interface.
PS. I’ll concede that making strong votes an actually limited resource that is enforced by the site economically (eg. with Token Bucket quota) would in a way also work, due to eliminating the perceived need for strong votes to be limited by “good faith”. But IMO the need is only perceived, and not real. Voting is for expressing preferences, and preferences are unlimited.
Note: I would never punish anyone for their vote-actions on the site, both because I agree that you should not punish people for giving them options without communicating any downside, but more importantly, because I think it is really important that votes form an independent assessment for which people do not feel like they have to justify themselves. Any punishment of voting would include some kind of public discussion of vote-patterns, which is definitely off-limits for us, and something we are very very very hesitant to do. (This seemed important to say, since I think independence of voting is quite important for the site integrity)
(Note: still disenfranchises users who don’t notice that this feature exists, but maybe that’s ok.)
It is not difficult to make people notice the feature exists; cf. the GreaterWrong implementation. (Some people will, of course, still fail to notice it, somehow. There are limits to how much obliviousness can be countered via reasonable UX design decisions.)
This is also a UX issue. Forcing users to navigate an unclear ethical question and prisoner’s dilemma—how much strong voting is “too much”—in order to use the site is unpleasant and a bad user experience. There should not be a “wrong” action available in the user interface.
[emphasis mine]
This is a good point, but a subtle and easily-mistakable one.
There is a misinterpretation of the bolded claim, which goes like this:
The UI should not permit an action which the user would not want to take.
The response to this, of course, is that the designers of the UI do not necessarily know in advance what actions the user does or does not want to take. Therefore let the UI permit all manner of actions; let the user decide what he wishes to do.
But that is not what (I am fairly sure) nshepperd meant. Rather, the right interpretation is:
The UI should not permit an action which the user, having taken, will (predictably) be informed was a wrong action.
In other words, if it’s known, by the system, that a certain action should not be taken by the user, then make it so that action cannot be taken! If you know the action is wrong, don’t wait until after the user does it to inform him of this! Say, in advance: “No, you may not do this.”
And with this view I entirely agree.
Voting is for expressing preferences, and preferences are unlimited.
It is my understanding that some or all of the LW team (as well as, possibly, others?) do not take this view. As I understand it, the contrary view is that the purpose of voting is to adjust the karma that a post/comment ends up with to some perceived “proper” value, rather than to express an independent opinion of it. The former may involve voting up, or down, strongly or weakly… I confess that I find this view perplexing, myself, so I will let its proponents defend it further, if they wish.
I don’t think it’s super productive to go into this with a ton of debt, but I do also think that voting is for expressing preferences, just that it’s better to model the preference as “on a scale from 1 to 1000, how good is this post?”, instead of “is this post good or bad?”. And you implement the former by upvoting if it is below your threshold, and downvoting if it is above, with the strong version being used when it’s particularly far away from where your assessment is. This gives you access to a bunch more data than if everyone just votes independently (i.e. voting independently results in posts just above the threshold for “good enough to strong-upvote” for a lot of users but to get the same karma as a post that is in the top 5 of all-time favorite posts for everyone who upvoted it).
In either case I am interested in an independent assessment, just that the assessment moves from “binary good/bad” to “numerical ordering of preferences”.
The problem with this view is that there does not seem to be any way to calibrate the scale. What should be the karma of a good post? A bad post? A mediocre one? What does 20 mean? What does 5 mean? Don’t the answers to these questions depend on how many users are voting on the post, and what their voting behavior is? Suppose you and I both hold the view you describe, but I think a good post should have 100 karma and you think a good post should have 300 karma—how should our voting behavior be interpreted? What does it mean, when a post ends up with, say, 75 karma? Do people think it’s good? Bad? Do we know?
This gets very complicated. It seems like the signal is degraded, not improved, by this.
i.e. voting independently results in posts just above the threshold for “good enough to strong-upvote” for a lot of users but to get the same karma as a post that is in the top 5 of all-time favorite posts for everyone who upvoted it
It seems to me like your perspective results in an improved signal only if everyone who votes has the same opinions on everything.
If people do not have the same opinions, then there will be a distribution across people’s “good enough to strong-upvote” thresholds; a post’s karma will then reflect its position along that distribution. A “top 5 all-time favorite for many people” will be “good enough to strong-upvote” for most people, and will have a high score. A “just good enough to upvote” post for many people, will cross that threshold for fewer, i.e. will be lower along that distribution, and will end up with a lower score. (In other words, you’re getting strong upvote × probability of strong upvote, summed across all voters.)
If everyone has the same opinion, then this will simply result in either everyone strong-upvoting it or no one strong-upvoting it—and in that case, my earlier concern about differently calibrated scales also does not apply.
So, your interpretation seems optimal if adopted by a user population with extremely homogeneous opinions. It is strongly sub-optimal, however, if adopted by a user population with a diverse range of opinions; in that scenario, the “votes independently indicate one’s own evaluation” interpretation is optimal.
Overall, agree on the whole asshole filter thing. After a few months of operation, we now have a bunch more data on how people vote, and so might make some adjustments to the system after we analyzed the data a bunch more.
I am currently tending towards a system where your strong-upvotes get weaker the more often you use them, using some kind of “exhaustion” mechanic. I think this still would cause a small amount of overrepresentation by people who use it a lot, but I think would lessen the strength of the effect. I am mostly worried about the UI complexity of this, and communicating this clearly to the user.
Also still open to other suggestions. I am not a huge fan of just leaving them unlimited, mostly because I think it’s somewhat arbitrary to what degree someone will perceive them as a trivial inconvenience, and then we just introduced a bunch of random noise into our karma system, by overrepresenting people who don’t find click-and-hold to be a large inconvenience.
We got to discussing this on #lesswrong recently. I don’t see anyone here pointing this out yet directly, so:
This approach, hoping that good faith will prevent people from using Strong votes “too much”, is a good example of an Asshole Filter (linkposted on LW last year). You’ve set some (unclear) boundaries, then due to not enforcing them, reward those who violate them with increased control over the site conversation. Chris_Leong gestures towards this without directly naming it in a sibling comment.
In my opinion “maybe put limits on strong upvotes if this seems to be a problem” is not the correct response to this problem, nor would be banning or otherwise ‘disciplining’ users who use strong votes “too much”. The correct response is to remove the asshole filter by altering the incentives to match what you want to happen. Options include:
Making votes normal by default but encouraging users to use strong votes freely, up to 100% of the time, so that good faith users are not disadvantaged. (Note: still disenfranchises users who don’t notice that this feature exists, but maybe that’s ok.)
Making votes strong by default so that it’s making a “weak” vote that takes extra effort. (Note: this gives users who carefully make weak votes when they have weak opinions less weight, but at least they do this with eyes open and in the absence of perverse incentives.)
#2 but with some algorithmic adjustment to give careful users more weight instead of less. This seems extremely difficult to get right (cf. slashdot metamoderation). Probably the correct answer there is some form of collaborative filtering.
Personally I favour solution #1.
I’ll add that this is not just a hypothetical troll-control issue. This is also a UX issue. Forcing users to navigate an unclear ethical question and prisoner’s dilemma—how much strong voting is “too much”—in order to use the site is unpleasant and a bad user experience. There should not be a “wrong” action available in the user interface.
PS. I’ll concede that making strong votes an actually limited resource that is enforced by the site economically (eg. with Token Bucket quota) would in a way also work, due to eliminating the perceived need for strong votes to be limited by “good faith”. But IMO the need is only perceived, and not real. Voting is for expressing preferences, and preferences are unlimited.
Note: I would never punish anyone for their vote-actions on the site, both because I agree that you should not punish people for giving them options without communicating any downside, but more importantly, because I think it is really important that votes form an independent assessment for which people do not feel like they have to justify themselves. Any punishment of voting would include some kind of public discussion of vote-patterns, which is definitely off-limits for us, and something we are very very very hesitant to do. (This seemed important to say, since I think independence of voting is quite important for the site integrity)
It is not difficult to make people notice the feature exists; cf. the GreaterWrong implementation. (Some people will, of course, still fail to notice it, somehow. There are limits to how much obliviousness can be countered via reasonable UX design decisions.)
[emphasis mine]
This is a good point, but a subtle and easily-mistakable one.
There is a misinterpretation of the bolded claim, which goes like this:
The UI should not permit an action which the user would not want to take.
The response to this, of course, is that the designers of the UI do not necessarily know in advance what actions the user does or does not want to take. Therefore let the UI permit all manner of actions; let the user decide what he wishes to do.
But that is not what (I am fairly sure) nshepperd meant. Rather, the right interpretation is:
The UI should not permit an action which the user, having taken, will (predictably) be informed was a wrong action.
In other words, if it’s known, by the system, that a certain action should not be taken by the user, then make it so that action cannot be taken! If you know the action is wrong, don’t wait until after the user does it to inform him of this! Say, in advance: “No, you may not do this.”
And with this view I entirely agree.
It is my understanding that some or all of the LW team (as well as, possibly, others?) do not take this view. As I understand it, the contrary view is that the purpose of voting is to adjust the karma that a post/comment ends up with to some perceived “proper” value, rather than to express an independent opinion of it. The former may involve voting up, or down, strongly or weakly… I confess that I find this view perplexing, myself, so I will let its proponents defend it further, if they wish.
I don’t think it’s super productive to go into this with a ton of debt, but I do also think that voting is for expressing preferences, just that it’s better to model the preference as “on a scale from 1 to 1000, how good is this post?”, instead of “is this post good or bad?”. And you implement the former by upvoting if it is below your threshold, and downvoting if it is above, with the strong version being used when it’s particularly far away from where your assessment is. This gives you access to a bunch more data than if everyone just votes independently (i.e. voting independently results in posts just above the threshold for “good enough to strong-upvote” for a lot of users but to get the same karma as a post that is in the top 5 of all-time favorite posts for everyone who upvoted it).
In either case I am interested in an independent assessment, just that the assessment moves from “binary good/bad” to “numerical ordering of preferences”.
The problem with this view is that there does not seem to be any way to calibrate the scale. What should be the karma of a good post? A bad post? A mediocre one? What does 20 mean? What does 5 mean? Don’t the answers to these questions depend on how many users are voting on the post, and what their voting behavior is? Suppose you and I both hold the view you describe, but I think a good post should have 100 karma and you think a good post should have 300 karma—how should our voting behavior be interpreted? What does it mean, when a post ends up with, say, 75 karma? Do people think it’s good? Bad? Do we know?
This gets very complicated. It seems like the signal is degraded, not improved, by this.
It seems to me like your perspective results in an improved signal only if everyone who votes has the same opinions on everything.
If people do not have the same opinions, then there will be a distribution across people’s “good enough to strong-upvote” thresholds; a post’s karma will then reflect its position along that distribution. A “top 5 all-time favorite for many people” will be “good enough to strong-upvote” for most people, and will have a high score. A “just good enough to upvote” post for many people, will cross that threshold for fewer, i.e. will be lower along that distribution, and will end up with a lower score. (In other words, you’re getting strong upvote × probability of strong upvote, summed across all voters.)
If everyone has the same opinion, then this will simply result in either everyone strong-upvoting it or no one strong-upvoting it—and in that case, my earlier concern about differently calibrated scales also does not apply.
So, your interpretation seems optimal if adopted by a user population with extremely homogeneous opinions. It is strongly sub-optimal, however, if adopted by a user population with a diverse range of opinions; in that scenario, the “votes independently indicate one’s own evaluation” interpretation is optimal.
Your interpretation of the bolded part is correct.
Overall, agree on the whole asshole filter thing. After a few months of operation, we now have a bunch more data on how people vote, and so might make some adjustments to the system after we analyzed the data a bunch more.
I am currently tending towards a system where your strong-upvotes get weaker the more often you use them, using some kind of “exhaustion” mechanic. I think this still would cause a small amount of overrepresentation by people who use it a lot, but I think would lessen the strength of the effect. I am mostly worried about the UI complexity of this, and communicating this clearly to the user.
Also still open to other suggestions. I am not a huge fan of just leaving them unlimited, mostly because I think it’s somewhat arbitrary to what degree someone will perceive them as a trivial inconvenience, and then we just introduced a bunch of random noise into our karma system, by overrepresenting people who don’t find click-and-hold to be a large inconvenience.