(re-rendering the fancy table is a bit annoying, but for the immediate future, Circling as Cousin to Rationality came it at rank 105. It showed up in the full-results, and you can look at how it compared to others)
You might want to edit the writeup to indicate that the fancy HTML table only displays votes by users with 1000+ karma.
Also, have you checked how big the discrepancy is between the 1000+ karma votes vs. all votes? I know the initial writeup mentioned that the former would be weighed more highly, but the HTML table implies that the latter got ignored entirely, which doesn’t seem like the right (or fair) approach at all. It would be one thing to only open the vote for users with 1000+ karma, but another entirely to open it to all users and then ignore those votes.
One thing you could do would be to commit to a specific vote weight (e.g. votes at 1000+ karma are weighted 2x as much as other votes), then calculate a ranking from that. Incidentally, potential math errors notwithstanding, a weight of 2x for the 1000+ karma scores would correspond to the final adjusted score simply being the average between the 1000+ karma score and the All score.
Anyway, here’s a copy of the above-mentioned spreadsheet with some extra columns: “Final Score Adjusted (1000+ karma weighed 2x other votes)” is just what it says. “Rank 1000 minus Rank All” and “Rank 1000 minus Rank Adjusted” display the rank discrepancy based on how votes are weighed.
For instance, microCOVID.org lands on rank 1 on both Rank All and Rank Adjusted. Which makes sense—you’d expect the 1000+-karma users to favor technical posts on AI alignment more highly than the broader community.
You might want to edit the writeup to indicate that the fancy HTML table only displays votes by users with 1000+ karma.
I had intended to convey that with “Complete Voting Results (1000+ Karma). You can see more detailed results, including non-1000+ karma votes, here.” (It’s written a few paragraphs before the results so you might have missed it.
For some historical context, this is the third Review.
In the first review, only 1000+ karma users could vote at all.
In the second year, we displayed the results of the 1000+ users, and then we looked at all the different voting results, but there weren’t actually major differences between either the posts in the top 10ish (which is what we award prizes to) or posts in the top 40-50ish (which is what went into the book). The book curation process depends a lot on which posts actually fit, and some other design considerations.
This year, I think like last year, there aren’t major differences in “which posts made it to top-10” (the one difference is whether #10 and #11 are “Ground of Optimization” and “Simulacra Levels and their Interactions”, or vice-versa)
What is a difference this year is a very major difference in the #1 post. The “All Voters” outcome was overwhelmingly “Microcovid”, the “1000+ karma voters” outcome as “Draft of AI Timelines.” Notably, “Draft of AI Timelines” is also the massive winner when you look at the Alignment Forum voters.
So, my overall plan had been “take in both votes, and then apply some design and prizegiving intuitions about what exactly to do with them. This year, I think this translates into something like “Microcovid” and “Draft of AI Timelines” should maybe both get extra-prize-money as the #1 winners of different votes.
I had intended to convey that with “Complete Voting Results (1000+ Karma). You can see more detailed results, including non-1000+ karma votes, here.”
… I am apparently blind. My apologies.
Other than that, I agree that if the main outcome of interest is which post is #1 and which are the top 10, there’s little difference between the various vote rankings, except for the microCOVID.org thing.
I think you’re probably right (but want to think more about) the results printing the “Voting Results” post being more of a weighted average. Prior to reading your comment, I was thinking I might might 1000+ karma votes as 3x the “All” votes (whereas you had 2x). But, in this case 3x still results in “Microcovid” winning the “weighted average”, so the result is kinda the same.
FYI here’s my personal spreadsheet where I’ve been futzing around with various display options. It includes my own methodology for weighting the results and combining them, which I think is different from yours although I didn’t delve too deeply into your spreadsheet architecture.
To be clear, I didn’t do anything smart in my take on the spreadsheet. I picked the weight of 2x for no special reason, but was then amused to discover that this choice was mathemically equivalent to taking the average of the All score and the 1000+ karma score:
Sadj:=(Sall−S1000)/2+S1000=(Sall+S1000)/2
Other than that, I only computed the rank difference of the various scoring rules, e.g. Rank_1000 minus Rank_All.
Regarding your new spreadsheet, it’s too individualized for me to understand much. But I did notice that cell Q2 in the Graphs sheet uses a formula of “=(O2/R$1)*8000” while all subsequent cells multiply by 10000 instead. Maybe that’s a tiny spreadsheet error?
(re-rendering the fancy table is a bit annoying, but for the immediate future, Circling as Cousin to Rationality came it at rank 105. It showed up in the full-results, and you can look at how it compared to others)
You might want to edit the writeup to indicate that the fancy HTML table only displays votes by users with 1000+ karma.
Also, have you checked how big the discrepancy is between the 1000+ karma votes vs. all votes? I know the initial writeup mentioned that the former would be weighed more highly, but the HTML table implies that the latter got ignored entirely, which doesn’t seem like the right (or fair) approach at all. It would be one thing to only open the vote for users with 1000+ karma, but another entirely to open it to all users and then ignore those votes.
One thing you could do would be to commit to a specific vote weight (e.g. votes at 1000+ karma are weighted 2x as much as other votes), then calculate a ranking from that. Incidentally, potential math errors notwithstanding, a weight of 2x for the 1000+ karma scores would correspond to the final adjusted score simply being the average between the 1000+ karma score and the All score.
Anyway, here’s a copy of the above-mentioned spreadsheet with some extra columns: “Final Score Adjusted (1000+ karma weighed 2x other votes)” is just what it says. “Rank 1000 minus Rank All” and “Rank 1000 minus Rank Adjusted” display the rank discrepancy based on how votes are weighed.
For instance, microCOVID.org lands on rank 1 on both Rank All and Rank Adjusted. Which makes sense—you’d expect the 1000+-karma users to favor technical posts on AI alignment more highly than the broader community.
Thanks for exploring this. :)
Quick note:
I had intended to convey that with “Complete Voting Results (1000+ Karma). You can see more detailed results, including non-1000+ karma votes, here.” (It’s written a few paragraphs before the results so you might have missed it.
For some historical context, this is the third Review.
In the first review, only 1000+ karma users could vote at all.
In the second year, we displayed the results of the 1000+ users, and then we looked at all the different voting results, but there weren’t actually major differences between either the posts in the top 10ish (which is what we award prizes to) or posts in the top 40-50ish (which is what went into the book). The book curation process depends a lot on which posts actually fit, and some other design considerations.
This year, I think like last year, there aren’t major differences in “which posts made it to top-10” (the one difference is whether #10 and #11 are “Ground of Optimization” and “Simulacra Levels and their Interactions”, or vice-versa)
What is a difference this year is a very major difference in the #1 post. The “All Voters” outcome was overwhelmingly “Microcovid”, the “1000+ karma voters” outcome as “Draft of AI Timelines.” Notably, “Draft of AI Timelines” is also the massive winner when you look at the Alignment Forum voters.
So, my overall plan had been “take in both votes, and then apply some design and prizegiving intuitions about what exactly to do with them. This year, I think this translates into something like “Microcovid” and “Draft of AI Timelines” should maybe both get extra-prize-money as the #1 winners of different votes.
… I am apparently blind. My apologies.
Other than that, I agree that if the main outcome of interest is which post is #1 and which are the top 10, there’s little difference between the various vote rankings, except for the microCOVID.org thing.
I think you’re probably right (but want to think more about) the results printing the “Voting Results” post being more of a weighted average. Prior to reading your comment, I was thinking I might might 1000+ karma votes as 3x the “All” votes (whereas you had 2x). But, in this case 3x still results in “Microcovid” winning the “weighted average”, so the result is kinda the same.
FYI here’s my personal spreadsheet where I’ve been futzing around with various display options. It includes my own methodology for weighting the results and combining them, which I think is different from yours although I didn’t delve too deeply into your spreadsheet architecture.
https://docs.google.com/spreadsheets/d/1L05yz0Y7ST4klK2riBKExBL1AbxGn8VnE-HL-zxdyiA/edit#gid=1406116027
To be clear, I didn’t do anything smart in my take on the spreadsheet. I picked the weight of 2x for no special reason, but was then amused to discover that this choice was mathemically equivalent to taking the average of the All score and the 1000+ karma score:
Sadj:=(Sall−S1000)/2+S1000=(Sall+S1000)/2
Other than that, I only computed the rank difference of the various scoring rules, e.g. Rank_1000 minus Rank_All.
Regarding your new spreadsheet, it’s too individualized for me to understand much. But I did notice that cell Q2 in the Graphs sheet uses a formula of “=(O2/R$1)*8000” while all subsequent cells multiply by 10000 instead. Maybe that’s a tiny spreadsheet error?