Also known as Max Harms. (I post AI alignment content under my other account.)
Not the same person as MaxH!
Raelifin
Just wanted to remind folks that this is coming up on Saturday! I’m looking forward to seeing y’all at the park. It should be sunny and warm. Feel free to send me requests for snacks or whatever.
Is there a minimal thing that Claude could do which would change your mind about whether it’s conscious?
Edit: My question was originally aimed at Richard, but I like Mikhail’s answer.
Thanks! The creators also apparently have a substack: https://forecasting.substack.com/
Value of information
If you have multiple quality metrics then you need a way to aggregate them (barring more radical proposals). Let’s say you sum them (the specifics of how they combine are irrelevant here). What has been created is essentially a 25-star system with a more explicit breakdown. This is essentially what I was suggesting. Rate each post on 5 dimensions from 0 to 2, add the values together, and divide by two (min 0.5), and you have my proposed system. Perhaps you think the interface should clarify the distinct dimensions of quality, but I think UI simplicity is pretty important, and am wary of suggesting having to click 5+ times to rate a post.
I addressed the issue of overcompensating in an edit: if the weighting is a median then users are incentivized to select their true rating. Good thought. ☺️
Thanks for your support and feedback!
I agree that there are benefits to hiding karma, but it seems like there are two major costs. The first is in reducing transparency; I claim that people like knowing why something is selected for them, and if karma becomes invisible the information becomes hidden in a way that people won’t like. (One could argue it should be hidden despite people’s desires, but that seems less obvious.) The other major reason is one cited by Habryka: creating common knowledge. Visible Karma scores help people gain a shared understanding of what’s valued across the site. Rankings aren’t sufficient for this, because they can’t distinguish relative quality from absolute quality (eg I’m much more likely to read a post with 200 karma, even if it’s ranked lower due to staleness than one that has 50).
I suggested the 5-star interface because it’s the most common way of giving things scores on a fixed scale. We could easily use a slider, or a number between 0 and 100 from my perspective. I think we want to err towards intuitive/easy interfaces even if it means porting over some bad intuitions from Amazon or whatever, but I’m not confident on this point.
I toyed with the idea of having a strong-bet option, which lets a user put down a stronger QJR bet than normal, and thus influence the community rating more than they would by default (albeit exposing them to higher risk). I mainly avoided it in the above post because it seemed like unnecessary complexity, although I appreciate the point about people overcompensating in order to have more influence.
One idea that I just had is that instead of having the community rating set by the weighted mean, perhaps it should be the weighted median. The effect of this would be such that voting 5-stars on a 2-star post would have exactly the same amount of sway as voting 3.5, right up until the 3.5 line is crossed. I really like this idea, and will edit the post body to mention it. Thanks!
I agree with the expectation that many posts/comments would be nearly indistinguishable on a five-star scale. I’m not sure there’s a way around this while keeping most of the desirable properties of having a range of options, though perhaps increasing it from 10 options (half-stars) to 14 or 18 options would help.
My basic thought is that if I can see a bunch of 4.5 star posts, I don’t really need the signal as to whether one is 4.3 stars vs 4.7 stars, even if 4.7 is much harder to achieve. I, as a reader, mostly just want a filter for bad/mediocre posts, and the high-end of the scale is just “stuff I want to read”. If I really want to measure difference, I can still see which are more uncontroversially good, and also which has more gratitude.
I’m not sure how a power-law system would work. It seems like if there’s still a fixed scale, you’re marking down a number of zeroes instead of a number of stars. …Unless you’re just suggesting linear voting (ie karma)?
Ah! This looks good! I’m excited to try it out.
Yep. I’m aware of that. Our karma system is better in that regard, and I should have mentioned that.
Nice. Thank you. How would you feel about me writing a top-level post reconsidering alternative systems and brainstorming/discussing solutions to the problems you raised?
I also want to note that this proposal isn’t mutually exclusive with other ideas, including other karma systems. It seems fine to have there be an additional indicator of popularity that is distinct from quality. Or, more to my liking, would be a button that simply marks that you thought a post was interesting and/or express gratitude towards the writer, without making a statement about how bulletproof the reasoning was. (This might help capture the essence of Rule Thinkers In, Not Out and reward newbies for posting.)
One obvious flaw with this proposal is that the quality-indicator would only be a measure of expected rating by a moderator. But who says that our moderators are the best judges of quality? Like, the scheme is ripe for corruption, and simply pushing the popularity contest one level up to a small group of elites.
One answer is that if you don’t like the mods, you can go somewhere else. Vote with your feet, etc.
A more turtles-all-the-way-down answer is that the stakeholders of LW (the users, and possibly influential community members/investors?) agree on an aggregate set of metrics for how well the moderators are collectively capturing quality. Then, for each unit of time (eg year) and each potential moderator, set up a conditional prediction market with real dollars on whether that person being a moderator causes the metrics to go up/down compared to the previous time unit. Hire the ones that people predict will be best for the site.
To my mind the primary features of this system that bear on Duncan’s top-level post are:
High-reputation judges can confidently set the quality signal for a piece of writing, even if they’re in the minority. The truth is not a popularity contest, even when it comes to quality.
The emphasis on betting means that people who “upvote” low-quality posts or “downvote” high-quality ones are punished, making “this made me feel things, and so I’m going to bandwagon” a dangerous mental move. And people who make this sort of move would be efficiently sidelined.
In concert, I expect that it would be much easier to bring concentrated force down on low-quality bits of writing. Which would, in turn, I think make the quality price/signal a much more meaningful piece of information, instead of the current karma score which is as others noted, is overloaded as a measure.
First of all, thank you, Duncan, for this post. I feel like it captures important perspectives that I’ve had, and problems that I can see and puts them together in a pretty good way. (I also share your perspective that the post Could Be Better in several ways, but I respect you not letting the perfect be the enemy of the good.)
I find myself irritated right now (bothered, not angry) that our community’s primary method of highlighting quality writing is by karma-voting. It’s a similar kind of feeling to living in a democracy—yes, there are lots of systems that are worse, but really? Is this really the best we can do? (No particular shade on Ruby or the Lightcone team—making things is hard and I’m certainly glad LW exists and is as good as it is.)
Like, I think I have an idea that might make things substantially better that’s not terrible: make the standard signal for quality being a high price on a quality-arbitrated betting market. This is essentially applying the concept of Futarchy to internet forums (h/t ACX and Hanson). (If this is familiar to you, dear reader, feel free to skip to responses to this comment, where I talk about features of this proposal and other ideas.) Here’s how I could see it working:
When a user makes a post or comment or whatever, they also name a number between 0 and 100. This number is essentially a self-assessment of quality, where 0 means “I know this is flagrant trolling” and 100 means “This is obviously something that any interested party should read”. As an example, let’s say that I assign this comment an 80.Now let’s say that you are reading and you see my comment and think “An 80? Bah! More like a 60!” You can then “downvote” the comment, which nudges the number down, or enter your own (numeric) estimate, which dramatically shifts the value towards your estimate (similar to a “strong” vote). Behind the scenes, the site tracks the disagreement. Each user is essentially making a bet around the true value of the post’s quality. (The downvote is a bet that it’s “less than 80″.) What are they betting? Reputation as judges! New users start 0 judge-of-quality-reputation, unless they get existing users to vouch for them and donate a bit of reputation. (We can call this “karma,” but I think it is very important to distinguish good-judge karma, from high-quality-writing karma!) When voting/betting on a post/comment, they stake some of that reputation (maybe 10% up to a cap of 50? (Just making up numbers here for the sake of clarity; I’d suggest actually running experiments)).
Then, you have the site randomly sample pieces of writing, weighting the sampling towards those that are most controversial (ie have the most reputation on the line). Have the site assign these pieces of writing to moderators whose sole job is to study that piece of writing and the surrounding context and to score its quality. (Perhaps you want multiple moderators. Perhaps there should be appeals, in the form of people betting against the value set by the moderator. Etc. More implementation details are needed.) That judgment then resolves all the bets, and results in users gaining/losing reputation.
Users who run out of reputation can’t actually bet, and so lose the ability to influence the quality-indicator. However, all people who place bets (or try to place bets when at zero/negative reputation) are subsidized a small amount of reputation just for participating. (This inflation is a feature, encouraging participation in the site.) Thus, even a new user without any vouch can build up ability to influence the signal by participating and consistently being right.
Update: I decided that I like the grass south of the baseball diamond better. Let’s meet there.
Hey all, Max here. I was bad/busy on the weekend when I was supposed to provide a more specific location, so I’ve updated the what3words to a picnic table near the dog/skate park. I reserve the right to continue to adjust the meetup location in the coming weeks if I find even better places, so be sure to check on the 18th for specifics.
I’m an AI safety researcher and author of Crystal Society. I did a bunch of community leading/organizing in Ohio, including running a rationality dojo. I moved out to the bay area in 2016, and to Grass Valley in June. If you feel like introducing yourself in the comments here, please do! (But also no pressure.)Do people want food? I’ll probably make it happen, so if you have preferences, let me know ahead of time by email or by comment here. (No need to request vegetarian options; that’s a given.)
Issue 2 is about to be fixed: https://github.com/Discordius/Lesswrong2/pull/188
I picked 7 Habits because it’s pretty clearly rationality in my eyes, but is distinctly not LW style Rationality. Perhaps I should have picked something worse to make my point more clear.
Manifold market here: https://manifold.markets/MaxHarms/will-ai-be-recursively-self-improvi