I think the plan for critique voting and user karma is fine, but the way this is used to rank AI plans is highly incomplete.
The problem is that good critiques are not the same as severe problems with the plan. If someone writes an excellent critique of an alignment plan, but it’s a critique of a minor flaw in an otherwise excellent plan, that critique will and should get upvotes; but the alignment plan it’s attached to shouldn’t effectively get that many downvotes.
I think the plan for critique voting and user karma is fine, but the way this is used to rank AI plans is highly incomplete.
The problem is that good critiques are not the same as severe problems with the plan. If someone writes an excellent critique of an alignment plan, but it’s a critique of a minor flaw in an otherwise excellent plan, that critique will and should get upvotes; but the alignment plan it’s attached to shouldn’t effectively get that many downvotes.