Richard_Ngo comments on Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart’s Law

Richard_Ngo 6 Jul 2023 1:17 UTC
10 points
4
why would something being poorly defined make it harder to optimize
Well, you need to know what you’re optimizing! In a two-player game, if the second player gets to redefine the rules after the first player has moved, then they get a huge advantage. That’s essentially what happens by defining “the spirit of cricket” vaguely.
- Daniel Kokotajlo 6 Jul 2023 13:36 UTC
  6 points
  0
  Parent
  How is this different from games with a referee? A foul is what the referee says it is; the spirit of cricket is what the cricket-lovers say it is. In both cases a savvy optimizer would start modelling the relevant humans and predicting what they would and wouldn’t judge illegal.
  
  I agree that different rules or optimization targets have different complexity levels, and the spirit of cricket seems more complicated than ordinary fouls which are more complicated than “did the ball hit the pegs.”
  
  I think the two-player-game-but-player2-gets-to-modify-the-rules is not a fair analogy here. Like I said it’s the cricket-loving public that decides, not player 2.
  - Richard_Ngo 6 Jul 2023 15:40 UTC
    8 points
    0
    Parent
    Ah, sorry for unclarity. The game I’m referring to is the one between the player who’s trying to game the rules, and the referee/rule-judging body that’s trying to avoid being Goodharted. The judging body can either “move first” by specifying the rules precisely, or “move second” by judging whether or not an action broke the rules according to illegible criteria. The latter is straighforwardly much harder to Goodhart. Or they can do a combination: I think of referees as doing a combination of these things, because they’re meant to interpret fixed, well-defined rules, but there’s still some room for judgment calls.
    - Daniel Kokotajlo 10 Jul 2023 18:49 UTC
      2 points
      0
      Parent
      Ahhh, I see, yes that makes sense.
  - A.H. 6 Jul 2023 14:35 UTC
    1 point
    0
    Parent
    I think the two-player-game-but-player2-gets-to-modify-the-rules is not a fair analogy here. Like I said it’s the cricket-loving public that decides, not player 2.
    Broadly, I agree with Richard Ngo’s characterisation. You are right that the ‘cricket loving public’ plays some part in determining what counts as ‘within the spirit’ but it is the decision of the players themselves that often is most important.
    How is this different from games with a referee? A foul is what the referee says it is; the spirit of cricket is what the cricket-lovers say it is. In both cases a savvy optimizer would start modelling the relevant humans and predicting what they would and wouldn’t judge illegal.
    
    I agree that different rules or optimization targets have different complexity levels, and the spirit of cricket seems more complicated than ordinary fouls which are more complicated than “did the ball hit the pegs.”
    I agree with you that the complexity is an important factor. I think you are correct that in principle this can still be Goodharted, but in practice it doesn’t seem to happen as it is much harder than Goodharting the written rules of the game, due to the increased complexity. There is nothing to prevent a superintelligent player from brainwashing the opposing team and general public to agreeing that their actions are legitimate. It’s just that doing this is a lot harder than normal ways of ‘gaming the system’. This is why I used the term ‘resists Goodharts law’ as opposed to ‘defeats Goodharts law’ or something similar.
    - RamblinDash 6 Jul 2023 15:32 UTC
      1 point
      0
      Parent
      It may be that there isn’t big enough money in cricket for it to be attractive to hypercompetitive athletes and coaches who are most likely to apply that optimization pressure?
- Charlie Steiner 7 Jul 2023 1:02 UTC
  2 points
  0
  Parent
  Second player judging is in some ways an advantage, and in other ways a disadvantage (and either way not really what’s happening in cricket, because the players are actually cooperating rather than trying to exploit the rules but being foiled).
  
  The disadvantage is that, lacking rules, it’s hard to communicate what you want the first player to do at all! You don’t get children to play soccer by not telling them any rules or giving any demonstrations, only judging their actions as legal or illegal.
  
  If you do manage to communicate your preferences about what kind of game is even being played to player 1, then there’s no qualitative barrier to communicating enough information about your standards that you can get goodharted.