Vanessa Kosoy comments on tailcalled’s Shortform

Vanessa Kosoy 29 Mar 2024 17:30 UTC
7 points
−7
Downvoted because conditional on this being true, it is harmful to publish. Don’t take it personally, but this is content I don’t want to see on LW.
- tailcalled 29 Mar 2024 17:41 UTC
  2 points
  0
  Parent
  Why harmful
  - Vanessa Kosoy 29 Mar 2024 17:47 UTC
    10 points
    3
    Parent
    Because it’s capability research. It shortens the TAI timeline with little compensating benefit.
    - tailcalled 29 Mar 2024 18:03 UTC
      3 points
      0
      Parent
      It’s capability research that is coupled to alignment:
      Furthermore it seems like a win for interpretability and alignment as it gives greater feedback on how the AI intends to earn rewards, and better ability to control those rewards.
      Coupling alignment to capabilities is basically what we need to survive, because the danger of capabilities comes from the fact that capabilities is self-funding, thereby risking outracing alignment. If alignment can absorb enough success from capabilities, we survive.
      - Vanessa Kosoy 29 Mar 2024 18:15 UTC
        1 point
        −8
        Parent
        I missed that paragraph on first reading, mea culpa. I think that your story about how it’s a win for interpretability and alignment is very unconvincing, but I don’t feel like hashing it out atm. Revised to weak downvote.
        
        Also, if you expect this to take off, then by your own admission you are mostly accelerating the current trajectory (which I consider mostly doomed) rather than changing it. Unless you expect it to take off mostly thanks to you?
        tailcalled 29 Mar 2024 21:12 UTC
        2 points
        −4
        Parent
        Also, if you expect this to take off, then by your own admission you are mostly accelerating the current trajectory (which I consider mostly doomed) rather than changing it. Unless you expect it to take off mostly thanks to you?
        Surely your expectation that the current trajectory is mostly doomed depends on your expectation of the technical details of the extension of the current trajectory. If technical specifics emerge that shows the current trajectory to be going in a more alignable direction, it may be fine to accelerate.
        Vanessa Kosoy 30 Mar 2024 6:46 UTC
        4 points
        2
        Parent
        Sure, if after updating on your discovery, it seems that the current trajectory is not doomed, it might imply accelerating is good. But, here it is very far from being the case.