ChristianKl comments on evhub’s Shortform

ChristianKl 10 Nov 2024 23:12 UTC
8 points
4
Nothing in that announcement suggests that this is limited to intelligence analysis.
U.S. intelligence and defense agencies do run misinformation campaigns such as the antivaxx campaign in the Philippines, and everything that’s public suggests that there’s not a block to using Claude offensively in that fashion.
If Anthropic has gotten promises that Claude is not being used offensively under this agreement they should be public about those promises and the mechanisms that regulate the use of Claude by U.S. intelligence and defense agencies.
- catherio 12 Nov 2024 19:37 UTC
  18 points
  10
  Parent
  COI: I work at Anthropic
  
  I confirmed internally (which felt personally important for me to do) that our partnership with Palantir is still subject to the same terms outlined in the June post “Expanding Access to Claude for Government”:
  
  For example, we have crafted a set of contractual exceptions to our general Usage Policy that are carefully calibrated to enable beneficial uses by carefully selected government agencies. These allow Claude to be used for legally authorized foreign intelligence analysis, such as combating human trafficking, identifying covert influence or sabotage campaigns, and providing warning in advance of potential military activities, opening a window for diplomacy to prevent or deter them. All other restrictions in our general Usage Policy, including those concerning disinformation campaigns, the design or use of weapons, censorship, and malicious cyber operations, remain.
  
  The contractual exceptions are explained here (very short, easy to read): https://support.anthropic.com/en/articles/9528712-exceptions-to-our-usage-policy
  
  The core of that page is as follows, emphasis added by me:
  
  For example, with carefully selected government entities, we may allow foreign intelligence analysis in accordance with applicable law. All other use restrictions in our Usage Policy, including those prohibiting use for disinformation campaigns, the design or use of weapons, censorship, domestic surveillance, and malicious cyber operations, remain.
  
  This is all public (in Anthropic’s up-to-date support.anthropic.com portal). Additionally it was announced when Anthropic first announced its intentions and approach around government in June.
  - ChristianKl 13 Nov 2024 9:06 UTC
    2 points
    0
    Parent
    The United States has laws that prevent the US intelligence and defense agencies from spying on their own population. The Snowden revelations showed us that the US intelligence and defense agencies did not abide by those limits.
    Facebook has a usage policy that forbids running misinformation campaigns on their platform. That did not stop US intelligence and defense agencies from running disinformation campaigns on their platform.
    Instead of just trusting contracts, Antrophics could add oversight mechanisms, so that a few Antrophics employees can look over how the models are used in practice and whether they are used within the bounds that Antrophics expects them to be used in.
    If all usage of the models is classified and out of reach of checking by Antrophics employees, there’s no good reason to expect the contract to be limiting US intelligence and defense agencies if those find it important to use the models outside of how Antrophics expects them to be used.
    For example, with carefully selected government entities, we may allow foreign intelligence analysis in accordance with applicable law. All other use restrictions in our Usage Policy, including those prohibiting use for disinformation campaigns, the design or use of weapons, censorship, domestic surveillance, and malicious cyber operations, remain.
    This sounds to me like a very carefully worded nondenail denail.
    If you say that one example of how you can break your terms is to allow a select government entity to do foreign intelligence analysis in accordance with applicable law and not do disinformation campaigns, you are not denying that another example of how you could do expectations is to allow disinformation campaigns.
    If Antrophics would be sincere in this being the only expectation that’s made, it would be easy to add a promise to Exceptions to our Usage Policy, that Anthropic will publish all expectations that they make for the sake of transparency.
    Don’t forget, that probably only a tiny number of Anthropic employees have seen the actual contracts and there’s a good chance that those are build by classification from talking with other Anthropics employees about what’s in the contracts.
    At Antrophics you are a bunch of people who are supposed to think about AI safety and alignment in general. You could think of this as a testcase of how to design mechanisms for alignment and the Exceptions to our Usage Policy seems like a complete failure in that regard, because it neither contains mechanism to make all expectations public nor any mechanisms to make sure that the policies are followed in practice.