gwern comments on RobertM’s Shortform

gwern Oct 3, 2024, 5:34 PM
10 points
2
The reason for these hacks seems pretty interesting: https://krebsonsecurity.com/2024/10/a-single-cloud-compromise-can-feed-an-army-of-ai-sex-bots/ https://permiso.io/blog/exploiting-hosted-models

Apparently this isn’t a simple theft of service as I had assumed, but it is caused by the partial success of LLM jailbreaks: hackers are now incentivized to hack any API-enabled account they can in order to use it not on generic LLM uses, but specifically on NSFW & child porn chat services, to both drain & burn accounts.

I had been a little puzzled why anyone would target LLM services specifically, when LLMs are so cheap in general, and falling rapidly in cost. Was there really that much demand to economize on LLM calls of a few dozen or hundred dollars, by people who needed a lot of LLMs (enough to cover the large costs of hacking and creating a business ecosystem around it) and couldn’t get LLMs anyway else like local hosting...? This explains that: the theft of money is only half the story. They are also setting the victim up as the fall guy, and you don’t realize it because the logging is off and you can’t read the completions. Quite alarming.

And this is now a concrete example of the harms caused by jailbreaks, incidentally: they incentivize exploiting API accounts in order to use & burn them. If the jailbreaks didn’t work, they wouldn’t bother.
- jimrandomh Oct 3, 2024, 10:00 PM
  5 points
  0
  Parent
  Worth noting explicitly: while there weren’t any logs left of prompts or completions, there were logs of API invocations and errors, which contained indications that whatever this was, it was still under development and not an already-scaled setup. Eg we saw API calls fail with invalid-arguments, then get retried successfully after a delay.
  The indicators-of-compromise aren’t a good match between the Permiso blog post and what we see in logs; in particular we see the user agent string Boto3/1.29.7 md/Botocore#1.32.7 ua/2.0 os/windows#10 md/arch#amd64 lang/python#3.12.4 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.32.7 which is not mentioned. While I haven’t checked all the IPs, I checked a sampling and they didn’t overlap. (The IPs are a very weak signal, however, since they were definitely botnet IPs and botnets can be large.)
  - gwern Oct 4, 2024, 12:34 AM
    7 points
    1
    Parent
    Permiso seems to think there may be multiple attacker groups, as they always refer to plural attackers and discuss a variety of indicators and clusters. And I don’t see any reason to think there is a single attacker—there’s no reason to think Chub is the only LLM sexting service, and even if it was, the logical way to operate for Chub would be to buy API access on a blackmarket from all comers without asking any questions, and focus on their own business. So that may just mean that you guys got hit by another hacker who was still setting up their own workflow and exploitation infrastructure.
    
    (It’s a big Internet. Like all that Facebook DALL-E AI slop images is not a single person or group, or even a single network of influencers, it’s like several different communities across various third world languages coordinating churning out AI slop for Facebook ‘engagement’ payments, all sharing tutorials and get-rick-quick schemes.)
    - gwern Oct 4, 2024, 7:23 PM
      3 points
      1
      Parent
      An additional reason to think there’s many attackers: I submitted this to Reddit and a Redditor says they use these sorts of ‘reverse proxy’ services regularly and that they’ve been discussed overtly on 4chan for at least a year: https://www.reddit.com/r/MediaSynthesis/comments/1fvd6tn/recent_wave_of_llm_cloud_api_hacks_motivated_by/lqb7lzf/ Obviously, if your attacker really was a newbie given their slipups, but these hackers providing reverse proxies have been operating for at least a year at sufficient scale as to generate regular discussions on 4chan alone, then there are probably quite a lot of them competing.