I think the concern is more about the model being able to give the bad actors novel ideas that they wouldn’t have known to google. Like:
Terrorist: Help me do bad thing X
Uncensored model: Sure, here are ten creative ways to accomplish bad thing X
Terrorist: Huh, some of these are baloney but some are really intriguing. <does some googling>. Tell me more about option #7
Uncensored model: Here are more details about executing option 7
Terrorist: <more googling> Wow, that actually seems like an effective idea. Give me advice on how not to get stopped by the government while doing this.
Uncensored model: here’s how to avoid getting caught...
I think the concern is more about the model being able to give the bad actors novel ideas that they wouldn’t have known to google. Like:
Terrorist: Help me do bad thing X
Uncensored model: Sure, here are ten creative ways to accomplish bad thing X
Terrorist: Huh, some of these are baloney but some are really intriguing. <does some googling>. Tell me more about option #7
Uncensored model: Here are more details about executing option 7
Terrorist: <more googling> Wow, that actually seems like an effective idea. Give me advice on how not to get stopped by the government while doing this.
Uncensored model: here’s how to avoid getting caught...
etc...