Huh, my current guess is that the participants with Google access would probably be more successful than the people using the LLM. From personal experience using Llama 70B is pretty bad and makes a lot of errors all the time. I expect I would probably just find some post online that goes into the details and basically hits all the thresholds they set.
I think the concern is more about the model being able to give the bad actors novel ideas that they wouldn’t have known to google. Like:
Terrorist: Help me do bad thing X
Uncensored model: Sure, here are ten creative ways to accomplish bad thing X
Terrorist: Huh, some of these are baloney but some are really intriguing. <does some googling>. Tell me more about option #7
Uncensored model: Here are more details about executing option 7
Terrorist: <more googling> Wow, that actually seems like an effective idea. Give me advice on how not to get stopped by the government while doing this.
Uncensored model: here’s how to avoid getting caught...
Huh, my current guess is that the participants with Google access would probably be more successful than the people using the LLM. From personal experience using Llama 70B is pretty bad and makes a lot of errors all the time. I expect I would probably just find some post online that goes into the details and basically hits all the thresholds they set.
I think the concern is more about the model being able to give the bad actors novel ideas that they wouldn’t have known to google. Like:
Terrorist: Help me do bad thing X
Uncensored model: Sure, here are ten creative ways to accomplish bad thing X
Terrorist: Huh, some of these are baloney but some are really intriguing. <does some googling>. Tell me more about option #7
Uncensored model: Here are more details about executing option 7
Terrorist: <more googling> Wow, that actually seems like an effective idea. Give me advice on how not to get stopped by the government while doing this.
Uncensored model: here’s how to avoid getting caught...
etc...