No, but the hacks of ChatGPT already provided a demonstration of problems with RLHF. I’m worried we’re in a situation analogous to ‘Smashing The Stack For Fun And Profit’ being published 27 years ago (reinventing vulnsknown since MULTICS in the 1960s) and all the C/C++ programmers in denial are going ‘bro I can patch that example, it’s no big deal, it’s just a loophole, we don’t need to change everything, you just gotta get good at memory management, bro, this isn’t hard to fix bro use a sanitizer and turn on -Wall, we don’t need to stop using C-like languages, u gotta believe me we can’t afford a 20% slowdown and it definitely won’t take us 3 decades and still be finding remote zero-days and new gadgets no way man you’re just making that up stop doom-mongering and FUDing bro (i’m too old to learn a new language)’.
No, but the hacks of ChatGPT already provided a demonstration of problems with RLHF. I’m worried we’re in a situation analogous to ‘Smashing The Stack For Fun And Profit’ being published 27 years ago (reinventing vulns known since MULTICS in the 1960s) and all the C/C++ programmers in denial are going ‘bro I can patch that example, it’s no big deal, it’s just a loophole, we don’t need to change everything, you just gotta get good at memory management, bro, this isn’t hard to fix bro use a sanitizer and turn on
-Wall
, we don’t need to stop using C-like languages, u gotta believe me we can’t afford a 20% slowdown and it definitely won’t take us 3 decades and still be finding remote zero-days and new gadgets no way man you’re just making that up stop doom-mongering and FUDing bro (i’m too old to learn a new language)’.very very funny example to use with Jake, a veteran c++ wizard