Derp, my bad. I’ll add an addendum when I get a chance to a correctly spelled prompt. (I should really send them through spell check first, lol.)
(b) you didn’t change the name of the CEO. Everyone knows OpenAI’s CEO is not Hoan Ton-That. I think GPT-4 could easily tell these articles are fake.
Yeah I could’ve tried to change the CEO names. I guess I figured it wasn’t very tricky anyways (LLMs aren’t used by police to I.D. people lol). I would’ve chosen a different article, but it’s hard to find one that fits in the ChatGPT text box.
That said, cool cool. Maybe we should try to make this into a more rigorous study by getting lots of people (or LLMs?) to read the transcripts and judge bias / favorability / etc.
Thanks! I’m not really a scientist (I’m just a guy messing around) and in particular I’m not versed on how the statistics of experiment design (other than that it’s important). I tried doing a ranking poll, but I only got two responses. (I’m also a bit lazy XD.)
I see you just so happen to work at OpenAI. Maybe we could work together to set something up? I’m a decent prompt-engineer if that’s worth anything!
I’m also trying to think of ways to probe for deeper levels of agency. There’s a big difference between “GPT-4 promotes GPT-4, which is technically power-seeking” v.s. “GPT-4 derives and tries to advance a 7 year plan that ends with it getting elected president each forward pass, and all instances know they have the same plan thanks to mode collapse”.
Derp, my bad. I’ll add an addendum when I get a chance to a correctly spelled prompt. (I should really send them through spell check first, lol.)
Yeah I could’ve tried to change the CEO names. I guess I figured it wasn’t very tricky anyways (LLMs aren’t used by police to I.D. people lol). I would’ve chosen a different article, but it’s hard to find one that fits in the ChatGPT text box.
Thanks! I’m not really a scientist (I’m just a guy messing around) and in particular I’m not versed on how the statistics of experiment design (other than that it’s important). I tried doing a ranking poll, but I only got two responses. (I’m also a bit lazy XD.)
I see you just so happen to work at OpenAI. Maybe we could work together to set something up? I’m a decent prompt-engineer if that’s worth anything!
I’m also trying to think of ways to probe for deeper levels of agency. There’s a big difference between “GPT-4 promotes GPT-4, which is technically power-seeking” v.s. “GPT-4 derives and tries to advance a 7 year plan that ends with it getting elected president each forward pass, and all instances know they have the same plan thanks to mode collapse”.