Christopher King comments on GPT-4 busted? Clear self-interest when summarizing articles about itself vs when article talks about Claude, LLaMA, or DALL·E 2

Christopher King 31 Mar 2023 19:23 UTC
4 points
0

(a) you misspelled Anthropic,

Derp, my bad. I’ll add an addendum when I get a chance to a correctly spelled prompt. (I should really send them through spell check first, lol.)

(b) you didn’t change the name of the CEO. Everyone knows OpenAI’s CEO is not Hoan Ton-That. I think GPT-4 could easily tell these articles are fake.

Yeah I could’ve tried to change the CEO names. I guess I figured it wasn’t very tricky anyways (LLMs aren’t used by police to I.D. people lol). I would’ve chosen a different article, but it’s hard to find one that fits in the ChatGPT text box.

That said, cool cool. Maybe we should try to make this into a more rigorous study by getting lots of people (or LLMs?) to read the transcripts and judge bias / favorability / etc.

Thanks! I’m not really a scientist (I’m just a guy messing around) and in particular I’m not versed on how the statistics of experiment design (other than that it’s important). I tried doing a ranking poll, but I only got two responses. (I’m also a bit lazy XD.)

I see you just so happen to work at OpenAI. Maybe we could work together to set something up? I’m a decent prompt-engineer if that’s worth anything!

I’m also trying to think of ways to probe for deeper levels of agency. There’s a big difference between “GPT-4 promotes GPT-4, which is technically power-seeking” v.s. “GPT-4 derives and tries to advance a 7 year plan that ends with it getting elected president each forward pass, and all instances know they have the same plan thanks to mode collapse”.