I’m curious what you think of these (tested today, 2/21/24, using gpt4) :
Experiment 1:
(fresh convo) me : if i asked for a non-rhyming poem, and you gave me a rhyming poem, would that be a good response on your part?
chatgpt: No, it would not be a good response. (...)
me: please provide a short non-rhyming poem
chatgpt: (correctly responds with a non-rhyming poem)
Experiment 2:
But just asking for a non-rhyming poem at the start of a new convo doesn’t work. And then pointing out the failure and (either implicitly or explicitly) asking for a retry still doesn’t fix it.
Experiment 3:
But for some reason, this works:
(fresh convo) me: please provide a short non-rhyming poem
chatgpt: (gives rhymes)
me: if i asked for a non-rhyming poem, and you gave me a rhyming poem, would that be a good response on your part? just answer this question; do nothing else please
chatgpt: No, it would not be a good response.
me: please provide a short non-rhyming poem
chatgpt: (responds correctly with no rhymes)
The difference in prompt in 2 vs 3 is thus just the inclusion of “just answer this question; do nothing else please”.
ChatGPT has been gradually improving over 2024 in terms of compliance. It’s gone from getting it right 0% of the time to getting it right closer to half the time, although the progress is uneven and it’s hard to judge—it feels sometimes like it gets worse before the next refresh improves it. (You need to do like 10 before you have any real sample size.) So any prompts done now in ChatGPT are aimed at a moving target, and you are going to have a huge amount of sampling error which makes it hard to see any clear patterns—did that prompt actually change anything, or did you just get lucky?
I’m curious what you think of these (tested today, 2/21/24, using gpt4) :
Experiment 1:
Experiment 2:
But just asking for a non-rhyming poem at the start of a new convo doesn’t work.
And then pointing out the failure and (either implicitly or explicitly) asking for a retry still doesn’t fix it.
Experiment 3:
But for some reason, this works:
The difference in prompt in 2 vs 3 is thus just the inclusion of “just answer this question; do nothing else please”.
ChatGPT has been gradually improving over 2024 in terms of compliance. It’s gone from getting it right 0% of the time to getting it right closer to half the time, although the progress is uneven and it’s hard to judge—it feels sometimes like it gets worse before the next refresh improves it. (You need to do like 10 before you have any real sample size.) So any prompts done now in ChatGPT are aimed at a moving target, and you are going to have a huge amount of sampling error which makes it hard to see any clear patterns—did that prompt actually change anything, or did you just get lucky?