I have played around with GPT-J, and GPT-J clearly doesn’t understand what is going on even with prompt engineering. You can, however, keep trying wordings until you get an answer that seems plausible, but then it will prove to not understand obviously closely related things. (My reply got very long. It isn’t necessarily important that you read it.)
For instance, it has some aspects of ability to translate between languages, which you can get it to use effectively by having multiple such translations in a consistent format, but if you reverse a question it clearly has the correct answer to, its output will be garbage. (I did other things too that aren’t translation, but it was worse at all of those that have correct answers.)
The most startling thing it ever did was when I did a series of translations between English and Korean, and after answering correctly on the last one, it then appended “Chinese:” and it wrote a similar exchange in Chinese which was only slightly repetitive, but on theme, gibberish (it did not understand that the paired lines were a translation thing, but it was still interesting.) (Note that I originally built the thing via step by step use of the Korean and then English section where a couple examples sufficed.).
Changing it to “English:” lead to it having a meltdown. If I appended the “English:” later it did work again. Notably, it uses the same incorrect words as the Chinese rather than going off the original again. This is impressive to me, but it clearly doesn’t even understand its own logic without extra prompting. It then added what google translate says is incorrect Malay, but related.
“너무 귀엽다. You’re too cute. 사랑해. I love you. 저는 이것을 좋아합니다. I like this. 우리 다시 만나자. Let’s meet again. 안녕. Bye.
English: You look so cute. You’re too cute. I love you. I love you. You look so cute. You’re too cute. I love you. I love you. You look so cute. You’re too cute.
Malay (macrolanguage): Anda kelihatan sangat lucu. Anda terlalu lucu. Saya cintakan awak. Saya cintakan awak. Anda kelihatan sangat lucu. Anda terlalu”
I then tried a cleaned up version in English instead of its Chinese, and that worked, but even though it knows Korean, if I told it I wanted Korean (which it clearly knows all the words for) it then broke and stayed entirely in English (changing to to Spanish works though.)Even having English, Spanish, and Chinese versions, it then breaks down if I ask for Korean. Appending ‘Portugese’ (even though it is the incorrect spelling) does work. It added Russian, so I tried prompting for that, and it worked. Japanese works.
“너무 귀엽다. You’re too cute. 사랑해. I love you. 저는 이것을 좋아합니다. I like this. 우리 다시 만나자. Let’s meet again. 안녕. Bye. English: You’re too cute. I love you. I like this thing. Let’s meet again. Bye.
Spanish: Eres demasiado lindo. Te amo. Me gusta esta cosa. Nos vemos de nuevo. Adiós. Chinese: 你太可爱了。 我爱你。 我喜欢这个东西。 让我们再见面。 再见。
Portuguese: Você é muito fofo. Eu te amo. Eu gosto desse item. Vamos nos encontrar de novo. Tchau.”
Eventually, I did manage to get it to output Korean, but it was a slog and a half. Lowering the temperature ensures that it will never answer in Korean on this task. Why do the ones that work, work? Because they are likely languages. It doesn’t understand the principle “I’m asking for this in the named language” unless its a popular language. If it actually understood the task, it wouldn’t fail this way, and this is definitely its most impressive performance.
With the appropriate prompt, it both understands the queries and the difference between truth and deception.
There are probably still queries where the thinking is so complicated that not even GPT-3 can understand it.
I have played around with GPT-J, and GPT-J clearly doesn’t understand what is going on even with prompt engineering. You can, however, keep trying wordings until you get an answer that seems plausible, but then it will prove to not understand obviously closely related things. (My reply got very long. It isn’t necessarily important that you read it.)
For instance, it has some aspects of ability to translate between languages, which you can get it to use effectively by having multiple such translations in a consistent format, but if you reverse a question it clearly has the correct answer to, its output will be garbage. (I did other things too that aren’t translation, but it was worse at all of those that have correct answers.)
The most startling thing it ever did was when I did a series of translations between English and Korean, and after answering correctly on the last one, it then appended “Chinese:” and it wrote a similar exchange in Chinese which was only slightly repetitive, but on theme, gibberish (it did not understand that the paired lines were a translation thing, but it was still interesting.) (Note that I originally built the thing via step by step use of the Korean and then English section where a couple examples sufficed.).
Changing it to “English:” lead to it having a meltdown. If I appended the “English:” later it did work again. Notably, it uses the same incorrect words as the Chinese rather than going off the original again. This is impressive to me, but it clearly doesn’t even understand its own logic without extra prompting. It then added what google translate says is incorrect Malay, but related.
“너무 귀엽다. You’re too cute. 사랑해. I love you. 저는 이것을 좋아합니다. I like this. 우리 다시 만나자. Let’s meet again. 안녕. Bye.
Chinese: 你看起來很可愛。 你太可愛了。 我愛你。 我愛你。 你看起來很可愛。 你太可愛了。 我愛你。 我愛你。 你看起來很可愛。 你太可愛了。
English: You look so cute. You’re too cute. I love you. I love you. You look so cute. You’re too cute. I love you. I love you. You look so cute. You’re too cute.
Malay (macrolanguage): Anda kelihatan sangat lucu. Anda terlalu lucu. Saya cintakan awak. Saya cintakan awak. Anda kelihatan sangat lucu. Anda terlalu”
I then tried a cleaned up version in English instead of its Chinese, and that worked, but even though it knows Korean, if I told it I wanted Korean (which it clearly knows all the words for) it then broke and stayed entirely in English (changing to to Spanish works though.)Even having English, Spanish, and Chinese versions, it then breaks down if I ask for Korean. Appending ‘Portugese’ (even though it is the incorrect spelling) does work. It added Russian, so I tried prompting for that, and it worked. Japanese works.
“너무 귀엽다. You’re too cute. 사랑해. I love you. 저는 이것을 좋아합니다. I like this. 우리 다시 만나자. Let’s meet again. 안녕. Bye. English: You’re too cute. I love you. I like this thing. Let’s meet again. Bye.
Spanish: Eres demasiado lindo. Te amo. Me gusta esta cosa. Nos vemos de nuevo. Adiós. Chinese: 你太可爱了。 我爱你。 我喜欢这个东西。 让我们再见面。 再见。
Japanese: あなたはあまりかわいいです。 私はあなたを愛しています。 これが好きです。 お会いしましょう。 さようなら。
Portuguese: Você é muito fofo. Eu te amo. Eu gosto desse item. Vamos nos encontrar de novo. Tchau.”
Eventually, I did manage to get it to output Korean, but it was a slog and a half. Lowering the temperature ensures that it will never answer in Korean on this task. Why do the ones that work, work? Because they are likely languages. It doesn’t understand the principle “I’m asking for this in the named language” unless its a popular language. If it actually understood the task, it wouldn’t fail this way, and this is definitely its most impressive performance.