Optimization Process comments on What’s the Least Impressive Thing GPT-4 Won’t be Able to Do

Optimization Process 22 Aug 2022 6:17 UTC
5 points
0
Several unimpressive tasks, with my associated P(GPT-4 can’t do it):
- 4:1 - Write limericks that reliably rhyme and scan about arbitrary topics (topics about as complex as “an animal climbing a skyscraper”)
- 12:1 - Beat me at chess (which I’m quite bad at).
  - (“GPT-4 can beat me at chess” = “Somebody can find a non-cheaty program that maps a game-history to a prompt, and maps GPT-4′s output to a move, such that GPT-4 wrapped in that translation layer can beat me.”)
- 30:1 - Solve an ASCII-art maze (e.g. solve these by putting a sequence of @s from start to finish).
I’m happy to operationalize and bet on any of these, taking the “GPT-4 can’t do it” side.
- Archimedes 15 Mar 2023 2:13 UTC
  2 points
  0
  Parent
  Bing chat does pretty well at the first one.
  Input:
  Write a limerick about an animal climbing a skyscraper
  Response:
  Sure, I can try to write a limerick about that. Here is one I came up with:
```
There once was a monkey named Jack
Who decided to climb a big stack
He swung from each ledge
But he slipped off the edge
And he landed right flat on his back
```
  I hope you like it. 😊
  - Optimization Process 19 Mar 2023 22:48 UTC
    1 point
    0
    Parent
    Lovely! Yeah, that rhymes and scans well enough for me!
    Here are my experiments; they’re pretty good, but I don’t count them as “reliably” scanning. So I think I’m gonna count this one as a win!
    (I haven’t tried testing my chess prediction yet, but here it is on ASCII-art mazes.)
- janus 28 Aug 2022 0:48 UTC
  1 point
  0
  Parent
  how big/difficult do you want the ascii mazes to be? and is few-shot ok?
  - Optimization Process 29 Aug 2022 2:06 UTC
    1 point
    0
    Parent
    I am thinking of mazes as complicated as the top one here! And few-shot is perfectly okay.
    (I’d be flabbergasted if it could solve an ascii-art maze “in one step” (i.e. I present the maze in a prompt, and GPT-4 just generates a stream of tokens that shows the path through the maze). I’d accept a program that iteratively runs GPT-4 on several prompts until it considers the maze “solved,” as long as it was clear that the maze-solving logic lived in GPT-4 and not the wrapper program.)