GPT-4 Multiplication Competition
GPT-4 perfectly understands e.g. Karatsuba multiplication and can consistently multiply two-digit numbers, but hasn’t connected those dots; it needs some hand-holding to correctly use a multiplication algorithm. What’s the shortest prompt that gets GPT-4 to multiply 6-digit integers with over 75% accuracy?
- 19 Apr 2023 4:16 UTC; 5 points) 's comment on The basic reasons I expect AGI ruin by (
I predict this will pass (haven’t had a chance to test on GPT-4, but the results using GPT-3 were promising):
Seems to get it right:
Haha yes!
It didn’t have any trouble with the individual steps in your example, so assuming that is typical, I think it’s fairly likely just this will work:
Unfortunately, this shortened version of the prompt failed.
Having GPT3/4 multiply numbers is a bit like eating soup with a fork. You can do it, and the larger you make the fork, the more soup you’ll get—but it’s not designed for it and it’s hugely impractical. GPT4 does not have an internal algorithm for multiplication because the training objective (text completion) does not incentivize developing that. No iteration of GPT (5, 6, 7) will ever be a 100% accurate calculator (unless they change the paradigm away from LLM+RLHF), it will just asymptotically approach 100%. Why don’t we just make a spoon?
Agreed. However, humans also don’t have an internal multiplication algorithm, but can nonetheless use a scratchpad to multiply accurately (in extreme circumstances :P). I’ve chosen multiplication as an example here because it’s maybe the “simplest” thing GPT-4 can’t consistently do.
What I’m finding interesting here is that GPT-4 knows how to break down multiplications (it can write perfect recursive code for multiplying large numbers). It also knows about chain of thought prompting. How close is it to being able to just… directly use the algorithms it knows? What’s the minimum amount of prodding needed? The following prompt is an upper bound on how hard it is to teach GPT-4 accurate multiplication:
(It correctly follows the procedure to obtain
182110560472
.) Is there a shorter prompt that accomplishes the same thing?I couldn’t get yours to ever work with Bing Chat but I did eventually find something that did work for me (most of the time) and is about half the characters of yours without any real code golfing. My prompt was the following: