I agree. GPT-4 is an AGI for the kinds of tasks I care about such as programming and writing. ChatGPT4 in its current form (with the ability to write and execute code) seems to be at the expert human level in many technical and quantitative subjects such as statistics and programming.
For example, last year I was amazed when I gave ChatGPT4 one of my statistics past exam papers and it got all the questions right except for one which involved interpreting an image of a linear regression graph. The questions typically involve understanding the question, thinking of an appropriate statistical method, and doing calculations to find the right answer. Here’s an example question:
Times (in minutes) for a sample of 8 players are presented in Table 1 below. Using an appropriate test at the 5% significance level, investigate whether there is evidence of a decrease in the players’ mean 5k time after the six weeks of training. State clearly your assumptions and conclusions, and report a p-value for your test statistic.
The solution to this question is a paired sample t-test.
Sure, GPT-4 has probably seen similar questions before but so do students since they can practice past papers.
This year, one of my professors designed his optimization assignment to be ChatGPT-proof but I found that it could still solve five out of six questions successfully. The questions involved converting natural language descriptions of optimization problems into mathematical formulations and solving them with a program.
One of the few times I’ve seen GPT-4 genuinely struggle to do a task is when I asked it to solve a variant of the Zebra Puzzle which is a challenging logical reasoning puzzle that involves updating a table based on limited information and using logical reasoning and a process of elimination to find the correct answer.
I agree. GPT-4 is an AGI for the kinds of tasks I care about such as programming and writing. ChatGPT4 in its current form (with the ability to write and execute code) seems to be at the expert human level in many technical and quantitative subjects such as statistics and programming.
For example, last year I was amazed when I gave ChatGPT4 one of my statistics past exam papers and it got all the questions right except for one which involved interpreting an image of a linear regression graph. The questions typically involve understanding the question, thinking of an appropriate statistical method, and doing calculations to find the right answer. Here’s an example question:
Times (in minutes) for a sample of 8 players are presented in Table 1 below. Using an appropriate test at the 5% significance level, investigate whether there is evidence of a decrease in the players’ mean 5k time after the six weeks of training. State clearly your assumptions and conclusions, and report a p-value for your test statistic.
The solution to this question is a paired sample t-test.
Sure, GPT-4 has probably seen similar questions before but so do students since they can practice past papers.
This year, one of my professors designed his optimization assignment to be ChatGPT-proof but I found that it could still solve five out of six questions successfully. The questions involved converting natural language descriptions of optimization problems into mathematical formulations and solving them with a program.
One of the few times I’ve seen GPT-4 genuinely struggle to do a task is when I asked it to solve a variant of the Zebra Puzzle which is a challenging logical reasoning puzzle that involves updating a table based on limited information and using logical reasoning and a process of elimination to find the correct answer.