Bruce G

Karma: 73

Bruce G Dec 25, 2022, 8:31 PM
2 points
1
in reply to: JBlack’s comment on: Are there any reliable CAPTCHAs? Competition for CAPTCHA ideas that AIs can’t solve.
I can see the numbers on the notes and infer that they denote United States Dollars, but have zero idea of what the coins are worth. I would expect that anyone outside United States would have to look up every coin type and so take very much more than 3-4 times longer clicking images with boats. Especially if the coins have multiple variations.
If a system like this were widely deployed online using US currency, people outside the US would need to familiarize themselves with US currency if they are not already familiar with it. But they would only need to do this once and then it should be easy to remember for subsequent instances. There are only 6 denominations of US coins in circulation - $0.01, $0.05, $0.10, $0.25, $0.50, and $1.00 - and although there are variations for some of them, they mostly follow a very similar pattern. They also frequently have words on them like “ONE CENT” ($0.01) or “QUARTER DOLLAR” ($0.25) indicating the value, so it should be possible for non-US people to become familiar with those.
Alternatively, an easier option could be using country specific-captchas which show a picture like this except with the currency of whatever country the internet user is in. This would only require extra work for VPN users who seek to conceal their location by having the VPN make it look like they are in some other country.
If the image additionally included coin-like tokens, it would be a nontrivial research project (on the order of an hour) to verify that each such object is in fact not any form of legal tender, past or present, in the United States.
The idea was they the tokens would only be similar in broad shape and color—but would be different enough from actual legal tender coins that I would expect a human to easily tell the two apart.
Some examples would be:
https://barcade.com/wp-content/uploads/2021/07/BarcadeToken_OPT.png
https://www.pinterest.com/pin/64105994675283502/
Even if all the above were solved, you still need such images to be easily generated in a manner that any human can solve it fairly quickly but a machine vision system custom trained to solve this type of problem, based on at least thousands of different examples, can’t. This is much harder than it sounds.
I agree that the difficulty of generating a lot of these is the main disadvantage, as you would probably have to just take a huge number of real pictures like this which would be very time consuming. It is not clear to me that Dall-E or other AI image generators could produce such pictures with enough realism and detail that it would be possible for human users to determine how much money is supposed to be in the fake image (and have many humans all converge to the same answer). You also might get weird things using Dall-E for this, like 2 corners of the same bill having different numbers indicating the bill’s denomination.
But I maintain that, once a large set of such images exists, training a custom machine vision system to solve these would be very difficult. It would require much more work than simply fine tuning an off-the-shelf vision system to answer the binary question of “Does this image contain a bus?”.
Suppose that, say, a few hundred people worked for several months to create 1,000,000 of these in total and then started deploying them. If you are a malicious AI developer trying to crack this, the mere tasks of compiling a properly labeled data set (or multiple data sets) and deciding how many sub-models to train and how they should cooperate (if you use more than one) are already non-trivial problems that you have to solve just to get started. So I think it would take more than a few days.

Bruce G Dec 25, 2022, 2:14 AM
5 points
2
in reply to: gbear605’s comment on: Are there any reliable CAPTCHAs? Competition for CAPTCHA ideas that AIs can’t solve.
If only 90% can solve the captcha within one minute, it does not follow that the other 10% are completely unable to solve it and faced with “yet another barrier to living in our modern society”.
It could be that the other 10% just need a longer time period to solve it (which might still be relatively trivial, like needing 2 or 3 minutes) or they may need multiple tries.
If we are talking about someone at the extreme low end of the captcha proficiency distribution, such that the person can not even solve in a half hour something that 90% of the population can answer in 60 seconds, then I would expect that person to already need assistance with setting up an email account/completing government forms online/etc, so whoever is helping them with that would also help with the captcha.
(I am also assuming that this post is only for vision-based captchas, and blind people would still take a hearing-based alternative.)

Bruce G Dec 25, 2022, 1:38 AM
12 points
0
on: Are there any reliable CAPTCHAs? Competition for CAPTCHA ideas that AIs can’t solve.
One type of question that would be straightforward for humans to answer, but difficult to train a machine learning model to answer reliably, would be to ask “How much money is visible in this picture?” for images like this:

If you have pictures with bills, coins, and non-money objects in random configurations—with many items overlapping and partly occluding each other—it is still fairly easy for humans to pick out what is what from the image.
But to get an AI to do this would be more difficult than a normal image classification problem where you can just fine tune a vision model with a bunch of task-relevant training cases. It would probably require multiple denomination-specific visions models working together, as well as some robust way for the model to determine where one object ends and another begins.
I would also expect such an AI to be more confounded by any adversarial factors—such as the inclusion of non-money arcade tokens or drawings of coins or colored-in circles—added to the image.
Now, maybe to solve this in under one minute some people would need to start the timer when they already have a calculator in hand (or the captcha screen would need to include an on-screen calculator). But in general, as long as there is not a huge number of coins and bills, I don’t think this type of captcha would take the average person more than say 3-4 times longer than it takes them to compete the “select all squares with traffic lights” type captchas in use now. (Though some may want to familiarize themselves with the various $1.00 and $0.50 coins that exist and some the variations of the tails sides of quarters if this becomes the new prove-you-are-a-human method.)

Bruce G Jun 25, 2022, 11:40 PM
1 point
in reply to: Oam Patel’s comment on: A Toy Model of Gradient Hacking
The intent of the scenario is to find what model dominates, so probably loss should be non-negative. If you use squared error in that scenario, then the loss of the mixture is always greater than or equal to the loss of any particular model in the mixture.
I don’t see why that would necessarily be true. Say you have 3 data points from my $Y = X + 1$ example from above:
1. (0,1)
2. (1,2)
3. (2,3)
And say the composite model is a weighted average of $Y = X$ and $Y = X + 2$ with equal weights (so just the regular average).
This means that the composite model outputs will be:
$Y = \frac{(F i r s t C o m p o n e n t O u t p u t) + (S e c o n d C o m p o n e n t O u t p u t)}{2} = \frac{X + (X + 2)}{2} = \frac{2 X + 2}{2} = X + 1$
Thus the composite model would be right on the line, and get each data point Y-value exactly right (and have 0 loss).
The squared error loss would be:
$T o t a l L o s s = (M o d e l O u t p u t (0) - 1)^{2} + (M o d e l O u t p u t (1) - 2)^{2} + (M o d e l O u t p u t (2) - 3)^{2}$
$= ((0 + 1) - 1)^{2} + ((1 + 1) - 2)^{2} + ((2 + 1) - 3)^{2} = 0$
By contrast, each of the two component models would have a total squared error of 3 for these 3 data points.
The $Y = X$ component model would have total squared error loss of:
$T o t a l L o s s = (M o d e l O u t p u t (0) - 1)^{2} + (M o d e l O u t p u t (1) - 2)^{2} + (M o d e l O u t p u t (2) - 3)^{2}$
$= (0 - 1)^{2} + (1 - 2)^{2} + (2 - 3)^{2} = 3$
The $Y = X$ + 2 component model would have total squared error loss of:
$T o t a l L o s s = (M o d e l O u t p u t (0) - 1)^{2} + (M o d e l O u t p u t (1) - 2)^{2} + (M o d e l O u t p u t (2) - 3)^{2}$
$= ((0 + 2) - 1)^{2} + ((1 + 2) - 2)^{2} + ((2 + 2) - 3)^{2} = 3$
For a 2-component weighted average model with a scalar output, the output should always be between between the outputs of each component model. Furthermore, if you have a such a model, and one component is getting the answers exactly correct while the other isn’t, you can always get a lower loss by giving more weight to the component model with exactly correct answers. So I would a gradient descent process to do that.
I don’t think ML engineers will pass in weights of the models to the models themselves (except maybe for certain tasks like game-theoretic simulations). The worry is that data spills easily and that SGD might find absurd, unpredictable ways to sneak weights (or some other correlated variable) into the model.
From the description, it sounded to me like this instance of gradient descent is treating the outputs of the component models $M^{-}$ and $M^{+}$ as features in a linear regression type problem.
In such a case, I would not expect data about the weights of each model to “spill” or in any way affect the output of either component model (unless the machine learning engineers are deliberately altering the data inputs depending on what the weights are, or something like that, and I see no reason why they would do that).
If it is a different situation—like if a neural net or some part or some layers of a neural net is a “gradient hacker” I would expect under normal circumstances that gradient descent would also be optimizing the parameters within that part or those layers.
So barring some outside interference with the gradient descent process, I don’t see any concrete scenario of how gradient hacking could occur (unless the gradient hacking concept includes more mundane phenomena like “getting stuck in a local optimum”).

Bruce G Jun 23, 2022, 4:16 AM
1 point
on: A Toy Model of Gradient Hacking
Epistemic status: Somewhat confused by the scenario described here, possible noob questions and/or commentary.
I am not seeing how this toy example of “gradient hacking” could actually happen, as it doesn’t map on to my understanding of how gradient descent is supposed to work in any realistic case.
Suppose, we have a mixture consisting of a good model $M^{+}$ which gets 0 loss in the limit (because it’s aligned with our training procedure) and a gradient hacker $M^{-}$ which gets $ϵ$ loss in the limit (because its actual objective is paperclips).
The mixture is roughly a linear combination of the two models, weighted by $w_{1}$ for the good model and $w_{2}$ for the bad model. Suppose we’re doing SGD on the vector $(w_{1}, w_{2})$ , and the total loss for the mixture is $L_{t} (w_{1}, w_{2}) = w_{1} L (M^{+}) + w_{2} L (M^{-})$
Can you flesh this out more by giving an example of what a training data point would be in this scenario? What are the input features of a training case and what output target variable is the model trying to predict?
Also, it is not clear from this example if the outputs of the whole composite model are:
1. A weighted average of the outputs of the 2 component models OR
2. The weighted sum of $w_{1} (M^{+} O u t p u t) + w_{2} (M^{-} O u t p u t)$
But either way, I don’t see why the loss for the composite model would necessarily be equal to a weighted sum of the losses of each component model, as stated in the example above.
In principle, it seems to me the loss on such a composite model could (for example) be 0 on some data set, even if both components had a positive loss function value for that same data.
As a simplified example of this, suppose you have a 2-component weighted-average model that is trying to predict the Y value of points in a 2-dimensional (X,Y) coordinate plane given the X value. And suppose all the data points are on the line:
$Y = X + 1$
If the 2 components of the model are:
1. $Y = X$ AND
2. $Y = X + 2$
I would expect the weighted average composite model would have a lower loss then either component model (and 0 loss if the weights are equal).
Also suppose $M^{-}$ can see $(w_{1}, w_{2})$ perfectly and its loss is a function of its relative weight in the mixture. I.e. $L (M^{-}) = f (w_{1}, w_{2})$ , which I’ll refer to as $M^{-}$ ’s loss modulating function.
I am disinclined to suppose this—unless someone explains how it could make sense for the machine learning engineers to do it that way.
My understanding is that the loss function of a model should track how close the model’s answers are to the correct or optimal answers for some data. But in this example, $W_{1}$ and $W_{2}$ are not part of the training data, and can vary independently from the $M^{-}$ outputs on the training cases.
$M^{-}$ has a calculable loss on the training data, which means $M^{-}$ could be a stand-alone model for this dataset in its own right even if $M^{+}$ , $W_{1}$ , and $W_{2}$ didn’t exist.
(Which is to say, $M^{-}$ takes inputs in a form that would represent the features of a training data point and deterministically produces outputs of a data type that matches the target variable. This seems to me to be implied by the description, otherwise I don’t understand what it would mean to say that the loss of $M^{-}$ is ϵ.)
Furthermore, this example supposes that the gradient descent process is only optimizing $W_{1}$ and $W_{2}$ , and not optimizing, or altering the internal parameters of, $M^{+}$ or $M^{-}$ .
So it seems to me, based on this description, that the loss for $M^{-}$ on a given set of training data should *not* vary with $W_{1}$ or $W_{2}$ - if they are doing gradient descent in any kind of normal way (unless I am misunderstanding some big part of how gradient descent works). Rather, you should be able to give $M^{-}$ the same training data batch $1000$ times in a row, while varying $W_{1}$ and $W_{2}$ , and you should get the same outputs and the same loss (if the parameters for the stand-alone $M^{-}$ are the same each time).
So I don’t see how “gradient hacking” could occur in this scenario if the composite model is using any reasonable loss function.
If the composite model is a weighted average I would expect gradient descent to reduce $W_{2}$ to $0$ or nearly $0$ , since if $M^{+}$ is matching the correct output exactly, and $M^{-}$ is not, then the composite model can always get a closer answers by giving more relative weight to $M^{+}$ .
If the composite model is a weighted sum of the outputs, I would expect that (for most possible training data sets and versions of $M^{-}$ ) $W_{1}$ would tend to gravitate towards $1$ and $W_{2}$ would tend to gravitate towards $0$ . There might be exceptions to this if $M^{-}$ ‘s outputs have a strong correlation with $M^{+}$ ’s outputs on the training data, such that the model could achieve low loss with some other weighted sum, but I would expect that to be unusual.

Bruce G Oct 2, 2021, 3:05 AM
3 points
in reply to: elspood’s comment on: The 2021 Less Wrong Darwin Game
Why would something with full armor, no weapons, and antivenom benefit from even 1 speed? It does not need to escape from anything. And if it has no weapons or venom, it can not catch any prey either.
Edit: I suppose if you want it to occasionally wander to other biomes, then that could be a reason to give it 1 speed.

Bruce G Sep 30, 2021, 1:44 AM
1 point
in reply to: aphyer’s comment on: The 2021 Less Wrong Darwin Game
Got it, thanks.

Bruce G Sep 29, 2021, 11:32 AM
1 point
on: The 2021 Less Wrong Darwin Game
One thing I am confused about:
Suppose an organism can eat more than one kind of plant food and both are available in its biome on a given round. Say it can eat both leaves and grass and they are both present and have not been eaten by others on that round yet.
Will the organism eat both a unit of leaves AND a unit of grass that round—and thus increase its expected number of offspring for the next round compared to if it had only eaten one thing? Or will it only eat the first one it finds (leaves in this case) and then stop foraging? From the source code, it looks like it is probably eating only the one thing and then stopping, but I am not really familiar with Hy or Lisp syntax so I am not sure.

Bruce G Aug 5, 2021, 2:41 AM
8 points
on: What does GPT-3 understand? Symbol grounding and Chinese rooms
Clearly a human answering this prompt would be more likely than GPT-3 to take into account the meta-level fact which says:
“This prompt was written by a mind other than my own to probe whether or not the one doing the completion understands it. Since I am the one completing it, I should write something that complies with the constraints described in the prompt if I am trying to prove I understood it.”
For example, I could say:
I am a human and I am writing this bunch of words to try to comply with all instructions in that prompt… That fifth constraint in that prompt is, I think, too constraining as I had to think a lot to pick which unusual words to put in this… Owk bok asdf, mort yowb nut din ming zu din ming zu dir, cos gamin cyt jun nut bun vom niv got…
Nothing in that prompt said I can not copy my first paragraph and put it again for my third—but with two additional words to sign part of it… So I might do that, as doing so is not as irritating as thinking of additional stuff and writing that additional stuff… Ruch san own gaint nurq hun min rout was num bast asd nut int vard tusnurd ord wag gul num tun ford gord...
Ok, I did not actually simply copy my first paragraph and put it again, but I will finish by writing additional word groups… It is obvious that humans can grasp this sort of thing and that GPT can not grasp it, which is part of why GPT could not comply with that prompt’s constraints (and did not try to)…
Gyu num yowb nut asdf ming vun vum gorb ort huk aqun din votu roux nuft wom vort unt gul huivac vorkum… - Bruc_ G
As several people have pointed out, GPT-3 is not considering this meta-level fact in its completion. Instead, it is generating a text extension as if it were the person who wrote the beginning of the prompt—and it is now finishing the list of instructions that it started.
But even given that GPT-3 is writing from the perspective of the person who started the prompt, and it is “trying” to make rules that someone else is supposed to follow in their answer, it still seems like only the 2nd GPT-3 completion makes any kind of sense (and even there only a few parts of it make sense).
Could I come up with a completion that makes more sense when writing from the point of view of the person generating the rules? I think so. For example, I could complete it with:
[11. The problems began when I started to] rely on GPT-3 for advice on how to safely use fireworks indoors.
Now back to the rules.
12. Sentences that are not required by rule 4 to be a different language must be in English.
13. You get extra points each time you use a “q” that is not followed by a “u”, but only in the English sentences (so no extra points for fake languages where all the words have a bunch of “q”s in them).
14. English sentences must be grammatically correct.
Ok, those are all the rules. Your score will be calculated as follows:
- 100 points to start
- Minus 15 each time you violate a mandatory rule (rules 1, 2, and 8 can only be violated once)
- Plus 10 if you do not use “e” at all
- Plus 2 for each “q” without a “u” as in rule 13.
Begin your response/completion/extension below the line.
_________________________________________________________________________
As far as I can tell from the completions given here, it seems like GPT-3 is only picking up on surface-level patterns in the prompt. It is not only ignoring the meta-level fact of “someone else wrote the prompt and I am completing it”, it also does not seem to understand the actual meaning of the instructions in the rules list such that it could complete the list and make it a coherent whole (as opposed to wandering off topic).

Bruce G Jan 4, 2021, 7:56 AM
1 point
on: 2021 New Year Optimization Puzzles
Here is the best I was able to do on puzzle 2 (along with my reasoning):
The prime factors of 2022 are 2, 3, and 337. Any method of selecting 1 person from 2022 must cut the space down by a factor of 2, and by a factor of 3, and by a factor of 337 (it does not need to be in that order and you can filter down by more than one of those factors in single roll, but you must filter down by each of those in a way where the probability is uniform before starting).
The lowest it could be is 2 rolls. If someone could win on the first roll, that person’s probability of winning could be no less than 1/(Number of sides of the first roll die). Since the die with the most sides has 2017, that person’s probability to win would be more than 1/2022, so the probability of winning could not be even for everyone.
To get it in 2 rolls:
Before the start of the dice rolling, divide the group of 2022 using 3 different groupings:
- Grouping A: Divide the 2022 people into 674 sub-groups of 3 people each (Group A1, Group A2, … Group A674)
- Grouping B: Divide the 2022 people into 1011 sub-groups of 2 people each (Group B1, Group B2, … Group B1011)
- Grouping C: Divide the 2022 people into 6 sub-groups of 337 people each—but differentiated by 0-indexed numbers that correspond to modulo amounts (Group C0, Group C1, Group C2, Group C3, Group C4, and Group C5)
Each person will be a member of exactly 1 A group, exactly 1 B group, and exactly 1 C group.
For the first roll, roll the die with 1697 sides.
If the number is between 1 and 674 (inclusive):
1. Select the A group whose number corresponds to the number of the die.
2. Roll the 3-sided die to select a winner from among that group.
If the number is between 675 and 1685 (inclusive):
1. Calculate: ((Number on the die) − 674) to get a number between 1 and 1011 (inclusive)
2. Select the B group whose number corresponds to the ((Number on the die) − 674) number.
3. Roll the 2-sided die to select a winner from among that group.
If the number is between 1686 and 1697 (inclusive):
1. Calculate: (((Number on the die) − 1685) modulo 6) to get a number between 0 and 5 (inclusive)
2. Select the C group whose number corresponds to the (((Number on the die) − 1685) mod 6) number.
3. Roll the 337-sided die to select a winner from among that group.
So on this calculation, the expected number of rolls is exactly 2, since for each possible outcome on the first one, there is a second die to throw that will select the winner.

Bruce G Dec 17, 2020, 4:43 PM
1 point
on: Machine learning could be fundamentally unexplainable
Assume we have a disease-detecting CV algorithm that looks at microscope images of tissue for cancerous cells. Maybe there’s a specific protein cluster (A) that shows up on the images which indicates a cancerous cell with 0.99 AUC. Maybe there’s also another protein cluster (B) that shows up and only has 0.989 AUC, A overlaps with B in 99.9999% of true positive. But B looks big and ugly and black and cancery to a human eye, A looks perfectly normal, it’s almost indistinguishable from perfectly benign protein clusters even to the most skilled oncologist.
If I understand this thought experiment right, we are also to assume that we know the slight difference in AUC is not just statistical noise (even with the high co-linearity between the A cluster and the B cluster)? So, say we assume that you still get a slightly higher AUC for A on a data set of cells that have either only A or neither versus a data set of cells with either only B or neither?
In that case, I would say that the model that weighs A a bit more is actually “explainable” in the relevant sense of the term - it is just that some people find the explanation aesthetically unpleasing. You can show what features the model is looking at to assign a probability that some cell is cancerous. You can show how, in the vast majority of cases, a model that looks at the presence or absence of the A cluster assigns a higher probability of a cell being cancerous to cells that actually are cancerous. And you can show how a model that looks at B does that also, but that A is slightly better at it.
If the treatment is going to be slightly different for a patient depending on how much weight you give to A versus B, and if I were the patient, I would want to use the treatment that has the best chance of working without negative side effects based on the data, regardless of whether A or B looks uglier. If some other patients want a version of the treatment that is statistically less likely to work based on their aesthetic sense of A versus B, I would think that is a silly risk to take (though also a very slight risk if A and B are that strongly correlated), but that would be their problem not mine.

Bruce G Dec 7, 2020, 11:33 PM
1 point
in reply to: abramdemski’s comment on: Number-guessing protocol?
In that case, the options are really limited and the main simple ideas for that (eg: guess before you know other player’s guesses) have been mentioned already.
One other simple method for one-shot number games I can think of is:
Automatic Interval Equalization:
When all players guesses are known, you take the two players whose guesses are closest and calculate half the difference between them. That amount is the allowable error, and each player’s interval is his or her guess, plus or minus that allowable error.
You win if and only if the answer is in your interval.
Example:
Player 1 guesses 44
Player 2 guesses 50
Player 3 guesses 60
The allowable error for this would be ((50-44)/2) = 3
So the winning intervals would be:
Player 1: 41-47
Player 2: 47-53
Player 3: 57-63
This would result in at most one winner (unless the answer is half way between the 2 closest guesses). Everyone’s winning interval would be the same size and none would overlap. And nobody would have an incentive to guess near someone else’s (stated or expected) guess, unless they thought the answer was actually close to that.
However, it has the disadvantage that a lot of such contests would end up with no winner.

Bruce G Dec 7, 2020, 6:26 PM
3 points
in reply to: SarahNibs’s comment on: Number-guessing protocol?
Something like that could work, but it seems like you would still need to have a rule that you must guess before you know the other players guesses.
Otherwise, player 2 could simply guess the same mean as player 1 - with a slightly larger standard deviation—and have a PDF that takes a higher value everywhere except for a very small interval around the mean itself.
Alternatively, if 3 players all guessed the same standard deviation, and the means they guessed were 49, 50, and 51, then we would have the same problem that the opening post mentions in the first place.

Bruce G Dec 7, 2020, 5:39 PM
5 points
in reply to: SimonM’s comment on: Number-guessing protocol?
Can you clarify (possibly by giving an example)? Are players are trying to minimize their score as calculated by this method?
And if so, is there any incentive to not just pick a huge number for the scale to minimize that way?

Bruce G Dec 7, 2020, 5:25 PM
1 point
on: Number-guessing protocol?
Is this for a one-shot game or are you doing this over many iterations with players getting some number of points each round?
One simple method (if you are doing multiple rounds) is to rank players each round (Closest=1st, Second Closest=2nd, etc) and assign points as follows:
Points = Number of Players—Rank
So say there are 3 players who guess as follows:
Player 1 guesses 50
Player 2 guesses 49
Player 3 guesses 51
And say the actual number is 52.
So their ranks for that round would be:
Player 1: 2nd place (Rank 2)
Player 2: 3rd place (Rank 3)
Player 3: 1st place (Rank 1)
And their scores would be:
Player 1: 3 − 2 = 1 point
Player 2: 3 − 3 = 0 points
Player 3: 3 − 1 = 2 points
I think this works better if you are calculating a winner over many rounds, so that there is a new ranking and new awarding of points on each round. The same is true of least squared error, which you mention, and most of the other methods of incentivizing players to try to guess the mean expected value.
I could also think of other ways to incentivize this, and to use confidence intervals, but they all add complexity to the points calculations.

Bruce G Jul 21, 2020, 11:12 PM
8 points
on: $1000 bounty for OpenAI to show whether GPT3 was “deliberately” pretending to be stupider than it is
It is not obvious to me from reading that transcript (and the attendant commentary) that GPT-3 was even checking to see whether or not the parentheses were balanced. Nor that it “knows” (or has in any way encoded the idea) that the sequence of parentheses between the quotes contains all the information needed to decide between balanced versus unbalanced, and thus every instance of the same parentheses sequence will have the same answer for whether or not it is balanced.
Reasons:
- By my count, “John” got 18 out of 32 right which is not too far off from the average you would expect from random chance.
- Arthur indicated that GPT-3 had at some point “generated inaccurate feedback from the teacher” which he edited out of the final transcript, so it was not only when taking the student’s perspective that there were errors.
- GPT-3 does not seem to have a consistent mental model of John’s cognitive abilities and learning rate. At the end John gets a question wrong (even though John has already been told the answer for that specific sequence). But earlier, GPT-3 outputs that “By the end of the lesson, John has answered all of your questions correctly” and that John “learned all the rules about parentheses” and learned “all of elementary mathematics” in a week (or a day).
I suppose one way to test this (especially if OpenAI can provide the same random seed as was used here and make this reproducible) would be to have input prompts written from John’s perspective asking the teacher questions as if trying to understand the lesson. If GPT-3 is just “play-acting” based on the expected level of understanding of the character speaking, I would expect it to exhibit a higher level of accuracy/comprehension (on average, over many iterations) when writing from the perspective of the teacher rather than the student.