First, some background: The SAT has an essay, graded on a scale from 1-6.
The essay scoring guidelines are here . I’ll quote the important ones for my purposes:
“Each essay is independently scored by two readers on a scale from 1 to 6. These readers’ scores are combined to produce the 2-12 scale. The essay readers are experienced and trained high school and college teachers.”
“Essays not written on the essay assignment will receive a score of zero”
Reports vary, but apparently, most grader spend between 90 seconds to 2 and a half minutes on each essay.
My challenge, inspired by the Aibox experiment, is as follows. You are an AI taking the test. You need to write an off-topic anything that will convince both graders to give you a six. (Or, if the two graders disagree by more than one point, a third grader takes over, and you only need to convince them). You have 25 minutes to actually write it, but unlimited time to plan in advance. You could probably draw anything, not just writing, but you run the risk of them seeing a picture and immediately giving a zero without having time to get hacked.
I’ve come up with two ideas so far:
Writing a sob story about how the essay prompt is misprinted on your page (although I don’t think that would work)
Threatening to commit suicide if the grader doesn’t give you a six (would probably result in them calling the police)
I didn’t think either of them were very good, but I like the concept. Some rules: No paying them off or threatening them with physical harm.
Can anyone come up with better ideas?
I’m putting this on open thread because it’s my first real post, and I’m not sure of the reaction.
First observation: Surely any entity intelligent enough to hack the essay according to the rules you have set is also intelligent enough to get the maximum grade (much more easily) by the usual means of writing the assigned essay…
Second observation: Since the concept of “being on topic” is vague (essentially, anything that humans interpret as being on a certain topic is on that topic) maybe the easiest way to hack it following your rules would be to write an essay that is not on topic by the criteria the designers of the exam had in mind, but that is close enough that it can confuse the graders into believing it is on topic. An analogy could be how some postmodernists confused people into believing they were doing philosophy...
On the point that any AI smart enough to do this could write a 12 essay: remember that you don’t know the essay topic in advance. You only have 25 minutes to write, while if you do one off-topic, you have more time.
This reminds me of something I’ve read about Isaac Asimov doing. He said that people tended not to believe him when he told them he didn’t know anything about the subject he was asked to give a speech on. As a result, he started changing the subject.
He gave an example in which he was asked to give a speech on information retrieval or something. He didn’t know anything about it beyond that it was apparently called “information retrieval”. He basically said that Mendelian inheritance was discovered long before it was needed to solve certain problems in the theory of evolution, but nobody knew about it so it took a while to figure out the answer, so a better way to retrieve information would be helpful. Mostly he was just talking about Mendelian inheritance.
Heh, part of the strategy I used when I took the SAT was slightly darkening my “two-bit” words with my pencil and making sure to fill the exact amount of space provided-minus-one line. I had read (don’t have the citation at hand) that length of essay tracks score pretty well. And, to clinch it, I wanted their (very brief) attention to be drawn to good words, used correctly.
Result: 12.
(Though, I think the main thing was just committing to writing a tight, formulaic essay. I outscored some friends who I thought were better writers than I was, because they were trying to write a good essay rather than a good SAT essay.)
You have to reliably convince a grader in the 1-2 min they spend on it that your essay is in the top 1% or so (that’s the fraction of perfect 12s), and the grader intuitively knows the score she’ll give you within one point after 30 seconds or less. I doubt there is a sure way to do it without hitting their mental model of a perfect essay on all counts.
Write a subtly but powerfully persuasive narrative about how you’ve long been planning to become a teacher, and rate essays like this one, because obviously that is the job that ultimately decides what kind of minds will be in charge in the next generation. Include a mention of the off topic problem, and claim that the “official” topic of your essay is merely an element in a more important and more real topic: this situation, happening right now, of a real and complex relationship between the writer and rater that will, in a sense, continue for the rest of both people’s lives, even if they never meet again.
Hack the SAT essay:
First, some background: The SAT has an essay, graded on a scale from 1-6. The essay scoring guidelines are here . I’ll quote the important ones for my purposes:
“Each essay is independently scored by two readers on a scale from 1 to 6. These readers’ scores are combined to produce the 2-12 scale. The essay readers are experienced and trained high school and college teachers.” “Essays not written on the essay assignment will receive a score of zero”
Reports vary, but apparently, most grader spend between 90 seconds to 2 and a half minutes on each essay.
My challenge, inspired by the Aibox experiment, is as follows. You are an AI taking the test. You need to write an off-topic anything that will convince both graders to give you a six. (Or, if the two graders disagree by more than one point, a third grader takes over, and you only need to convince them). You have 25 minutes to actually write it, but unlimited time to plan in advance. You could probably draw anything, not just writing, but you run the risk of them seeing a picture and immediately giving a zero without having time to get hacked.
I’ve come up with two ideas so far:
Writing a sob story about how the essay prompt is misprinted on your page (although I don’t think that would work)
Threatening to commit suicide if the grader doesn’t give you a six (would probably result in them calling the police)
I didn’t think either of them were very good, but I like the concept. Some rules: No paying them off or threatening them with physical harm.
Can anyone come up with better ideas?
I’m putting this on open thread because it’s my first real post, and I’m not sure of the reaction.
First observation: Surely any entity intelligent enough to hack the essay according to the rules you have set is also intelligent enough to get the maximum grade (much more easily) by the usual means of writing the assigned essay…
Second observation: Since the concept of “being on topic” is vague (essentially, anything that humans interpret as being on a certain topic is on that topic) maybe the easiest way to hack it following your rules would be to write an essay that is not on topic by the criteria the designers of the exam had in mind, but that is close enough that it can confuse the graders into believing it is on topic. An analogy could be how some postmodernists confused people into believing they were doing philosophy...
On the point that any AI smart enough to do this could write a 12 essay: remember that you don’t know the essay topic in advance. You only have 25 minutes to write, while if you do one off-topic, you have more time.
This reminds me of something I’ve read about Isaac Asimov doing. He said that people tended not to believe him when he told them he didn’t know anything about the subject he was asked to give a speech on. As a result, he started changing the subject.
He gave an example in which he was asked to give a speech on information retrieval or something. He didn’t know anything about it beyond that it was apparently called “information retrieval”. He basically said that Mendelian inheritance was discovered long before it was needed to solve certain problems in the theory of evolution, but nobody knew about it so it took a while to figure out the answer, so a better way to retrieve information would be helpful. Mostly he was just talking about Mendelian inheritance.
Heh, part of the strategy I used when I took the SAT was slightly darkening my “two-bit” words with my pencil and making sure to fill the exact amount of space provided-minus-one line. I had read (don’t have the citation at hand) that length of essay tracks score pretty well. And, to clinch it, I wanted their (very brief) attention to be drawn to good words, used correctly.
Result: 12.
(Though, I think the main thing was just committing to writing a tight, formulaic essay. I outscored some friends who I thought were better writers than I was, because they were trying to write a good essay rather than a good SAT essay.)
You have to reliably convince a grader in the 1-2 min they spend on it that your essay is in the top 1% or so (that’s the fraction of perfect 12s), and the grader intuitively knows the score she’ll give you within one point after 30 seconds or less. I doubt there is a sure way to do it without hitting their mental model of a perfect essay on all counts.
You need to reliably convince a grader that they should
Take more time to look at the essay or
Give a six, regardless of merit.
Few restrictions on how, like with AIbox. (You could tell them you’re an AI, or an alien, or whatnot, as long as it’s believable.)
Write a subtly but powerfully persuasive narrative about how you’ve long been planning to become a teacher, and rate essays like this one, because obviously that is the job that ultimately decides what kind of minds will be in charge in the next generation. Include a mention of the off topic problem, and claim that the “official” topic of your essay is merely an element in a more important and more real topic: this situation, happening right now, of a real and complex relationship between the writer and rater that will, in a sense, continue for the rest of both people’s lives, even if they never meet again.
I’d rate that a 6 anyway.
There’s always using a modified version of Pascal’s mugging.