Yeah, skipped a step, sorry. SAT tends to be touted as a “measure [of] literacy, numeracy and writing skills that are needed for academic success in college. They state that the SAT assesses how well the test-takers analyze and solve problems.” If so, then an AI that can do well on this test is expected to be able to learn to “analyze and solve problems” in a rather general range. At that point the argument about whether the AI can “keep its beliefs consistent with one another”, at least as much as a human can, which is not very much, would become moot. The test is also standardized and really not easy to game for a human, even with intensive preparation, so it’s not nearly as subjective as various Turing tests. Hope this makes sense.
First off standardized tests are incredibly easy to game by humans and there is an entire industry around it. My roommate in college tutored people on this and routinely sat and scored perfect scores on all standardized entrance examinations (SAT, ACT, LSAT, GRE, etc.) as an advertising ploy. This is despite scoring mediocre when he took it the first time for college and not being intrinsically super bright or anything. The notion that these are a real test for anything other than teachable test taking skills is propaganda from the testing industry. Most prestigious schools are in the process of removing standardized tests from entrance consideration since it is demonstrated to be a poor heuristic for student performance.
But there is an even more fundamental issue I think, which is that GPT-2 more resembles a compressed GLUT or giant Markov chain in than it does a thinking program that computes intelligent solutions for itself.
Despite popular misconceptions, the SAT is basically an IQ test, and doesn’t really reward obsessive freaking out and throwing money at the problem.
I am not inclined to argue about this particular point though. Scott tends to know what he writes about and whenever his mistakes are pointed out he earnestly adds a post to the list of mistakes. So I go with his take on it.
But there is an even more fundamental issue I think, which is that GPT-2 more resembles a compressed GLUT or giant Markov chain in than it does a thinking program that computes intelligent solutions for itself.
Maybe. I don’t know enough about either the brain architecture (which looks like a hodge podge of whatever evolution managed to cobble together) or the ML architecture (which is probably not much closer to intelligent design), and I do not really care. As long as AI behaves like an IQ 120+ human, I would happily accept a mix of GLUTs and Markov chains as a reasonable facsimile of intelligence and empathy.
As long as AI behaves like an IQ 120+ human, I would happily accept a mix of GLUTs and Markov chains as a reasonable facsimile of intelligence and empathy.
It doesn’t though, that’s the point! It cannot form plans. It cannot work towards coherent, long term goals, or really operate as an agent at all. It is unable to form new concepts and ideas. It is a very narrow AI only really able to remix its training data in a way that appears on the surface to approximate human writing style. That’s all it can do.
I don’t care who disagrees. If he’s got statistics then I defy the data. This is something you can go out and test in the real world. Get a practice test book, test yourself on one time test, learn some techniques, test yourself on the next test to see what difference it makes, and repeat. I’ve done this and the effect is very real. Training centers have demonstrated the effect with large group sizes.
Yeah, skipped a step, sorry. SAT tends to be touted as a “measure [of] literacy, numeracy and writing skills that are needed for academic success in college. They state that the SAT assesses how well the test-takers analyze and solve problems.” If so, then an AI that can do well on this test is expected to be able to learn to “analyze and solve problems” in a rather general range. At that point the argument about whether the AI can “keep its beliefs consistent with one another”, at least as much as a human can, which is not very much, would become moot. The test is also standardized and really not easy to game for a human, even with intensive preparation, so it’s not nearly as subjective as various Turing tests. Hope this makes sense.
First off standardized tests are incredibly easy to game by humans and there is an entire industry around it. My roommate in college tutored people on this and routinely sat and scored perfect scores on all standardized entrance examinations (SAT, ACT, LSAT, GRE, etc.) as an advertising ploy. This is despite scoring mediocre when he took it the first time for college and not being intrinsically super bright or anything. The notion that these are a real test for anything other than teachable test taking skills is propaganda from the testing industry. Most prestigious schools are in the process of removing standardized tests from entrance consideration since it is demonstrated to be a poor heuristic for student performance.
But there is an even more fundamental issue I think, which is that GPT-2 more resembles a compressed GLUT or giant Markov chain in than it does a thinking program that computes intelligent solutions for itself.
Scott seems to disagree:
I am not inclined to argue about this particular point though. Scott tends to know what he writes about and whenever his mistakes are pointed out he earnestly adds a post to the list of mistakes. So I go with his take on it.
Maybe. I don’t know enough about either the brain architecture (which looks like a hodge podge of whatever evolution managed to cobble together) or the ML architecture (which is probably not much closer to intelligent design), and I do not really care. As long as AI behaves like an IQ 120+ human, I would happily accept a mix of GLUTs and Markov chains as a reasonable facsimile of intelligence and empathy.
It doesn’t though, that’s the point! It cannot form plans. It cannot work towards coherent, long term goals, or really operate as an agent at all. It is unable to form new concepts and ideas. It is a very narrow AI only really able to remix its training data in a way that appears on the surface to approximate human writing style. That’s all it can do.
I see your stance, and it looks like further discussion is no longer productive. We’ll see how things turn out.
I don’t care who disagrees. If he’s got statistics then I defy the data. This is something you can go out and test in the real world. Get a practice test book, test yourself on one time test, learn some techniques, test yourself on the next test to see what difference it makes, and repeat. I’ve done this and the effect is very real. Training centers have demonstrated the effect with large group sizes.