I think that making the effort for something like this is just marvelous! This kind of intensity is really critical for meaningfully changing how someone thinks in the long run, and ten weeks of intense socializing with others concerned about existential risk with subsequent intermittent reinforcement is more than enough to create long-term loyalty to the cause, so to speak.
I echo Vaniver’s concerns, though I imagine many of these issues will get hammered out in the course of just doing it once. I’m commenting because I want to raise awareness of a problem that is usually harder to notice in most educational contexts (and I notice only because I’ve been explicitly trained to notice it):
We’re going to run A/B tests on you, and track the results to find out which training activities work best, and begin the tradition of evidence-based rationality training.
I think this is wonderful! I also think this is extremely dangerous.
I’ll preface my explanation with a warning that I’m going to touch on some politically charged topics. It has generally been my experience that most people have many of the same strongly held opinions about education, like that teachers will teach better if they know more about the subject they’re teaching. If I say something you strongly disagree with, do let me know; I like to discover when I’m wrong and appreciate such opportunities when provided to me. But let me know after pausing a moment to reflect on how you know that I’m wrong. I know that it can be really challenging for me not to just rehearse arguments on topics that I feel really emotionally charged by, and I imagine others often feel the same way.
The reason I say this evidence-based education is dangerous is because it’s one of those things that has a rational interpretation but is usually an applause light in disguise. The most blatant example in the USA I know of is No Child Left Behind (NCLB). NCLB was based on the idea that we need objective measures for education, and that once we have those objective measures we can enforce a kind of accountability. The model came from corporations in which the bottom line—profit—could be used as a hard-and-fast measure of success, and individual modules (e.g. stores) could be rewarded or punished based on profit in order to motivate them to generate more profit.
There have been scores of arguments about whether this analogy is appropriate for education, but for the present case I don’t think that matters in the slightest. The point I’d like to draw your attention to is the “objective measures” part. This is done through standardized testing in reading and mathematics. The huge, overwhelmingly crippling problem with this approach is that these tests are never, ever calibrated for what we actually care about.
I think one of the main reasons for this is that it’s actually quite difficult to go past one’s initial impression of what’s important in order to actually define what one cares about. For instance, it’s usually assumed that problem-solving ability is a standard measure of mathematical skill. However, there are other tremendously important aspects of math proficiency that problem-solving never touches, like problem-identification (i.e. recognizing how to convert an actual real-world problem you’re facing into a mathematical model that looks more like a word problem) and complex-pattern-recognition (e.g. realizing that all that knowledge about the Law of Large Numbers and variance can generate very helpful investing insights). And that’s ignoring the artifact that multiple-choice tests aren’t necessarily even testing students’ ability to solve previously-well-defined-and-tidy problems.
I think rationality is, in many ways, a lot more complicated than math. I also suspect that it modularizes in some ways, so that skill with question-dissolving might not bear at all on the ability to change one’s behavior, or perhaps even on the ability to notice confusion in real-time. This makes me worried about how SIAI is defining the goals of rationality training. Giving someone a confusing problem with tempting easy-but-wrong answers and seeing how they do would be a great test for potential FAI researchers, but that probably wouldn’t test what you’re looking for if you’re trying to train people to be excellent at swaying public opinion through social skills. And furthermore, do the tests actually test rationality, even in modularized form?
Bear in mind that A/B-style testing is based on a hard-scientific model of checking options such as when determining to what degree new drug X affects physiological property Y. But notice that “being and staying healthy” is a vastly more complicated challenge than, say, reducing mucus production from a cold because the former is understood much better in terms of messy human experience than it is in terms of specific, measurable parameters; hence the existence of people who are healthy by current medical measures but who just don’t feel well. In the same way, I suspect that rationality is complicated enough that someone who’s able to pass a battery of rationality tests might still turn out to be missing something you care about.
Of course, I don’t advocate skipping tests altogether due to a lack of certainty that the tests will always capture what you’re looking for. But I do think that there’s plenty of precedent to be worried that one’s tests might have very little if anything to do with what one cares about simply because the tests seem like they really should measure what you care about. The failure to check against this possibility is one of the reasons that math education is so abysmally horrid in the United States.
Has SIAI guarded against this concern? If so, might I ask how?
I think that making the effort for something like this is just marvelous! This kind of intensity is really critical for meaningfully changing how someone thinks in the long run, and ten weeks of intense socializing with others concerned about existential risk with subsequent intermittent reinforcement is more than enough to create long-term loyalty to the cause, so to speak.
I echo Vaniver’s concerns, though I imagine many of these issues will get hammered out in the course of just doing it once. I’m commenting because I want to raise awareness of a problem that is usually harder to notice in most educational contexts (and I notice only because I’ve been explicitly trained to notice it):
I think this is wonderful! I also think this is extremely dangerous.
I’ll preface my explanation with a warning that I’m going to touch on some politically charged topics. It has generally been my experience that most people have many of the same strongly held opinions about education, like that teachers will teach better if they know more about the subject they’re teaching. If I say something you strongly disagree with, do let me know; I like to discover when I’m wrong and appreciate such opportunities when provided to me. But let me know after pausing a moment to reflect on how you know that I’m wrong. I know that it can be really challenging for me not to just rehearse arguments on topics that I feel really emotionally charged by, and I imagine others often feel the same way.
The reason I say this evidence-based education is dangerous is because it’s one of those things that has a rational interpretation but is usually an applause light in disguise. The most blatant example in the USA I know of is No Child Left Behind (NCLB). NCLB was based on the idea that we need objective measures for education, and that once we have those objective measures we can enforce a kind of accountability. The model came from corporations in which the bottom line—profit—could be used as a hard-and-fast measure of success, and individual modules (e.g. stores) could be rewarded or punished based on profit in order to motivate them to generate more profit.
There have been scores of arguments about whether this analogy is appropriate for education, but for the present case I don’t think that matters in the slightest. The point I’d like to draw your attention to is the “objective measures” part. This is done through standardized testing in reading and mathematics. The huge, overwhelmingly crippling problem with this approach is that these tests are never, ever calibrated for what we actually care about.
I think one of the main reasons for this is that it’s actually quite difficult to go past one’s initial impression of what’s important in order to actually define what one cares about. For instance, it’s usually assumed that problem-solving ability is a standard measure of mathematical skill. However, there are other tremendously important aspects of math proficiency that problem-solving never touches, like problem-identification (i.e. recognizing how to convert an actual real-world problem you’re facing into a mathematical model that looks more like a word problem) and complex-pattern-recognition (e.g. realizing that all that knowledge about the Law of Large Numbers and variance can generate very helpful investing insights). And that’s ignoring the artifact that multiple-choice tests aren’t necessarily even testing students’ ability to solve previously-well-defined-and-tidy problems.
I think rationality is, in many ways, a lot more complicated than math. I also suspect that it modularizes in some ways, so that skill with question-dissolving might not bear at all on the ability to change one’s behavior, or perhaps even on the ability to notice confusion in real-time. This makes me worried about how SIAI is defining the goals of rationality training. Giving someone a confusing problem with tempting easy-but-wrong answers and seeing how they do would be a great test for potential FAI researchers, but that probably wouldn’t test what you’re looking for if you’re trying to train people to be excellent at swaying public opinion through social skills. And furthermore, do the tests actually test rationality, even in modularized form?
Bear in mind that A/B-style testing is based on a hard-scientific model of checking options such as when determining to what degree new drug X affects physiological property Y. But notice that “being and staying healthy” is a vastly more complicated challenge than, say, reducing mucus production from a cold because the former is understood much better in terms of messy human experience than it is in terms of specific, measurable parameters; hence the existence of people who are healthy by current medical measures but who just don’t feel well. In the same way, I suspect that rationality is complicated enough that someone who’s able to pass a battery of rationality tests might still turn out to be missing something you care about.
Of course, I don’t advocate skipping tests altogether due to a lack of certainty that the tests will always capture what you’re looking for. But I do think that there’s plenty of precedent to be worried that one’s tests might have very little if anything to do with what one cares about simply because the tests seem like they really should measure what you care about. The failure to check against this possibility is one of the reasons that math education is so abysmally horrid in the United States.
Has SIAI guarded against this concern? If so, might I ask how?