Thinking about EY’s 2-4-6 problem (the following assumes you’ve read the link, slight spoilers for the Sequences ahead), and I’m noticing some confusion. I’m going to walk through my thought process as it happens (starting with trying to solve the problem as if I don’t know the solution), so this is gonna get messy.
Let’s say you start with the “default” hypothesis (we’ll call it Hyp1) that people seem to jump to first (by this I mean both me and Yudkowsky; I have no idea about others (why did we jump to this one first?)) that only sequences of numbers increasing by 2 are true. How should a rationalist try to prove/disprove Hyp1 is the correct algorithm with as few sets of sequences as possible? Well, my naive thought would be let’s test with a random high-number sequence that follows Hyp1. This would be to insure there isn’t some sort of cap to the size allowed (wouldn’t be proven of course, but if I’m dealing with a human I can probably assume that). Now what? Knowing nothing else, should we continue with similar sequences, to try to add evidence through “positive” induction, or aim for something against the rules, to make sure we aren’t being too restrictive? The fact of the matter is, while we can (and have to) make sure our hypothesis returns True and False, respectively, for all past “graded” sequences, we can’t insure the ruleset isn’t more complicated than it seems so far, such that there will be a future sequence that our hypothesis will give the incorrect answer to. A possible exception to this is if the ruleset only allows for a finite number of sequences which return True, but as I’m typing this I realize that’s not an exception; you can go through the full finite set of presumable True sequences and they’ll all return True, but you still can’t be sure the presumably False sequences will all return false. So there is no way to prove you’ve landed on the correct hypothesis with any possible hypothesis you give.
Okay, so at what point do you decide you’ve gathered enough evidence to determine if Hyp1, or any given Hyp, is true or not? As fun as it would be to have Yudkowsky waste his time grading infinite slips of paper (this is a joke—I do not endorse wasting Eliezer’s time irl, he’s got better stuff to do), I’m gonna get bored way before infinity. So let’s say I only need to reach 50% confidence. How would I know once I’ve reached that? I’m not sure right now, and would appreciate any comments giving insight on this, since if we weren’t dealing with a human, but rather with a God who choose an algorithm “randomly from among all finite computable algorithms” (lets assume for now the quoted statement has any meaning), then I think it will be impossible to gain any finite amount of confidence in a given algorithm in finite time, since there are infinitely more possible algorithms that can also be true, no matter the sample size.
The good news is we aren’t dealing with a God, we’re dealing with a human (presumably) that built this problem, so we can restrict the phase space of possible valid algorithms to those which a human would reasonably come up with. We can also assume with good probability that the algorithm will be both fairly simple, and also at least somewhat counter-intuitive in nature, considering the classroom setting Yudkowsky put us in, and the fact it’s EY who’s giving us this problem, come on guys.
For good measure, we can probably also assume only high-school level math or below will be used, since otherwise people who just don’t know advanced math will feel cheated. I’m noticing this is turning into a psychological analysis, which I think is what’s going on “under the hood” when people consider math problems irl. (I remember being in grade school, like 10 years old or so, and getting frustrated that a math question about the speed of an airplane didn’t take acceleration into account. I was convinced that the teacher had the answer wrong, since she assumed a flat speed. Fun times, grade school...) This is something I don’t think Yudkowsky brought up, but even seemingly simple questions like this have a lot of real-world assumptions baked into them, and it’s often assumed by researchers and teachers that people will make the same assumptions they did.
Okay, so we’ve narrowed down the space of possible valid algorithms considerably, (in fact from an infinite to a finite space, which is pretty impressive) and all this without writing down a thing. What now? Hyp1 was what intuitively came into my mind, and considering similarity of intuitive mental patterns across humans, it’s likely that the test designer at least considered it. as well So let’s start there. My first impulse is to see if we can extract individual assumptions, or variables, from Hyp1, so we can play around with them. Writing it down explicitly,
Hyp1 = for (the last two numbers of the sequence), return True iff(PreviousNumber + 2 = CurrentNumber)
Not sure where to put this, but I also just noticed that all of Yudkowsky’s examples are whole positive numbers, and without testing we can’t actually be sure if that’s a requirement or not. I would test that in a minimal amount of space with one sequence which starts with a negative fractional number, but which otherwise assumes Hyp1, such as (-2.07, −0.07, 1.93). If we’re optimizing for space, we’re probably going to want to pack in as many falsifiable assumptions as we can into each sequence, and if it returns false, we treat each assumption with separate sequences on the next set of three, so we can narrow down what went wrong.
What are some other assumptions we’re making with Hyp1?
[submitting to save, this is still a work in progress; idk how to draftify a shortform lol]
Thinking about EY’s 2-4-6 problem (the following assumes you’ve read the link, slight spoilers for the Sequences ahead), and I’m noticing some confusion. I’m going to walk through my thought process as it happens (starting with trying to solve the problem as if I don’t know the solution), so this is gonna get messy.
Let’s say you start with the “default” hypothesis (we’ll call it Hyp1) that people seem to jump to first (by this I mean both me and Yudkowsky; I have no idea about others (why did we jump to this one first?)) that only sequences of numbers increasing by 2 are true. How should a rationalist try to prove/disprove Hyp1 is the correct algorithm with as few sets of sequences as possible? Well, my naive thought would be let’s test with a random high-number sequence that follows Hyp1. This would be to insure there isn’t some sort of cap to the size allowed (wouldn’t be proven of course, but if I’m dealing with a human I can probably assume that). Now what? Knowing nothing else, should we continue with similar sequences, to try to add evidence through “positive” induction, or aim for something against the rules, to make sure we aren’t being too restrictive? The fact of the matter is, while we can (and have to) make sure our hypothesis returns True and False, respectively, for all past “graded” sequences, we can’t insure the ruleset isn’t more complicated than it seems so far, such that there will be a future sequence that our hypothesis will give the incorrect answer to. A possible exception to this is if the ruleset only allows for a finite number of sequences which return True, but as I’m typing this I realize that’s not an exception; you can go through the full finite set of presumable True sequences and they’ll all return True, but you still can’t be sure the presumably False sequences will all return false. So there is no way to prove you’ve landed on the correct hypothesis with any possible hypothesis you give.
Okay, so at what point do you decide you’ve gathered enough evidence to determine if Hyp1, or any given Hyp, is true or not? As fun as it would be to have Yudkowsky waste his time grading infinite slips of paper (this is a joke—I do not endorse wasting Eliezer’s time irl, he’s got better stuff to do), I’m gonna get bored way before infinity. So let’s say I only need to reach 50% confidence. How would I know once I’ve reached that? I’m not sure right now, and would appreciate any comments giving insight on this, since if we weren’t dealing with a human, but rather with a God who choose an algorithm “randomly from among all finite computable algorithms” (lets assume for now the quoted statement has any meaning), then I think it will be impossible to gain any finite amount of confidence in a given algorithm in finite time, since there are infinitely more possible algorithms that can also be true, no matter the sample size.
The good news is we aren’t dealing with a God, we’re dealing with a human (presumably) that built this problem, so we can restrict the phase space of possible valid algorithms to those which a human would reasonably come up with. We can also assume with good probability that the algorithm will be both fairly simple, and also at least somewhat counter-intuitive in nature, considering the classroom setting Yudkowsky put us in, and the fact it’s EY who’s giving us this problem, come on guys.
For good measure, we can probably also assume only high-school level math or below will be used, since otherwise people who just don’t know advanced math will feel cheated. I’m noticing this is turning into a psychological analysis, which I think is what’s going on “under the hood” when people consider math problems irl. (I remember being in grade school, like 10 years old or so, and getting frustrated that a math question about the speed of an airplane didn’t take acceleration into account. I was convinced that the teacher had the answer wrong, since she assumed a flat speed. Fun times, grade school...) This is something I don’t think Yudkowsky brought up, but even seemingly simple questions like this have a lot of real-world assumptions baked into them, and it’s often assumed by researchers and teachers that people will make the same assumptions they did.
Okay, so we’ve narrowed down the space of possible valid algorithms considerably, (in fact from an infinite to a finite space, which is pretty impressive) and all this without writing down a thing. What now? Hyp1 was what intuitively came into my mind, and considering similarity of intuitive mental patterns across humans, it’s likely that the test designer at least considered it. as well So let’s start there. My first impulse is to see if we can extract individual assumptions, or variables, from Hyp1, so we can play around with them. Writing it down explicitly,
Hyp1 = for (the last two numbers of the sequence), return True iff (PreviousNumber + 2 = CurrentNumber)
Not sure where to put this, but I also just noticed that all of Yudkowsky’s examples are whole positive numbers, and without testing we can’t actually be sure if that’s a requirement or not. I would test that in a minimal amount of space with one sequence which starts with a negative fractional number, but which otherwise assumes Hyp1, such as (-2.07, −0.07, 1.93). If we’re optimizing for space, we’re probably going to want to pack in as many falsifiable assumptions as we can into each sequence, and if it returns false, we treat each assumption with separate sequences on the next set of three, so we can narrow down what went wrong.
What are some other assumptions we’re making with Hyp1?
[submitting to save, this is still a work in progress; idk how to draftify a shortform lol]
The triplet
6, 4, 2
also seems worth testing.
1. Write a program that knows the rule.
2. Go faster by allowing triplet rules.
Like a, a+2, a+4.
This isn’t guessing the rule. If all instances would return true, then it gets true back.