My current toy thinking along these lines is imagining a program that will write a program to solve the towers of hanoi, given only some description of the problem, and do nothing else, using only fixed computational resources for the whole thing.
I think that’s safe, and would illustrate useful principles for FAI.
This object of this famous puzzle is to move N disks from the left peg to the right peg using the center peg as an auxiliary holding peg. At no time can a larger disk be placed upon a smaller disk.
What Blueberry said. The page you linked just gives the standard program for solving Towers of Hanoi. What JamesAndrix was imagining was a program that comes up with that solution, given just the description of the problem—i.e., what the human coder did.
Well, this can actually be done (yes, in Prolog with a few metaprogramming tricks), and it’s not really that hard—only very inefficient, i.e. feasible only for relatively small problems. See: Inductive logic programming.
No, not learning. And the ‘do nothing else’ parts can’t be left out.
This shouldn’t be a general automatic programing method, just something that goes through the motions of solving this one problem. It should already ‘know’ whatever principles lead to that solution. The outcome should be obvious to the programmer, and I suspect realistically hand-traceable. My goal is a solid understanding of a toy program exactly one meta-level above hanoi.
This does seem like something Prolog could do well, if there is already a static program that does this I’d love to see it.
Until you specify the format of a description of the problem, and how the program figures out how to write a program to solve the problem, it is hard to tell if this would be safe.
And if you don’t know that it is safe, it isn’t. Using some barrier like “fixed computational resources” to contain a non-understood process is a red flag.
The format of the description is something I’m struggling with, but I’m not clear how it impacts safety.
How the AI figures things out is up to the human programmer. Part of my intent in this exercise is to constrain the human to solutions they fully understand. In my mind my original description would have ruled out evolving neural nets, but now I see I definitely didn’t make that clear.
By ‘fixed computational resources’ I mean that you’ve got to write the program such that if it discovers some flaw that gives it access to the internet, it will patch around that access because what it is trying to do is solve the puzzle of (solving the puzzle using only these instructions and these rules and this memory.)
What I’m looking for is a way to work on friendliness using goals that are much simpler than human morality, implemented by minds that are at least comprehensible in their operation, if not outright step-able.
My current toy thinking along these lines is imagining a program that will write a program to solve the towers of hanoi, given only some description of the problem, and do nothing else, using only fixed computational resources for the whole thing.
I think that’s safe, and would illustrate useful principles for FAI.
An earlier comment of mine on the Towers of Hanoi. (ETA: I mean earlier relative to the point in time when this thread was resurrected.)
Are you familiar with Hofstadter’s work in “microdomains”, such as Copycat et al.?
So.… you want to independently re-invent a prolog compiler?
More like a program that takes
as input and returns the Prolog code as output.
What Blueberry said. The page you linked just gives the standard program for solving Towers of Hanoi. What JamesAndrix was imagining was a program that comes up with that solution, given just the description of the problem—i.e., what the human coder did.
Well, this can actually be done (yes, in Prolog with a few metaprogramming tricks), and it’s not really that hard—only very inefficient, i.e. feasible only for relatively small problems. See: Inductive logic programming.
No, not learning. And the ‘do nothing else’ parts can’t be left out.
This shouldn’t be a general automatic programing method, just something that goes through the motions of solving this one problem. It should already ‘know’ whatever principles lead to that solution. The outcome should be obvious to the programmer, and I suspect realistically hand-traceable. My goal is a solid understanding of a toy program exactly one meta-level above hanoi.
This does seem like something Prolog could do well, if there is already a static program that does this I’d love to see it.
Until you specify the format of a description of the problem, and how the program figures out how to write a program to solve the problem, it is hard to tell if this would be safe.
And if you don’t know that it is safe, it isn’t. Using some barrier like “fixed computational resources” to contain a non-understood process is a red flag.
The format of the description is something I’m struggling with, but I’m not clear how it impacts safety.
How the AI figures things out is up to the human programmer. Part of my intent in this exercise is to constrain the human to solutions they fully understand. In my mind my original description would have ruled out evolving neural nets, but now I see I definitely didn’t make that clear.
By ‘fixed computational resources’ I mean that you’ve got to write the program such that if it discovers some flaw that gives it access to the internet, it will patch around that access because what it is trying to do is solve the puzzle of (solving the puzzle using only these instructions and these rules and this memory.)
What I’m looking for is a way to work on friendliness using goals that are much simpler than human morality, implemented by minds that are at least comprehensible in their operation, if not outright step-able.