I’m reminded of his master’s voice by stanislaw lem by this story, which has a completely different outcome to when humanity tries to decode a message from the stars.
Some form of proof of concept would be nice. Alter OOPS to use ockhams razor or implement AIXItl and then give it a picture of a bent piece of grass or three ball frames, and see what you get. As long as GR is in the hypothesis space it should by your reasoning be the most probable after these images. The unbounded uncomputable versions shouldn’t have any advantage in this case.
I’d be suprised if you got anything like modern physics popping out. I’ll do this test on any AI I create. If any of them have hypothesis like GR I’ll stop working on them until the friendliness problem has been solved. This should be safe, unless you think it could deduce my psychology from this as well.
GR is almost certainly wrong, given how well it fails to fit with QM. I’m no expert, but QM seems to work better than GR does, so it’s more likely the latter will have to change—which is what you’d expect from reductionism, I suppose. GR is operating at entirely the wrong level of abstraction.
The point is if GR is wrong and the AI doesn’t output GR because it’s wrong, then your test will say that the AI isn’t that smart. And then you do something like letting it out of the box and everyone probably dies.
And if the AI is that smart it will lie anyway....
Presumably it’s outputting the thing that’s right where GR is wrong, in which case you should be able to tell, at least in so much as it’s consistent with GR in all the places that GR has been tested.
Maybe it outputs something that’s just to hard to understand so you can’t actually tell what it’s predictions are, in which case you haven’t learned anything from your test.
I’m reminded of his master’s voice by stanislaw lem by this story, which has a completely different outcome to when humanity tries to decode a message from the stars.
Some form of proof of concept would be nice. Alter OOPS to use ockhams razor or implement AIXItl and then give it a picture of a bent piece of grass or three ball frames, and see what you get. As long as GR is in the hypothesis space it should by your reasoning be the most probable after these images. The unbounded uncomputable versions shouldn’t have any advantage in this case.
I’d be suprised if you got anything like modern physics popping out. I’ll do this test on any AI I create. If any of them have hypothesis like GR I’ll stop working on them until the friendliness problem has been solved. This should be safe, unless you think it could deduce my psychology from this as well.
What if GR is wrong, and it does not output GR because it spots the flaw that we do not?
Well, good for it?
GR is almost certainly wrong, given how well it fails to fit with QM. I’m no expert, but QM seems to work better than GR does, so it’s more likely the latter will have to change—which is what you’d expect from reductionism, I suppose. GR is operating at entirely the wrong level of abstraction.
The point is if GR is wrong and the AI doesn’t output GR because it’s wrong, then your test will say that the AI isn’t that smart. And then you do something like letting it out of the box and everyone probably dies.
And if the AI is that smart it will lie anyway....
Presumably it’s outputting the thing that’s right where GR is wrong, in which case you should be able to tell, at least in so much as it’s consistent with GR in all the places that GR has been tested.
Maybe it outputs something that’s just to hard to understand so you can’t actually tell what it’s predictions are, in which case you haven’t learned anything from your test.