Don’t worry, Eliezer, there’s almost certainly a configuration of particles where something that remembers being you also remembers surviving the singularity. And in that universe your heroic attempts were very likely a principal reason why the catastrophe didn’t happen. All that’s left is arguing about the relative measure.
Even if things don’t work that way, and there really is a single timeline for some incomprehensible reason, you had a proper pop, and you and MIRI have done some fascinating bits of maths, and produced some really inspired philosophical essays, and through them attracted a large number of followers, many of whom have done excellent stuff.
I’ve enjoyed everything you’ve ever written, and I’ve really missed your voice over the last few years.
It’s not your fault that the universe set you an insurmountable challenge, or that you’re surrounded by the sorts of people who are clever enough to build a God and stupid enough to do it in spite of fairly clear warnings.
Honestly, even if you were in some sort of Groundhog Day setup, what on earth were you supposed to do? The ancients tell us that it takes many thousands of years just to seduce Andie MacDowell, and that doesn’t even look hard.
Yeah, the thought that I’m not really seeing how to win even with a single restart has been something of a comforting one. I was, in fact, handed something of an overly hard problem even by “shut up and do the impossible” standards.
Groundhog Day loop I’d obviously be able to do it, as would Nate Soares or Paul Christiano, if AGI failures couldn’t destroy my soul inside the loop. Possibly even Yann Lecun could do it; the first five loops or so would probably be enough to break his optimism, and with his optimism broken and the ability to test and falsify his own mistakes nonfatally he’d be able to make progress.
Certainly I would love to read it, if someone were to write it well. Permission to nick the idea is generally given.
I like the idea of starting off from Punxutawny, with Bill Murray’s weatherman slowly getting the idea that he should try to fix whatever keeps suddenly causing the mysterious reset events.
I suppose every eight billion deaths (+ whatever else is out there) you get a bug report, and my friend Ramana did apparently manage to create a formally verified version of grep, so more is possible than my intuition tells me.
But I do wonder if that just (rather expensively) gets you to the point where the AI keeps you alive so you don’t reset the loop. That’s not necessarily a win.
--
Even if you can reset at will, and you can prevent the AI from stopping you pressing the reset, it only gets you as far as a universe where you personally think you’ve won. The rest of everything is probably paperclips.
--
If you give everyone in the entire world death-or-reset-at-will powers, then the bug reports stop being informative, but maybe after many many loops then just by trial and error we get to a point where everyone has a life they prefer over reset, and maybe there’s a non-resetting path from there to heaven despite the amount of wireheading that will have gone on?
--
But actually I have two worries even in this case:
One is that human desires are just fundamentally incompatible in the sense that someone will always press the button and the whole thing just keeps looping. There’s probably some sort of finiteness argument here, but I worry about Arrow’s Theorem.
The second is that the complexity of the solution may exceed the capacity of the human mind. Even if solutions exist, are you sure that you can carry enough information back to the start of the loop to have a chance of doing it right?
--
Also what if the AI gets hold of the repeat button?
Also, I am lazy and selfish, I’m probably going to spend a while in my personal AI paradise before I finally reluctantly guilt myself into pressing the reset. When I come back (as a four-year old with plans) am I going to be able to stand this world? Or am I just going to rebuild my paradise as fast as possible and stay there longer this time?
I think there are lots of ways to fail.
--
Obviously I am just free-associating out of my arse here, (can’t sleep), but maybe it’s worth trying to work out some sort of intuition for this problem?
Why do you think it’s possible *at all*, and how many repeats do you think you’d need?
I’m pretty sure I couldn’t beat God at chess, even given infinite repeats. What makes this easier?
(OK actually I could, given infinite repeats, potentially construct a whole-game tablebase and get a draw (and maybe a win if chess can be won). That looks kind of tractable, but I probably need to take over the world as part of that.… I’m going to need something pretty large to keep the data in...)
It’s too bad we couldn’t just have the proverbial box be an isolated simulation and have you brain interface into it. The A.I. keeps winning in the Matrix, and afterwards we just reset it until we see improvements in alignment.
Don’t worry, Eliezer, there’s almost certainly a configuration of particles where something that remembers being you also remembers surviving the singularity. And in that universe your heroic attempts were very likely a principal reason why the catastrophe didn’t happen. All that’s left is arguing about the relative measure.
Even if things don’t work that way, and there really is a single timeline for some incomprehensible reason, you had a proper pop, and you and MIRI have done some fascinating bits of maths, and produced some really inspired philosophical essays, and through them attracted a large number of followers, many of whom have done excellent stuff.
I’ve enjoyed everything you’ve ever written, and I’ve really missed your voice over the last few years.
It’s not your fault that the universe set you an insurmountable challenge, or that you’re surrounded by the sorts of people who are clever enough to build a God and stupid enough to do it in spite of fairly clear warnings.
Honestly, even if you were in some sort of Groundhog Day setup, what on earth were you supposed to do? The ancients tell us that it takes many thousands of years just to seduce Andie MacDowell, and that doesn’t even look hard.
Yeah, the thought that I’m not really seeing how to win even with a single restart has been something of a comforting one. I was, in fact, handed something of an overly hard problem even by “shut up and do the impossible” standards.
Groundhog Day loop I’d obviously be able to do it, as would Nate Soares or Paul Christiano, if AGI failures couldn’t destroy my soul inside the loop. Possibly even Yann Lecun could do it; the first five loops or so would probably be enough to break his optimism, and with his optimism broken and the ability to test and falsify his own mistakes nonfatally he’d be able to make progress.
It’s not that hard.
The setting of Groundhog Day struggle against AI might be the basis of a great fanfiction.
Certainly I would love to read it, if someone were to write it well. Permission to nick the idea is generally given.
I like the idea of starting off from Punxutawny, with Bill Murray’s weatherman slowly getting the idea that he should try to fix whatever keeps suddenly causing the mysterious reset events.
I suppose every eight billion deaths (+ whatever else is out there) you get a bug report, and my friend Ramana did apparently manage to create a formally verified version of grep, so more is possible than my intuition tells me.
But I do wonder if that just (rather expensively) gets you to the point where the AI keeps you alive so you don’t reset the loop. That’s not necessarily a win.
--
Even if you can reset at will, and you can prevent the AI from stopping you pressing the reset, it only gets you as far as a universe where you personally think you’ve won. The rest of everything is probably paperclips.
--
If you give everyone in the entire world death-or-reset-at-will powers, then the bug reports stop being informative, but maybe after many many loops then just by trial and error we get to a point where everyone has a life they prefer over reset, and maybe there’s a non-resetting path from there to heaven despite the amount of wireheading that will have gone on?
--
But actually I have two worries even in this case:
One is that human desires are just fundamentally incompatible in the sense that someone will always press the button and the whole thing just keeps looping. There’s probably some sort of finiteness argument here, but I worry about Arrow’s Theorem.
The second is that the complexity of the solution may exceed the capacity of the human mind. Even if solutions exist, are you sure that you can carry enough information back to the start of the loop to have a chance of doing it right?
--
Also what if the AI gets hold of the repeat button?
Also, I am lazy and selfish, I’m probably going to spend a while in my personal AI paradise before I finally reluctantly guilt myself into pressing the reset. When I come back (as a four-year old with plans) am I going to be able to stand this world? Or am I just going to rebuild my paradise as fast as possible and stay there longer this time?
I think there are lots of ways to fail.
--
Obviously I am just free-associating out of my arse here, (can’t sleep), but maybe it’s worth trying to work out some sort of intuition for this problem?
Why do you think it’s possible *at all*, and how many repeats do you think you’d need?
I’m pretty sure I couldn’t beat God at chess, even given infinite repeats. What makes this easier?
(OK actually I could, given infinite repeats, potentially construct a whole-game tablebase and get a draw (and maybe a win if chess can be won). That looks kind of tractable, but I probably need to take over the world as part of that.… I’m going to need something pretty large to keep the data in...)
> I was, in fact, handed something of an overly hard problem
We were all handed the problem. You were the one who decided to man up and do something about it.
It’s too bad we couldn’t just have the proverbial box be an isolated simulation and have you brain interface into it. The A.I. keeps winning in the Matrix, and afterwards we just reset it until we see improvements in alignment.