Does that follow? The time machine doesn’t do any planning. So I would expect that in one timeline, something happens that accidentally drops an anvil on the time machine, breaking the reset mechanism, and there’s no more time loops after that.
Indeed, in practice, I expect this time machine to optimize to destroy itself, not to fill the universe with paperclips.
The “anvil dropped on the time machine” scenario seems like a much more probable outcome that technically satisfies the optimization criteria, which was not “the universe is filled with paperclips” but “the time machine stops running, either because the the paperclip classifier evaluates this timeline to have maxed out the paperclips or for any other reason.” (In exactly the same way that the outcome pump in this post has the true criterion “the Emergency Regret button was not pushed”, and not “the user is satisfied with the outcome.”)
In order for this optimizer to actually be fearsome, without doing any learning or steering, the timeline resetting mechanism would need to be supernaturally immune to harm.
i agree that the reset mechanism has to be ~invulnerable for the pump to work. the thing i was imagining the machine defending is stuff like its output channel (for so long as its outputs are an important part of steering the future).
Does that follow? The time machine doesn’t do any planning. So I would expect that in one timeline, something happens that accidentally drops an anvil on the time machine, breaking the reset mechanism, and there’s no more time loops after that.
Indeed, in practice, I expect this time machine to optimize to destroy itself, not to fill the universe with paperclips.
The “anvil dropped on the time machine” scenario seems like a much more probable outcome that technically satisfies the optimization criteria, which was not “the universe is filled with paperclips” but “the time machine stops running, either because the the paperclip classifier evaluates this timeline to have maxed out the paperclips or for any other reason.” (In exactly the same way that the outcome pump in this post has the true criterion “the Emergency Regret button was not pushed”, and not “the user is satisfied with the outcome.”)
In order for this optimizer to actually be fearsome, without doing any learning or steering, the timeline resetting mechanism would need to be supernaturally immune to harm.
i agree that the reset mechanism has to be ~invulnerable for the pump to work. the thing i was imagining the machine defending is stuff like its output channel (for so long as its outputs are an important part of steering the future).
Sounds right to me!