Learning distills memories in models that can be more reasonably bounded even for experience on astronomical timescales. It’s not absolutely necessary to keep the exact record of everything. What it takes to avoid value drift is another issue though, this might incur serious overhead.
Value drift in people is not necessarily important, might even be desirable, it’s only clear for the agent in charge of the world that there should be no value drift. But even that doesn’t necessarily make sense if there are no values to work with as a natural ingredient of this framing.
The arguments for instrumental convergence don’t apply to the smaller processes that take place within a world fully controlled by an all-powerful agent, because the agent can break Moloch’s back. If the agent doesn’t want undue resource acquisition to be useful for you, it won’t be, and so on.
The expectation that humans would value preservation of values is shaky, it’s mostly based on the instrumental convergence argument, that doesn’t apply in this setting. So it might actually turn out that human preference says that value preservation is not good for individual people, that value drift in people is desirable. Absence of value drift is still an instrumental goal for the agent in charge of the world that works for the human preference that doesn’t drift. This agent can then ensure that the overall shape of value drift in the people who live in the world is as it should be, that it doesn’t descend into madness.
Value drift only makes sense where the abstraction of values makes sense. Does my apartment building have a data integrity problem, does it fail some hash checks? This doesn’t make sense, the apartment building is not a digital data structure. I think it’s plausible that some AGIs of the non-world-eating variety lack anything that counts as their preference, they are not agents. In a world dominated by such AGIs some people would still set up smaller agents merely for the purpose of their own preference management (this is the overhead I alluded to in the previous comment). But for those who don’t and end up undergoing unchecked value drift (with no agents to keep it in line with what values-on-reflection approve of), the concept of values is not necessarily important either. This too might be the superior alternative, more emphasis on living long reflection than on being manipulated into following its conclusions.
Being immortal means you will one day be a Jupiter brain (if you think memories are part of one’s identity, which I think they are)
x-post: https://twitter.com/matiroy9/status/1451816147909808131
Yes, this follows from that loops can happen only once subjectively, as Peter Hensen recently mentioned to me.
Learning distills memories in models that can be more reasonably bounded even for experience on astronomical timescales. It’s not absolutely necessary to keep the exact record of everything. What it takes to avoid value drift is another issue though, this might incur serious overhead.
Value drift in people is not necessarily important, might even be desirable, it’s only clear for the agent in charge of the world that there should be no value drift. But even that doesn’t necessarily make sense if there are no values to work with as a natural ingredient of this framing.
Sure! My point still stand though :)
Here it’s more about identity deterioration than value drift (you could maintain the same value while forgetting all your life).
But also, to address your claim in a vacuum:
Value-preservation is an instrumental convergent goal; ie. you’re generally more likely to achieve your goals if you want to achieve them.
Plus, I think most humans would value preserving their (fundamental) values intrinsically as well.
Am not sure I understand
The arguments for instrumental convergence don’t apply to the smaller processes that take place within a world fully controlled by an all-powerful agent, because the agent can break Moloch’s back. If the agent doesn’t want undue resource acquisition to be useful for you, it won’t be, and so on.
The expectation that humans would value preservation of values is shaky, it’s mostly based on the instrumental convergence argument, that doesn’t apply in this setting. So it might actually turn out that human preference says that value preservation is not good for individual people, that value drift in people is desirable. Absence of value drift is still an instrumental goal for the agent in charge of the world that works for the human preference that doesn’t drift. This agent can then ensure that the overall shape of value drift in the people who live in the world is as it should be, that it doesn’t descend into madness.
Value drift only makes sense where the abstraction of values makes sense. Does my apartment building have a data integrity problem, does it fail some hash checks? This doesn’t make sense, the apartment building is not a digital data structure. I think it’s plausible that some AGIs of the non-world-eating variety lack anything that counts as their preference, they are not agents. In a world dominated by such AGIs some people would still set up smaller agents merely for the purpose of their own preference management (this is the overhead I alluded to in the previous comment). But for those who don’t and end up undergoing unchecked value drift (with no agents to keep it in line with what values-on-reflection approve of), the concept of values is not necessarily important either. This too might be the superior alternative, more emphasis on living long reflection than on being manipulated into following its conclusions.