In most cases my thought is “well, what’s the alternative?”
I’m either doing what I would have done after thinking for N years, or I’m committing to a course of action after thinking less than N years. The former risks value drift, the latter risks… well, not having had as much time to think, which isn’t obviously better than value drift.
I do think there’s a few variations that seem like improvements, like:
run X copies of myself, slightly randomizing their starting conditions, running them for a range of times (maybe as wide as “1 week” to “10,000 years”. Before revealing their results to me, reveal how convergent they were. If there’s high convergence I’m probably less worried about the answer. )
make sure simulated me can only think about certain classes of things (“solve this problem with these constraints”). I’m more worried about value drift from “10,000 year me who lived life generally” than “10,000 year me who just thought about this one problem. Unless the problem was meta-ethics, in which case I probably want some kind of value drift.”
In most cases my thought is “well, what’s the alternative?”
Perhaps we humans should think ourselves for 10,000 years (passing the task from one generation to the next until aging is solved), instead of deferring to some “idealized” digital versions of ourselves.
This would require preventing existential catastrophes, during those 10,000 years, via “conventional means” (e.g. stabilizing the world to some extent).
In most cases my thought is “well, what’s the alternative?”
I’m either doing what I would have done after thinking for N years, or I’m committing to a course of action after thinking less than N years. The former risks value drift, the latter risks… well, not having had as much time to think, which isn’t obviously better than value drift.
I do think there’s a few variations that seem like improvements, like:
run X copies of myself, slightly randomizing their starting conditions, running them for a range of times (maybe as wide as “1 week” to “10,000 years”. Before revealing their results to me, reveal how convergent they were. If there’s high convergence I’m probably less worried about the answer. )
make sure simulated me can only think about certain classes of things (“solve this problem with these constraints”). I’m more worried about value drift from “10,000 year me who lived life generally” than “10,000 year me who just thought about this one problem. Unless the problem was meta-ethics, in which case I probably want some kind of value drift.”
Perhaps we humans should think ourselves for 10,000 years (passing the task from one generation to the next until aging is solved), instead of deferring to some “idealized” digital versions of ourselves.
This would require preventing existential catastrophes, during those 10,000 years, via “conventional means” (e.g. stabilizing the world to some extent).