Any bright ideas on what to do when calculating the human’s inferred preferences is comparable in size to using a simple objective plus instrumental calculations?
Something I’ve been confused about re: this argument is, aren’t instrumental calculations to rederive a human objective at minimum as expensive as just encoding the human’s objective directly, especially in the presence of a speed prior?
Any bright ideas on what to do when calculating the human’s inferred preferences is comparable in size to using a simple objective plus instrumental calculations?
Something I’ve been confused about re: this argument is, aren’t instrumental calculations to rederive a human objective at minimum as expensive as just encoding the human’s objective directly, especially in the presence of a speed prior?