I think it’s possible that any leaky abstraction used in designing FAI might doom the enterprise. But if that’s not true, we can use this “qualia translation function” to make a leaky abstractions in a FAI context a tiny bit safer(?).
E.g., if we’re designing an AGI with a reward signal, my intuition is we should either
(1) align our reward signal with actual pleasurable qualia (so if our abstractions leak it matters less, since the AGI is drawn to maximize what we want it to maximize anyway);
(2) implement the AGI in an architecture/substrate which produces as little emotional qualia as possible, so there’s little incentive for behavior to drift.
My thoughts here are terribly laden with assumptions and could be complete crap. Just thinking out loud.
I think it’s possible that any leaky abstraction used in designing FAI might doom the enterprise. But if that’s not true, we can use this “qualia translation function” to make a leaky abstractions in a FAI context a tiny bit safer(?).
E.g., if we’re designing an AGI with a reward signal, my intuition is we should either (1) align our reward signal with actual pleasurable qualia (so if our abstractions leak it matters less, since the AGI is drawn to maximize what we want it to maximize anyway); (2) implement the AGI in an architecture/substrate which produces as little emotional qualia as possible, so there’s little incentive for behavior to drift.
My thoughts here are terribly laden with assumptions and could be complete crap. Just thinking out loud.