FAI needs to be backed by solid theory, explaining why exactly its answers are superior to moral intuition.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture. If by “solid” you mean “agrees with our intuitions about real life”, we’re back to square one. If by “solid” you mean something else, please explain what exactly. It looks to me like you’re running circles around the is-ought problem without recognizing it.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture.
How could I possibly mean “internally consistent”? Being consistent conveys no information about a concept, aside from its non-triviality, and so can’t be a useful characteristic. And choosing specks is also “internally consistent”. Maybe I like specks in others’ eyes.
FAI theory should be reliably convincing and verifiable, preferably on the level of mathematical proofs. FAI theory describes how to formally define the correct answers to moral questions, but doesn’t at all necessarily help in intuitive understanding of what these answers are. It could be a formalization of “what we’d choose if we were smarter, knew more, had more time to think”, for example, which doesn’t exactly show how the answers look.
FAI doesn’t work with “us”, it works with world-states, which include all detail including whatever distinguishes present humans from hypothetical smarter people. A given situation that includes a smarter person is distinct from otherwise the same situation that includes a human person, and so these situations should be optimized differently.
I see your point, but my question still stands. You seem to take it on faith that an extrapolated smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”. This is not quite as bad as believing that a paperclipper AI will “discover” morality on its own, but it’s close.
You seem to take it on faith that a hypothetical smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”.
I don’t “take it on faith”, and the example with “if we were smarter” wasn’t supposed to be an actual stab at FAI theory.
On the other hand, if we define “smarter” as also keeping preference fixed (the alternative would be wrong, as a Smiley is also “smarter”, but clearly not what I meant), then smarter versions’ advice is by definition better. This, again, gives no technical guidance on how to get there, hence “formalization” word was essential in my comment. The “smarter” modifier is about as opaque as the whole of FAI.
You define “smarter” as keeping “preference” fixed, but you also define “preference” as the extrapolation of our moral intuitions as we become “smarter”. It’s circular. You’re right, this stuff is opaque.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture. If by “solid” you mean “agrees with our intuitions about real life”, we’re back to square one. If by “solid” you mean something else, please explain what exactly. It looks to me like you’re running circles around the is-ought problem without recognizing it.
How could I possibly mean “internally consistent”? Being consistent conveys no information about a concept, aside from its non-triviality, and so can’t be a useful characteristic. And choosing specks is also “internally consistent”. Maybe I like specks in others’ eyes.
FAI theory should be reliably convincing and verifiable, preferably on the level of mathematical proofs. FAI theory describes how to formally define the correct answers to moral questions, but doesn’t at all necessarily help in intuitive understanding of what these answers are. It could be a formalization of “what we’d choose if we were smarter, knew more, had more time to think”, for example, which doesn’t exactly show how the answers look.
Then the FAI risks putting us all in a situation we hate, which we’d love if only we were a bit smarter.
FAI doesn’t work with “us”, it works with world-states, which include all detail including whatever distinguishes present humans from hypothetical smarter people. A given situation that includes a smarter person is distinct from otherwise the same situation that includes a human person, and so these situations should be optimized differently.
I see your point, but my question still stands. You seem to take it on faith that an extrapolated smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”. This is not quite as bad as believing that a paperclipper AI will “discover” morality on its own, but it’s close.
I don’t “take it on faith”, and the example with “if we were smarter” wasn’t supposed to be an actual stab at FAI theory.
On the other hand, if we define “smarter” as also keeping preference fixed (the alternative would be wrong, as a Smiley is also “smarter”, but clearly not what I meant), then smarter versions’ advice is by definition better. This, again, gives no technical guidance on how to get there, hence “formalization” word was essential in my comment. The “smarter” modifier is about as opaque as the whole of FAI.
You define “smarter” as keeping “preference” fixed, but you also define “preference” as the extrapolation of our moral intuitions as we become “smarter”. It’s circular. You’re right, this stuff is opaque.
It’s a description, connection between the terms, but not a definition (pretty useless, but not circular).