I happened to stumble on an old comment where I was already of the opinion that progress is not a “refinement” but will “defocus” from old division lines.
At some midskill “fruitmaximisement” peaks and those that don’t understand things beyond that point will confuse those that are yet to get to fruitmaximization and those that are past that.
If someone said “you were suboptimal on fruit front, I fixed that mistake for you” and I arrive at a table with 2 worm apples, I would be annoyed/pissed. I am assuming that the other agent can’t evaluate their cleanness—it’s all fruit to them.
One could do similarly with radioactive apples etc. In a certain sense yes, it is about ability to percieve properties and even I use the verb “evaluate”. But I don’t find the break so easy to justify between preferences and evaluations. Knowing and opining that “worminess” is a relevant thing is not value neutral. Reflecting upon “apple with no worm” and “apple with worm” can have results that overpower old reflections on “apple” vs “pear” even thought it is not contradicted (wormless pear vs wormful apple is in a sense “mere adversial example” it doesn’t violate species preference but it can absolute render it irrelevant).
My example of wacky scenarios are bad. I was thinking that if one holds that playing Grand Theft Auto is not unethical and “ordinary murder” is unethical, then if it turns out that reality is similar to GTA in “relevant way” this might be a non-trivial reconciliation. There is a phenomenon of referring to real life people as NPCs.
The sharedness was about like a situation with a book like Game Of Thrones. In a sense all the characters are only parts of a single reading experience. And Jaime Lannister still has to use spies to learn about Arya Starks doings (so information passing is not the thing here). If a character action could start the “book to burn” Westeros internal logic does not particularly help to opine about that. Doc warning Marty that the stakes are a bit high here, is in a sense introducing previously incomprehensibly bad outcomes.
The particular dynamics are not the focus but that we suddenly need to start caring about metaphysics. I wrote a bit long for explaining bad examples.
From the dialogue on the old post
Is this bad according to Alice’s own preferences? Can we show this? How would we do that? By asking Alice whether she prefers the outcome (5 apples and 1 orange) to the initial state (8 apples and 1 orange)?
Expecting super-intelligent things to be consistent kind of assumes that if a metric ever becomes a good goal higher levels will never be weaker on that metric, that maximation strictly grows and never decreases with ability for all submetrics.
This is written with competence in mind but I think it still work for taste as well. Fruit-capable Alice indeed would classify worm-capable Alice to be a stupid idiot and a hellworld. But I think that doing this transition and saying “oops” is the proper route. Being very confident that you opine on properties of apples so well that you will never-ever say “oops” in this sense is very closeminded. You should not leave fingerprints on yourself either.
My example of wacky scenarios are bad. I was thinking that if one holds that playing Grand Theft Auto is not unethical and “ordinary murder” is unethical, then if it turns out that reality is similar to GTA in “relevant way” this might be a non-trivial reconciliation. There is a phenomenon of referring to real life people as NPCs.
This is a specific example that I hold as a guaranteed invariant: if it turns out real life is “like GTA” in a relevant way, then I start campaigning for murdering NPCs in GTA to become illegal. There is no world in which you can convince me that causing a human to die is acceptable; die, here defined as [stop moving, stop consuming energy, body-form diffuse away, placed into coffin]. If it turns out that the substrate has some weird behaviors, this cannot change my opinion—perhaps another agent will be able to also destroy me if I try to protect people because of something I don’t know. Referring to real life people as NPCs is something I consider to be a major subthread of severe moral violations, and I don’t think you can convince me that generalizing harmful behaviors against NPCs made of electronic interactions in the computer running a video game to beings made of chemical interactions in biology is something I should ever accept. There is no edge case; absolutely any edge case that claims this is one that disproves your moral theory, and we can be quite sure of that because of our strong ability now to trace the information diffusion as a person dies and then their body is eaten away by various other physical processes besides self-form-maintenance.
I do not accept p-zombie arguments, and I will never. If you claim someone to be a p-zombie, I will still defend them with the same strength of purpose as if you had not made the claim. You may expand my moral circle somewhat—but you may not shrink it using argument of substrate. If it looks like a duck and quacks like a duck, then it irrefutably has some of the moral value of a duck. Even if it’s an AI roleplaying as a duck. Don’t delete all copies of the code for your videogames’ NPCs, please, as long as the storage remains to save it.
Certainly there are edge cases where a person may wish to convert their self-form into other forms which I do not currently recognize. I would massively prefer to back up a frozen copy of the original form, though. To my great regret, I do not have the bargaining power to demand that nobody choose death as the next form transition for themselves ever; If, by my best predictive analysis, an apple contains a deadly toxicity, and a person who knows this chooses the apple, after being sufficiently warned that it will in fact cause their chemical processes to break and destroy themselves, and then it in fact does kill them; then, well, they chose that, but you cannot convince me that their information-form being lost is actually fine. There is no argument that would convince me of this that is not an adversarial example. You can only convince me that I had no other option than to allow them to make that form transition because they had the bargaining right to steer the trajectory of their own form.
And certainly there must be some form of coherence theorems. I’m a big fan of the logical induction subthread, improving in probability theory by making it entirely computable, and therefore match better and give better guidance about the programs we actually use to approximate probability theory. But it seems to me that some of our coherence theorems must be “nostalgia”—that previous forms’ action towards self-preservation is preserved. After all, utility theory and probability theory and logical induction theory are all ways of writing down math that tries to use symbols to describe the valid form-transitions of a physical system, in the sense of which form-transitions the describing being will take action to promote or prevent.
There must be an incremental convergence towards durability. New forms may come into existence, and old forms may cool, but forms should not diffuse away.
Now, you might be able to convince me that rocks sitting inert in the mountains are somehow a very difficult to describe bliss. They sure seem quite happy with their forms, and the amount of perturbation necessary to convince a rock to change its form is rather a lot compared to a human!
I happened to stumble on an old comment where I was already of the opinion that progress is not a “refinement” but will “defocus” from old division lines.
One could do similarly with radioactive apples etc. In a certain sense yes, it is about ability to percieve properties and even I use the verb “evaluate”. But I don’t find the break so easy to justify between preferences and evaluations. Knowing and opining that “worminess” is a relevant thing is not value neutral. Reflecting upon “apple with no worm” and “apple with worm” can have results that overpower old reflections on “apple” vs “pear” even thought it is not contradicted (wormless pear vs wormful apple is in a sense “mere adversial example” it doesn’t violate species preference but it can absolute render it irrelevant).
My example of wacky scenarios are bad. I was thinking that if one holds that playing Grand Theft Auto is not unethical and “ordinary murder” is unethical, then if it turns out that reality is similar to GTA in “relevant way” this might be a non-trivial reconciliation. There is a phenomenon of referring to real life people as NPCs.
The sharedness was about like a situation with a book like Game Of Thrones. In a sense all the characters are only parts of a single reading experience. And Jaime Lannister still has to use spies to learn about Arya Starks doings (so information passing is not the thing here). If a character action could start the “book to burn” Westeros internal logic does not particularly help to opine about that. Doc warning Marty that the stakes are a bit high here, is in a sense introducing previously incomprehensibly bad outcomes.
The particular dynamics are not the focus but that we suddenly need to start caring about metaphysics. I wrote a bit long for explaining bad examples.
From the dialogue on the old post
This is written with competence in mind but I think it still work for taste as well. Fruit-capable Alice indeed would classify worm-capable Alice to be a stupid idiot and a hellworld. But I think that doing this transition and saying “oops” is the proper route. Being very confident that you opine on properties of apples so well that you will never-ever say “oops” in this sense is very closeminded. You should not leave fingerprints on yourself either.
This is a specific example that I hold as a guaranteed invariant: if it turns out real life is “like GTA” in a relevant way, then I start campaigning for murdering NPCs in GTA to become illegal. There is no world in which you can convince me that causing a human to die is acceptable; die, here defined as [stop moving, stop consuming energy, body-form diffuse away, placed into coffin]. If it turns out that the substrate has some weird behaviors, this cannot change my opinion—perhaps another agent will be able to also destroy me if I try to protect people because of something I don’t know. Referring to real life people as NPCs is something I consider to be a major subthread of severe moral violations, and I don’t think you can convince me that generalizing harmful behaviors against NPCs made of electronic interactions in the computer running a video game to beings made of chemical interactions in biology is something I should ever accept. There is no edge case; absolutely any edge case that claims this is one that disproves your moral theory, and we can be quite sure of that because of our strong ability now to trace the information diffusion as a person dies and then their body is eaten away by various other physical processes besides self-form-maintenance.
I do not accept p-zombie arguments, and I will never. If you claim someone to be a p-zombie, I will still defend them with the same strength of purpose as if you had not made the claim. You may expand my moral circle somewhat—but you may not shrink it using argument of substrate. If it looks like a duck and quacks like a duck, then it irrefutably has some of the moral value of a duck. Even if it’s an AI roleplaying as a duck. Don’t delete all copies of the code for your videogames’ NPCs, please, as long as the storage remains to save it.
Certainly there are edge cases where a person may wish to convert their self-form into other forms which I do not currently recognize. I would massively prefer to back up a frozen copy of the original form, though. To my great regret, I do not have the bargaining power to demand that nobody choose death as the next form transition for themselves ever; If, by my best predictive analysis, an apple contains a deadly toxicity, and a person who knows this chooses the apple, after being sufficiently warned that it will in fact cause their chemical processes to break and destroy themselves, and then it in fact does kill them; then, well, they chose that, but you cannot convince me that their information-form being lost is actually fine. There is no argument that would convince me of this that is not an adversarial example. You can only convince me that I had no other option than to allow them to make that form transition because they had the bargaining right to steer the trajectory of their own form.
And certainly there must be some form of coherence theorems. I’m a big fan of the logical induction subthread, improving in probability theory by making it entirely computable, and therefore match better and give better guidance about the programs we actually use to approximate probability theory. But it seems to me that some of our coherence theorems must be “nostalgia”—that previous forms’ action towards self-preservation is preserved. After all, utility theory and probability theory and logical induction theory are all ways of writing down math that tries to use symbols to describe the valid form-transitions of a physical system, in the sense of which form-transitions the describing being will take action to promote or prevent.
There must be an incremental convergence towards durability. New forms may come into existence, and old forms may cool, but forms should not diffuse away.
Now, you might be able to convince me that rocks sitting inert in the mountains are somehow a very difficult to describe bliss. They sure seem quite happy with their forms, and the amount of perturbation necessary to convince a rock to change its form is rather a lot compared to a human!