Praxis-based values as I define them are, informally, reflective decision-influences matching the description ‘promote x x-ingly’: ‘promote peace peacefully,’ ‘promote corrigibility corrigibly,’ ‘promote science scientifically.’
This just seems meaningless, or tautological, to be entirely honest.
Do you have a formal definition in the works?
Otherwise it seems likely to turn into circular arguments, or infinite regress, like prior attempts.
I describe the more formal definition in the post:
‘Actions (or more generally ‘computations’) get an x-ness rating. We define the x shard’s expected utility conditional on a candidate action a as the sum of two utility functions: a bounded utility function on the x-ness of a and a more tightly bounded utility function on the expected aggregate x-ness of the agent’s future actions conditional on a. (So the shard will choose an action with mildly suboptimal x-ness if it gives a big boost to expected aggregate future x-ness, but refuse certain large sacrifices of present x-ness for big boosts to expected aggregate future x-ness.)′
And as I say in the post, we should expect decision-influences matching this definition to be natural and robust only in cases where x is a ‘self-promoting’ property. A property x is ‘self-promoting’ if it is reliably the case that performing an action with a higher x-ness rating increases the expected aggregate x-ness of future actions.
‘Actions (or more generally ‘computations’) get an x-ness rating. We define the x shard’s expected utility conditional on a candidate action a as the sum of two utility functions: a bounded utility function on the x-ness of a and a more tightly bounded utility function on the expected aggregate x-ness of the agent’s future actions conditional on a. (So the shard will choose an action with mildly suboptimal x-ness if it gives a big boost to expected aggregate future x-ness, but refuse certain large sacrifices of present x-ness for big boosts to expected aggregate future x-ness.)′
A formal definition means one based on logical axioms, mathematical axioms, universal constants (e.g. speed of light), observed metrics (e.g. the length of a day), etc.
Writing more elaborate sentences can’t resolve the problem of circularity or infinite regress.
You might be confusing it with the legal or societal/cultural/political/literary sense.
A property x is ‘self-promoting’ if it is reliably the case that performing an action with a higher x-ness rating increases the expected aggregate x-ness of future actions.
This seems to be entirely your invention? I can’t find any google results with a similar match.
A property x is ‘self-promoting’ if it is reliably the case that performing an action with a higher x-ness rating increases the expected aggregate x-ness of future actions.
This seems to be entirely your invention? I can’t find any google results with a similar match.
Similarly we become just by doing just acts, temperate by doing temperate acts, brave by doing brave acts...Hence it is incumbent on us to control the character of our activities, since on the quality of these depends the quality of our dispositions. It is therefore not of small moment whether we are trained from childhood in one set of habits or another; on the contrary it is of very great, or rather of supreme, importance
This just seems meaningless, or tautological, to be entirely honest.
Do you have a formal definition in the works?
Otherwise it seems likely to turn into circular arguments, or infinite regress, like prior attempts.
I describe the more formal definition in the post:
‘Actions (or more generally ‘computations’) get an x-ness rating. We define the x shard’s expected utility conditional on a candidate action a as the sum of two utility functions: a bounded utility function on the x-ness of a and a more tightly bounded utility function on the expected aggregate x-ness of the agent’s future actions conditional on a. (So the shard will choose an action with mildly suboptimal x-ness if it gives a big boost to expected aggregate future x-ness, but refuse certain large sacrifices of present x-ness for big boosts to expected aggregate future x-ness.)′
And as I say in the post, we should expect decision-influences matching this definition to be natural and robust only in cases where x is a ‘self-promoting’ property. A property x is ‘self-promoting’ if it is reliably the case that performing an action with a higher x-ness rating increases the expected aggregate x-ness of future actions.
A formal definition means one based on logical axioms, mathematical axioms, universal constants (e.g. speed of light), observed metrics (e.g. the length of a day), etc.
Writing more elaborate sentences can’t resolve the problem of circularity or infinite regress.
You might be confusing it with the legal or societal/cultural/political/literary sense.
This seems to be entirely your invention? I can’t find any google results with a similar match.
Based on the title “Some Thoughts on Virtue Ethics for AIs” I’m assuming this is just a formalization of virtue ethics’s idea of moral habituation. For example, from Aristotle’s Nicomachean Ethics: