RogerDearnaley comments on Towards an Ethics Calculator for Use by an AGI

RogerDearnaley 4 Jan 2024 10:00 UTC
1 point
0
I suspect doing a good job of this is going to be extremely challenging. My loose-order-of-magnitude estimate of the Kolmogorov complexity of a decent Ethics/human values calculator is somewhere in the terabytes (of the order of the size of our genome, i.e. a few gigabytes, is a plausible lower bound, but there’s no good reason for it to be an upper bound). However, a sufficiently rough approximation might be a lot smaller, and even that could be quite useful (if prone to running into Goodhart’s Law under optimization pressure). I think it’s quite likely that doing something like this will be useful in AI-Assisted Alignment, in which case having sample all-human attempts from which to start is likely to be valuable.
Did you look at the order of magnitude of standard civil damages for various kinds of harm? That seems like the sort of thing your model should be able to predict successfully.
Also, these sorts of “pleasure” involve not taking responsibility for one’s emotions and thus act to reduce self-esteem, which in turn reduces one’s tendency to experience life overall as “positive.” Therefore, these pleasures were actually considered as value destructions.
I know a number of intelligent, apparently sane, yet kinky people who would disagree with you. If you’re interested in the topic, you might want to read some more on it, e.g.: Safe, Sane, and Consensual—Consent and the Ethics of BDSM If nothing else, your model should be able to account for the fact that at least a few percent of people do this.
- sweenesm 4 Jan 2024 16:52 UTC
  1 point
  0
  Parent
  Thank you for the comment. Yes, I agree that “doing a good job of this is going to be extremely challenging.” I know it’s been challenging for me just to get to the point that I’ve gotten to so far (which is somewhat past my original post). I like to joke that I’m just smart enough to give this a decent try and just stupid enough to actually try it. And yes, I’m trying to find a rough approximation as a good starting point, in hopes that it’ll be useful.
  Thanks for the suggestion about civil damages—I haven’t looked into that, only criminal “damages” (in terms of criminal sentences) thus far. I actually don’t expect that the first version of my calculations, based on my own ethics/values, will particularly agree with civil damages, but it may be interesting to see if the calculations can be modified to follow an alternate ethical framework (one less focused on self-esteem) that does give reasonable agreement.
  Regarding masochistic and sadistic pleasure, it depends on how we define them. One might regard people who enjoy exercise as being into “masochistic pleasure.” That’s not what I mean by it. For masochistic pleasure I basically mean pleasure that comes from one’s own pain, plus self-loathing. Sadistic pleasure would be pleasure that comes from the thought of others’ pain, plus self-loathing (even if it may appear as loathing of other, the way I see it, it’s ultimately self-loathing). Self-loathing involves not taking responsibility for one’s emotions about oneself and is part of having a low self-esteem. I appreciate you pointing to the need for clarification on this, and hope it’s now clarified a bit. Thanks again for the comment!