and AI safety compilation

Chipmonk 11 Jun 2023 11:13 UTC
1 point
0
Okay, I’ll try to summarize your main points. Please let me know if this is right
1. You think «membranes» will not be able to be formalized in a consistent way, especially in a way that is consistent across different levels of modeling
2. “It seems easy to find counterexamples when intruding into someone’s boundaries is an ethical thing to do and obtaining from that would be highly unethical.”
Have I missed anything? I’ll respond after you confirm.
Also, would you please share any key example(s) of #2?
- Roman Leventov 12 Jun 2023 11:51 UTC
  1 point
  0
  Parent
  You think «membranes» will not be able to be formalized in a consistent way, especially in a way that is consistent across different levels of modeling
  No, I think membranes could be formalised (Markov blankets, objective “joints” of the environment as in https://arxiv.org/abs/2303.01514, etc.; though theory-laden, I think that the “diff” between the boundaries identifiable from the perspective of different theories is usually negligible).
  We, humans, intrude into each others’ boundaries, boundaries of animals, organisations, communities, etc. all the time. A surgeon intruding into the boundaries of a patient is an ethical thing to do. If AI automated the entire economy, then waited until humanity completely loses the ability to run the civilisation on their own, and then suddenly stopped any maintenance of the automated systems that support the lives of humans and sees how humans die out because they cannot support themselves would be “respecting humans’ boundaries”, but would also be an evil treacherous turn. Messing with Hitler’s boundaries (i.e., killing him) in 1940 would be an ethical action from the perspective of most systems that may care about that (individual humans, organisations, countries, communities).
  I think that boundaries (including consciousness boundaries: what is the locus of animal consciousness? Just the brain or the whole body, or it even extends beyond the body? What is the locus of AI’s consciousness?) is an undeniably important concept that is usable for inferring ethical behaviour. But I don’t think a simple “winning” deontology is derivable from this concept. I’m currently preparing an article where I describe that from the AI engineering perspective, deontology, virtue ethics, and consequentialism could be seen as engineering techniques (approaches) that could help to produce and continuously infer the ethical style of behaviour. None of these “classical” approaches to normative ethics is either necessary or sufficient, but they all could help to improve the ethics in some cognitive architectures.
  - Chipmonk 12 Jun 2023 12:12 UTC
    1 point
    0
    Parent
    I think that boundaries […] is an undeniably important concept that is usable for inferring ethical behaviour. But I don’t think a simple “winning” deontology is derivable from this concept.
    I see
    I’m currently preparing an article where I describe that from the AI engineering perspective, deontology, virtue ethics, and consequentialism
    please lmk when you post this. i’ve subscribed to your lw posts too
    FWIW, I don’t think the examples given necessarily break «membranes» as a “winning” deontological theory.
    A surgeon intruding into the boundaries of a patient is an ethical thing to do.
    If the patient has consented, there is no conflict.
    (Important note: consent does not always nullify membrane violations. In this case it does, but there are many cases where it doesn’t.)
    If AI automated the entire economy, then waited until humanity completely loses the ability to run the civilisation on their own, and then suddenly stopped any maintenance of the automated systems that support the lives of humans and sees how humans die out because they cannot support themselves would be “respecting humans’ boundaries”, but would also be an evil treacherous turn.
    I think a way to properly understand this might be.. If Alice makes a promise to Bob, she is essentially giving Bob a piece of herself, and that changes how he plans for the future and whatnot. If she revokes that by terms not part of the original agreement, she has stolen something from Bob, and that is a violation of membranes. ?
    If the AI promises to support humans under an agreement, then breaks that agreement, that is theft.
    Messing with Hitler’s boundaries (i.e., killing him) in 1940 would be an ethical action from the perspective of most systems that may care about that (individual humans, organisations, countries, communities).
    In a case like this I wonder if the theory would also need something like “minimize net boundary violations”, kind of like how some deontologies make murder okay sometimes.
    But then this gets really close to utilitarianism and that’s gross imo. So I’m not sure. Maybe there’s another way to address this? Maybe I see what you mean

Chipmonk comments on «Boundaries/​Membranes» and AI safety compilation

Chipmonk comments on «Boundaries/Membranes» and AI safety compilation