Bridgett Kay comments on Alignment Paradox and a Request for Harsh Criticism

Bridgett Kay 6 Feb 2025 0:08 UTC
3 points
0
“If your conclusion is that we don’t know how to do value alignment, I and I think most alignment thinkers would agree with you. If the conclusion is that AGI is useless, I don’t think it is at all.”
Sort of- I worry that it may be practically impossible for current humans to align AGI to the point of usefulness.
“If we had external help that allowed us to focus more on what we truly want—like eliminating premature death from cancer or accidents, or accelerating technological progress for creative and meaningful projects—we’d arrive at a very different future. But I don’t think that future would be worse; in fact, I suspect it would be significantly better.”
That’s my intuition and hope- but I worry that these things are causally entangled with things that we don’t anticipate. To use your example- what if we only ask an aligned and trusted AGI to cure premature death by disease and accident, which wouldn’t greatly conflict with most people’s values in the way that radical life extension would, but then a sudden loss of an entire healthcare and insurance industry results, causing such a total economic collapse that causes vast swaths of people to starve. (I don’t think this would actually happen, but it’s an example of the kind of unforeseen consequence that getting a wish suddenly granted may cause, when you ask an instruction following AGI to give, without counting on a greater intelligence to project and weigh all of the consequences.)
I also worry about the phrase “a human you trust.”
Again- this feels like cynicism, if not the result of a catastrophizing mind (which I know I have.) I think you make a very good argument- I’m probably indulging too much in black-and-white thought- that there’s a way to fulfill these desires quickly enough that we are able to relieve more suffering than we would have if left to our devices, but still slow enough to monitor unforeseen consequences. Maybe the bigger question is just whether we will.
- Seth Herd 6 Feb 2025 5:17 UTC
  3 points
  0
  Parent
  I agree with everything you’ve said there.
  
  The bigger question is whether we will achieve usefully aligned AGI. And the biggest question is what we can do.
  
  Ease your mind! Worries will not help. Enjoy the sunshine and the civilization while we have it, don’t take it all on your shoulders, and just do something to help!
  
  As Sarah Connor said:
  
  NO FATE
  
  We are not in her unfortunately singular shoes. It does not rest on our shoulders alone. As most heroes in history have, we can gather allies and enjoy the camaraderie and each day.
  
  On a different topic, I wish you wouldn’t call yourself a failed scifi author or a failed anything. I hope it’s in jest or excessive modesty. Failure is only when you give up on everything or are dead. I think there is much to accomplish in writing good fiction. It doesn’t have to be perfect. Changing directions isn’t failing either; it’s changing strategy, hopefully as a result of learning.
  - Bridgett Kay 6 Feb 2025 12:48 UTC
    1 point
    0
    Parent
    Yeah- calling myself a failed scifi writer really was half in jest- had some very limited success as an indie writer for a good number of years, and recently need has made me shift direction. Thank you for the encouragement, though!