Thanks for all the detail, and for looking past my clumsy questions!
It sounds like one disagreement you’re pointing at is about the shape of possible futures. You value “humanity colonizes the universe” far less than some other people do. (maybe rob in particular?) That seems sane to me.
The near-term decision questions that brought us here were about how hard to fight to “solve the alignment problem,” whatever that means. For that, the real question is about the difference in total value of the future conditioned on “solving” it and conditioned on “not solving” it. You think there are plausible distributions on future outcomes so that 1 one-millionth of the expected value of those futures is worth more to you than personally receiving 1 billion dollars.
Putting these bits together, I would guess the amount of value at stake is not really the thing driving disagreement here, but about the level of futility? Say you think humanity overall has about a 1% chance of succeeding with a current team of 1000 full-time-equivalents working on the problem. Do you want to join the team in that case? What if we have a one-in-one-thousand chance and a current team of 1 million? Do these seem like the right units to talk about the disagreement in?
(Another place that I thought there might be a disagreement: do you think solving the alignment problem increases or decreases s-risk? Here “solving the alignment problem” is the thing that we’re discussing giving up on because it’s too futile.)
In some philosophical sense, you have to multiply the expected value by the estimated chance of success. They both count. But I’m not sitting there actually doing multiplication, because I don’t think you can put good enough estimates on either one to make the result meaningful.
In fact, I guess that there’s a better than 1 percent chance of avoiding AI catastrophe in real life, although I’m not sure I’d want to (a) put a number on it, (b) guess how much of the hope is in “solving alignment” versus the problem just not being what people think it will be, (c) guess how much influence my or anybody else’s actions would have on moving the probability[edited from “property”...], or even (d) necessarily commit to very many guesses about which actions would move the probability in which directions. I’m just generally not convinced that the whole thing is predictable down to 1 percent at all.
In any case, I am not in fact working on it.
I don’t actually know what values I would put on a lot of futures, even the 1000 year one. Don’t get hung up on the billion dollars, because I also wouldn’t take a billion dollars to singlemindedly dedicate the remainder of my life , or even my “working time”, to anything in particular unless I enjoyed it. Enjoying life is something you can do with relative certainty, and it can be enough even if you then die. That can be a big enough “work of art”. Everybody up to this point has in fact died, and they did OK.
For that matter, I’m about 60 years old, so I’m personally likely to die before any of this stuff happens… although I do have a child and would very much prefer she didn’t have to deal with anything too awful.
I guess I’d probably work on it if I thought I had a large, clear contribution to make to it, but in fact I have absolutely no idea at all how to do it, and no reason to expect I’m unusually talented at anything that would actually advance it.
do you think solving the alignment problem increases or decreases s-risk
If you ended up enacting a serious s-risk, I don’t understand how you could say you’d solved the alignment problem. At least not unless the values you were aligning with were pretty ugly ones.
I will admit that sometimes I think other people’s ideas of good outcomes sound closer to s-risks than I would like, though. If you solved the problem of aligning with those people, I might see it as an increase.
Have you considered local movement building? Perhaps, something simple like organising dinners or a reading group to discuss these issues? Maybe no-one would come, but it’s hard to say unless you give it a go and, in any case, a small group of two or three thoughtful people is more valuable than a much larger group of people who are just there to pontificate without really thinking through anything deeply.
Thanks for all the detail, and for looking past my clumsy questions!
It sounds like one disagreement you’re pointing at is about the shape of possible futures. You value “humanity colonizes the universe” far less than some other people do. (maybe rob in particular?) That seems sane to me.
The near-term decision questions that brought us here were about how hard to fight to “solve the alignment problem,” whatever that means. For that, the real question is about the difference in total value of the future conditioned on “solving” it and conditioned on “not solving” it.
You think there are plausible distributions on future outcomes so that 1 one-millionth of the expected value of those futures is worth more to you than personally receiving 1 billion dollars.
Putting these bits together, I would guess the amount of value at stake is not really the thing driving disagreement here, but about the level of futility? Say you think humanity overall has about a 1% chance of succeeding with a current team of 1000 full-time-equivalents working on the problem. Do you want to join the team in that case? What if we have a one-in-one-thousand chance and a current team of 1 million? Do these seem like the right units to talk about the disagreement in?
(Another place that I thought there might be a disagreement: do you think solving the alignment problem increases or decreases s-risk? Here “solving the alignment problem” is the thing that we’re discussing giving up on because it’s too futile.)
In some philosophical sense, you have to multiply the expected value by the estimated chance of success. They both count. But I’m not sitting there actually doing multiplication, because I don’t think you can put good enough estimates on either one to make the result meaningful.
In fact, I guess that there’s a better than 1 percent chance of avoiding AI catastrophe in real life, although I’m not sure I’d want to (a) put a number on it, (b) guess how much of the hope is in “solving alignment” versus the problem just not being what people think it will be, (c) guess how much influence my or anybody else’s actions would have on moving the probability[edited from “property”...], or even (d) necessarily commit to very many guesses about which actions would move the probability in which directions. I’m just generally not convinced that the whole thing is predictable down to 1 percent at all.
In any case, I am not in fact working on it.
I don’t actually know what values I would put on a lot of futures, even the 1000 year one. Don’t get hung up on the billion dollars, because I also wouldn’t take a billion dollars to singlemindedly dedicate the remainder of my life , or even my “working time”, to anything in particular unless I enjoyed it. Enjoying life is something you can do with relative certainty, and it can be enough even if you then die. That can be a big enough “work of art”. Everybody up to this point has in fact died, and they did OK.
For that matter, I’m about 60 years old, so I’m personally likely to die before any of this stuff happens… although I do have a child and would very much prefer she didn’t have to deal with anything too awful.
I guess I’d probably work on it if I thought I had a large, clear contribution to make to it, but in fact I have absolutely no idea at all how to do it, and no reason to expect I’m unusually talented at anything that would actually advance it.
If you ended up enacting a serious s-risk, I don’t understand how you could say you’d solved the alignment problem. At least not unless the values you were aligning with were pretty ugly ones.
I will admit that sometimes I think other people’s ideas of good outcomes sound closer to s-risks than I would like, though. If you solved the problem of aligning with those people, I might see it as an increase.
Have you considered local movement building? Perhaps, something simple like organising dinners or a reading group to discuss these issues? Maybe no-one would come, but it’s hard to say unless you give it a go and, in any case, a small group of two or three thoughtful people is more valuable than a much larger group of people who are just there to pontificate without really thinking through anything deeply.