I suspect I was interpreting this as “technological progress will be faster than (meta)philosophical progress” instead of the actually-relevant “the gap between technological progress and (meta)philosophical progress will grow faster than it would have without AI”. Do you have arguments for this latter operationalization?
I thought from a previous comment that you already agree with the latter, but sure I can give an argument. It’s basically that the most obvious way of using ML to accelerate philosophical progress seems risky (compared to just having humans do philosophical work) and no one has proposed a better method, so unless this problem is solved in a better way, it looks like we’d have to either accept a faster growing gap between philosophical progress and technological progress, or incur extra risk from using ML to accelerate philosophical progress. See the section Replicate the trajectory with ML? of Some Thoughts on Metaphilosophy for more details.
Background: I generally think humans are pretty “good” at technological progress and pretty “bad” at (meta)philosophical progress, and I think AI will be similar.
Aside from the above argument, I think we could end up creating AIs whose ratio between philosophical ability and technical ability is worse than human, if AI designers simply spent more resources on improving technical ability and neglected philosophical ability in comparison (e.g., because there is higher market demand for technical ability). Considering how much money is currently being invested into making technological progress vs philosophical progress in the overall economy, wouldn’t you expect something similar when it comes to AI? (I guess this is more of an argument for overall pessimism rather than for favoring one approach over another, but I still wanted to point out that I don’t agree with your relative optimism here.)
I thought from a previous comment that you already agree with the latter
Yeah, that’s why I said “I probably agreed with this in the past”. I’m not sure whether my underlying models changed or whether I didn’t notice the contradiction in my beliefs at the time.
It’s basically that the most obvious way of using ML to accelerate philosophical progress seems risky
It feels like this is true for the vast majority of plausible technological progress as well? E.g. most scientific experiments / designed technologies require real-world experimentation, which means you get very little data, making it very hard to naively automate with ML. I could make a just-so story where philosophy has much more data (philosophy writing), that is relatively easy to access (a lot of it is on the Internet), and so will be easier to automate.
My actual reason for not seeing much of a difference is that (conditional on short timelines) I expect that the systems we develop will be very similar to humans in the profile of abilities they have, because it looks like we will develop them in a manner similar to how humans were “developed” (I’m imagining development paths that look like e.g. OpenAI Five, AlphaStar, GPT-2 as described at SlateStarCodex). So the zeroth-order prediction is that there won’t be a relative difference between technological and philosophical progress. A very sketchy first-order prediction based on “there is lots of easily accessible philosophy data” suggests that philosophical progress will be differentially advanced.
Yeah, I agree that that particular method of making philosophical progress is not going to work.
I guess this is more of an argument for overall pessimism rather than for favoring one approach over another, but I still wanted to point out that I don’t agree with your relative optimism here.
Yeah, that’s basically my response.
I don’t have good arguments for my optimism (and I did remove it from the newsletter opinion for that reason). Nonetheless, I am optimistic. One argument is that over the past few centuries it seems like philosophical progress has been making the world better faster than technological progress has been causing bad distributional shifts—but of course even if our ancestors’ values had been corrupted we would not see it that way, so this isn’t a very good argument.
It feels like this is true for the vast majority of plausible technological progress as well? E.g. most scientific experiments / designed technologies require real-world experimentation, which means you get very little data, making it very hard to naively automate with ML. I could make a just-so story where philosophy has much more data (philosophy writing), that is relatively easy to access (a lot of it is on the Internet), and so will be easier to automate.
On the scientific/technological side, you can also use scientific/engineering papers (which I’m guessing has to be at least an order of magnitude greater in volume than philosophy writing), plus you have access to ground truths in the form of experiments and real world outcomes (as well as near-ground truths like simulation results) which has no counterpart in philosophy. My main point is that it seems a lot harder for technological progress to go “off the rails” due to having access to ground truths (even if that data is sparse) so we can push it much harder with ML.
My actual reason for not seeing much of a difference is that (conditional on short timelines) I expect that the systems we develop will be very similar to humans in the profile of abilities they have, because it looks like we will develop them in a manner similar to how humans were “developed”
I agree this could be a reason that things turn out well even if we don’t explicitly solve metaphilosophy or do something like my hybrid approach ahead of time. The way I would put it is that humans developed philosophical abilities for some mysterious reason that we don’t understand, so we can’t rule out AI developing philosophical abilities for the same reason. It feels pretty risky to rely on this though. If by the time we get human-level AI, this turns out not to be true, what are we going to do then? And even if we end up with AIs that appear to be able to help us with philosophy, without having solved metaphilosophy how would we know whether it’s actually helping or pushing us “off the rails”?
On the scientific/technological side, you can also use scientific/engineering papers (which I’m guessing has to be at least an order of magnitude greater in volume than philosophy writing)
This still seems like it is continuing the status quo (where we put more effort into technology relative to philosophy) rather than differentially benefitting technology.
My main point is that it seems a lot harder for technological progress to go “off the rails” due to having access to ground truths (even if that data is sparse) so we can push it much harder with ML.
Yeah, that seems right, to the extent that we want to use ML to “directly” work on technological / philosophical progress. To the extent that it has to factor through some more indirect method (e.g. through human reasoning as in iterated amplification) I think this becomes an argument to be pessimistic about solving metaphilosophy, but not that it will differentially benefit technological progress (or at least this depends on hard-to-agree-on intuitions).
I think there’s a strong argument to be made that you will have to go through some indirect method because there isn’t enough data to attack the problem directly.
(Fwiw, I’m also worried about the semi-supervised RL part of iterated amplification for the same reason.)
The way I would put it is that humans developed philosophical abilities for some mysterious reason that we don’t understand, so we can’t rule out AI developing philosophical abilities for the same reason. It feels pretty risky to rely on this though.
Yeah, I agree that this is a strong argument for your position.
I thought from a previous comment that you already agree with the latter, but sure I can give an argument. It’s basically that the most obvious way of using ML to accelerate philosophical progress seems risky (compared to just having humans do philosophical work) and no one has proposed a better method, so unless this problem is solved in a better way, it looks like we’d have to either accept a faster growing gap between philosophical progress and technological progress, or incur extra risk from using ML to accelerate philosophical progress. See the section Replicate the trajectory with ML? of Some Thoughts on Metaphilosophy for more details.
Aside from the above argument, I think we could end up creating AIs whose ratio between philosophical ability and technical ability is worse than human, if AI designers simply spent more resources on improving technical ability and neglected philosophical ability in comparison (e.g., because there is higher market demand for technical ability). Considering how much money is currently being invested into making technological progress vs philosophical progress in the overall economy, wouldn’t you expect something similar when it comes to AI? (I guess this is more of an argument for overall pessimism rather than for favoring one approach over another, but I still wanted to point out that I don’t agree with your relative optimism here.)
Yeah, that’s why I said “I probably agreed with this in the past”. I’m not sure whether my underlying models changed or whether I didn’t notice the contradiction in my beliefs at the time.
It feels like this is true for the vast majority of plausible technological progress as well? E.g. most scientific experiments / designed technologies require real-world experimentation, which means you get very little data, making it very hard to naively automate with ML. I could make a just-so story where philosophy has much more data (philosophy writing), that is relatively easy to access (a lot of it is on the Internet), and so will be easier to automate.
My actual reason for not seeing much of a difference is that (conditional on short timelines) I expect that the systems we develop will be very similar to humans in the profile of abilities they have, because it looks like we will develop them in a manner similar to how humans were “developed” (I’m imagining development paths that look like e.g. OpenAI Five, AlphaStar, GPT-2 as described at SlateStarCodex). So the zeroth-order prediction is that there won’t be a relative difference between technological and philosophical progress. A very sketchy first-order prediction based on “there is lots of easily accessible philosophy data” suggests that philosophical progress will be differentially advanced.
Yeah, I agree that that particular method of making philosophical progress is not going to work.
Yeah, that’s basically my response.
I don’t have good arguments for my optimism (and I did remove it from the newsletter opinion for that reason). Nonetheless, I am optimistic. One argument is that over the past few centuries it seems like philosophical progress has been making the world better faster than technological progress has been causing bad distributional shifts—but of course even if our ancestors’ values had been corrupted we would not see it that way, so this isn’t a very good argument.
On the scientific/technological side, you can also use scientific/engineering papers (which I’m guessing has to be at least an order of magnitude greater in volume than philosophy writing), plus you have access to ground truths in the form of experiments and real world outcomes (as well as near-ground truths like simulation results) which has no counterpart in philosophy. My main point is that it seems a lot harder for technological progress to go “off the rails” due to having access to ground truths (even if that data is sparse) so we can push it much harder with ML.
I agree this could be a reason that things turn out well even if we don’t explicitly solve metaphilosophy or do something like my hybrid approach ahead of time. The way I would put it is that humans developed philosophical abilities for some mysterious reason that we don’t understand, so we can’t rule out AI developing philosophical abilities for the same reason. It feels pretty risky to rely on this though. If by the time we get human-level AI, this turns out not to be true, what are we going to do then? And even if we end up with AIs that appear to be able to help us with philosophy, without having solved metaphilosophy how would we know whether it’s actually helping or pushing us “off the rails”?
This still seems like it is continuing the status quo (where we put more effort into technology relative to philosophy) rather than differentially benefitting technology.
Yeah, that seems right, to the extent that we want to use ML to “directly” work on technological / philosophical progress. To the extent that it has to factor through some more indirect method (e.g. through human reasoning as in iterated amplification) I think this becomes an argument to be pessimistic about solving metaphilosophy, but not that it will differentially benefit technological progress (or at least this depends on hard-to-agree-on intuitions).
I think there’s a strong argument to be made that you will have to go through some indirect method because there isn’t enough data to attack the problem directly.
(Fwiw, I’m also worried about the semi-supervised RL part of iterated amplification for the same reason.)
Yeah, I agree that this is a strong argument for your position.