As we’ve discussed and in short, I think aligned AI permits dialing up many of the processes that make science or prediction markets imperfectly self-correcting: tremendously cheaper, in parallel, on the full panoply of questions (including philosophy and the social sciences), with robust consistency, cross-examination, test sets, and forecasting. These sorts of things are an important part of scalable supervision for alignment, but if they can be made to work I expect them to drive strong epistemic convergence.
You’ve described some of these ideas to me before, but not in enough detail for me to form a judgement on the actual merits of the ideas and arguments. So I’m having to rely on my usual skeptical prior for new untested ideas in the philosophical or AI safety areas (because a lot of new ideas end up not working out, and people tend to be overconfident about their own original ideas), along with:
We seem to understand the philosophy/epistemology of science much better than that of philosophy (i.e. metaphilosophy), and at least superficially the methods humans use to make progress in them don’t look very similar, so it seems suspicious that the same AI-based methods happen to work equally well for science and for philosophy. (I tried to understand/explain the difference between the two and why philosophy might be especially difficult or problematic for ML in Some Thoughts on Metaphilosophy. I’m not necessarily putting a lot of stock in my specific explanations but it would be a pretty big surprise to learn that it turns out they’re really the same.)
So to the extent that your overall optimism is based on optimism about these new ideas, I think I have to discount much of that, at least until I see a reasonably detailed write-up.
Per Faynman, understanding is the ability to practice. Or an ability to construct a practicing AI?
Maybe we understand philosophy of science/epistemology reasonably well, but we don’t yet know to construct a universal AI scientist.
Philosophy is a nebulous concept with multiple overlapping meanings, and as a practice, doesn’t have a single crisp functional conceptualisation, like those you enumerated in your post on metaphilosophy. It’s rather some superposition of these conceptualisations. But the most “important” conceptualisation, it seems to me, is that philosophy are coherent stories that help people craft shared meaning and motivation. This is pretty evident in religion-as-philosophy, moral philosophy, and most “applied” philosophies, from philosophy of physics to philosophy of art, and metaphilosophy itself. We discussed this point here: https://www.lesswrong.com/posts/k93NEoXZq6CdXegdx/philosophical-cyborg-part-1. So, I actually disagree that we as much “in the dark” on the nature of philosophy as you present. I feel that I understand philosophy not much worse than I understand epistemology/science (albeit still somewhat worse).
This functional role of philosophy AI is already very good at, in many cases, albeit if we task it to come up with a completely novel interpretation of quantum mechanics it will still struggle, probably.
I didn’t understand why the question of whether AI can practice good philosophy is very relevant to p(doom), though.
But the most “important” conceptualisation, it seems to me, is that philosophy are coherent stories that help people craft shared meaning and motivation. This is pretty evident in religion-as-philosophy, moral philosophy, and most “applied” philosophies, from philosophy of physics to philosophy of art, and metaphilosophy itself.
I would agree with “philosophy are coherent stories that help people craft shared meaning and motivation” but I think it’s not merely this. The position that it is merely this (and therefore philosophical questions do not have answers that can be true or false) would be self-undermining, because if there is no true or false in metaphilosophy, then your own metaphilosophical position can’t be true (providing me with no reason to accept it on that basis), and thinking that all or most philosophical questions that I care about have no truth value would be rather demotivating.
Consider instead the following alternative metaphilosophical position (that I tentatively hold): Philosophy may be many things (including “coherent stories that help people craft shared meaning and motivation”) but it is also a method of answering confusing questions, that (at least some) humans seem to possess but can’t yet understand or explain, and many philosophical questions do seem to have answers that can be true or false. What is incoherent or not motivating about this? What is wrong with this in general?
Consider instead the following alternative metaphilosophical position (that I tentatively hold): Philosophy may be many things (including “coherent stories that help people craft shared meaning and motivation”)
This is exactly what I meant as well, in “Philosophy is a nebulous concept with multiple overlapping meanings, and as a practice, doesn’t have a single crisp functional conceptualisation, like those you enumerated in your post on metaphilosophy. It’s rather some superposition of these conceptualisations.”
Let’s first disentangle two questions: “Why people are practicing philosophy?” and “What is the (truth) status of philosophical statements?”.
The first one is the question of anthropology and (social) psychology which, I could argue, are kind of “philosophies” rather than “sciences” themselves, at this moment: they attempt to explain existing evidence (with “coherent stories”), but don’t produce mathematical models with good predictive power. Nevertheless, I agree that there are multiple explanations that we can provide, from “finding answers to difficult questions” and “by practicing philosophy, people try to increase the coherence of their world models, which is a deep ‘motivation’ of conscious biological agents, and which became imported into the domain of words 80k years ago or whenever people have acquired compositional language”, to “showing off”.
As retrospective explanation, which doesn’t inform us much to why will AI be motivated to practice philosophy. As an algorithm for finding more coherent and parsimonious sets of symbols/concepts (ontologies), it may be implemented in LM-like AI, perhaps, or there may be other, more efficient algorithms for this.
To the second question, about the status of philosophical questions, I think the answer is the following: while the essence of a scientific theory is a mathematical model which is judged by the quality of its match with evidence and coherence with adjacent scientific (mathematical) models, philosophy is text which is judged by its coherence with scientific models, internal (linguistic) coherence, and coherence with other philosophical texts (arbitrarily selected by a philosopher, as well as a scientist arbitrary selects theories why want their theory to match well with, thus advancing this or that scientific paradigm).
Internal coherence of a text and its coherence with other texts is a question of (neuro)semiotics and linguistics/philosophy of language, both of which, in my mind, are branches of cognitive science. If there is something else that makes texts convincing to people apart from their coherence, apart from external factors such as the likability and the authority of the author of the text or the orator, then the “quality” of philosophical texts also becomes the question of neuropsychology more generally.
The above story about coherence applies to most kinds of philosophy except “foundational”, such as foundations of physics or “philosophical paradigms” such as pragmatism, which perhaps serve as capstones for large sets of other scientific and philosophical theories, and the merit of these “foundational” philosophies is judged by the overall coherence of these sets of other theories being capstoned.
I don’t like using the word “truth” outside of logic, but if I’m forced to, the above implies that I go with some version of the coherence theory of truth.
Internal coherence of a text and its coherence with other texts is a question of (neuro)semiotics and linguistics/philosophy of language, both of which, in my mind, are branches of cognitive science. If there is something else that makes texts convincing to people apart from their coherence, apart from external factors such as the likability and the authority of the author of the text or the orator, then the “quality” of philosophical texts also becomes the question of neuropsychology more generally.
Before the invention of logic, someone might have said the same thing about math, that nothing determines the “quality” of a proof, aside from how convincing human neuropsychology happens to find it. I’m not saying that for sure philosophy is the same or analogous, that we’ll definitely find deeper reasons than neuropsychology for why a philosophical text is correct or convincing, but neither do I know how to rule that out, which makes me uncertain.
Plus, intuitively it seems like when trying to answer philosophical questions, I’m often aiming for some truth more “real” or “objective” than merely coherence with scientific models and arbitrarily selected other texts. For example, it seems either objectively true or objectively false that nothing determines the quality of a philosophical text aside from coherence and neuropsychology. The truth value of this statement doesn’t seem to depend on what other texts I happen to select to try to make it cohere with, or other subjective factors.
I’m not necessarily putting a lot of stock in my specific explanations but it would be a pretty big surprise to learn that it turns out they’re really the same.
Does it seem to you that the kinds of people who are good at science vs good at philosophy (or the kinds of reasoning processes they use) are especially different?
In your own case, it seems to me like you’re someone who’s good at philosophy, but you’re also good at more “mundane” technical tasks like programming and cryptography. Do you think this is a coincidence?
I would guess that there’s a common factor of intelligence + being a careful thinker. Would you guess that we can mechanize the intelligence part but not the careful thinking part?
A lot of people are way better than me at technical tasks (at some point I wanted to go into cryptography research as a career but had stick with applied cryptography which is less technically demanding), but way worse at philosophy (or at least have shown little interest in philosophy which itself seems like a major philosophical error). I don’t know how to explain this if science and philosophy are really the same thing or use same methods.
I would guess that there’s a common factor of intelligence + being a careful thinker. Would you guess that we can mechanize the intelligence part but not the careful thinking part?
It probably has to be more than that, because lots of people in cryptography (and security in general) are highly intelligent and careful thinkers (how else can they survive in those fields), but again AFAICT most people in those fields are not particularly good philosophers. Maybe at least one necessary additional ingredient is “good philosophical intuitions” (for example you have to have an intuition that philosophy is important before you would even start thinking about it) but I have little idea how to break that down further.
As we’ve discussed and in short, I think aligned AI permits dialing up many of the processes that make science or prediction markets imperfectly self-correcting: tremendously cheaper, in parallel, on the full panoply of questions (including philosophy and the social sciences), with robust consistency, cross-examination, test sets, and forecasting. These sorts of things are an important part of scalable supervision for alignment, but if they can be made to work I expect them to drive strong epistemic convergence.
You’ve described some of these ideas to me before, but not in enough detail for me to form a judgement on the actual merits of the ideas and arguments. So I’m having to rely on my usual skeptical prior for new untested ideas in the philosophical or AI safety areas (because a lot of new ideas end up not working out, and people tend to be overconfident about their own original ideas), along with:
We seem to understand the philosophy/epistemology of science much better than that of philosophy (i.e. metaphilosophy), and at least superficially the methods humans use to make progress in them don’t look very similar, so it seems suspicious that the same AI-based methods happen to work equally well for science and for philosophy. (I tried to understand/explain the difference between the two and why philosophy might be especially difficult or problematic for ML in Some Thoughts on Metaphilosophy. I’m not necessarily putting a lot of stock in my specific explanations but it would be a pretty big surprise to learn that it turns out they’re really the same.)
So to the extent that your overall optimism is based on optimism about these new ideas, I think I have to discount much of that, at least until I see a reasonably detailed write-up.
Per Faynman, understanding is the ability to practice. Or an ability to construct a practicing AI?
Maybe we understand philosophy of science/epistemology reasonably well, but we don’t yet know to construct a universal AI scientist.
Philosophy is a nebulous concept with multiple overlapping meanings, and as a practice, doesn’t have a single crisp functional conceptualisation, like those you enumerated in your post on metaphilosophy. It’s rather some superposition of these conceptualisations. But the most “important” conceptualisation, it seems to me, is that philosophy are coherent stories that help people craft shared meaning and motivation. This is pretty evident in religion-as-philosophy, moral philosophy, and most “applied” philosophies, from philosophy of physics to philosophy of art, and metaphilosophy itself. We discussed this point here: https://www.lesswrong.com/posts/k93NEoXZq6CdXegdx/philosophical-cyborg-part-1. So, I actually disagree that we as much “in the dark” on the nature of philosophy as you present. I feel that I understand philosophy not much worse than I understand epistemology/science (albeit still somewhat worse).
This functional role of philosophy AI is already very good at, in many cases, albeit if we task it to come up with a completely novel interpretation of quantum mechanics it will still struggle, probably.
I didn’t understand why the question of whether AI can practice good philosophy is very relevant to p(doom), though.
I would agree with “philosophy are coherent stories that help people craft shared meaning and motivation” but I think it’s not merely this. The position that it is merely this (and therefore philosophical questions do not have answers that can be true or false) would be self-undermining, because if there is no true or false in metaphilosophy, then your own metaphilosophical position can’t be true (providing me with no reason to accept it on that basis), and thinking that all or most philosophical questions that I care about have no truth value would be rather demotivating.
Consider instead the following alternative metaphilosophical position (that I tentatively hold): Philosophy may be many things (including “coherent stories that help people craft shared meaning and motivation”) but it is also a method of answering confusing questions, that (at least some) humans seem to possess but can’t yet understand or explain, and many philosophical questions do seem to have answers that can be true or false. What is incoherent or not motivating about this? What is wrong with this in general?
This is exactly what I meant as well, in “Philosophy is a nebulous concept with multiple overlapping meanings, and as a practice, doesn’t have a single crisp functional conceptualisation, like those you enumerated in your post on metaphilosophy. It’s rather some superposition of these conceptualisations.”
Let’s first disentangle two questions: “Why people are practicing philosophy?” and “What is the (truth) status of philosophical statements?”.
The first one is the question of anthropology and (social) psychology which, I could argue, are kind of “philosophies” rather than “sciences” themselves, at this moment: they attempt to explain existing evidence (with “coherent stories”), but don’t produce mathematical models with good predictive power. Nevertheless, I agree that there are multiple explanations that we can provide, from “finding answers to difficult questions” and “by practicing philosophy, people try to increase the coherence of their world models, which is a deep ‘motivation’ of conscious biological agents, and which became imported into the domain of words 80k years ago or whenever people have acquired compositional language”, to “showing off”.
As retrospective explanation, which doesn’t inform us much to why will AI be motivated to practice philosophy. As an algorithm for finding more coherent and parsimonious sets of symbols/concepts (ontologies), it may be implemented in LM-like AI, perhaps, or there may be other, more efficient algorithms for this.
To the second question, about the status of philosophical questions, I think the answer is the following: while the essence of a scientific theory is a mathematical model which is judged by the quality of its match with evidence and coherence with adjacent scientific (mathematical) models, philosophy is text which is judged by its coherence with scientific models, internal (linguistic) coherence, and coherence with other philosophical texts (arbitrarily selected by a philosopher, as well as a scientist arbitrary selects theories why want their theory to match well with, thus advancing this or that scientific paradigm).
Internal coherence of a text and its coherence with other texts is a question of (neuro)semiotics and linguistics/philosophy of language, both of which, in my mind, are branches of cognitive science. If there is something else that makes texts convincing to people apart from their coherence, apart from external factors such as the likability and the authority of the author of the text or the orator, then the “quality” of philosophical texts also becomes the question of neuropsychology more generally.
The above story about coherence applies to most kinds of philosophy except “foundational”, such as foundations of physics or “philosophical paradigms” such as pragmatism, which perhaps serve as capstones for large sets of other scientific and philosophical theories, and the merit of these “foundational” philosophies is judged by the overall coherence of these sets of other theories being capstoned.
I don’t like using the word “truth” outside of logic, but if I’m forced to, the above implies that I go with some version of the coherence theory of truth.
Before the invention of logic, someone might have said the same thing about math, that nothing determines the “quality” of a proof, aside from how convincing human neuropsychology happens to find it. I’m not saying that for sure philosophy is the same or analogous, that we’ll definitely find deeper reasons than neuropsychology for why a philosophical text is correct or convincing, but neither do I know how to rule that out, which makes me uncertain.
Plus, intuitively it seems like when trying to answer philosophical questions, I’m often aiming for some truth more “real” or “objective” than merely coherence with scientific models and arbitrarily selected other texts. For example, it seems either objectively true or objectively false that nothing determines the quality of a philosophical text aside from coherence and neuropsychology. The truth value of this statement doesn’t seem to depend on what other texts I happen to select to try to make it cohere with, or other subjective factors.
Does it seem to you that the kinds of people who are good at science vs good at philosophy (or the kinds of reasoning processes they use) are especially different?
In your own case, it seems to me like you’re someone who’s good at philosophy, but you’re also good at more “mundane” technical tasks like programming and cryptography. Do you think this is a coincidence?
I would guess that there’s a common factor of intelligence + being a careful thinker. Would you guess that we can mechanize the intelligence part but not the careful thinking part?
A lot of people are way better than me at technical tasks (at some point I wanted to go into cryptography research as a career but had stick with applied cryptography which is less technically demanding), but way worse at philosophy (or at least have shown little interest in philosophy which itself seems like a major philosophical error). I don’t know how to explain this if science and philosophy are really the same thing or use same methods.
It probably has to be more than that, because lots of people in cryptography (and security in general) are highly intelligent and careful thinkers (how else can they survive in those fields), but again AFAICT most people in those fields are not particularly good philosophers. Maybe at least one necessary additional ingredient is “good philosophical intuitions” (for example you have to have an intuition that philosophy is important before you would even start thinking about it) but I have little idea how to break that down further.