(Warning, long comment; it stays mostly on track but is embarrassingly mostly self-centered.) I think I must have been being imprecise when I said you were “less curious” about the things I yammer about, and honestly I don’t remember what I was thinking at the time and won’t try to rationalize it. (I wasn’t on adderall then but am on adderall now; there may be state-dependent memory effects.) I thus unendorse at least that part of the grandparent.
I think that everything you’re saying is correct, and note the interesting similarities between my case and Nesov’s two years ago—except Nesov had actual formal technical understanding and results and still couldn’t easily communicate his insights, whereas my intuition is not at all grounded in formality. I definitely won’t be contributing to decision theory progress any time soon, and probably never will—I can get excited about certain philosophical themes or aesthetics and stamp things with my mark of intuitive approval, but there is very little value in that unless I’m for some reason in a situation where people with actual skills can bounce ideas off of me. (I am trying to set up that situation currently but I’m trying not to put too much weight on it.)
I am still very confused about how actual progress in decision theory-like fields works, though, insofar as the things I see on the mailing list, e.g. the discussion of Loebian blindspots, look like resolving side issues where the foundations are weak. I don’t see how getting the proof proof proofs right helps much, whereas I was very excited by Nesov’s focus on e.g. reversibility and semantics; much of this comes from being happy that Nesov has certain conceptual aesthetics which I endorse. You could perhaps characterize this as not understanding Slepnev’s style of research. I see your style as somewhere between Nesov’s and Slepnev’s. Perhaps research styles or methodology would make for a useful LW discussion post, or a decision theory list email? Or is my notion of “style” just off? I have never been involved in mathematical-esque research, nor have I read about how it works besides Polya’s How to Solve It and brief accounts of e.g. quantum mechanics research.
Anyway. Currently there is only one actual-decisions-relevant scenario where I see the sort of thinking I do being useful, and in that sense I sort of think of it as my scenario of comparative advantage. But unfortunately I’ve yet to talk to people who either have thought very deeply about very similar issues or have relevant technical knowledge, those people being Shulman and Nesov. The scenario I’m thinking of is where we have a non-provably-Friendly AI or a uFAI but there are other existential risks to worry about. (I think this scenario may be the default, though—it seems somewhat likely to me that AGI is within reach of this generation of humans, whereas it is unclear if something-like-provably Friendly AI is possible, or what value there is in somewhat-more-stable-than-hacked-together AI.) It would be useful to understand what sorts of attractors there are for a self-modifying AI to fall into for either its decision theory or utility function, what the implications of our decision to run a uFAI would be in terms of either causal or acausal game theory, and generally what the heck we’d be knowingly inflicting on the multiverse if we decided to hit the big red button.
These questions and questions like them lend themselves to thorough models and rely on precise technical knowledge but aren’t obviously questions that can be formalized. Such questions are in the grey area between the answerable-technical and the unanswerable-philosophical, with a focus on the nature of intelligence: precisely where Less Wrong-style rationality skills are most necessary and most useful. Likewise questions about “morality”, which are nestled between formal utility theory and decision theory on one side, highly qualitative “naturalistic meta-ethics” on another, and informal but technical and foundational questions about computation on a third under-explored side. Better understanding these questions has a low but non-negligible chance at affecting either singularity-focused game theory or the design choices guiding the development of FAI or somewhat-Friendly AI.
I think about things at about that level of technicality seeing as I have an automatic disposition to obsess about such questions and may or may not have a knack for doing so in a useful manner. My ability to excel at such thinking is hard to analyze; I think playing with models of complex systems, like multilevel selection, and seeing to what extent my intuitions are verified or not by the systems, would be one way to both check and train relevant intuitions. Another relevant field is probably psychology, where I have a few ideas which I think could be tested. Computational cognitive science is a relevant intuition-testing and intuition-building field and I’ve managed to nab myself a girlfriend who is going into it. Rayhawk wants to build a suite of games that train low-level probabilistic reasoning which I think would also help. He’s written up one very small one thus far and it would be excellent if Less Wrong could start a project to bring the idea to life. But that’s a story for another day.
I consider it somewhat likely that in 6 months I will look back and think myself an utter fool to expect to make any useful progress on thinking about such things. In the meantime I don’t expect LW folk to bother to try to understand my cryptic thoughts, especially not when everyone has so many of their own to worry about worrying about.
I would be interested to know if any of your intuitive leaps have lead any of those people to make any progress beyond “a new idea that’s almost certain to be wrong even if we’re not sure why” to “something that seems likely to be an improvement over the previous state of the art”.
I think the intuitive leaps I’m most proud of are in just-maybe-sort-of-almost understanding some of Rayhawk’s ideas and maybe provoking him to develop them slightly further or recall them after a few months or years of rust. I don’t have a very good idea of how useful all of my philosophicalish conversation with him has been. His ideas are uniformly a lot better than mine. If for some reason I can convince both him and SingInst that he should be doing FAI work then perhaps I’ll have a much better model of how useful my philosophical aesthetics are, or how useful they might be if I supplemented them with deep technical-formal knowledge-understanding. I currently model myself as being somewhat useful to bounce ideas off of but not yet a, ya know, real FAI researcher, not by a longshot. My aim is to become a truly useful research assistant in the next few years while realizing my apparent cognitive comparative advantage.
Do you have any examples?
The combination of social awkwardness and non-trivial difficulty of tracking down examples makes me rather averse to doing so; on the other hand I think Nesov would probably like to see such examples and I have something of a moral obligation to substantiate the claim. The realistic model of my behavior says I won’t end up providing the examples. However the realistic model of my behavior says that in the future if I come across such examples I will PM Nesov. I think however that I’d rather not list such gripes in public; I feel like it sets a bad precedent or something. (Interestingly Yudkowsky is a celebrity and thus such moral qualms have never applied to him in my head. I do regret being harsher on Eliezer than was called for; it’s way too easy to forget he’s a person as well as a meme and meme-generator.)
My aim is to become a truly useful research assistant in the next few years while realizing my apparent cognitive comparative advantage.
Are you working on training yourself to understand graduate-level logic, set theory and category theory? That’s my current best guess at an actionable thing an aspiring FAI researcher should do, no matter what else is on your plate (and it’s been a stable conclusion for over a year).
Not yet, but very soon now. (The plan for category theory is to get proficient with Haskell and maybe kill two birds with one stone by playing with functional inductive programming (which uses category theory). I do not yet have plans for set theory or logic; I don’t really understand what they’re trying to do very well. Or like, my brain hasn’t categorized them as “cool”, whereas my brain has categorized category theory as “cool”, and I think that if I better understood what was cool about them then I’d have a better idea of where to start. I was sort of hoping I could somehow learn all my math in terms of categories, which is still technically a possibility I guess but not at all something I can do on my own.)
I don’t recommend studying category theory at any depth before at least some logic, abstract algebra and topology. It can feel overly empty of substance without a wealth of examples already in place, it’s not called “abstract nonsense” for nothing. Follow my reading list if you don’t have any better ideas or background, and maybe ask someone else for advice. I don’t like some of this stuff as well, I just study it because I must.
(The plan for category theory is to get proficient with Haskell and maybe kill two birds with one stone by playing with functional inductive programming (which uses category theory)
I’ve known many people who have tried to walk down this path and failed. The successful ones I know, knew one before the other.
The scenario I’m thinking of is where we have a non-provably-Friendly AI or a uFAI but there are other existential risks to worry about. (I think this scenario may be the default, though—it seems somewhat likely to me that AGI is within reach of this generation of humans, whereas it is unclear if something-like-provably Friendly AI is possible, or what value there is in somewhat-more-stable-than-hacked-together AI.) It would be useful to understand what sorts of attractors there are for a self-modifying AI to fall into for either its decision theory or utility function, what the implications of our decision to run a uFAI would be in terms of either causal or acausal game theory, and generally what the heck we’d be knowingly inflicting on the multiverse if we decided to hit the big red button.
(Warning, long comment; it stays mostly on track but is embarrassingly mostly self-centered.) I think I must have been being imprecise when I said you were “less curious” about the things I yammer about, and honestly I don’t remember what I was thinking at the time and won’t try to rationalize it. (I wasn’t on adderall then but am on adderall now; there may be state-dependent memory effects.) I thus unendorse at least that part of the grandparent.
I think that everything you’re saying is correct, and note the interesting similarities between my case and Nesov’s two years ago—except Nesov had actual formal technical understanding and results and still couldn’t easily communicate his insights, whereas my intuition is not at all grounded in formality. I definitely won’t be contributing to decision theory progress any time soon, and probably never will—I can get excited about certain philosophical themes or aesthetics and stamp things with my mark of intuitive approval, but there is very little value in that unless I’m for some reason in a situation where people with actual skills can bounce ideas off of me. (I am trying to set up that situation currently but I’m trying not to put too much weight on it.)
I am still very confused about how actual progress in decision theory-like fields works, though, insofar as the things I see on the mailing list, e.g. the discussion of Loebian blindspots, look like resolving side issues where the foundations are weak. I don’t see how getting the proof proof proofs right helps much, whereas I was very excited by Nesov’s focus on e.g. reversibility and semantics; much of this comes from being happy that Nesov has certain conceptual aesthetics which I endorse. You could perhaps characterize this as not understanding Slepnev’s style of research. I see your style as somewhere between Nesov’s and Slepnev’s. Perhaps research styles or methodology would make for a useful LW discussion post, or a decision theory list email? Or is my notion of “style” just off? I have never been involved in mathematical-esque research, nor have I read about how it works besides Polya’s How to Solve It and brief accounts of e.g. quantum mechanics research.
Anyway. Currently there is only one actual-decisions-relevant scenario where I see the sort of thinking I do being useful, and in that sense I sort of think of it as my scenario of comparative advantage. But unfortunately I’ve yet to talk to people who either have thought very deeply about very similar issues or have relevant technical knowledge, those people being Shulman and Nesov. The scenario I’m thinking of is where we have a non-provably-Friendly AI or a uFAI but there are other existential risks to worry about. (I think this scenario may be the default, though—it seems somewhat likely to me that AGI is within reach of this generation of humans, whereas it is unclear if something-like-provably Friendly AI is possible, or what value there is in somewhat-more-stable-than-hacked-together AI.) It would be useful to understand what sorts of attractors there are for a self-modifying AI to fall into for either its decision theory or utility function, what the implications of our decision to run a uFAI would be in terms of either causal or acausal game theory, and generally what the heck we’d be knowingly inflicting on the multiverse if we decided to hit the big red button.
These questions and questions like them lend themselves to thorough models and rely on precise technical knowledge but aren’t obviously questions that can be formalized. Such questions are in the grey area between the answerable-technical and the unanswerable-philosophical, with a focus on the nature of intelligence: precisely where Less Wrong-style rationality skills are most necessary and most useful. Likewise questions about “morality”, which are nestled between formal utility theory and decision theory on one side, highly qualitative “naturalistic meta-ethics” on another, and informal but technical and foundational questions about computation on a third under-explored side. Better understanding these questions has a low but non-negligible chance at affecting either singularity-focused game theory or the design choices guiding the development of FAI or somewhat-Friendly AI.
I think about things at about that level of technicality seeing as I have an automatic disposition to obsess about such questions and may or may not have a knack for doing so in a useful manner. My ability to excel at such thinking is hard to analyze; I think playing with models of complex systems, like multilevel selection, and seeing to what extent my intuitions are verified or not by the systems, would be one way to both check and train relevant intuitions. Another relevant field is probably psychology, where I have a few ideas which I think could be tested. Computational cognitive science is a relevant intuition-testing and intuition-building field and I’ve managed to nab myself a girlfriend who is going into it. Rayhawk wants to build a suite of games that train low-level probabilistic reasoning which I think would also help. He’s written up one very small one thus far and it would be excellent if Less Wrong could start a project to bring the idea to life. But that’s a story for another day.
I consider it somewhat likely that in 6 months I will look back and think myself an utter fool to expect to make any useful progress on thinking about such things. In the meantime I don’t expect LW folk to bother to try to understand my cryptic thoughts, especially not when everyone has so many of their own to worry about worrying about.
I think the intuitive leaps I’m most proud of are in just-maybe-sort-of-almost understanding some of Rayhawk’s ideas and maybe provoking him to develop them slightly further or recall them after a few months or years of rust. I don’t have a very good idea of how useful all of my philosophicalish conversation with him has been. His ideas are uniformly a lot better than mine. If for some reason I can convince both him and SingInst that he should be doing FAI work then perhaps I’ll have a much better model of how useful my philosophical aesthetics are, or how useful they might be if I supplemented them with deep technical-formal knowledge-understanding. I currently model myself as being somewhat useful to bounce ideas off of but not yet a, ya know, real FAI researcher, not by a longshot. My aim is to become a truly useful research assistant in the next few years while realizing my apparent cognitive comparative advantage.
The combination of social awkwardness and non-trivial difficulty of tracking down examples makes me rather averse to doing so; on the other hand I think Nesov would probably like to see such examples and I have something of a moral obligation to substantiate the claim. The realistic model of my behavior says I won’t end up providing the examples. However the realistic model of my behavior says that in the future if I come across such examples I will PM Nesov. I think however that I’d rather not list such gripes in public; I feel like it sets a bad precedent or something. (Interestingly Yudkowsky is a celebrity and thus such moral qualms have never applied to him in my head. I do regret being harsher on Eliezer than was called for; it’s way too easy to forget he’s a person as well as a meme and meme-generator.)
Are you working on training yourself to understand graduate-level logic, set theory and category theory? That’s my current best guess at an actionable thing an aspiring FAI researcher should do, no matter what else is on your plate (and it’s been a stable conclusion for over a year).
Not yet, but very soon now. (The plan for category theory is to get proficient with Haskell and maybe kill two birds with one stone by playing with functional inductive programming (which uses category theory). I do not yet have plans for set theory or logic; I don’t really understand what they’re trying to do very well. Or like, my brain hasn’t categorized them as “cool”, whereas my brain has categorized category theory as “cool”, and I think that if I better understood what was cool about them then I’d have a better idea of where to start. I was sort of hoping I could somehow learn all my math in terms of categories, which is still technically a possibility I guess but not at all something I can do on my own.)
I don’t recommend studying category theory at any depth before at least some logic, abstract algebra and topology. It can feel overly empty of substance without a wealth of examples already in place, it’s not called “abstract nonsense” for nothing. Follow my reading list if you don’t have any better ideas or background, and maybe ask someone else for advice. I don’t like some of this stuff as well, I just study it because I must.
I’ve known many people who have tried to walk down this path and failed. The successful ones I know, knew one before the other.
This.