Actually: Maybe this is a decent chance to have Less Wrong folk help refine common LW concepts. Or at least a few of the ones that are featured in this article. Certainly most folk won’t share my confusions, misgivings, or ignorance about most of Less Wrong’s recurring concepts, but certainly a few must about a few, and could benefit from such a list! Consider this comment a test of the idea. I’ll list some concepts from this post that I’m dissatisfied with and at least one reason why, and maybe others can point out a better way of thinking about the concept or a way to make it more precise or less confused. Here we go:
Mathematical structure: I don’t know what this means, and I don’t think Wikipedia’s definition is the relevant one. Can we give minimal examples of mathematical structures? What’s simpler than the empty set? Is a single axiom a mathematical structure? (What if that axiom is ridiculously (infinitely?) long—how short does something have to be to be an axiom? Where are we getting the language we use to write out the axiom, and where did it get its axioms? (“Memetic evolution?! That’s not even a… what is this i dont even”))
Ensemble: There are a whole bunch of different ensemble formulations—see Standish’s “Theory of Nothing” for a few. Besides being ensembles of different ‘things’—mathematical structures, computations, whatever—are there other ways of differentiating or classifying them? Is the smallest ensemble that deserves its name just “anything computable”, or is there a way to use e.g. (infinite and/or infinitely nested?) speed priors to get an interesting “smaller” ensemble? If they’re not equivalent, is that an accident? Is any of this relevant to choosing what things some part of your decision theory is allowed to reason about, at least at first? (I’m thinking about dangerous Pascalian problems if infinities are allowed anywhere.)
Weightings: Why does a universe-wide universal-prior weighting match the same universal prior probability distribution that is optimal for predicting inputs from environments of arbitrary size/complexity? (Presumably I’d better understand this if I actually understood the proofs, but in general I’m annoyed at that “within a constant!” trick even if it makes up like half of algorithmic probability theory.) And what does it mean for reality to have a chosen Turing language? What if it disagrees with your chosen Turing language; presumably Solomonoff induction would learn that too? (What are the answers to the more pointed questions I should be asking here?)
computation: I sort of get what a computation is in its abstract form. But then I look at my desk. What computations are being executed? How do we analyze the physical world as being made up of infinite infinitely overlapping computations? What is approximate computation? What does it mean to execute a computation? Inquiry about computation could go on for a long time though, and deserves it’s own bundle of posts.
sweeping confusion into UDT: (I have been told that) in many decision scenarios it can be determined by a bookie what the UDT agent’s implicit preferences and probabilities are even after they’ve been timestamped into a decision policy—they might not be explicitly separated and in some cases that gets around tricky problems but generally it’s kind of misleading to say that it’s all just preference, considering the processes that determined those preferences happened to care about this weird “probability”-ish thing called (relative) existence. Are any probabilistic-ish problems being obscured because of not explicitly distinguishing between beliefs and preferences even though implicitly they’re there? (Is this line of inquiry fundamentally confused, and if so, how?)
anthropics: Is there any reason to bother pondering anthropics directly, rather than thinking decision theoretically while taking into account that ones preferences seem like they may refer to something like probabilities? This is related to the previous bullet point.
If this is useful—if some answers to the above questions or confusions are useful—then I’ll consider writing a much longer post in the same vein. Being annoyed or disturbed about imperfections of concepts or conceptual breakdowns is something I am automatically predisposed to be motivated to do, and if this works I can see myself as being more valuable to the LW community, especially the part of the community that is afraid to ask stupid questions with a loud voice.
Mathematical structure: I don’t know what this means, and I don’t think Wikipedia’s definition is the relevant one. Can we give minimal examples of mathematical structures? What’s simpler than the empty set? Is a single axiom a mathematical structure? (What if that axiom is ridiculously (infinitely?) long—how short does something have to be to be an axiom? Where are we getting the language we use to write out the axiom, and where did it get its axioms? (“Memetic evolution?! That’s not even a… what is this i dont even”))
Anyway. Typically, a mathematical structure is something with some sort of attached rule set, whereas a mathematical object is a primitive, something that can be manipulated according to that rule set. So the empty set or a single axiom might be a mathematical object, but not a structure.
Infinite axioms are the objects of infinitary logics, and there is a whole branch of model theory devoted to their study (I think that most of the groundwork for that area was developed by Jon Barwise). You can learn about this and several other areas of model theory here.
There is a pervasive sense that mathematics is not an anthropocentric activity, and that it is in some way universal, but this is not very well specified. I tend to think that in order to tackle this issue it might be necessary to understand how to implement a program that can ‘understand’ and generate new mathematics at least as generally as a human with peak mathematical ability, but that is just my intuition.
(Warning: I be hittin’ the comment button without reviewing this carefully, views expressed may be inaccurately expressed and shit, ya dig? Aight yo.)
Thanks for the pointers. I wish there were a place for me to just bring up things I’ve been thinking about, and quickly get pointers or even conversation. Is Less Wrong IRC the best place for that? I’ve never used it.
I tend to think that in order to tackle this issue it might be necessary to understand how to implement a program that can ‘understand’ and generate new mathematics at least as generally as a human with peak mathematical ability, but that is just my intuition.
One FAI-relevant question I’m very interested in is: What if anything happens when a Goedel machine becomes intelligent enough to “understand” the semantics of its self-description, especially its utility function and proof search axioms? Many smart people emphasize the important difference between syntax and semantics, but this is not nearly as common in Less Wrong’s standard philosophy.[1] If we could show that there’s no way a Goedel machine can “re-interpret” the semantics of its axioms or utility function to mean something intuitively rather different than how humans were interpreting them, then we would have two interesting arguments: that it is indeed theoretically possible to build a general intelligence that is “stable” if the axioms are sound[2], and also that superintelligences with non-Friendly initial utility functions probably won’t converge on whatever a Friendly AI would also have converged on. Though this still probably wouldn’t convince the vast majority of AGI researchers who weren’t already convinced, it would convince smart technically-minded objectors like me or perhaps Goertzel (though I I’m not sure what his position is).
One interesting way to look at Goedel machines for all kinds of investigations is to imagine that they create new agents to do things for them. (A variation on this theme is to trap the Goedel machine in a box and make it choose which of two totally different agents to let out of their boxes—it’s a situation where it outputs the best strategy according to its goals, but that strategy has a huge number of side effects besides just optimizing its goals.) For ideas related to those in the previous paragraph it might be useful to imagine that the Goedel machine’s proof search tells it that a very good idea would be to create an agent to monitor the Goedel machine and to intervene if the Goedel machine stops optimizing according to its utility function. (After all, what if its hardware gets corrupted, or it gets coerced to modify its utility function and to delete its memory of the coercion?) How does this second agent determine the “actual” or “intended” semantics of the original machine’s utility function, assuming it’s not too worried about its own utility function that references the original machine’s utility function? These are just tools one can use to look at such things, the details I’m adding could be better optimized. Though it’s not my reason for bringing up these ideas, you can see how such considerations indicate that having a thorough understanding of the Goedel machine’s architecture and utility function doesn’t obviously tell us everything we need to know, because superintelligences are liable to get creative. No pun intended.
To further show why this might be interesting for LW folk: Many of SingInst’s standard arguments about the probable unFriendliness of not-explicitly-coded-to-be-Friendly AIs are either contradictory or technically weak, and it’d be nice to technically demonstrate that they are or aren’t compelling. To substantiate that claim a little bit: Despite SingInst’s standard arguments—which I’ve thoroughly understood for two years now and I was a Visiting Fellow for over a freakin’ year so please Less Wrong for once don’t parrot them back to me; ahem, anyway...---despite SingInst’s standard arguments it’s difficult to imagine an agent that doesn’t automatically instantly fail, for example by simple wireheading or just general self-destruction, but instead even becomes superintelligent, and yet somehow manages to land in the sweet spot where it (mis-)interprets its utility function to be referring to something completely alien but again not because it’s wire-heading. Most AI designs simply don’t go anywhere; thus formal ones like Goedel machines are by far the best to inspect closely. If we look at less-technical ones then it becomes a game where anyone can assert their intuition or play reference class tennis. For example, some AI with a hacky implicit goal system becomes smart enough to FOOM: As it’s reflecting on its goal system in order to piece it together, how much reflection does it do? What kind of reflection does it do? The hard truth is that it’s hard to argue for any particular amount less than “a whole bunch of reflection”, and if you think about it for awhile it’s easy to see how in theory such reflection could lead to it becoming Friendly. Thus Goedel machines with very precise axioms and utility functions are by far the best to look at.
(BTW, combining two ideas above: it’s important to remember that wireheading agents can technically create non-wireheading agents that humans would have to worry about. It’s just hard to imagine an AI that stayed non-wireheading long enough and became competent enough to write a non-wireheading seed AI, and then suddenly started wireheading.)
[1] Maybe because it leads to people like Searle saying questionable things? Though ironically Searle is generally incredibly misunderstood and caricatured. At any rate, I am afraid that some kind of reversed stupidity might be occurring, whether or not that stupidity was ever there in the first place or was just incorrect pattern-matching from computationalist-skepticism to substance dualism, or something.
[2] Does anyone talk about how it can be shown that two axioms aren’t contradictory, or that an axiom isn’t self-contradictory? (Obviously you need to at least implicitly use axioms to show this, so it’s an infinite regress, but just as obviously at some point we’re going to have to trust in induction, even if we’re coding an FAI.)
Ash Ketchum is strolling around Kanto when he happens upon a MissingNo. “GOEDEL MACHINE, I choose you!” GOEDEL MACHINE used RECURSION. Game Boy instantly explodes.
MISTY: “Ash, we have to do something! Kooky Psychic Gym Leader Sabrina is leveling up her Abra and she’s not even trying to develop a formal theory of Friendly Artificial Alakazam!” ASH: “Don’t panic! She doesn’t realize that in order to get her Kadabra to evolve she’ll have to trade with carbon copies of us in other Game Boys, then trade back. Nintendo’s clever ploy to sell link cables ensures we have a...” Ash dons sunglasses. ”...game theoretic advantage.” BROCK: “Dah dah dah dah, Po-ké-MON!” Game Boy explodes.
I think this would be useful. On some of these topics we may not even realize how confused we are. I thought I knew where I was at with “computation”, for example. I realized I cannot answer your question without begging more though.
Actually: Maybe this is a decent chance to have Less Wrong folk help refine common LW concepts. Or at least a few of the ones that are featured in this article. Certainly most folk won’t share my confusions, misgivings, or ignorance about most of Less Wrong’s recurring concepts, but certainly a few must about a few, and could benefit from such a list! Consider this comment a test of the idea. I’ll list some concepts from this post that I’m dissatisfied with and at least one reason why, and maybe others can point out a better way of thinking about the concept or a way to make it more precise or less confused. Here we go:
Mathematical structure: I don’t know what this means, and I don’t think Wikipedia’s definition is the relevant one. Can we give minimal examples of mathematical structures? What’s simpler than the empty set? Is a single axiom a mathematical structure? (What if that axiom is ridiculously (infinitely?) long—how short does something have to be to be an axiom? Where are we getting the language we use to write out the axiom, and where did it get its axioms? (“Memetic evolution?! That’s not even a… what is this i dont even”))
Ensemble: There are a whole bunch of different ensemble formulations—see Standish’s “Theory of Nothing” for a few. Besides being ensembles of different ‘things’—mathematical structures, computations, whatever—are there other ways of differentiating or classifying them? Is the smallest ensemble that deserves its name just “anything computable”, or is there a way to use e.g. (infinite and/or infinitely nested?) speed priors to get an interesting “smaller” ensemble? If they’re not equivalent, is that an accident? Is any of this relevant to choosing what things some part of your decision theory is allowed to reason about, at least at first? (I’m thinking about dangerous Pascalian problems if infinities are allowed anywhere.)
Weightings: Why does a universe-wide universal-prior weighting match the same universal prior probability distribution that is optimal for predicting inputs from environments of arbitrary size/complexity? (Presumably I’d better understand this if I actually understood the proofs, but in general I’m annoyed at that “within a constant!” trick even if it makes up like half of algorithmic probability theory.) And what does it mean for reality to have a chosen Turing language? What if it disagrees with your chosen Turing language; presumably Solomonoff induction would learn that too? (What are the answers to the more pointed questions I should be asking here?)
computation: I sort of get what a computation is in its abstract form. But then I look at my desk. What computations are being executed? How do we analyze the physical world as being made up of infinite infinitely overlapping computations? What is approximate computation? What does it mean to execute a computation? Inquiry about computation could go on for a long time though, and deserves it’s own bundle of posts.
sweeping confusion into UDT: (I have been told that) in many decision scenarios it can be determined by a bookie what the UDT agent’s implicit preferences and probabilities are even after they’ve been timestamped into a decision policy—they might not be explicitly separated and in some cases that gets around tricky problems but generally it’s kind of misleading to say that it’s all just preference, considering the processes that determined those preferences happened to care about this weird “probability”-ish thing called (relative) existence. Are any probabilistic-ish problems being obscured because of not explicitly distinguishing between beliefs and preferences even though implicitly they’re there? (Is this line of inquiry fundamentally confused, and if so, how?)
anthropics: Is there any reason to bother pondering anthropics directly, rather than thinking decision theoretically while taking into account that ones preferences seem like they may refer to something like probabilities? This is related to the previous bullet point.
If this is useful—if some answers to the above questions or confusions are useful—then I’ll consider writing a much longer post in the same vein. Being annoyed or disturbed about imperfections of concepts or conceptual breakdowns is something I am automatically predisposed to be motivated to do, and if this works I can see myself as being more valuable to the LW community, especially the part of the community that is afraid to ask stupid questions with a loud voice.
Research on numerical cognition seems relevant here. Interesting links here, here and here.
Anyway. Typically, a mathematical structure is something with some sort of attached rule set, whereas a mathematical object is a primitive, something that can be manipulated according to that rule set. So the empty set or a single axiom might be a mathematical object, but not a structure.
Infinite axioms are the objects of infinitary logics, and there is a whole branch of model theory devoted to their study (I think that most of the groundwork for that area was developed by Jon Barwise). You can learn about this and several other areas of model theory here.
There is a pervasive sense that mathematics is not an anthropocentric activity, and that it is in some way universal, but this is not very well specified. I tend to think that in order to tackle this issue it might be necessary to understand how to implement a program that can ‘understand’ and generate new mathematics at least as generally as a human with peak mathematical ability, but that is just my intuition.
(Warning: I be hittin’ the comment button without reviewing this carefully, views expressed may be inaccurately expressed and shit, ya dig? Aight yo.)
Thanks for the pointers. I wish there were a place for me to just bring up things I’ve been thinking about, and quickly get pointers or even conversation. Is Less Wrong IRC the best place for that? I’ve never used it.
One FAI-relevant question I’m very interested in is: What if anything happens when a Goedel machine becomes intelligent enough to “understand” the semantics of its self-description, especially its utility function and proof search axioms? Many smart people emphasize the important difference between syntax and semantics, but this is not nearly as common in Less Wrong’s standard philosophy.[1] If we could show that there’s no way a Goedel machine can “re-interpret” the semantics of its axioms or utility function to mean something intuitively rather different than how humans were interpreting them, then we would have two interesting arguments: that it is indeed theoretically possible to build a general intelligence that is “stable” if the axioms are sound[2], and also that superintelligences with non-Friendly initial utility functions probably won’t converge on whatever a Friendly AI would also have converged on. Though this still probably wouldn’t convince the vast majority of AGI researchers who weren’t already convinced, it would convince smart technically-minded objectors like me or perhaps Goertzel (though I I’m not sure what his position is).
One interesting way to look at Goedel machines for all kinds of investigations is to imagine that they create new agents to do things for them. (A variation on this theme is to trap the Goedel machine in a box and make it choose which of two totally different agents to let out of their boxes—it’s a situation where it outputs the best strategy according to its goals, but that strategy has a huge number of side effects besides just optimizing its goals.) For ideas related to those in the previous paragraph it might be useful to imagine that the Goedel machine’s proof search tells it that a very good idea would be to create an agent to monitor the Goedel machine and to intervene if the Goedel machine stops optimizing according to its utility function. (After all, what if its hardware gets corrupted, or it gets coerced to modify its utility function and to delete its memory of the coercion?) How does this second agent determine the “actual” or “intended” semantics of the original machine’s utility function, assuming it’s not too worried about its own utility function that references the original machine’s utility function? These are just tools one can use to look at such things, the details I’m adding could be better optimized. Though it’s not my reason for bringing up these ideas, you can see how such considerations indicate that having a thorough understanding of the Goedel machine’s architecture and utility function doesn’t obviously tell us everything we need to know, because superintelligences are liable to get creative. No pun intended.
To further show why this might be interesting for LW folk: Many of SingInst’s standard arguments about the probable unFriendliness of not-explicitly-coded-to-be-Friendly AIs are either contradictory or technically weak, and it’d be nice to technically demonstrate that they are or aren’t compelling. To substantiate that claim a little bit: Despite SingInst’s standard arguments—which I’ve thoroughly understood for two years now and I was a Visiting Fellow for over a freakin’ year so please Less Wrong for once don’t parrot them back to me; ahem, anyway...---despite SingInst’s standard arguments it’s difficult to imagine an agent that doesn’t automatically instantly fail, for example by simple wireheading or just general self-destruction, but instead even becomes superintelligent, and yet somehow manages to land in the sweet spot where it (mis-)interprets its utility function to be referring to something completely alien but again not because it’s wire-heading. Most AI designs simply don’t go anywhere; thus formal ones like Goedel machines are by far the best to inspect closely. If we look at less-technical ones then it becomes a game where anyone can assert their intuition or play reference class tennis. For example, some AI with a hacky implicit goal system becomes smart enough to FOOM: As it’s reflecting on its goal system in order to piece it together, how much reflection does it do? What kind of reflection does it do? The hard truth is that it’s hard to argue for any particular amount less than “a whole bunch of reflection”, and if you think about it for awhile it’s easy to see how in theory such reflection could lead to it becoming Friendly. Thus Goedel machines with very precise axioms and utility functions are by far the best to look at.
(BTW, combining two ideas above: it’s important to remember that wireheading agents can technically create non-wireheading agents that humans would have to worry about. It’s just hard to imagine an AI that stayed non-wireheading long enough and became competent enough to write a non-wireheading seed AI, and then suddenly started wireheading.)
[1] Maybe because it leads to people like Searle saying questionable things? Though ironically Searle is generally incredibly misunderstood and caricatured. At any rate, I am afraid that some kind of reversed stupidity might be occurring, whether or not that stupidity was ever there in the first place or was just incorrect pattern-matching from computationalist-skepticism to substance dualism, or something.
[2] Does anyone talk about how it can be shown that two axioms aren’t contradictory, or that an axiom isn’t self-contradictory? (Obviously you need to at least implicitly use axioms to show this, so it’s an infinite regress, but just as obviously at some point we’re going to have to trust in induction, even if we’re coding an FAI.)
Ash Ketchum is strolling around Kanto when he happens upon a MissingNo. “GOEDEL MACHINE, I choose you!” GOEDEL MACHINE used RECURSION. Game Boy instantly explodes.
MISTY: “Ash, we have to do something! Kooky Psychic Gym Leader Sabrina is leveling up her Abra and she’s not even trying to develop a formal theory of Friendly Artificial Alakazam!”
ASH: “Don’t panic! She doesn’t realize that in order to get her Kadabra to evolve she’ll have to trade with carbon copies of us in other Game Boys, then trade back. Nintendo’s clever ploy to sell link cables ensures we have a...” Ash dons sunglasses. ”...game theoretic advantage.”
BROCK: “Dah dah dah dah, Po-ké-MON!”
Game Boy explodes.
I think this would be useful. On some of these topics we may not even realize how confused we are. I thought I knew where I was at with “computation”, for example. I realized I cannot answer your question without begging more though.