So, “God does everything”, plus a definition of “everything” which makes predictions about all events, would rate very highly with you? It’s very low on theory and very high on prediction.
No, it has tons of theory. God is a very complex concept. Note that ‘God did everything’ is more complex and therefore less likely than ‘everything happened’. Did you read http://lesswrong.com/lw/jp/occams_razor/ ?
How do you figure God is complex? God as I mean it simply can do anything, no reason given. That is its only attribute: that it arbitrarily does anything the theory its attached to cares to predict. We can even stop calling it “God”. We could even not mention it at all so there is no theory and merely give a list of predictions. Would that be good, in your view?
If ‘God’ is meaningless and can merely be attached to any theory, then the theory is the same with and without God. There is nothing to refute, since there is no difference. If you defined ‘God’ to mean a being who created all species or who commanded a system of morality, I would have both reason to care about and means to refute God. If you defined ‘God’ to mean ‘quantum physics’, there would be applications and means of proving that ‘God’ is a good approximation, but this definition is nonsensical, since it is not what is usually meant by ‘God. If the theory of ‘God’ has no content, there is nothing to discuss, but the is again a very unusual definition.
If there is no simpler description, then a list of predictions is better but, if an explanation simpler then merely a list of prediction is at all possible, then that would be more likely.
How do you decide if an explanation is simpler than a list of predictions? Are you thinking in terms of data compression?
Do you understand that the content of an explanation is not equivalent to the predictions it makes? It offers a different kind of thing than just predictions.
How do you decide if an explanation is simpler than a list of predictions? Are you thinking in terms of data compression?
Essentially. It is simpler if it has a higher Solomonoff prior.
Do you understand that the content of an explanation is not equivalent to the predictions it makes? It offers a different kind of thing than just predictions.
Yes, there is more than just predictions. However, prediction are the only things that tell us how to update our probability distributions.
Quoting from The Fabric of Reality, chapter 1, by David Deutsch.
Yet some philosophers — and even some scientists — disparage the role of explanation in science. To them, the basic purpose of a scientific theory is not to explain anything, but to predict the outcomes of experiments: its entire content lies in its predictive formulae. They consider that any consistent explanation that a theory may give for its predictions is as good as any other — or as good as no explanation at all — so long as the predictions are true. This view is called instrumentalism (because it says that a theory is no more than an ‘instrument’ for making predictions). To instrumentalists, the idea that science can enable us to understand the underlying reality that accounts for our observations is a fallacy and a conceit. They do not see how anything a scientific theory may say beyond predicting the outcomes of experiments can be more than empty words.
[cut a quote of Steven Weinberg clearly advocating instrumentalism. the particular explanation he says doesn’t matter is that space time is curved. space time curvature is an example of a non-predictive explanation.]
imagine that an extraterrestrial scientist has visited the Earth and given us an ultra-high-technology ‘oracle’ which can predict the outcome of any possible experiment, but provides no explanations. According to instrumentalists, once we had that oracle we should have no further use for scientific theories, except as a means of entertaining ourselves. But is that true? How would the oracle be used in practice? In some sense it would contain the knowledge necessary to build, say, an interstellar spaceship. But how exactly would that help us to build one, or to build another oracle of the same kind — or even a better mousetrap? The oracle only predicts the outcomes of experiments. Therefore, in order to use it at all we must first know what experiments to ask it about. If we gave it the design of a spaceship, and the details of a proposed test flight, it could tell us how the spaceship would perform on such a flight. But it could not design the spaceship for us in the first place. And even if it predicted that the spaceship we had designed would explode on take-off, it could not tell us how to prevent such an explosion. That would still be for us to work out. And before we could work it out, before we could even begin to improve the design in any way, we should have to understand, among other things, how the spaceship was supposed to work. Only then would we have any chance of discovering what might cause an explosion on take-off. Prediction — even perfect, universal prediction — is simply no substitute for explanation.
Similarly, in scientific research the oracle would not provide us with any new theory. Not until we already had a theory, and had thought of an experiment that would test it, could we possibly ask the oracle what would happen if the theory were subjected to that test. Thus, the oracle would not be replacing theories at all: it would be replacing experiments. It would spare us the expense of running laboratories and particle accelerators.
[cut elaboration]
The oracle would be very useful in many situations, but its usefulness would always depend on people’s ability to solve scientific problems in just the way they have to now, namely by devising explanatory theories. It would not even replace all experimentation, because its ability to predict the outcome of a particular experiment would in practice depend on how easy it was to describe the experiment accurately enough for the oracle to give a useful answer, compared with doing the experiment in reality. After all, the oracle would have to have some sort of ‘user interface’. Perhaps a description of the experiment would have to be entered into it, in some standard language. In that language, some experiments would be harder to specify than others. In practice, for many experiments the specification would be too complex to be entered. Thus the oracle would have the same general advantages and disadvantages as any other source of experimental data, and it would be useful only in cases where consulting it happened to be more convenient than using other sources. To put that another way: there already is one such oracle out there, namely the physical world. It tells us the result of any possible experiment if we ask it in the right language (i.e. if we do the experiment), though in some cases it is impractical for us to ‘enter a description of the experiment’ in the required form (i.e. to build and operate the apparatus). But it provides no explanations.
In a few applications, for instance weather forecasting, we may be almost as satisfied with a purely predictive oracle as with an explanatory theory. But even then, that would be strictly so only if the oracle’s weather forecast were complete and perfect. In practice, weather forecasts are incomplete and imperfect, and to make up for that they include explanations of how the forecasters arrived at their predictions. The explanations allow us to judge the reliability of a forecast and to deduce further predictions relevant to our own location and needs. For instance, it makes a difference to me whether today’s forecast that it will be windy tomorrow is based on an expectation of a nearby high-pressure area, or of a more distant hurricane. I would take more precautions in the latter case.
[“wind due to hurricane” and “wind due to high-pressure area” are different explanations for a particular prediction.]
So knowledge is more than just predictive because it also lets us design things?
Here’s a solution to the problem with the oracle—design a computer that inputs every possible design to the oracle and picks the best. You may object that this would be extremely time-consuming and therefore impractical. However, you don’t need to build the computer; just ask the oracle what would happen if you did.
What can we learn from this? This kind of knowledge can be seen as predictive, but only incidentally, because the computer happen to be implemented in the physical world. If it were implemented mathematically, as an abstract algorithm, we would recognize this as deductive, mathematical knowledge. But math is all about tautologies; nothing new is learned. Okay, I apologize for that. I think I’ve been changing my definition of knowledge repeatedly to include or exclude such things. I don’t really care as much about consistent definitions as I should. Hopefully it is clear from context. I’ll go back to your original question.
Would a list of predictions with no theory/explanation be good or bad, in your view?
The difference between the two cases is not the same as the crucial difference here. Having a theory as opposed to a list of predictions for every possible experiment does not necessarily make the theorems easier to prove. When it does, which is almost always, this is simply because that theory is more concise, so it is easier to deduce things from. This seems more like a matter of computing power than one of epistemology.
According to some predetermined criteria. “How well does this spaceship fly?” “How often does it crash?” Making a computer evaluate machines is not hard in principle, and is beside the point.
And wouldn’t the oracle predict that the computer program would never halt, since it would attempt to enter infinitely many questions into the oracle?
I was assuming a finite maximum size with only finitely many distinguishable configurations in that size, but, again, this is irrelevant; whatever trick you use to make this work, you will not change the conclusions.
According to some predetermined criteria. “How well does this spaceship fly?” “How often does it crash?” Making a computer evaluate machines is not hard in principle, and is beside the point.
I think figuring out what criteria you want is an example of a non-predictive issue. That makes it not beside the point. And if the computer picks the best according to criteria we give it, they will contain mistakes. We won’t actually get the best answer. We’ll have to learn stuff and improve our knowledge all in order to set up your predictive thing. So there is this whole realm of non-predictive learning.
I was assuming a finite maximum size with only finitely many distinguishable configurations in that size,
So you make assumptions like a spaceship is a thing made out of atoms. If your understanding of physics (and therefore your assumptions) is incorrect then your use of the oracle won’t work out very well. So your ability to get useful predictions out of the oracle depends on your understanding, not just on predicting anything.
I think figuring out what criteria you want is an example of a non-predictive issue.
So I just give it my brain and tell it to do what it wants. Of course, there are missing steps, but they should be purely deductive. I believe that is what Eliezer is working on now :)
So you make assumptions like a spaceship is a thing made out of atoms. If your understanding of physics (and therefore your assumptions) is incorrect then your use of the oracle won’t work out very well.
Good point. I guess you can’t bootstrap an oracle like this; some things possible mathematically, like calculating a function over an infinity of points, just can’t be done physically. My point still stands, but this illustration definitely dies.
So I just give it my brain and tell it to do what it wants. Of course, there are missing steps, but they should be purely deductive. I believe that is what Eliezer is working on now :)
That’s it? That’s just not very impressive by my standards. Popper’s epistemology is far more advanced, already. Why do you guys reject and largely ignore it? Is it merely because Eliezer published a few sentences of nasty anti-Popper myths in an old essay?
By ‘what Eliezer is working on now’ I meant AI, which would probably be necessary to extract my desires from my brain in practice. In principle, we could just use Bayes’ theorem a lot, assuming we had precise definitions of these concepts.
Why do you guys reject and largely ignore it? Is it merely because Eliezer published a few sentences of nasty anti-Popper myths in an old essay?
Popperian epistemology is incompatible with Bayesian epistemology, which I accept from its own justification, not from a lack of any other theory. I disliked what I had heard about Popper before I started reading LessWrong, but I forget my exact argument, so I do not know if it was valid. From what I do remember, I suspect it was not.
So, you reject Popper’s ideas without having any criticism of them that you can remember?
That’s it?
You don’t care that Popper’s ideas have criticisms of Bayesian epistemology which you haven’t answered. You feel you don’t need to answer criticisms because Bayesian epistemology is self-justifying and thus all criticisms of it must be wrong. Is that it?
So, you reject Popper’s ideas without having any criticism of them that you can remember?
No, I brought up my past experience with Popper because you asked if my opinions on him came from Eliezer.
You feel you don’t need to answer criticisms because Bayesian epistemology is self-justifying and thus all criticisms of it must be wrong. Is that it?
No, I think Bayesian epistemology has been mathematically proven. I don’t spend a lot of time investigating alternatives for the same reason I don’t spend time investigating alternatives to calculus.
If you have a valid criticism, “this is wrong” or “you haven’t actually proved this” as opposed to “this has a limited domain of applicability” (actually, that could be valid if Popperian epistemology can answer a question that Bayesianism can’t), I would love to know. You did bring up some things of this type, like that paper by Popper, but none of them have logically stood up, unless I am missing something.
If Bayesian epistemology is mathematically proven, why have I been told in my discussions here various things such as: there is a regress problem which isn’t fully solved (Yudkowsky says so), that circular arguments for induction are correction, that foundationalism is correct, been linked to articles to make Bayesian points and told they have good arguments with only a little hand waving, and so on? And I’ve been told further research is being done.
It seems to me that saying it’s proven, the end, is incompatible with it having any flaws or unsolved problems or need for further research. So, which is it?
If you have a valid criticism, “this is wrong” or “you haven’t actually proved this” as opposed to “this has a limited domain of applicability” (actually, that could be valid if Popperian epistemology can answer a question that Bayesianism can’t), I would love to know.
All of the above. It is wrong b/c, e.g., it is instrumentalist (has not understood the value of explanatory knowledge) and inductivist (induction is refuted). It is incomplete b/c, e.g. it cannot deal with non-observational knowledge such as moral knowledge. You haven’t proved much to me however I’ve been directed to two books, so judgment there is pending.
I don’t know how you concluded that none of my arguments stood up logically. Did you really think you’d logically refuted every point? I don’t agree, I think most of your arguments were not pure logic, and I thought that various issues were pending further discussion of sub-points. As I recall, some points I raised have not been answered. I’m having several conversations in parallel so I don’t recall which in particular you didn’t address which were replies to you personally, but for example I quoted an argument by David Deutsch about an oracle. The replies I got about how to try to cheat the oracle did not address the substantive point of the thought experiment, and did not address the issue (discussed in the quote) that oracles have user interfaces and entering questions isn’t just free and trivial, and did not address the issue that physical reality is a predictive oracle meeting all the specified characteristics of the alien oracle (we already have an oracle and none of the replies I got about use the oracle would actually work with the oracle we have). As I saw it, my (quoted) points on that issue stood. The replies were some combination of incomplete and missing the point. They were also clever which is a bit of fun. I thought of what I think is a better way to try to cheat the rules, which is to ask the oracle to predict the contents of philosophy books that would be written if philosophy was studied for trillions of years by the best people. However, again, the assumption that any question which is easily described in English can be easily entered into the oracle and get a prediction was not part of the thought experiment. And the reason I hadn’t explained all this yet is that there were various other points pending anyway, so shrug, it’s hard to decide where to start when you have many different things to say.
If Bayesian epistemology is mathematically proven, why have I been told in my discussions here various things such as: there is a regress problem which isn’t fully solved (Yudkowsky says so), that circular arguments for induction are correction, that foundationalism is correct, been linked to articles to make Bayesian points and told they have good arguments with only a little hand waving, and so on? And I’ve been told further research is being done.
It is proven that the correct epistemology, meaning one that is necessary to achieve general goals, is isomorphic to Bayesianism with some prior (for beings that know all math). What that prior is requires more work. While the constraint of knowing all math is extremely unrealistic, do you agree that the theory of what knowledge would be had in such situations is a useful guide to action until we have a more general theory. Popperian epistemology cannot tell me how much money to bet at what odds for or against P = NP any more than Bayesian epistemology can, but at least Bayesian epistemology set this as a goal.
it is instrumentalist (has not understood the value of explanatory knowledge)
oracles have user interfaces and entering questions isn’t just free and trivial, and did not address the issue that physical reality is a predictive oracle meeting all the specified characteristics of the alien oracle
This is all based on our limited mathematical ability. A theory does have an advantage over an oracle or the reality-oracle: we can read it. Would you agree that all the benefits of a theory come from this plus knowing all math. The difference is one of mathematical knowledge, not of physical knowledge. How does Popper help with this? Are there guidelines for what ‘equivalent’ formulations of a theory are mathematically better? If so, this is something that Bayesianism does not try to cover, so this may have value. However, this is unrelated to the question of the validity of “don’t assign probabilities to theories”.
inductivist (induction is refuted)
I thought I addressed this but, to recap:
p(h, eb) > p(h, b) [bottom left of page 1]
That (well and how much bigger) is all I need to make decisions.
All this means: that factor that contains all of h that does not follow deductively from e is strongly countersupported by e.
So what? I already have my new probabilities.
[T]he calculus of probability reveals that probabilistic support cannot be inductive support.
What is induction if not the calculation of new probabilities for hypotheses? Should I care about these ‘inductive truths’ that Popper disproves the existence of? I already have an algorithm to calculate the best action to take. It seems like Bayesianism isn’t inductivist by Popper’s definition.
moral knowledge
I’d like to be sure that we are using the same definitions of our terms, so please give an example.
You mean proven given some assumptions about what an epistemology should be, right?
Would you agree that all the benefits of a theory come from this [can read it] plus knowing all math.
No. We need explanations to understand the world. In real life, is only when we have explanations that we can make good predictions at all. For example, suppose you have a predictive theory about dice and you want to make bets. I chose that example intentionally to engage with areas of your strength. OK, now you face the issue: does a particular real world situation have the correct attributes for my predictive theory to apply? You have to address that to know if your predictions will be correct or not. We always face this kind of problem to do much of anything. How do we figure out when our theories apply? We come up with explanations about what kinds of situations they apply to, and what situation we are in, and we then come up with explanations about why we think we are/aren’t in the right kind of situation, and we use critical argument to improve these explanations. Bayesian Epistemology does not address all this.
p(h, eb) > p(h, b) [bottom left of page 1]
I replied to that. Repeating: if you increase the probability of infinitely many theories, the problem of figuring out a good theory is not solved. So that is not all you need.
Further, I’m still waiting on an adequate answer about what support is (inductive or otherwise) and how it differs from consistency.
I gave examples of moral knowledge in another comment to you. Morality is knowledge about how to live, what is a good life. e.g. murder is immoral.
You mean proven given some assumptions about what an epistemology should be, right?
Yes, I stated my assumptions in the sentence, though I may have missed some.
We always face this kind of problem to do much of anything. How do we figure out when our theories apply?
This comes back to the distinction between one complete theory that fully specifies the universe and a set of theories that are considered to be one because we are only looking at a certain domain. In the former case, the domain of applicability is everywhere. In the latter, we have a probability distribution that tells us how likely it is to fail in every domain. So, this kind of thing is all there in the math.
I replied to that. Repeating: if you increase the probability of infinitely many theories, the problem of figuring out a good theory is not solved. So that is not all you need.
What do you mean by ‘a good theory’. Bayesian never select one theory as ‘good’ as follow that; we always consider the possibility of being wrong. When theories have higher probability than others, I guess you could call them good. I don’t see why this is hard; just calculate P(H | E) for all the theories and give more weight to the more likely ones when making decisions.
Further, I’m still waiting on an adequate answer about what support is (inductive or otherwise) and how it differs from consistency.
Evidence supports a hypothesis if P(H | E) > P(H). Two statements, A, B, are consistent if ¬(A&B → ⊥). I think I’m missing something.
Evidence supports a hypothesis if P(H | E) > P(H). Two statements, A, B, are consistent if ¬(A&B → ⊥). I think I’m missing something.
Let’s consider only theories which make all their predictions with 100% probability for now. And theories which cover everything.
Then:
If H and E are consistent, then it follows that P(H | E) > P(H).
For any given E, consider how much greater the probability of H is, for all consistent H. That amount is identical for all H considered.
We can put all the Hs in two categories: the consistent ones which gain equal probability, and the inconsistent ones for which P(H|E) = 0. (Assumption warning: we’re relying on getting it right which H are consistent with which E.)
This means:
1) consistency and support coincide.
2) there are infinitely many equally supported theories. There are only and exactly two amounts of support that any theory has given all current evidence, one of which is 0.
3) The support concept plays no role in helping us distinguish between the theories with more than 0 support.
4) The support concept can be dropped entirely because it has no use at all. The consistency concept does everything
5) All mention of probability can be dropped too, since it wasn’t doing anything.
6) And we still have the main problem of epistemology left over, which is dealing with the theories that aren’t refuted by evidence
Similar arguments can be made without my initial assumptions/restrictions. For example introducing theories that make predictions with less than 100% probability will not help you because they are going to have lower probability than theories which make the same predictions with 100% probability.
For any given E, consider how much greater the probability of H is, for all consistent H. That amount is identical for all H considered.
Well the ratio is the same, but that’s probably what you meant.
5) All mention of probability can be dropped too, since it wasn’t doing anything.
6) And we still have the main problem of epistemology left over, which is dealing with the theories that aren’t refuted by evidence
Have a prior. This reintroduces probabilities and deals with the remaining theories. You will converge on the right theory eventually no matter what your prior is. Of course, that does not mean that all priors are equally rational.
If they all have the same prior probability, then their probabilities are the same and stay that way. If you use a prior which arbitrarily (in my view) gives some things higher prior probabilities in a 100% non-evidence-based way, I object to that, and it’s a separate issue from support.
How does having a prior save the concept of support? Can you give an example? Maybe the one here, currently near the bottom:
If they all have the same prior probability, then their probabilities are the same and stay that way.
Well shouldn’t they? If you look at it from the perspective of making decisions rather than finding one right theory, it’s obvious that they are equiprobable and this should be recognized.
If you use a prior which arbitrarily (in my view) gives some things higher prior probabilities in a 100% non-evidence-based way, I object to that, and it’s a separate issue from support.
Solomonoff does not give “some things higher prior probabilities in a 100% non-evidence-based way”. All hypotheses have the same probability, many just make similar predictions.
It seems someone has downvoted you for not being familiar with Eliezer’s work on AI. Basically, this is overly anthropomorphic. It is one of our goals to ensure that an AI can progress from a ‘seed AI’ to a superintelligent AI without anything going wrong, but, in practice, we’ve observed that using metaphors like ‘parenting’ confuses people too much to make progress, so we avoid it.
I wasn’t using parenting as a metaphor. I meant it quite literally (only the educational part, not the diaper changing).
One of the fundamental attributes of an AI is that it’s a program which can learn new things.
Humans are also entities that learn new things.
But humans, left alone, don’t fare so well. Helping people learn is important, especially children. This avoids having everyone reinvent the wheel.
The parenting issue therefore must be addressed for AI. I am familiar with the main ideas of the kind of AI work you guys do, but I have not found the answer to this.
One possible way to address it is to say the AI will reinvent the wheel. It will have no help but just figure everything out from scratch.
Another approach would be to program some ideas into the AI (changeable, or not, or some of each), and then leave it alone with that starting point.
Another approach would be to talk with the AI, answer its questions, lecture it, etc… This is the approach humans use with their children.
Each of these approaches has various problems with it which are non-trivial to solve.
I wasn’t using parenting as a metaphor. I meant it quite literally (only the educational part, not the diaper changing).
When humans hear parenting, they think of the human parenting process. Describe the AI as ‘learning’ and the humans as ‘helping it learn’. This get us closer to the idea of humans learning about the universe around them, rather than being raised as generic members of society.
Don’t worry about downvotes, they do not matter.
Well, the point of down votes is discourage certain behaviour, and I agree that you should use terminology that we have found less likely to cause confusion.
This is definitely an important problem, but we’re not really at the stage where it is necessary yet. I don’t see how we could make much progress on how to get an AI to learn without knowing the algorithms that it will use to learn.
When humans hear parenting, they think of the human parenting process.
Not all humans. Not me. Is that not a bias?
Well, the point of down votes is discourage certain behaviour
I don’t discourage without any argument being given, just on the basis of someone’s judgement without knowing the reason. I don’t think I should. I think that would be irrational. I’m surprised that this community wants to encourage people to conform to the collective opinion of others as expressed by votes.
I don’t see how we could make much progress on how to get an AI to learn without knowing the algorithms that it will use to learn.
OK, I think I see where you are coming from. However, there is only one known algorithm that learns (creates knowledge). It is, in short, evolution. We should expect an AI to use it, we shouldn’t expect a brand new solution to this hard problem (historically there have been very few candidate solutions proposed, most not at all promising).
The implementation details are not very important because the result will be universal, just like people are. This is similar to how the implementation details of universal computers are not important for many purposes.
Are you guys familiar with these concepts? There is important knowledge relevant to creating AIs which your statement seems to me to overlook.
I don’t discourage without any argument being given, just on the basis of someone’s judgement without knowing the reason. I don’t think I should. I think that would be irrational. I’m surprised that this community wants to encourage people to conform to the collective opinion of others as expressed by votes.
As a general rule, if I downvote, I either reply to the post, or it is something that should be obvious to someone who has read the main sequences.
OK, I think I see where you are coming from. However, there is only one known algorithm that learns (creates knowledge). It is, in short, evolution.
No, there is another: the brain. It is also much faster than evolution, an advantage I would want a FAI to have.
You’re conflating two things. Biological evolution is a very specific algorithm, with well-studied mathematical properties. ‘Evolution’ in general just means any change over time. You seem to be using it in an intermediate sense, as any change that proceeds through reproduction, variation, and selection, which is also a common meaning. This, however, is still very broad, so there’s very little that you can learn about an AI just from knowing “it will come up with many ideas, mostly based on previous ones, and reject most of them”. This seems less informative than “it will look at evidence and then rationally adjust its understanding”.
Why is it that you guys want to make AI but don’t study relevant topics like this?
Eliezer has studied cognitive science. Those of us not working directly with him have very little to do with AI design. Even Eliezer’s current work is slightly more background theory than AI itself.
I’m not conflating them. I did not mean “change over time”.
There are many things we can learn from evolutionary epistemology. It seeming broad to you does not prevent that. You would do better to ask what good it is instead of guess it is no good.
For one thing it connects with meme theory.
A different example is that it explains misunderstandings when people communicate. Misunderstandings are extremely common because communication involves 1) guessing what the other person is trying to say 2) selecting between those guesses with criticism 3) making more guesses which are variants of previous guesses 4) more selection 5) etc
This explanation helps us see how easily communication can go wrong. It raises interesting issues like why so much communication doesn’t go wrong. It refutes various myths like that people absorb their teacher’s lectures a little like sponges.
It matters. And other explanations of miscommunication are worse.
Eliezer has studied cognitive science.
But that isn’t the topic I was speaking of. I meant evolutionary epistemology. Which btw I know that Eliezer has not studied much because he isn’t familiar with one of it’s major figures (Popper).
Evolution is a largely philosophical theory (distinct from the scientific theory about the history of life of earth). It is a theory of epistemology. Some parts of epistemology technically depend on the laws of physics, but it is general researched separately from physics. There has not been any science experiment to test it which I consider important, but I could conceive of some because if you specified different and perverse laws of physics you could break evolution. In a different sense, evolution is tested constantly in that the laws of physics and evidence we see around us, every day, are not that perverse but conceivable physics that would break evolution.
The reason I accept evolution (again I refer to the epistemological theory about how knowledge is created) is that it is a good explanation, and it solves an important philosophical problem, and I don’t know anything wrong with it, and I also don’t know any rivals which solve the problem.
The problem has a long history. Where does “apparent design” come from? Paley gave an example of finding a watch in nature, which he said you know can’t have gotten there by chance. That’s correct—the watch has knowledge (aka apparent designed, or purposeful complexity, or many other terms). The watch is adapted “to a purpose” as some people put it (I’m not really a fan of the purpose terminology. But it’s adapted! And I think it gets the point across ok.)
Paley then guessed as follows: there is no possible solution to the origins of knowledge other than “A designer (God) created it”. This is a very bad solution even pre-Darwin because it does not actually solve the problem. The designer itself has knowledge, adaptation to a purpose, whatever. So where did it come from? The origin is not answered.
Since then, the problem has been solved by the theory of evolution and nothing else. And it applies to more than just watches found in nature, and to plants and animals. It also applies to human knowledge. The answer “intelligence did it” is no better than “God did it”. How does intelligence do it? The only known answer is: by evolution.
The best thing to read on this topic is The Beginning of Infinity by David Deutsch which discusses Popperian epistemology, evolution, Paley’s problem and its solution, and also has two chapters about meme theory which give important applications.
Also here: http://fallibleideas.com/tradition (Deutsch discusses static and dynamic memes and societies. I discuss “traditions” rather than “memes”. It’s quite similar stuff.)
Evolution is a largely philosophical theory (distinct from the scientific theory about the history of life of earth). It is a theory of epistemology. Some parts of epistemology technically depend on the laws of physics, but it is general researched separately from physics.
What? Epistemological evolution seems to be about how the mind works, independent of what philosophical status is accorded to the thoughts. Surely it could be tested just by checking if the mind actually develops ideas in accordance with the way it is predicted to.
If you want to check how minds work, you could do that. But that’s very hard. We’re not there yet. We don’t know how.
How minds work is a separate issue from evolutionary epistemology. Epistemology is about how knowledge is created (in abstract, not in human minds specifically). If it turns out there is another way, it wouldn’t upset the evolution would create knowledge if done in minds.
There’s no reason to think there is another way. No argument that there is. No explanation of why to expect there to be. No promising research on the verge of working one out. Shrug.
Epistemology is about how knowledge is created (in abstract, not in human minds specifically).
I see. I thought that evolutionary epistemology was a theory of human minds, though I know that that technically isn’t epistemology. Does evolutionary epistemology describe knowledge about the world, mathematical knowledge, or both (I suspect you will say both)?
So, you’re saying that in order to create knowledge, there has to be copying, variation, and selection. I would agree with the first two, but not necessarily the third. Consider a formal axiomatic system. It produces an ever-growing list of theorems, but none are ever selected any more than others. Would you still consider this system to be learning?
With deduction, all the consequences are already contained in the premises and axioms. Abstractly, that’s not learning.
When human mathematicians do deduction, they do learn stuff, because they also think about stuff while doing it, they don’t just mechanically and thoughtlessly follow the rules of math.
So induction (or probabilistic updating, since you said that Popper proved it not to be the same as whatever philosopher call ‘induction’) isn’t learning either because the conclusions are contained in the priors and observations?
If the axiomatic system was physically implemented in a(n ever-growing) computer, would you consider this learning?
the idea of induction is that the conclusions are NOT logically contained in the observations (that’s why it is not deduction).
if you make up a prior from which everything deductively follows, and everything else is mere deduction from there, then all of your problems and mistakes are in the prior.
If the axiomatic system was physically implemented in a(n ever-growing) computer, would you consider this learning?
no. learning is creating new knowledge. that would simply be human programmers putting their own knowledge into a prior, and then the machine not creating any new knowledge that wasn’t in the prior.
The correct method of updating one’s probability distributions is contained in the observations. P(H|E) = P(H)P(E|H)/P(E) .
If the axiomatic system was physically implemented in a(n ever-growing) computer, would you consider this learning?
no. learning is creating new knowledge. that would simply be human programmers putting their own knowledge into a prior, and then the machine not creating any new knowledge that wasn’t in the prior.
So how could evolutionary epistemology be relevant to AI design?
AIs are programs that create knowledge. That means they need to do evolution. That means they need, roughly, a conjecture generator, a criticism generator, and a criticism evaluator. The conjecture generator might double as the criticism generator since a criticism is a type of conjecture, but it might not.
The conjectures need to be based on the previous conjectures (not necessarily all of the, but some). That makes it replication with variation. The criticism is the selection.
Any AI design that completely ignores this is, imo, hopeless. I think that’s why the AI field hasn’t really gotten anywhere. They don’t understand what they are trying to make, because they have the wrong philosophy (in particular the wrong explanations. i don’t mean math or logic).
AIs are programs that create knowledge. That means they need to do evolution. That means they need, roughly, a conjecture generator, a criticism generator, and a criticism evaluator. The conjecture generator might double as the criticism generator since a criticism is a type of conjecture, but it might not.
Note that there are AI approaches which do do something close to what you think an AI “needs”. For example, some of Simon Colton’s work can be thought of in a way roughly like what you want. But it is a mistake to think that such an entity needs to do that. (Some of the hardcore Bayesians make the same mistake in assuming that an AI must use a Bayesian framework. That something works well as a philosophical approach is not the same claim as that it should work well in a specific setting where we want an artificial entity to produce certain classes of systematic reliable results.)
Those aren’t AIs. They do not create new knowledge. They do not “learn” in my sense—of doing more than they were programmed to. All the knowledge is provided by the human programmer—they are designed by an intelligent person and to the extent they “act intelligent” it’s all due to the person providing the thinking for it.
Those aren’t AIs. They do not create new knowledge. They do not “learn” in my sense—of doing more than they were programmed to.
I’m not sure this is at all well-defined. I’m curious, what would make you change your mind? If for example, Colton’s systems constructed new definitions, proofs, conjectures, and counter-examples in math would that be enough to decide they were learning?
Could you explain how this is connected to the issue of making new knowledge?
Or: show me the code, and explain to me how it works, and how the code doesn’t contain all the knowledge the AI creates.
This seems a bit like showing a negative. I will suggest you look for a start at Simon Colton’s paper in the Journal of Integer Sequences which uses a program that operates in a way very close to the way you think an AI would need to operate in terms of making conjectures and trying to refute them. I don’t know if the source code is easily available. It used to be on Colton’s website but I don’t see it there anymore; if his work seems at all interesting to you you can presumably email him requesting a copy. I don’t know how to show that the AI “doesn’t contain all the knowledge the AI creates” aside from the fact that the system constructed concepts and conjectures in number theory which had not previously been constructed. Moreover, Colton’s own background in number theory is not very heavy, so it is difficult to claim that he’s importing his own knowledge into the code. If you define more precisely what you mean by the code containing the knowledge I might be able to answer that further. Without a more precise notion it isn’t clear to me how to respond.
Holding a conversation requires creating knowledge of what the other guy is saying.
In deduction, you agree that the conclusions are logically contained in the premises and axioms, right? They aren’t something new.
In a spam filter, a programmer figures out how he wants spam filtered (he has the idea), then he tells the computer to do it. The computer doesn’t figure out the idea or any new idea.
With biological evolution, for example, we see something different. You get stuff out, like cats, which weren’t specified in advance. And they aren’t a trivial extension; they contain important knowledge such as the knowledge of optics that makes their eyes work. This is why “Where can cats come from?” has been considered an important question (people want an explanation of the knowledge which i sometimes called “apparent design), while “Where can rocks come from?” is not in the same category of question (it does have some interest for other reasons).
With people, people create ideas that aren’t in their genes, and were’t told to them by their parents or anyone else. That includes abstract ideas that aren’t the summation of observation. They sometimes create ideas no one ever thought of before. THey create new ideas.
In an AI (AGI you call it?) should be like a person: it should create new ideas which are not in it’s “genes” (programming). If someone actually writes an AI they will understand how it works and they can explain it, and we can use their explanation to judge whether they “cheated” or not (whether they, e.g., hard coded some ideas into the program and then said the AI invented them).
In deduction, you agree that the conclusions are logically contained in the premises and axioms, right? They aren’t something new.
Ok. So to make sure I understand this claim. You are asserting that mathematicians are not constructing anything “new” when they discover proofs or theorems in set axiomatic systems?
With biological evolution, for example, we see something different. You get stuff out, like cats, which weren’t specified in advance. And they aren’t a trivial extension;
Are genetic algorithm systems then creating something new by your definition?
In an AI (AGI you call it?)
Different concepts. An artificial intelligent is not (necessarily) a well-defined notion. An AGI is an artficial general intelligence, essentially something that passes the Turing test. Not the same concept.
If someone actually writes an AI they will understand how it works and they can explain it, and we can use their explanation to judge whether they “cheated” or not (whether they, e.g., hard coded some ideas into the program and then said the AI invented them).
I see no reason to assume that a person will necessarily understand how an AGI they constructed works. To use the most obvious hypothetical, someone might make a neural net modeled very closely after the human brain that functions as an AGI without any understanding of how it works.
Ok. So to make sure I understand this claim. You are asserting that mathematicians are not constructing anything “new” when they discover proofs or theorems in set axiomatic systems?
When you “discover” that 2+1 = 3, given premises and axioms, you aren’t discovering something new.
But working mathematicians do more than that. They create new knowledge. It includes:
1) they learn new ways to think about the premises and axioms
2) they do not publish deductively implied facts unselectively or randomly. they choose the ones that they consider important. by making these choices they are adding content not found in the premises and axioms
3) they make choices between different possible proofs of the same thing. again where they make choices they are adding stuff, based on their own non-deductive understanding
4) when mathematicians work on proofs, they also think about stuff as they go. just like when experimental scientists do fairly mundane tasks in a lab, at the same time they will think and make it interesting with their thoughts.
Are genetic algorithm systems then creating something new by your definition?
They could be. I don’t think any exist yet that do. For example I read a Dawkins paper about one. In the paper he basically explained how he tweaked the code in order to get the results he wanted. He didn’t, apparently, realize that it was him, not the program, creating the output.
By “AI” I mean AGI. An intelligence (like a person) which is artificial. Please read all my prior statements in light of that.
I see no reason to assume that a person will necessarily understand how an AGI they constructed works. To use the most obvious hypothetical, someone might make a neural net modeled very closely after the human brain that functions as an AGI without any understanding of how it works.
Well, OK, but they’d understand how it was created, and could explain that. They could explain what they know about why it works (it copies what humans do). And they could also make the code public and discuss what it doesn’t include (e.g. hard coded special cases. except for the 3 he included on purpose, and he explains why they are there). That’d be pretty convincing!
I don’t think this is true. While he probably wouldn’t announce it if he was working on AI, he’ has indicated that he’s working on two books (HPMoR and a rationality book), and has another book queued. He’s also indicated that he doesn’t think anyone should work on AI until the goal system stability problem is solved, which he’s talked about thinking about but hasn’t published anything on, which probably means he’s stuck.
I more meant “he’s probably thinking about this in the back of his mind fairly often”, as well as trying to be humourous.
He’s also indicated that he doesn’t think anyone should work on AI until the goal system stability problem is solved, which he’s talked about thinking about but hasn’t published anything on, which probably means he’s stuck.
Do you know what he would think of work that has a small chance of solving goal stability and a slightly larger chance of helping with AI in general? This seems like a net plus to me, but you seem to have heard what he thinks should be studied from a slightly clearer source than I did.
No, it has tons of theory. God is a very complex concept. Note that ‘God did everything’ is more complex and therefore less likely than ‘everything happened’. Did you read http://lesswrong.com/lw/jp/occams_razor/ ?
How do you figure God is complex? God as I mean it simply can do anything, no reason given. That is its only attribute: that it arbitrarily does anything the theory its attached to cares to predict. We can even stop calling it “God”. We could even not mention it at all so there is no theory and merely give a list of predictions. Would that be good, in your view?
If ‘God’ is meaningless and can merely be attached to any theory, then the theory is the same with and without God. There is nothing to refute, since there is no difference. If you defined ‘God’ to mean a being who created all species or who commanded a system of morality, I would have both reason to care about and means to refute God. If you defined ‘God’ to mean ‘quantum physics’, there would be applications and means of proving that ‘God’ is a good approximation, but this definition is nonsensical, since it is not what is usually meant by ‘God. If the theory of ‘God’ has no content, there is nothing to discuss, but the is again a very unusual definition.
Would a list of predictions with no theory/explanation be good or bad, in your view?
If there is no simpler description, then a list of predictions is better but, if an explanation simpler then merely a list of prediction is at all possible, then that would be more likely.
How do you decide if an explanation is simpler than a list of predictions? Are you thinking in terms of data compression?
Do you understand that the content of an explanation is not equivalent to the predictions it makes? It offers a different kind of thing than just predictions.
Essentially. It is simpler if it has a higher Solomonoff prior.
Yes, there is more than just predictions. However, prediction are the only things that tell us how to update our probability distributions.
So, your epistemology is 100% instrumentalist and does not deal with non-predictive knowledge at all?
Can you give an example of non-predictive knowledge and what role it should play?
Quoting from The Fabric of Reality, chapter 1, by David Deutsch.
Yet some philosophers — and even some scientists — disparage the role of explanation in science. To them, the basic purpose of a scientific theory is not to explain anything, but to predict the outcomes of experiments: its entire content lies in its predictive formulae. They consider that any consistent explanation that a theory may give for its predictions is as good as any other — or as good as no explanation at all — so long as the predictions are true. This view is called instrumentalism (because it says that a theory is no more than an ‘instrument’ for making predictions). To instrumentalists, the idea that science can enable us to understand the underlying reality that accounts for our observations is a fallacy and a conceit. They do not see how anything a scientific theory may say beyond predicting the outcomes of experiments can be more than empty words.
[cut a quote of Steven Weinberg clearly advocating instrumentalism. the particular explanation he says doesn’t matter is that space time is curved. space time curvature is an example of a non-predictive explanation.]
imagine that an extraterrestrial scientist has visited the Earth and given us an ultra-high-technology ‘oracle’ which can predict the outcome of any possible experiment, but provides no explanations. According to instrumentalists, once we had that oracle we should have no further use for scientific theories, except as a means of entertaining ourselves. But is that true? How would the oracle be used in practice? In some sense it would contain the knowledge necessary to build, say, an interstellar spaceship. But how exactly would that help us to build one, or to build another oracle of the same kind — or even a better mousetrap? The oracle only predicts the outcomes of experiments. Therefore, in order to use it at all we must first know what experiments to ask it about. If we gave it the design of a spaceship, and the details of a proposed test flight, it could tell us how the spaceship would perform on such a flight. But it could not design the spaceship for us in the first place. And even if it predicted that the spaceship we had designed would explode on take-off, it could not tell us how to prevent such an explosion. That would still be for us to work out. And before we could work it out, before we could even begin to improve the design in any way, we should have to understand, among other things, how the spaceship was supposed to work. Only then would we have any chance of discovering what might cause an explosion on take-off. Prediction — even perfect, universal prediction — is simply no substitute for explanation.
Similarly, in scientific research the oracle would not provide us with any new theory. Not until we already had a theory, and had thought of an experiment that would test it, could we possibly ask the oracle what would happen if the theory were subjected to that test. Thus, the oracle would not be replacing theories at all: it would be replacing experiments. It would spare us the expense of running laboratories and particle accelerators.
[cut elaboration]
The oracle would be very useful in many situations, but its usefulness would always depend on people’s ability to solve scientific problems in just the way they have to now, namely by devising explanatory theories. It would not even replace all experimentation, because its ability to predict the outcome of a particular experiment would in practice depend on how easy it was to describe the experiment accurately enough for the oracle to give a useful answer, compared with doing the experiment in reality. After all, the oracle would have to have some sort of ‘user interface’. Perhaps a description of the experiment would have to be entered into it, in some standard language. In that language, some experiments would be harder to specify than others. In practice, for many experiments the specification would be too complex to be entered. Thus the oracle would have the same general advantages and disadvantages as any other source of experimental data, and it would be useful only in cases where consulting it happened to be more convenient than using other sources. To put that another way: there already is one such oracle out there, namely the physical world. It tells us the result of any possible experiment if we ask it in the right language (i.e. if we do the experiment), though in some cases it is impractical for us to ‘enter a description of the experiment’ in the required form (i.e. to build and operate the apparatus). But it provides no explanations.
In a few applications, for instance weather forecasting, we may be almost as satisfied with a purely predictive oracle as with an explanatory theory. But even then, that would be strictly so only if the oracle’s weather forecast were complete and perfect. In practice, weather forecasts are incomplete and imperfect, and to make up for that they include explanations of how the forecasters arrived at their predictions. The explanations allow us to judge the reliability of a forecast and to deduce further predictions relevant to our own location and needs. For instance, it makes a difference to me whether today’s forecast that it will be windy tomorrow is based on an expectation of a nearby high-pressure area, or of a more distant hurricane. I would take more precautions in the latter case.
[“wind due to hurricane” and “wind due to high-pressure area” are different explanations for a particular prediction.]
So knowledge is more than just predictive because it also lets us design things?
Here’s a solution to the problem with the oracle—design a computer that inputs every possible design to the oracle and picks the best. You may object that this would be extremely time-consuming and therefore impractical. However, you don’t need to build the computer; just ask the oracle what would happen if you did.
What can we learn from this? This kind of knowledge can be seen as predictive, but only incidentally, because the computer happen to be implemented in the physical world. If it were implemented mathematically, as an abstract algorithm, we would recognize this as deductive, mathematical knowledge. But math is all about tautologies; nothing new is learned. Okay, I apologize for that. I think I’ve been changing my definition of knowledge repeatedly to include or exclude such things. I don’t really care as much about consistent definitions as I should. Hopefully it is clear from context. I’ll go back to your original question.
The difference between the two cases is not the same as the crucial difference here. Having a theory as opposed to a list of predictions for every possible experiment does not necessarily make the theorems easier to prove. When it does, which is almost always, this is simply because that theory is more concise, so it is easier to deduce things from. This seems more like a matter of computing power than one of epistemology.
How does it pick the best?
And wouldn’t the oracle predict that the computer program would never halt, since it would attempt to enter infinitely many questions into the oracle?
According to some predetermined criteria. “How well does this spaceship fly?” “How often does it crash?” Making a computer evaluate machines is not hard in principle, and is beside the point.
I was assuming a finite maximum size with only finitely many distinguishable configurations in that size, but, again, this is irrelevant; whatever trick you use to make this work, you will not change the conclusions.
I think figuring out what criteria you want is an example of a non-predictive issue. That makes it not beside the point. And if the computer picks the best according to criteria we give it, they will contain mistakes. We won’t actually get the best answer. We’ll have to learn stuff and improve our knowledge all in order to set up your predictive thing. So there is this whole realm of non-predictive learning.
So you make assumptions like a spaceship is a thing made out of atoms. If your understanding of physics (and therefore your assumptions) is incorrect then your use of the oracle won’t work out very well. So your ability to get useful predictions out of the oracle depends on your understanding, not just on predicting anything.
So I just give it my brain and tell it to do what it wants. Of course, there are missing steps, but they should be purely deductive. I believe that is what Eliezer is working on now :)
Good point. I guess you can’t bootstrap an oracle like this; some things possible mathematically, like calculating a function over an infinity of points, just can’t be done physically. My point still stands, but this illustration definitely dies.
That’s it? That’s just not very impressive by my standards. Popper’s epistemology is far more advanced, already. Why do you guys reject and largely ignore it? Is it merely because Eliezer published a few sentences of nasty anti-Popper myths in an old essay?
By ‘what Eliezer is working on now’ I meant AI, which would probably be necessary to extract my desires from my brain in practice. In principle, we could just use Bayes’ theorem a lot, assuming we had precise definitions of these concepts.
Popperian epistemology is incompatible with Bayesian epistemology, which I accept from its own justification, not from a lack of any other theory. I disliked what I had heard about Popper before I started reading LessWrong, but I forget my exact argument, so I do not know if it was valid. From what I do remember, I suspect it was not.
So, you reject Popper’s ideas without having any criticism of them that you can remember?
That’s it?
You don’t care that Popper’s ideas have criticisms of Bayesian epistemology which you haven’t answered. You feel you don’t need to answer criticisms because Bayesian epistemology is self-justifying and thus all criticisms of it must be wrong. Is that it?
No, I brought up my past experience with Popper because you asked if my opinions on him came from Eliezer.
No, I think Bayesian epistemology has been mathematically proven. I don’t spend a lot of time investigating alternatives for the same reason I don’t spend time investigating alternatives to calculus.
If you have a valid criticism, “this is wrong” or “you haven’t actually proved this” as opposed to “this has a limited domain of applicability” (actually, that could be valid if Popperian epistemology can answer a question that Bayesianism can’t), I would love to know. You did bring up some things of this type, like that paper by Popper, but none of them have logically stood up, unless I am missing something.
If Bayesian epistemology is mathematically proven, why have I been told in my discussions here various things such as: there is a regress problem which isn’t fully solved (Yudkowsky says so), that circular arguments for induction are correction, that foundationalism is correct, been linked to articles to make Bayesian points and told they have good arguments with only a little hand waving, and so on? And I’ve been told further research is being done.
It seems to me that saying it’s proven, the end, is incompatible with it having any flaws or unsolved problems or need for further research. So, which is it?
All of the above. It is wrong b/c, e.g., it is instrumentalist (has not understood the value of explanatory knowledge) and inductivist (induction is refuted). It is incomplete b/c, e.g. it cannot deal with non-observational knowledge such as moral knowledge. You haven’t proved much to me however I’ve been directed to two books, so judgment there is pending.
I don’t know how you concluded that none of my arguments stood up logically. Did you really think you’d logically refuted every point? I don’t agree, I think most of your arguments were not pure logic, and I thought that various issues were pending further discussion of sub-points. As I recall, some points I raised have not been answered. I’m having several conversations in parallel so I don’t recall which in particular you didn’t address which were replies to you personally, but for example I quoted an argument by David Deutsch about an oracle. The replies I got about how to try to cheat the oracle did not address the substantive point of the thought experiment, and did not address the issue (discussed in the quote) that oracles have user interfaces and entering questions isn’t just free and trivial, and did not address the issue that physical reality is a predictive oracle meeting all the specified characteristics of the alien oracle (we already have an oracle and none of the replies I got about use the oracle would actually work with the oracle we have). As I saw it, my (quoted) points on that issue stood. The replies were some combination of incomplete and missing the point. They were also clever which is a bit of fun. I thought of what I think is a better way to try to cheat the rules, which is to ask the oracle to predict the contents of philosophy books that would be written if philosophy was studied for trillions of years by the best people. However, again, the assumption that any question which is easily described in English can be easily entered into the oracle and get a prediction was not part of the thought experiment. And the reason I hadn’t explained all this yet is that there were various other points pending anyway, so shrug, it’s hard to decide where to start when you have many different things to say.
It is proven that the correct epistemology, meaning one that is necessary to achieve general goals, is isomorphic to Bayesianism with some prior (for beings that know all math). What that prior is requires more work. While the constraint of knowing all math is extremely unrealistic, do you agree that the theory of what knowledge would be had in such situations is a useful guide to action until we have a more general theory. Popperian epistemology cannot tell me how much money to bet at what odds for or against P = NP any more than Bayesian epistemology can, but at least Bayesian epistemology set this as a goal.
This is all based on our limited mathematical ability. A theory does have an advantage over an oracle or the reality-oracle: we can read it. Would you agree that all the benefits of a theory come from this plus knowing all math. The difference is one of mathematical knowledge, not of physical knowledge. How does Popper help with this? Are there guidelines for what ‘equivalent’ formulations of a theory are mathematically better? If so, this is something that Bayesianism does not try to cover, so this may have value. However, this is unrelated to the question of the validity of “don’t assign probabilities to theories”.
I thought I addressed this but, to recap:
That (well and how much bigger) is all I need to make decisions.
So what? I already have my new probabilities.
What is induction if not the calculation of new probabilities for hypotheses? Should I care about these ‘inductive truths’ that Popper disproves the existence of? I already have an algorithm to calculate the best action to take. It seems like Bayesianism isn’t inductivist by Popper’s definition.
I’d like to be sure that we are using the same definitions of our terms, so please give an example.
You mean proven given some assumptions about what an epistemology should be, right?
No. We need explanations to understand the world. In real life, is only when we have explanations that we can make good predictions at all. For example, suppose you have a predictive theory about dice and you want to make bets. I chose that example intentionally to engage with areas of your strength. OK, now you face the issue: does a particular real world situation have the correct attributes for my predictive theory to apply? You have to address that to know if your predictions will be correct or not. We always face this kind of problem to do much of anything. How do we figure out when our theories apply? We come up with explanations about what kinds of situations they apply to, and what situation we are in, and we then come up with explanations about why we think we are/aren’t in the right kind of situation, and we use critical argument to improve these explanations. Bayesian Epistemology does not address all this.
I replied to that. Repeating: if you increase the probability of infinitely many theories, the problem of figuring out a good theory is not solved. So that is not all you need.
Further, I’m still waiting on an adequate answer about what support is (inductive or otherwise) and how it differs from consistency.
I gave examples of moral knowledge in another comment to you. Morality is knowledge about how to live, what is a good life. e.g. murder is immoral.
Yes, I stated my assumptions in the sentence, though I may have missed some.
This comes back to the distinction between one complete theory that fully specifies the universe and a set of theories that are considered to be one because we are only looking at a certain domain. In the former case, the domain of applicability is everywhere. In the latter, we have a probability distribution that tells us how likely it is to fail in every domain. So, this kind of thing is all there in the math.
What do you mean by ‘a good theory’. Bayesian never select one theory as ‘good’ as follow that; we always consider the possibility of being wrong. When theories have higher probability than others, I guess you could call them good. I don’t see why this is hard; just calculate P(H | E) for all the theories and give more weight to the more likely ones when making decisions.
Evidence supports a hypothesis if P(H | E) > P(H). Two statements, A, B, are consistent if ¬(A&B → ⊥). I think I’m missing something.
Let’s consider only theories which make all their predictions with 100% probability for now. And theories which cover everything.
Then:
If H and E are consistent, then it follows that P(H | E) > P(H).
For any given E, consider how much greater the probability of H is, for all consistent H. That amount is identical for all H considered.
We can put all the Hs in two categories: the consistent ones which gain equal probability, and the inconsistent ones for which P(H|E) = 0. (Assumption warning: we’re relying on getting it right which H are consistent with which E.)
This means:
1) consistency and support coincide.
2) there are infinitely many equally supported theories. There are only and exactly two amounts of support that any theory has given all current evidence, one of which is 0.
3) The support concept plays no role in helping us distinguish between the theories with more than 0 support.
4) The support concept can be dropped entirely because it has no use at all. The consistency concept does everything
5) All mention of probability can be dropped too, since it wasn’t doing anything.
6) And we still have the main problem of epistemology left over, which is dealing with the theories that aren’t refuted by evidence
Similar arguments can be made without my initial assumptions/restrictions. For example introducing theories that make predictions with less than 100% probability will not help you because they are going to have lower probability than theories which make the same predictions with 100% probability.
Well the ratio is the same, but that’s probably what you meant.
Have a prior. This reintroduces probabilities and deals with the remaining theories. You will converge on the right theory eventually no matter what your prior is. Of course, that does not mean that all priors are equally rational.
If they all have the same prior probability, then their probabilities are the same and stay that way. If you use a prior which arbitrarily (in my view) gives some things higher prior probabilities in a 100% non-evidence-based way, I object to that, and it’s a separate issue from support.
How does having a prior save the concept of support? Can you give an example? Maybe the one here, currently near the bottom:
http://lesswrong.com/lw/54u/bayesian_epistemology_vs_popper/3urr?context=3
Well shouldn’t they? If you look at it from the perspective of making decisions rather than finding one right theory, it’s obvious that they are equiprobable and this should be recognized.
Solomonoff does not give “some things higher prior probabilities in a 100% non-evidence-based way”. All hypotheses have the same probability, many just make similar predictions.
Is anyone here working on the problem of parenting/educating AIs?
It seems someone has downvoted you for not being familiar with Eliezer’s work on AI. Basically, this is overly anthropomorphic. It is one of our goals to ensure that an AI can progress from a ‘seed AI’ to a superintelligent AI without anything going wrong, but, in practice, we’ve observed that using metaphors like ‘parenting’ confuses people too much to make progress, so we avoid it.
Don’t worry about downvotes, they do not matter.
I wasn’t using parenting as a metaphor. I meant it quite literally (only the educational part, not the diaper changing).
One of the fundamental attributes of an AI is that it’s a program which can learn new things.
Humans are also entities that learn new things.
But humans, left alone, don’t fare so well. Helping people learn is important, especially children. This avoids having everyone reinvent the wheel.
The parenting issue therefore must be addressed for AI. I am familiar with the main ideas of the kind of AI work you guys do, but I have not found the answer to this.
One possible way to address it is to say the AI will reinvent the wheel. It will have no help but just figure everything out from scratch.
Another approach would be to program some ideas into the AI (changeable, or not, or some of each), and then leave it alone with that starting point.
Another approach would be to talk with the AI, answer its questions, lecture it, etc… This is the approach humans use with their children.
Each of these approaches has various problems with it which are non-trivial to solve.
Make sense so far?
When humans hear parenting, they think of the human parenting process. Describe the AI as ‘learning’ and the humans as ‘helping it learn’. This get us closer to the idea of humans learning about the universe around them, rather than being raised as generic members of society.
Well, the point of down votes is discourage certain behaviour, and I agree that you should use terminology that we have found less likely to cause confusion.
AIs don’t necessarily have so much of a problem with this. They learn very differently than humans: http://lesswrong.com/lw/jo/einsteins_arrogance/ , http://lesswrong.com/lw/qj/einsteins_speed/ , http://lesswrong.com/lw/qk/that_alien_message/
This is definitely an important problem, but we’re not really at the stage where it is necessary yet. I don’t see how we could make much progress on how to get an AI to learn without knowing the algorithms that it will use to learn.
Not all humans. Not me. Is that not a bias?
I don’t discourage without any argument being given, just on the basis of someone’s judgement without knowing the reason. I don’t think I should. I think that would be irrational. I’m surprised that this community wants to encourage people to conform to the collective opinion of others as expressed by votes.
OK, I think I see where you are coming from. However, there is only one known algorithm that learns (creates knowledge). It is, in short, evolution. We should expect an AI to use it, we shouldn’t expect a brand new solution to this hard problem (historically there have been very few candidate solutions proposed, most not at all promising).
The implementation details are not very important because the result will be universal, just like people are. This is similar to how the implementation details of universal computers are not important for many purposes.
Are you guys familiar with these concepts? There is important knowledge relevant to creating AIs which your statement seems to me to overlook.
Yes, that would be a bias. Note that this kind of bias is not always explicitly noticed.
As a general rule, if I downvote, I either reply to the post, or it is something that should be obvious to someone who has read the main sequences.
No, there is another: the brain. It is also much faster than evolution, an advantage I would want a FAI to have.
You are unfamiliar with the basic concepts of evolutionary epistemology. The brain internally does evolution of ideas.
Why is it that you guys want to make AI but don’t study relevant topics like this?
You’re conflating two things. Biological evolution is a very specific algorithm, with well-studied mathematical properties. ‘Evolution’ in general just means any change over time. You seem to be using it in an intermediate sense, as any change that proceeds through reproduction, variation, and selection, which is also a common meaning. This, however, is still very broad, so there’s very little that you can learn about an AI just from knowing “it will come up with many ideas, mostly based on previous ones, and reject most of them”. This seems less informative than “it will look at evidence and then rationally adjust its understanding”.
There’s an article related to this: http://lesswrong.com/lw/l6/no_evolutions_for_corporations_or_nanodevices/
Eliezer has studied cognitive science. Those of us not working directly with him have very little to do with AI design. Even Eliezer’s current work is slightly more background theory than AI itself.
I’m not conflating them. I did not mean “change over time”.
There are many things we can learn from evolutionary epistemology. It seeming broad to you does not prevent that. You would do better to ask what good it is instead of guess it is no good.
For one thing it connects with meme theory.
A different example is that it explains misunderstandings when people communicate. Misunderstandings are extremely common because communication involves 1) guessing what the other person is trying to say 2) selecting between those guesses with criticism 3) making more guesses which are variants of previous guesses 4) more selection 5) etc
This explanation helps us see how easily communication can go wrong. It raises interesting issues like why so much communication doesn’t go wrong. It refutes various myths like that people absorb their teacher’s lectures a little like sponges.
It matters. And other explanations of miscommunication are worse.
But that isn’t the topic I was speaking of. I meant evolutionary epistemology. Which btw I know that Eliezer has not studied much because he isn’t familiar with one of it’s major figures (Popper).
I don’t know enough about evolutionary epistemology to evaluate the usefulness and applicability of its ideas.
How was evolutionary epistemology tested? Are there experiments or just introspection?
Evolution is a largely philosophical theory (distinct from the scientific theory about the history of life of earth). It is a theory of epistemology. Some parts of epistemology technically depend on the laws of physics, but it is general researched separately from physics. There has not been any science experiment to test it which I consider important, but I could conceive of some because if you specified different and perverse laws of physics you could break evolution. In a different sense, evolution is tested constantly in that the laws of physics and evidence we see around us, every day, are not that perverse but conceivable physics that would break evolution.
The reason I accept evolution (again I refer to the epistemological theory about how knowledge is created) is that it is a good explanation, and it solves an important philosophical problem, and I don’t know anything wrong with it, and I also don’t know any rivals which solve the problem.
The problem has a long history. Where does “apparent design” come from? Paley gave an example of finding a watch in nature, which he said you know can’t have gotten there by chance. That’s correct—the watch has knowledge (aka apparent designed, or purposeful complexity, or many other terms). The watch is adapted “to a purpose” as some people put it (I’m not really a fan of the purpose terminology. But it’s adapted! And I think it gets the point across ok.)
Paley then guessed as follows: there is no possible solution to the origins of knowledge other than “A designer (God) created it”. This is a very bad solution even pre-Darwin because it does not actually solve the problem. The designer itself has knowledge, adaptation to a purpose, whatever. So where did it come from? The origin is not answered.
Since then, the problem has been solved by the theory of evolution and nothing else. And it applies to more than just watches found in nature, and to plants and animals. It also applies to human knowledge. The answer “intelligence did it” is no better than “God did it”. How does intelligence do it? The only known answer is: by evolution.
The best thing to read on this topic is The Beginning of Infinity by David Deutsch which discusses Popperian epistemology, evolution, Paley’s problem and its solution, and also has two chapters about meme theory which give important applications.
You can also find some, e.g. here: http://fallibleideas.com/evolution-and-knowledge
Also here: http://fallibleideas.com/tradition (Deutsch discusses static and dynamic memes and societies. I discuss “traditions” rather than “memes”. It’s quite similar stuff.)
What? Epistemological evolution seems to be about how the mind works, independent of what philosophical status is accorded to the thoughts. Surely it could be tested just by checking if the mind actually develops ideas in accordance with the way it is predicted to.
If you want to check how minds work, you could do that. But that’s very hard. We’re not there yet. We don’t know how.
How minds work is a separate issue from evolutionary epistemology. Epistemology is about how knowledge is created (in abstract, not in human minds specifically). If it turns out there is another way, it wouldn’t upset the evolution would create knowledge if done in minds.
There’s no reason to think there is another way. No argument that there is. No explanation of why to expect there to be. No promising research on the verge of working one out. Shrug.
I see. I thought that evolutionary epistemology was a theory of human minds, though I know that that technically isn’t epistemology. Does evolutionary epistemology describe knowledge about the world, mathematical knowledge, or both (I suspect you will say both)?
It describes the creation of any type of knowledge. It doesn’t tell you the specifics of any field itself, but doing it helps you learn them.
So, you’re saying that in order to create knowledge, there has to be copying, variation, and selection. I would agree with the first two, but not necessarily the third. Consider a formal axiomatic system. It produces an ever-growing list of theorems, but none are ever selected any more than others. Would you still consider this system to be learning?
With deduction, all the consequences are already contained in the premises and axioms. Abstractly, that’s not learning.
When human mathematicians do deduction, they do learn stuff, because they also think about stuff while doing it, they don’t just mechanically and thoughtlessly follow the rules of math.
So induction (or probabilistic updating, since you said that Popper proved it not to be the same as whatever philosopher call ‘induction’) isn’t learning either because the conclusions are contained in the priors and observations?
If the axiomatic system was physically implemented in a(n ever-growing) computer, would you consider this learning?
the idea of induction is that the conclusions are NOT logically contained in the observations (that’s why it is not deduction).
if you make up a prior from which everything deductively follows, and everything else is mere deduction from there, then all of your problems and mistakes are in the prior.
no. learning is creating new knowledge. that would simply be human programmers putting their own knowledge into a prior, and then the machine not creating any new knowledge that wasn’t in the prior.
The correct method of updating one’s probability distributions is contained in the observations. P(H|E) = P(H)P(E|H)/P(E) .
So how could evolutionary epistemology be relevant to AI design?
AIs are programs that create knowledge. That means they need to do evolution. That means they need, roughly, a conjecture generator, a criticism generator, and a criticism evaluator. The conjecture generator might double as the criticism generator since a criticism is a type of conjecture, but it might not.
The conjectures need to be based on the previous conjectures (not necessarily all of the, but some). That makes it replication with variation. The criticism is the selection.
Any AI design that completely ignores this is, imo, hopeless. I think that’s why the AI field hasn’t really gotten anywhere. They don’t understand what they are trying to make, because they have the wrong philosophy (in particular the wrong explanations. i don’t mean math or logic).
Could you explain where AIXI does any of that?
Or could you explain where a Bayesian spam filter does that?
Note that there are AI approaches which do do something close to what you think an AI “needs”. For example, some of Simon Colton’s work can be thought of in a way roughly like what you want. But it is a mistake to think that such an entity needs to do that. (Some of the hardcore Bayesians make the same mistake in assuming that an AI must use a Bayesian framework. That something works well as a philosophical approach is not the same claim as that it should work well in a specific setting where we want an artificial entity to produce certain classes of systematic reliable results.)
Those aren’t AIs. They do not create new knowledge. They do not “learn” in my sense—of doing more than they were programmed to. All the knowledge is provided by the human programmer—they are designed by an intelligent person and to the extent they “act intelligent” it’s all due to the person providing the thinking for it.
I’m not sure this is at all well-defined. I’m curious, what would make you change your mind? If for example, Colton’s systems constructed new definitions, proofs, conjectures, and counter-examples in math would that be enough to decide they were learning?
How about it starts by passing the turing test?
Or: show me the code, and explain to me how it works, and how the code doesn’t contain all the knowledge the AI creates.
Could you explain how this is connected to the issue of making new knowledge?
This seems a bit like showing a negative. I will suggest you look for a start at Simon Colton’s paper in the Journal of Integer Sequences which uses a program that operates in a way very close to the way you think an AI would need to operate in terms of making conjectures and trying to refute them. I don’t know if the source code is easily available. It used to be on Colton’s website but I don’t see it there anymore; if his work seems at all interesting to you you can presumably email him requesting a copy. I don’t know how to show that the AI “doesn’t contain all the knowledge the AI creates” aside from the fact that the system constructed concepts and conjectures in number theory which had not previously been constructed. Moreover, Colton’s own background in number theory is not very heavy, so it is difficult to claim that he’s importing his own knowledge into the code. If you define more precisely what you mean by the code containing the knowledge I might be able to answer that further. Without a more precise notion it isn’t clear to me how to respond.
Holding a conversation requires creating knowledge of what the other guy is saying.
In deduction, you agree that the conclusions are logically contained in the premises and axioms, right? They aren’t something new.
In a spam filter, a programmer figures out how he wants spam filtered (he has the idea), then he tells the computer to do it. The computer doesn’t figure out the idea or any new idea.
With biological evolution, for example, we see something different. You get stuff out, like cats, which weren’t specified in advance. And they aren’t a trivial extension; they contain important knowledge such as the knowledge of optics that makes their eyes work. This is why “Where can cats come from?” has been considered an important question (people want an explanation of the knowledge which i sometimes called “apparent design), while “Where can rocks come from?” is not in the same category of question (it does have some interest for other reasons).
With people, people create ideas that aren’t in their genes, and were’t told to them by their parents or anyone else. That includes abstract ideas that aren’t the summation of observation. They sometimes create ideas no one ever thought of before. THey create new ideas.
In an AI (AGI you call it?) should be like a person: it should create new ideas which are not in it’s “genes” (programming). If someone actually writes an AI they will understand how it works and they can explain it, and we can use their explanation to judge whether they “cheated” or not (whether they, e.g., hard coded some ideas into the program and then said the AI invented them).
Ok. So to make sure I understand this claim. You are asserting that mathematicians are not constructing anything “new” when they discover proofs or theorems in set axiomatic systems?
Are genetic algorithm systems then creating something new by your definition?
Different concepts. An artificial intelligent is not (necessarily) a well-defined notion. An AGI is an artficial general intelligence, essentially something that passes the Turing test. Not the same concept.
I see no reason to assume that a person will necessarily understand how an AGI they constructed works. To use the most obvious hypothetical, someone might make a neural net modeled very closely after the human brain that functions as an AGI without any understanding of how it works.
When you “discover” that 2+1 = 3, given premises and axioms, you aren’t discovering something new.
But working mathematicians do more than that. They create new knowledge. It includes:
1) they learn new ways to think about the premises and axioms
2) they do not publish deductively implied facts unselectively or randomly. they choose the ones that they consider important. by making these choices they are adding content not found in the premises and axioms
3) they make choices between different possible proofs of the same thing. again where they make choices they are adding stuff, based on their own non-deductive understanding
4) when mathematicians work on proofs, they also think about stuff as they go. just like when experimental scientists do fairly mundane tasks in a lab, at the same time they will think and make it interesting with their thoughts.
They could be. I don’t think any exist yet that do. For example I read a Dawkins paper about one. In the paper he basically explained how he tweaked the code in order to get the results he wanted. He didn’t, apparently, realize that it was him, not the program, creating the output.
By “AI” I mean AGI. An intelligence (like a person) which is artificial. Please read all my prior statements in light of that.
Well, OK, but they’d understand how it was created, and could explain that. They could explain what they know about why it works (it copies what humans do). And they could also make the code public and discuss what it doesn’t include (e.g. hard coded special cases. except for the 3 he included on purpose, and he explains why they are there). That’d be pretty convincing!
I don’t think this is true. While he probably wouldn’t announce it if he was working on AI, he’ has indicated that he’s working on two books (HPMoR and a rationality book), and has another book queued. He’s also indicated that he doesn’t think anyone should work on AI until the goal system stability problem is solved, which he’s talked about thinking about but hasn’t published anything on, which probably means he’s stuck.
I more meant “he’s probably thinking about this in the back of his mind fairly often”, as well as trying to be humourous.
Do you know what he would think of work that has a small chance of solving goal stability and a slightly larger chance of helping with AI in general? This seems like a net plus to me, but you seem to have heard what he thinks should be studied from a slightly clearer source than I did.