Let me explain. Let’s take the definition of an intelligent agent as an optimizer over possible futures, steering the world toward the preferred one. Now, suppose we look at the world after the optimizer is done. Only one of the many possible worlds, the one steered by the optimizer, is accessible to retrospection. Let’s further assume that we have no access to the internals of the optimizer, only to the recorded history. In particular, we cannot rely on it having human-like goals and use pattern-matching to whatever a human would do.
Is there still enough data left to tell with high probability that an intelligent optimizer is at work, and not just a random process? If so, how would one determine that? If not, what hope do we have of detecting an alien intelligence?
Omohundro has a paper on instrumental goals that many/most intelligences would converge on. For instance, they would strive to model themselves, to expand their capabilities, to represent their goals in terms of utility functions, to protect their utility functions against change, etc. None of these are universally true because we can just posit a pathological intelligence whose terminal goal is to not do these things. (And to some extent e.g. humans do in fact behave pathologically like that.)
We can say very little “optimizers over possible futures” in full generality, because that concept can be very broad if you define “optimizer” sufficiently broadly. Is a thermostat an intelligence, with the goal of achieving some temperature? Or consider a rock—we can see it as a mind with the goal of continuing its existence, and thus decides to be hard.
It seems that the paper discusses the inside view of intelligence, not the ways to detect one by its non-human-like artifacts.
I agree that it is hard to tell an intelligence without pattern-matching it to humans, that’s why I asked the question in the first place. But there hopefully should be at least some way to be convinced that a rock is not very intelligent, even if you can’t put yourself in its crystalline shoes.
I just edited the above comment, because I had forgotten about Kolmogrov complexity, and in particular how K-complexity varies only by a constant between turing-complete machines. That link should explain it pretty well; now that I remembered this I’m significantly less convinced that the problem is isomorphic.
Let me explain. Let’s take the definition of an intelligent agent as an optimizer over possible futures, steering the world toward the preferred one.
Yes, at least some of the time. Evolution fits your definition and we know about that. So if you want examples of how to deduce the existence of an intelligence without knowing its goals ahead of time, you could look at the history of the discovery of evolution.
Also, Eliezer has has written an essay which answers your question, you may want to look at that.
I don’t see how Eliezer’s criterion of stable negentropic artifacts can tell apart people (alive) from stars (not alive) (this is my go-to counterexample to the standard definitions of life).
I think that the idea is that somethings are very specific specifications, while others aren’t. For example a star isn’t a particularly unlikely configuration, take a large cloud of hydrogen and you’ll get a star. However a human is a very narrow target in design space: taking a pile of carbon, nitrogen, oxygen and hydrogen is very unlikely to get you a human.
Hence to explain stars we don’t need posit the existence of a process with a lot of optimization power. However since since humans are a very unlikely configuration this suggests that the reason they exist is because of something with a lot of optimization power (that thing being evolution).
I see what you are saying, certainly humans are very unlikely to spontaneously form in space. On the other hand, humans are not at all rare on Earth and stars are very unlikely to spontaneously form there.
Let’s take the definition of an intelligent agent as an optimizer over possible futures, steering the world toward the preferred one.
That’s a very low bar for intelligence, it looks more like a definition of life. Most or all living creatures do this. Some pretty simple software would fit the bill, too.
Yes, the bar is set low intentionally. I would be pretty happy if we could tell if black-box life is detectable. Again, not relying on pattern-matching to the life on earth, such as DNA, oxygenation for energy, methane release, water presence, or whatever else NASA uses to detect life on Mars. Unless, of course one can prove that some of these are necessary for any life.
There’s been quite a bit of speculation regarding alternative biochemistries; however, most of the popular ones seem to have various problems (most often low elemental abundance and inconvenient chemical properties). It’s of course difficult to prove that all of them are impossible in a search space the size of the universe, though.
Anyway, I think there are enough instrumental goals that even without human-like goals we should be able to recognize crafted tools like watches, hammers, and whatnot.
That’s pattern-matching to humanity, something I explicitly asked not to rely upon. Unless you can show that instrumental goal convergence is inevitable and independent from terminal goal or value convergence. Can you?
I’m having trouble imagining what these terminal goals are that can be optimized toward without having at least some familiar instrumental goals such as timekeeping, attaching things to other things, or murdering entities. Can you give me some examples?
well one of them is alive and moves around on its own. A hammer is a technological artifact with no visible or even implied means of existing without being crafted. You can’t observe baby hammers crafting each other out of raw material in nature.
Can one detect intelligence in retrospect?
Let me explain. Let’s take the definition of an intelligent agent as an optimizer over possible futures, steering the world toward the preferred one. Now, suppose we look at the world after the optimizer is done. Only one of the many possible worlds, the one steered by the optimizer, is accessible to retrospection. Let’s further assume that we have no access to the internals of the optimizer, only to the recorded history. In particular, we cannot rely on it having human-like goals and use pattern-matching to whatever a human would do.
Is there still enough data left to tell with high probability that an intelligent optimizer is at work, and not just a random process? If so, how would one determine that? If not, what hope do we have of detecting an alien intelligence?
Omohundro has a paper on instrumental goals that many/most intelligences would converge on. For instance, they would strive to model themselves, to expand their capabilities, to represent their goals in terms of utility functions, to protect their utility functions against change, etc. None of these are universally true because we can just posit a pathological intelligence whose terminal goal is to not do these things. (And to some extent e.g. humans do in fact behave pathologically like that.)
We can say very little “optimizers over possible futures” in full generality, because that concept can be very broad if you define “optimizer” sufficiently broadly. Is a thermostat an intelligence, with the goal of achieving some temperature? Or consider a rock—we can see it as a mind with the goal of continuing its existence, and thus decides to be hard.
It seems that the paper discusses the inside view of intelligence, not the ways to detect one by its non-human-like artifacts.
I agree that it is hard to tell an intelligence without pattern-matching it to humans, that’s why I asked the question in the first place. But there hopefully should be at least some way to be convinced that a rock is not very intelligent, even if you can’t put yourself in its crystalline shoes.
This is isomorphic to the problem (edit: not impossibility) of coming up with a fully mind-neutral definition of information entropy, is it not?
I am not familiar with it, feel free to link or explain...
I just edited the above comment, because I had forgotten about Kolmogrov complexity, and in particular how K-complexity varies only by a constant between turing-complete machines. That link should explain it pretty well; now that I remembered this I’m significantly less convinced that the problem is isomorphic.
Yes, at least some of the time. Evolution fits your definition and we know about that. So if you want examples of how to deduce the existence of an intelligence without knowing its goals ahead of time, you could look at the history of the discovery of evolution.
Also, Eliezer has has written an essay which answers your question, you may want to look at that.
I don’t see how Eliezer’s criterion of stable negentropic artifacts can tell apart people (alive) from stars (not alive) (this is my go-to counterexample to the standard definitions of life).
I think that the idea is that somethings are very specific specifications, while others aren’t. For example a star isn’t a particularly unlikely configuration, take a large cloud of hydrogen and you’ll get a star. However a human is a very narrow target in design space: taking a pile of carbon, nitrogen, oxygen and hydrogen is very unlikely to get you a human.
Hence to explain stars we don’t need posit the existence of a process with a lot of optimization power. However since since humans are a very unlikely configuration this suggests that the reason they exist is because of something with a lot of optimization power (that thing being evolution).
I see what you are saying, certainly humans are very unlikely to spontaneously form in space. On the other hand, humans are not at all rare on Earth and stars are very unlikely to spontaneously form there.
That’s a very low bar for intelligence, it looks more like a definition of life. Most or all living creatures do this. Some pretty simple software would fit the bill, too.
Yes, the bar is set low intentionally. I would be pretty happy if we could tell if black-box life is detectable. Again, not relying on pattern-matching to the life on earth, such as DNA, oxygenation for energy, methane release, water presence, or whatever else NASA uses to detect life on Mars. Unless, of course one can prove that some of these are necessary for any life.
There’s been quite a bit of speculation regarding alternative biochemistries; however, most of the popular ones seem to have various problems (most often low elemental abundance and inconvenient chemical properties). It’s of course difficult to prove that all of them are impossible in a search space the size of the universe, though.
I found a watch upon the heath.
Anyway, I think there are enough instrumental goals that even without human-like goals we should be able to recognize crafted tools like watches, hammers, and whatnot.
That’s pattern-matching to humanity, something I explicitly asked not to rely upon. Unless you can show that instrumental goal convergence is inevitable and independent from terminal goal or value convergence. Can you?
I’m having trouble imagining what these terminal goals are that can be optimized toward without having at least some familiar instrumental goals such as timekeeping, attaching things to other things, or murdering entities. Can you give me some examples?
How do you know the hammer is crafted while the hammer fish isn’t?
well one of them is alive and moves around on its own. A hammer is a technological artifact with no visible or even implied means of existing without being crafted. You can’t observe baby hammers crafting each other out of raw material in nature.