On your first point, we all draw inferences, humans and computers alike. However, you could produce a set of results from pagerank and have a set of humans evaluate them, and the humans would more or less agree on what is a false positive and what is a true positive. Also similarly , but with a bit more effort) you could find some false negatives. This means that the inference of pagerank is somehow inferior to the human inference, (given that humans are the standard here).
This does open up a very interesting line of thought however. Could there be a situation where a machine could have a definition that humans just couldn’t match? Somehow humans picking items out of a set that doesn’t match what the algorithm defines, even after being trained against that algorithm or even being able to read the code, even when not limited by perception constraints or time? If not, why? It’s commonly accepted here that humans aren’t the optimally intelligent design that could possibly exist. So perhaps this thing that we have, that is not intelligence but some aspect of it could be replicated to avoid OBP (and perhaps this would be a breakthgough for FAI). If yes, then there could be a situation where the machine holds the actual definition, and the human is the one doing optimization by proxy. In fact, there’s nothing to prohibit that from occuring now, if at all possible. I am not sure where this line of thought leads, I think I may be conflating value systems and intelligence, but I do need to think more about it.
On your second point, yes, google cannot be a full uFAI example. What I meant to show, is that even when it is being constantly supervised, it can still cause measurable and unintended outcomes, and what is more, we are not even sure how to prevent them given the incentive structure that exists.
I guess I was reacting to your suggestion later in the article that we should try to remove the proxy and instead just look right at the underlying “page quality” characteristic (or perhaps I misinterpreted that?) My objection is that, yes, one can (implicitly or explicitly) define an underlying quality characteristic of pages, and no, Google is not perfect at estimating that value, but however you go about estimating that parameter, the process will always involve making some observations about the page in question (e.g. local hyperlink structure) and then doing inference. Although the underlying characteristic might be well-defined, there is no way, even in principle (even given unlimited data and computing power far beyond that of a human), to just “look right at the underlying characteristic”—I don’t even think that phrase is meaningful.
If humans were given the same task then they would also follow the evidence-inference procedure, albeit with more sophisticated forms of inference than the Google programmers can work out how to program into their computers.
So I think that “get rid of the proxy” really means “gather more and different types of evidence and improve the page quality-inference procedure”.
I think at this point we’re hitting on the question of whether truly correct definitions exist. I believe Eliezer had some articles in the past about thingspace and ‘carving reality at its joints’, but, while this assumption underlies my article, I do see your point. In this case you would say that the algorithm has approached the ‘true characteristic’ when humans cannot discern any better than it can.
I guess one example is a guy who wrote a few articles for Hacker News, they got reasonably upvoted, and then he came out and said that he was producing them based on some underlying assumptions about what would do good on HN. Some people were negative and felt manipulated, but not everyone agreed that this was spam. The argument was “if you upvoted it, then you enjoyed it, and what the process that produced the article was shouldn’t matter”.
If you consider that spam, then certainly it’s spam beyond the human capability to detect it. But what you really need there is a process to infer intention, which may well be impossible.
On your first point, we all draw inferences, humans and computers alike. However, you could produce a set of results from pagerank and have a set of humans evaluate them, and the humans would more or less agree on what is a false positive and what is a true positive. Also similarly , but with a bit more effort) you could find some false negatives. This means that the inference of pagerank is somehow inferior to the human inference, (given that humans are the standard here).
This does open up a very interesting line of thought however. Could there be a situation where a machine could have a definition that humans just couldn’t match? Somehow humans picking items out of a set that doesn’t match what the algorithm defines, even after being trained against that algorithm or even being able to read the code, even when not limited by perception constraints or time? If not, why? It’s commonly accepted here that humans aren’t the optimally intelligent design that could possibly exist. So perhaps this thing that we have, that is not intelligence but some aspect of it could be replicated to avoid OBP (and perhaps this would be a breakthgough for FAI). If yes, then there could be a situation where the machine holds the actual definition, and the human is the one doing optimization by proxy. In fact, there’s nothing to prohibit that from occuring now, if at all possible. I am not sure where this line of thought leads, I think I may be conflating value systems and intelligence, but I do need to think more about it.
On your second point, yes, google cannot be a full uFAI example. What I meant to show, is that even when it is being constantly supervised, it can still cause measurable and unintended outcomes, and what is more, we are not even sure how to prevent them given the incentive structure that exists.
Thanks for the reply :)
I guess I was reacting to your suggestion later in the article that we should try to remove the proxy and instead just look right at the underlying “page quality” characteristic (or perhaps I misinterpreted that?) My objection is that, yes, one can (implicitly or explicitly) define an underlying quality characteristic of pages, and no, Google is not perfect at estimating that value, but however you go about estimating that parameter, the process will always involve making some observations about the page in question (e.g. local hyperlink structure) and then doing inference. Although the underlying characteristic might be well-defined, there is no way, even in principle (even given unlimited data and computing power far beyond that of a human), to just “look right at the underlying characteristic”—I don’t even think that phrase is meaningful.
If humans were given the same task then they would also follow the evidence-inference procedure, albeit with more sophisticated forms of inference than the Google programmers can work out how to program into their computers.
So I think that “get rid of the proxy” really means “gather more and different types of evidence and improve the page quality-inference procedure”.
I think at this point we’re hitting on the question of whether truly correct definitions exist. I believe Eliezer had some articles in the past about thingspace and ‘carving reality at its joints’, but, while this assumption underlies my article, I do see your point. In this case you would say that the algorithm has approached the ‘true characteristic’ when humans cannot discern any better than it can.
I guess one example is a guy who wrote a few articles for Hacker News, they got reasonably upvoted, and then he came out and said that he was producing them based on some underlying assumptions about what would do good on HN. Some people were negative and felt manipulated, but not everyone agreed that this was spam. The argument was “if you upvoted it, then you enjoyed it, and what the process that produced the article was shouldn’t matter”.
If you consider that spam, then certainly it’s spam beyond the human capability to detect it. But what you really need there is a process to infer intention, which may well be impossible.