I agree the code snippet is relevant, but it looks like pseudocode for the “optimization algorithm of choice” part- the question is what dataset and sets of alternatives you’re calling it over. Is it a narrow environment where we can be reasonably confident that the model of reality is close to reality, and the model of our objective is close to our objective? Or is it a broad environment where we can’t be confident about the fidelity of our models of reality or our objectives without calling in FAI experts to evaluate the approach and find obvious holes?
Similarly, is it an environment where the optimization algorithm needs to take into account other agents and model them, or one in which the algorithm can just come up with a plan without worrying about how that plan will alter the wider world?
It seems like explaining the difference between narrow AI and AGI and giving a clearer sense of what subcomponents make a decision support system dangerous might work well for SI. Right now, the dominant feature of UFAI as SI describes it is that it’s an agent with a utility function- and so the natural response to SI’s description is “well, get rid of the agency.” That’s a useful response only if it constricts the space of possible AIs we could build- and I think it does, by limiting us to narrow AIs. Spelling out the benefits and costs to various AI designs and components will both help bring other people to SI’s level of understanding and point out holes in SI’s assumptions and arguments.
That’s a useful response only if it constricts the space of possible AIs we could build- and I think it does, by limiting us to narrow AIs
I agree with you that that is a position one might take in response to the UFAI risks, but it seems from reading Karnovsky that he thinks some Oracle/”Tool” AI (quite general) is safe if you get rid of that darned explicit utility function. Eliezer is trying to disabuse him of the notion. If your understanding of Karnovsky is different, mine is more like Eliezers. In any case this is probably mute, since Karnovsky is very likely to respond one way or another, given this turned into a public debate.
it seems from reading Karnovsky that he thinks some Oracle/”Tool” AI (quite general) is safe if you get rid of that darned explicit utility function.
I think agency and utility functions are separate, here, and it looks like agency is the part that should be worrisome. I haven’t thought about that long enough to state that definitively, though.
Eliezer is trying to disabuse him of the notion.
Right, but it looks like by moving from where Eliezer is towards where Holden is, where I would rather see him move from where Holden is to where Eliezer is. Much of point 2, for example, is discussing how hard AGI is- which, to me, suggests we should worry less about it, because it is unlikely to be implemented successfully, and any AIs we will see will be narrow- in which case AGI thinking isn’t that relevant.
My approach would have been along the lines of: start off with a safe AI, add wrinkles until its safety is no longer clear, and then discuss the value of FAI researchers.
For example, we might imagine a narrow AI that takes in labor stats data, econ models, psych models, and psych data and advises schoolchildren on what subjects to study and what careers to pursue. Providing a GoogleLifeMap to one person doesn’t seem very dangerous- but what about when it’s ubiquitous? Then there will be a number of tradeoffs that need to be weighed against each other and it’s not at all clear that the AI will get them right. (If the AI tells too many people to become doctors, the economic value of being a doctor will decrease- and so the AI has to decide who of a set of potential doctors to guide towards being a doctor. How will it select between people?)
In addition to providing advice to people, it can aggregate the advice it has provided, translate it into economic terms, and hand it off to some independent economy-modeling service which is (from GoogleLifeMap’s perspective) a black box. Economic predictions about the costs and benefits of various careers are compiled, and eventually become GoogleLifeMap’s new dataset. Possibly it has more than one dataset, and presents career recommendations from each of them in parallel: “According to dataset A, you should spend nine hours a week all through high school sculpting with clay, but never show the results to anyone outside your immediate family, and study toward becoming a doctor of dental surgery; according to dataset B, you should work in foodservice for five years and two months, take out a thirty million dollar life insurance policy, and then move to a bunker in southern Arizona.”
I agree the code snippet is relevant, but it looks like pseudocode for the “optimization algorithm of choice” part- the question is what dataset and sets of alternatives you’re calling it over. Is it a narrow environment where we can be reasonably confident that the model of reality is close to reality, and the model of our objective is close to our objective? Or is it a broad environment where we can’t be confident about the fidelity of our models of reality or our objectives without calling in FAI experts to evaluate the approach and find obvious holes?
Similarly, is it an environment where the optimization algorithm needs to take into account other agents and model them, or one in which the algorithm can just come up with a plan without worrying about how that plan will alter the wider world?
It seems like explaining the difference between narrow AI and AGI and giving a clearer sense of what subcomponents make a decision support system dangerous might work well for SI. Right now, the dominant feature of UFAI as SI describes it is that it’s an agent with a utility function- and so the natural response to SI’s description is “well, get rid of the agency.” That’s a useful response only if it constricts the space of possible AIs we could build- and I think it does, by limiting us to narrow AIs. Spelling out the benefits and costs to various AI designs and components will both help bring other people to SI’s level of understanding and point out holes in SI’s assumptions and arguments.
I agree with you that that is a position one might take in response to the UFAI risks, but it seems from reading Karnovsky that he thinks some Oracle/”Tool” AI (quite general) is safe if you get rid of that darned explicit utility function. Eliezer is trying to disabuse him of the notion. If your understanding of Karnovsky is different, mine is more like Eliezers. In any case this is probably mute, since Karnovsky is very likely to respond one way or another, given this turned into a public debate.
I think agency and utility functions are separate, here, and it looks like agency is the part that should be worrisome. I haven’t thought about that long enough to state that definitively, though.
Right, but it looks like by moving from where Eliezer is towards where Holden is, where I would rather see him move from where Holden is to where Eliezer is. Much of point 2, for example, is discussing how hard AGI is- which, to me, suggests we should worry less about it, because it is unlikely to be implemented successfully, and any AIs we will see will be narrow- in which case AGI thinking isn’t that relevant.
My approach would have been along the lines of: start off with a safe AI, add wrinkles until its safety is no longer clear, and then discuss the value of FAI researchers.
For example, we might imagine a narrow AI that takes in labor stats data, econ models, psych models, and psych data and advises schoolchildren on what subjects to study and what careers to pursue. Providing a GoogleLifeMap to one person doesn’t seem very dangerous- but what about when it’s ubiquitous? Then there will be a number of tradeoffs that need to be weighed against each other and it’s not at all clear that the AI will get them right. (If the AI tells too many people to become doctors, the economic value of being a doctor will decrease- and so the AI has to decide who of a set of potential doctors to guide towards being a doctor. How will it select between people?)
In addition to providing advice to people, it can aggregate the advice it has provided, translate it into economic terms, and hand it off to some independent economy-modeling service which is (from GoogleLifeMap’s perspective) a black box. Economic predictions about the costs and benefits of various careers are compiled, and eventually become GoogleLifeMap’s new dataset. Possibly it has more than one dataset, and presents career recommendations from each of them in parallel: “According to dataset A, you should spend nine hours a week all through high school sculpting with clay, but never show the results to anyone outside your immediate family, and study toward becoming a doctor of dental surgery; according to dataset B, you should work in foodservice for five years and two months, take out a thirty million dollar life insurance policy, and then move to a bunker in southern Arizona.”