it seems from reading Karnovsky that he thinks some Oracle/”Tool” AI (quite general) is safe if you get rid of that darned explicit utility function.
I think agency and utility functions are separate, here, and it looks like agency is the part that should be worrisome. I haven’t thought about that long enough to state that definitively, though.
Eliezer is trying to disabuse him of the notion.
Right, but it looks like by moving from where Eliezer is towards where Holden is, where I would rather see him move from where Holden is to where Eliezer is. Much of point 2, for example, is discussing how hard AGI is- which, to me, suggests we should worry less about it, because it is unlikely to be implemented successfully, and any AIs we will see will be narrow- in which case AGI thinking isn’t that relevant.
My approach would have been along the lines of: start off with a safe AI, add wrinkles until its safety is no longer clear, and then discuss the value of FAI researchers.
For example, we might imagine a narrow AI that takes in labor stats data, econ models, psych models, and psych data and advises schoolchildren on what subjects to study and what careers to pursue. Providing a GoogleLifeMap to one person doesn’t seem very dangerous- but what about when it’s ubiquitous? Then there will be a number of tradeoffs that need to be weighed against each other and it’s not at all clear that the AI will get them right. (If the AI tells too many people to become doctors, the economic value of being a doctor will decrease- and so the AI has to decide who of a set of potential doctors to guide towards being a doctor. How will it select between people?)
In addition to providing advice to people, it can aggregate the advice it has provided, translate it into economic terms, and hand it off to some independent economy-modeling service which is (from GoogleLifeMap’s perspective) a black box. Economic predictions about the costs and benefits of various careers are compiled, and eventually become GoogleLifeMap’s new dataset. Possibly it has more than one dataset, and presents career recommendations from each of them in parallel: “According to dataset A, you should spend nine hours a week all through high school sculpting with clay, but never show the results to anyone outside your immediate family, and study toward becoming a doctor of dental surgery; according to dataset B, you should work in foodservice for five years and two months, take out a thirty million dollar life insurance policy, and then move to a bunker in southern Arizona.”
I think agency and utility functions are separate, here, and it looks like agency is the part that should be worrisome. I haven’t thought about that long enough to state that definitively, though.
Right, but it looks like by moving from where Eliezer is towards where Holden is, where I would rather see him move from where Holden is to where Eliezer is. Much of point 2, for example, is discussing how hard AGI is- which, to me, suggests we should worry less about it, because it is unlikely to be implemented successfully, and any AIs we will see will be narrow- in which case AGI thinking isn’t that relevant.
My approach would have been along the lines of: start off with a safe AI, add wrinkles until its safety is no longer clear, and then discuss the value of FAI researchers.
For example, we might imagine a narrow AI that takes in labor stats data, econ models, psych models, and psych data and advises schoolchildren on what subjects to study and what careers to pursue. Providing a GoogleLifeMap to one person doesn’t seem very dangerous- but what about when it’s ubiquitous? Then there will be a number of tradeoffs that need to be weighed against each other and it’s not at all clear that the AI will get them right. (If the AI tells too many people to become doctors, the economic value of being a doctor will decrease- and so the AI has to decide who of a set of potential doctors to guide towards being a doctor. How will it select between people?)
In addition to providing advice to people, it can aggregate the advice it has provided, translate it into economic terms, and hand it off to some independent economy-modeling service which is (from GoogleLifeMap’s perspective) a black box. Economic predictions about the costs and benefits of various careers are compiled, and eventually become GoogleLifeMap’s new dataset. Possibly it has more than one dataset, and presents career recommendations from each of them in parallel: “According to dataset A, you should spend nine hours a week all through high school sculpting with clay, but never show the results to anyone outside your immediate family, and study toward becoming a doctor of dental surgery; according to dataset B, you should work in foodservice for five years and two months, take out a thirty million dollar life insurance policy, and then move to a bunker in southern Arizona.”