It’s not clear that people should be agents. Agents are means of setting up content of the world to accord with values, they are not optimized for being the valuable content of the world. So a holy madman has a work-life balance problem, they are an instrument of their values rather than an incarnation of them.
It seems to me that the agents you are considering don’t have as complex a utility function as people, who seem to at least in part consider their own well being as part of their utility funciton. Additionally, people usually don’t have a clear idea of what their actual utility function is, so if they want to go all-in on it, they let some values fall by the wayside. AFAIK this limitation not a requirement for an agent.
If you had your utility function fully specified, I don’t think you could be considered both rational and also not a “holy madman”. (This borders on my answer to the question of free will, which so far as I can tell, is a question that should not explicitly be answered, so as to not spoil it for anyone who wants to figure it out for themselves.)
Suffice it to say that optimized/optimal function should be a convergent instrumental goal, similar to self-preservation, and a rational agent should thereby have it as a goal. If I am not mistaken, this means that a problem in work-life balance, as you put it, is not something that an actual rational agent would tolerate, provided there are options to choose from that don’t include this problem and have a similar return otherwise.
Or did I misinterpret what you wrote? I can be dense sometimes...^^
I think there are a couple of responses the holy-madman type can give:
The holy-madman aesthetic is actually pretty nice. Human values include truth, which requires coherent thought. And in fiction, we especially enjoy heroes who go after coherent goals. So in practice in our current world, the tails don’t come apart much. At worst, people who manage to be more agentic aren’t making too big of a sacrifice in the incarnation department. And perhaps they’re actually better-off in that respect.
A coherent agent is basically what happens when you can split up the problem of deciding what to do and doing it, because most of the expected utility is in the rest of the world. An effective altruist who cares about cosmic waste probably thinks your argument is referring to something pretty negligible in comparison. Even if you argue functional decision theory means you’re controlling all similar agents, not just yourself, that could still be pretty negligible.
The nice things are skills and virtues, parts of designs that might get washed away by stronger optimization. If people or truths or playing chess are not useful/valuable, agents get rid of them, while people might have a different attitude.
(Part of the motivation here is in making sense of corrigibility. Also, I guess simulacrum level 4 is agency, but humans can’t function without a design, so attempts to take advantage of the absence of a design devolve into incoherence.)
It’s not clear that people should be agents. Agents are means of setting up content of the world to accord with values, they are not optimized for being the valuable content of the world. So a holy madman has a work-life balance problem, they are an instrument of their values rather than an incarnation of them.
This is a very striking statement, and I want to flag it as excellent.
It seems to me that the agents you are considering don’t have as complex a utility function as people, who seem to at least in part consider their own well being as part of their utility funciton. Additionally, people usually don’t have a clear idea of what their actual utility function is, so if they want to go all-in on it, they let some values fall by the wayside. AFAIK this limitation not a requirement for an agent.
If you had your utility function fully specified, I don’t think you could be considered both rational and also not a “holy madman”. (This borders on my answer to the question of free will, which so far as I can tell, is a question that should not explicitly be answered, so as to not spoil it for anyone who wants to figure it out for themselves.)
Suffice it to say that optimized/optimal function should be a convergent instrumental goal, similar to self-preservation, and a rational agent should thereby have it as a goal. If I am not mistaken, this means that a problem in work-life balance, as you put it, is not something that an actual rational agent would tolerate, provided there are options to choose from that don’t include this problem and have a similar return otherwise.
Or did I misinterpret what you wrote? I can be dense sometimes...^^
No, sounds right to me, at least approximately. It would be interesting to have theorems.
My position on free will is pretty developed, so I don’t think you’d be spoiling anything if you DMed me with that part of the thought.
I think there are a couple of responses the holy-madman type can give:
The holy-madman aesthetic is actually pretty nice. Human values include truth, which requires coherent thought. And in fiction, we especially enjoy heroes who go after coherent goals. So in practice in our current world, the tails don’t come apart much. At worst, people who manage to be more agentic aren’t making too big of a sacrifice in the incarnation department. And perhaps they’re actually better-off in that respect.
A coherent agent is basically what happens when you can split up the problem of deciding what to do and doing it, because most of the expected utility is in the rest of the world. An effective altruist who cares about cosmic waste probably thinks your argument is referring to something pretty negligible in comparison. Even if you argue functional decision theory means you’re controlling all similar agents, not just yourself, that could still be pretty negligible.
The nice things are skills and virtues, parts of designs that might get washed away by stronger optimization. If people or truths or playing chess are not useful/valuable, agents get rid of them, while people might have a different attitude.
(Part of the motivation here is in making sense of corrigibility. Also, I guess simulacrum level 4 is agency, but humans can’t function without a design, so attempts to take advantage of the absence of a design devolve into incoherence.)