Here’s what I imagine a solution looks like: you have some giant finite element model. There may or may not be some agenty subsystems embedded in it. You pass it in to solution.py, and out pops a list of susbsystems with approximately-agenty behavior, along with approximate utility functions and range of validity for each subsystem.
Does it need a method of combining contradictory preferences etc? Only if it’s actually constructing preferences as a distinct step to begin with. It could, for instance, just directly formulate “subsystem with utility function” as a functional equation, and then look for subsystems which approximately satisfy that functional equation within some region. That would implicitly combine contradictory preferences and underdefined preferences and so forth, but “combining preferences” wouldn’t necessarily be a very useful way to think about it.
The hard part isn’t synthesizing a utility function from preferences; the hard part is figuring out which part of the system to draw a box around, and what it means for that subsystem to have “preferences”. Which part of the system even has preferences to begin with, and what’s the physical manifestation of those preferences? By the time all that is worked out, it’s entirely plausible that “preferences” won’t even be a useful intermediate abstraction to think about.
The hard part isn’t synthesizing a utility function from preferences; the hard part is figuring out which part of the system to draw a box around, and what it means for that subsystem to have “preferences”. Which part of the system even has preferences to begin with, and what’s the physical manifestation of those preferences? By the time all that is worked out, it’s entirely plausible that “preferences” won’t even be a useful intermediate abstraction to think about.
This is exactly the issue I’ve been concerning myself with lately: I think preferences as we typically model them are not a natural category and are instead better thought of as a complex illusion over some more primitive operation. I suspect it’s something like error minimization and homeostasis, but that’s just a working guess and I endeavor to be more confused before I become less confused.
Nonetheless, I also appreciate Stuart’s work here formalizing this model in enough detail that maybe we can use it as a well-known starting point to build from, much as other theories that ultimately aren’t quite right were right enough to get people working in the right part of problem/solution space.
Here’s what I imagine a solution looks like: you have some giant finite element model. There may or may not be some agenty subsystems embedded in it. You pass it in to solution.py, and out pops a list of susbsystems with approximately-agenty behavior, along with approximate utility functions and range of validity for each subsystem.
Does it need a method of combining contradictory preferences etc? Only if it’s actually constructing preferences as a distinct step to begin with. It could, for instance, just directly formulate “subsystem with utility function” as a functional equation, and then look for subsystems which approximately satisfy that functional equation within some region. That would implicitly combine contradictory preferences and underdefined preferences and so forth, but “combining preferences” wouldn’t necessarily be a very useful way to think about it.
The hard part isn’t synthesizing a utility function from preferences; the hard part is figuring out which part of the system to draw a box around, and what it means for that subsystem to have “preferences”. Which part of the system even has preferences to begin with, and what’s the physical manifestation of those preferences? By the time all that is worked out, it’s entirely plausible that “preferences” won’t even be a useful intermediate abstraction to think about.
This is exactly the issue I’ve been concerning myself with lately: I think preferences as we typically model them are not a natural category and are instead better thought of as a complex illusion over some more primitive operation. I suspect it’s something like error minimization and homeostasis, but that’s just a working guess and I endeavor to be more confused before I become less confused.
Nonetheless, I also appreciate Stuart’s work here formalizing this model in enough detail that maybe we can use it as a well-known starting point to build from, much as other theories that ultimately aren’t quite right were right enough to get people working in the right part of problem/solution space.