one of my basic background assumptions about agency:
there is no ontologically fundamental caring/goal-directedness, there is only the structure of an action being chosen (by some process, for example a search process), then taken.
this makes me conceptualize the ‘ideal agent structure’ as being “search, plus a few extra parts”. in my model of it, optimal search is queried for what action fulfills some criteria (‘maximizes some goal’) given some pointer (~ world model) to a mathematical universe sufficiently similar to the actual universe → search’s output is taken as action, and because of said similarity we see a behavioral agent that looks to us like it values the world it’s in.
i’ve been told that {it’s common to believe that search and goal-directedness are fundamentally intertwined or meshed together or something}, whereas i view goal-directedness as almost not even a real thing, just something we observe behaviorally when search is scaffolded in that way.
if anyone wants to explain the mentioned view to me, or link a text about it, i’d be interested.
(maybe a difference is in the kind of system being imagined: in selected-for systems, i can understand expecting things to be approximately-done at once (i.e. within the same or overlapping strands of computations); i guess i’d especially expect that if there’s a selection incentive for efficiency. i’m imagining neat, ideal (think intentionally designed rather than selected for) systems in this context.)
edit: another implication of this view is that decision theory is its own component (could be complex or not) of said ‘ideal agent structure’, i.e. that superintelligence with an ineffective decision theory is possible (edit: nontrivially likely for a hypothetical AI designer to unintentionally program / need to avoid). that is, one being asked the wrong questions (i.e. of the wrong decision theory) in the above model.
one of my basic background assumptions about agency:
there is no ontologically fundamental caring/goal-directedness, there is only the structure of an action being chosen (by some process, for example a search process), then taken.
this makes me conceptualize the ‘ideal agent structure’ as being “search, plus a few extra parts”. in my model of it, optimal search is queried for what action fulfills some criteria (‘maximizes some goal’) given some pointer (~ world model) to a mathematical universe sufficiently similar to the actual universe → search’s output is taken as action, and because of said similarity we see a behavioral agent that looks to us like it values the world it’s in.
i’ve been told that {it’s common to believe that search and goal-directedness are fundamentally intertwined or meshed together or something}, whereas i view goal-directedness as almost not even a real thing, just something we observe behaviorally when search is scaffolded in that way.
if anyone wants to explain the mentioned view to me, or link a text about it, i’d be interested.
(maybe a difference is in the kind of system being imagined: in selected-for systems, i can understand expecting things to be approximately-done at once (i.e. within the same or overlapping strands of computations); i guess i’d especially expect that if there’s a selection incentive for efficiency. i’m imagining neat, ideal (think intentionally designed rather than selected for) systems in this context.)
edit: another implication of this view is that decision theory is its own component (could be complex or not) of said ‘ideal agent structure’, i.e. that superintelligence with an ineffective decision theory is
possible(edit: nontrivially likely for a hypothetical AI designer to unintentionally program / need to avoid). that is, one being asked the wrong questions (i.e. of the wrong decision theory) in the above model.