I think this is wrong. Replace “have/has” with “can be modeled as a”. We don’t know if that structure is causal and singular to that output. I think recognizing that inferred structures are not exclusive or complete is useful here—many possible structures can generate very similar outputs. If your goal is to predict future outputs, you probably need to find the actual generator, not pick a structure that could be it.
No actual agents will be exactly any of those structures, it’ll be a mix of them, and some other less-legible ones. I’d probably also argue that ANY of those structures can be scary, and in fact the utility-optimizing agent can use ANY of the decision types (table, rule, or model), because they’re all equivalent, with sufficient complexity of {table, ruleset, or world-model).
When we observe the programs’ outputs, we can form a hypothesis class about what structures we think the programs have. My claim is that only a couple of structures pop out after testing a utility-maximizing agent in a sufficient amount of environments.
You are correct in pointing out that an agent could be employing multiple structures at a time. Future work on this problem would include ways of quantifying how much of a certain structure some program has. This might look like trying to come up with a distance measure for programs that would say a version of quicksort written in OOP or a functional representation would have distance 0.
My claim is that only a couple of structures pop out after testing a utility-maximizing agent in a sufficient amount of environments.
I think there’s a lot of assumptions here about the kinds of feasible/available structures, and what “sufficient amount” of environments entails. My expectation is that this is true neither for toy problems (often optimized for a gotcha) nor for real-world problems (often with massive complexity and near-chaotic margins).
All models are wrong, some models are useful.
I think this is wrong. Replace “have/has” with “can be modeled as a”. We don’t know if that structure is causal and singular to that output. I think recognizing that inferred structures are not exclusive or complete is useful here—many possible structures can generate very similar outputs. If your goal is to predict future outputs, you probably need to find the actual generator, not pick a structure that could be it.
No actual agents will be exactly any of those structures, it’ll be a mix of them, and some other less-legible ones. I’d probably also argue that ANY of those structures can be scary, and in fact the utility-optimizing agent can use ANY of the decision types (table, rule, or model), because they’re all equivalent, with sufficient complexity of {table, ruleset, or world-model).
When we observe the programs’ outputs, we can form a hypothesis class about what structures we think the programs have. My claim is that only a couple of structures pop out after testing a utility-maximizing agent in a sufficient amount of environments.
You are correct in pointing out that an agent could be employing multiple structures at a time. Future work on this problem would include ways of quantifying how much of a certain structure some program has. This might look like trying to come up with a distance measure for programs that would say a version of quicksort written in OOP or a functional representation would have distance 0.
I think there’s a lot of assumptions here about the kinds of feasible/available structures, and what “sufficient amount” of environments entails. My expectation is that this is true neither for toy problems (often optimized for a gotcha) nor for real-world problems (often with massive complexity and near-chaotic margins).