Thanks for the explanation and links! I agree that linear program solvers are intrinsically optimizing. SAT-solvers not intrinsically so, but could be used to optimize (such as Pattern’s example in his comment).
Moving on, I think it’s hard to define “environment” in a way that isn’t arbitrary, but it’s easier to think about “everything this optimizer affects”.
For every optimizer, I could (theoretically) list everything that it affects at that moment, but that list changes based off what’s around the optimizer to be affected. I could rig a set up such that, as soon as I run the optimizer code, it would cause a power surge and blow a fuse. Then, every optimizer has changed the environment while optimizing.
But you may argue that an opt2 would purposely choose an action that would affect the environment (especially if choosing that action maximizes reward) while an opt1 wouldn’t even have that action available.
But while available actions may be constrained by the type of optimizer, and you could try to make a distinction between different available actions, the affects of those limited actions changes with the programming language, the hardware, etc.
I’m still confused on this, and you still may have made a good distinction between different optimizers.
Thanks for the explanation and links! I agree that linear program solvers are intrinsically optimizing. SAT-solvers not intrinsically so, but could be used to optimize (such as Pattern’s example in his comment).
Moving on, I think it’s hard to define “environment” in a way that isn’t arbitrary, but it’s easier to think about “everything this optimizer affects”.
For every optimizer, I could (theoretically) list everything that it affects at that moment, but that list changes based off what’s around the optimizer to be affected. I could rig a set up such that, as soon as I run the optimizer code, it would cause a power surge and blow a fuse. Then, every optimizer has changed the environment while optimizing.
But you may argue that an opt2 would purposely choose an action that would affect the environment (especially if choosing that action maximizes reward) while an opt1 wouldn’t even have that action available.
But while available actions may be constrained by the type of optimizer, and you could try to make a distinction between different available actions, the affects of those limited actions changes with the programming language, the hardware, etc.
I’m still confused on this, and you still may have made a good distinction between different optimizers.