anonymousaisafety comments on Don’t Over-Optimize Things

anonymousaisafety 16 Jun 2022 22:58 UTC
27 points
0
My reply is focused on this specific statement:^[1]
sometimes trying to [over] optimize can lead to worse outcomes
There is something known as the performance / robustness stability tradeoff in controls theory. Controls theory^[2] is the study of dynamic (e.g. autonomous) systems, and I have no idea why it is not more commonly cited on this forum.
The mathematical description of this gets a little bit unwieldy, so I’m going to simplify.
Note that everything I’m about to say is discussing ideal systems and real systems are actually worse.
Higher performance systems are less stable than lower performance systems. For an intuitive idea of why this might be the case, consider the example of a system where you want to keep some variable to some setpoint, like temperature in a room. If you slowly control the error as it occurs, you’ll end up with what is called a proportional error response.
Consider the following picture of a step response.^[3]
You might want to be faster, so you might try to do something clever and add a term to the controller for how quickly the temperature is changing. Now you have a proportional-derivative error. There is a tradeoff. By making the system more responsive, we’ve made it less stable. It is now possible for our controller to oscillate out of control.
Here is picture of various step responses.^[4] Take note of the unstable and marginally stable states.
The phrases to know are “gain margin” and “phase margin”.
Gain margin is about how robust your system is in magnitude—if the error is larger or smaller, how well does the system correct that error? You can think of gain margin as you’re trying to keep a bouncing spring in place by hitting it with a hammer, and if you hit it too hard, it’ll oscillate in a way you don’t like.
Phase margin is about how robust your system is in time. To continue the previous example, the idea of phase margin is capturing the reality that you’re controlling some external actuator, i.e. the hammer, and there’s some delay between when you need to swing and when the swing actually occurs, and if that delay is too large, the system will respond differently. In fact, if that delay is just the right frequency, it’ll actually add energy into the system and drive it unstable.^[5]
The first controller I described above is called a PID (proportional, integral, derivative) controller and I gave examples of a P controller and PD controller. Normally you use a PI controller because the integral term drives the error to zero over time, which is necessary when your system has friction or some dead-zone or other bias that prevents a pure proportional controller from working.
There are various fancy controllers you’ll hear about, like feed-forward, or MPC (“model predictive control”). The performance / robustness stability tradeoff applies to all of them. It is an iron law. It does not matter how fancy your controller gets. The fancier you make the controller, the more susceptible it is to going unstable. Basically, increasingly complicated controllers get performance by baking in assumptions about the physical world into the control loop. These assumptions are things like, how much bias is in the system, or quickly can an error change, what is the largest step response we might need to achieve. If those assumptions match reality, the controller will have very high performance and seem very stable. But if any of those assumptions are violated, that fancy controller might immediately go unstable. That’s the price you pay for performance.
One way to think about this is the following thought experiment. You have a tradeoff between how well you can track a setpoint and how well you can reject disturbances. If you make it very difficult to knock a system off a setpoint, it’ll reject disturbances well. However, a change in that setpoint might also look like a disturbance, and the system will be similarly sluggish to respond.
For real systems, a lot of the design considerations are going to be around giving yourself enough gain and phase margin so that you’ve got an envelope of safety around the testing you’re able to do. Think of it like the factor of safety used in construction. The bridge is built to be say, 5x stronger than it needs to be. For this reason, and contrary to claims made on this forum, real systems are not engineered to the theoretical limit of performance or “efficiency”.
1. ^
  Bob is over-optimizing towards higher performance (“faster arrival”) solutions that have increasingly higher risks of catastrophic failure (“death due to crashes from violating speed limits”).
2. ^
  https://en.wikipedia.org/wiki/Control_theory
3. ^
  https://www.goddardconsulting.ca/pid-control.html
4. ^
  https://www.sciencedirect.com/science/article/pii/B9780750646376500137
5. ^
  Lack of phase margin is also what stops “I will simply control the robot over the network” ideas from working—if the phase margin is insufficient, the delay incurred over the network will make it impossible for the remote actuators to be controlled in response to disturbances with any degree of accuracy.
What links here?
- anonymousaisafety's comment on Air Conditioner Test Results & Discussion by johnswentworth (22 Jun 2022 23:12 UTC; 83 points)
- Owen Cotton-Barratt's comment on Don’t Over-Optimize Things by Owen Cotton-Barratt (EA Forum; 17 Jun 2022 9:46 UTC; 2 points)
- Katy Kelly 16 Jun 2022 23:44 UTC
  4 points
  Parent
  This was such a good read, I made an account to say that it should be a post in and of itself.
  
  This example gave me a big aha about left/right political divides.
  
  “One way to think about this is the following thought experiment. You have a tradeoff between how well you can track a setpoint and how well you can reject disturbances. If you make it very difficult to knock a system off a setpoint, it’ll reject disturbances well. However, a change in that setpoint might also look like a disturbance, and the system will be similarly sluggish to respond.”
  
  Spitballing / overgeneralizing:
  
  Maybe the right could be seen as the part of society better at rejecting disturbances, and the left the side that’s better at tracking changes in the set point.
  
  Makes sense of why conservative areas often seem to be more stable (and why most cultures have all these weird, unnecessary taboos—they’re over-rejecting), and why the left tends to be better at art, and most high performance cities are left leaning (they’re tracking the set point), but also generally less stable (they’re overly responsive).
- Emrik 31 Aug 2022 19:30 UTC
  1 point
  Parent
  Productivity and akrasia are neighbouring valleys in a bistable system. If you’re productive, you can keep up behaviour which lets you continue be productive (e.g. get your tasks done, sleep well, exercise). If you seem be behind on your tasks one day, it stresses you out a little, so you put some extra effort in to return to equilibrium. But if you’re too far behind one day, your stress level shoots through the roof, so you put in a lot of extra effort, so you sleep less, so you have less effort to put in, so your stress level increases—and either you persevere gloriously because you tried really hard, or you fall apart. Make an ill-advised bet and you end up in the akratic equilibrium, and climbing back up will be rough.
  But putting in extra effort is not the only response you have in order to decrease stress (sometimes). You can also give up on some of your plans and prioritise within what you can manage. Throwing your plans overboard gives you no chance of success, but it could make your productivity loop more robust. This has to be managed against the risk of degrading the strength of your habits, however. You’re a finely-tuned multidimensional control system, and there are pitfalls in every direction.
  - The Pygmalion effect is a psychological phenomenon in which high expectations lead to improved performance in a given area.
  - It always takes longer than you expect, even when you take into account Hofstadter’s Law.
  - Work expands so as to fill the time available for its completion.
  - The demand upon a resource tends to expand to match the supply of the resource.
  When do you throw out luggage? When do you let out steam? If propositional attitudes are part of your control loop, how do you consciously manage it so conscious management doesn’t interfere with the loop? Without resorting to model-dissolving outside-view perspectives, I mean.
  - Emrik 31 Aug 2022 19:43 UTC
    1 point
    Parent
    Modest epistemology and hubris are bistable as well. You need hubris in order to produce anything worthwhile so you have the self-confidence required to produce anything worthwhile. Grr, need a better word for hubris.