Perhaps it would be helpful to provide some examples of how closed-loop AI optimization systems are used today—this may illuminate the negative consequences of generalized policy to restrict their implementation.
The majority of advanced process manufacturing systems use some form of closed-loop AI control (Model Predictive Control) that incorporate neural networks for state estimation, and even neural nets for inference on the dynamics of the process (how does a change in a manipulated variable lead to a change in a target control variable, and how do these changes evolve over time). The ones that don’t use neural nets use some sort of symbolic regression algorithm that can handle high dimensionality, non-linearity and multiple competing objective functions.
These systems have been in place since the mid-90s (and are in fact one of the earliest commercial applications of neural nets—check the patent history)
Self driving cars, autonomous mobile robots, unmanned aircraft, etc—all of these things are closed-loop AI optimization systems. Even advanced HVAC systems, ovens and temperature control systems adopt these techniques.
These systems are already constrained in the sense that limitations are imposed on the degree and magnitude of adaptation that is allowed to take place. For example—the rate, direction and magnitude of a change to a manipulated variable is constrained by upper and lower control limits, and other factors that account for safety and robustness.
To determine whether (or how) rules around ‘human in the loop’ should be enforced, we should start by acknowledging how control engineers have solved similar problems in applications that are already ubiquitous in industry.
I will propose a slight modification to the definition of closed-loop offered, not to be pedantic but to help align the definition with the risks proposed.
A closed-loop system generally incorporates inputs, an arbitrary function that translates inputs to outputs (like a model or agent), the outputs themselves, and some evaluation of the output’s efficacy against some defined objectives—this might be referred to as a loss function, cost function, utility function, reward function or objective function—let’s just call this the evaluation.
The defining characteristic of a closed loop system is that this evaluation is fed back into the input channel, not just the output of the function.
An LLM that produces outputs that are ultimately fed back into the context window as input is merely an autoregressive system, not necessarily a closed-loop system. In the case of chatbots LLMs and similar systems, there isn’t necessarily an evaluation of the outputs efficacy that is fed back into the context window in order to control the behavior of the system against a defined objective—these systems are autoregressive.
For a closed-loop AI to modify it’s behavior without a human-in-the-loop training process, it’s model/function will need to operate directly on the evaluation of it’s prior performance, and will require an inference-time objective function of some sort to guide this evaluation.
A classic example of closed-loop AI is the ‘system 2’ functionality that LeCun describes in his Autonomous Machine Intelligence paper (effectively Model Predictive Control)
https://openreview.net/pdf?id=BZ5a1r-kVsf