Although in control theory open- vs. closed-loop is a binary feature of a system, there’s a sense in which some systems are more closed than others because more information is fed back as input and that information is used more extensively. Memoryless LLMs have a lesser capacity to respond to feedback, which I think makes them safer because it reduces their opportunities to behave in unexpected ways outside the training distribution.
This is a place where making the simple open vs. closed distinction becomes less useful because we have to get into the implementation details to actually understand what an AI does. Nevertheless, I’d suggest that if we had a policy of minimizing the amount of feedback AI is allowed to respond to, this would make AI marginally safer for us to build.
In a sense, yes.
Although in control theory open- vs. closed-loop is a binary feature of a system, there’s a sense in which some systems are more closed than others because more information is fed back as input and that information is used more extensively. Memoryless LLMs have a lesser capacity to respond to feedback, which I think makes them safer because it reduces their opportunities to behave in unexpected ways outside the training distribution.
This is a place where making the simple open vs. closed distinction becomes less useful because we have to get into the implementation details to actually understand what an AI does. Nevertheless, I’d suggest that if we had a policy of minimizing the amount of feedback AI is allowed to respond to, this would make AI marginally safer for us to build.