I agree that ideas are similar, and that CAIS are probably more safe than AI with translucent thoughts (by default).
What I wanted to do here is to define a set of AIs broader than CAIS and Open Agents, because I think that the current trajectory of AI does not point towards strict open agents aka small dumb AIs trained / fine-tuned independently and used jointly, and doing small bounded tasks (for example, it does not generate 10000 tokens to decide what to do next, then proceeds to launch 5 API calls and a python script, and prompts a new LLM instance on the result of these calls).
AI with translucent thoughts would include other kinds of system I think are probably much more competitive, like systems of LLMs trained jointly using RL / your favorite method, or LLMs producing long generation with a lot of “freedom” (to such an extent that considering it a safe microservice would not apply).
I think there is a 20% chance that the first AGIs have translucent thoughts. I think there is 5% chance that they are “strict open agents”. Do you agree?
No. I think AIs used to do useful things competitive with human professionals will be 90% or more stateless open agents.
There is simply not a feasible way to get sufficient reliability without this.
This means you, yes, might use an llm to interpret what a human wants to do, fill out paperwork, etc.
But a task like ‘validate this bridge design’ or ‘diagnose this patient’ or ‘remove the brake module from this car and install a replacement’ you need those many 9s of reliability.
Open agents have a human readable file of what they are there to accomplish, a series of state frames, and are completely deterministic, relying on nothing but the (input frames, file instructing them what to do) to make their decisions.
Note I have added the idea of ‘stateless microservices’, which essentially all large software systems today use (every online service that is reliable I know of uses them) on to “open agents”, from my years of debugging complex systems.
You are just not going to get it reliable enough to even ship the first time if you don’t have such a tight architecture. How can you debug a “real” general intelligence that works like a human? You can’t, the machine has “thoughts” and changes over time and it’s own goals and so on and so forth. Infeasible.
You will not get to the point of boxing said systems because they aren’t usable in production to begin with.
Note that all AI systems, except perhaps a few tiny research toys, are currently stateless. Robotics control systems like for autonomous cars are also already build as open agency systems, and again, this is true for all of them I am aware of. So this is not a stretch or a big effort, it’s what the engineers are already doing.
Mmh, interesting, I hadn’t thought of it from the need-for-robustness angle.
Do you think it matters because LLMs are inherently less robust than humans, and therefore you can’t just replace humans by general-ish things? Some companies do work as a combination of micro-entities which are extremely predictable and robust, and the more predictable/robust the better. Do you think that every entity that produces value have to follow this structure?
I disagree with what you said about statelessness because the AI with translucent thoughts I describe are mostly stateless. The difference between CAIS & AI with translucent thoughts is not the possibility of a state, it’s the possibility of joint training & long “free” text generations which makes hidden coordination & not-micro thinking possible.
That free text generation is state buildup. It’s the cause of most software failure since the beginning.
Current llms, because they have only a single buffer which is human readable as their state are fairly safe. The danger comes if we start hiding memory cells in network layers. Where unexaminable values are being stored, and from frame to frame these values are being used in a hidden way (to maximize reward)
When you say translucent thoughts you are proposing we structure ourselves exactly WHAT the machine can store from frame to frame, and we can validate this such as by having a different model “pick up” from a frame and complete a task.
If task performance drops because the stored data wasn’t formatted correctly (the AI hijacked the bits to store something else) we can automatically measure this and take action.
I agree that ideas are similar, and that CAIS are probably more safe than AI with translucent thoughts (by default).
What I wanted to do here is to define a set of AIs broader than CAIS and Open Agents, because I think that the current trajectory of AI does not point towards strict open agents aka small dumb AIs trained / fine-tuned independently and used jointly, and doing small bounded tasks (for example, it does not generate 10000 tokens to decide what to do next, then proceeds to launch 5 API calls and a python script, and prompts a new LLM instance on the result of these calls).
AI with translucent thoughts would include other kinds of system I think are probably much more competitive, like systems of LLMs trained jointly using RL / your favorite method, or LLMs producing long generation with a lot of “freedom” (to such an extent that considering it a safe microservice would not apply).
I think there is a 20% chance that the first AGIs have translucent thoughts. I think there is 5% chance that they are “strict open agents”. Do you agree?
No. I think AIs used to do useful things competitive with human professionals will be 90% or more stateless open agents.
There is simply not a feasible way to get sufficient reliability without this.
This means you, yes, might use an llm to interpret what a human wants to do, fill out paperwork, etc.
But a task like ‘validate this bridge design’ or ‘diagnose this patient’ or ‘remove the brake module from this car and install a replacement’ you need those many 9s of reliability.
Open agents have a human readable file of what they are there to accomplish, a series of state frames, and are completely deterministic, relying on nothing but the (input frames, file instructing them what to do) to make their decisions.
Note I have added the idea of ‘stateless microservices’, which essentially all large software systems today use (every online service that is reliable I know of uses them) on to “open agents”, from my years of debugging complex systems.
You are just not going to get it reliable enough to even ship the first time if you don’t have such a tight architecture. How can you debug a “real” general intelligence that works like a human? You can’t, the machine has “thoughts” and changes over time and it’s own goals and so on and so forth. Infeasible.
You will not get to the point of boxing said systems because they aren’t usable in production to begin with.
Note that all AI systems, except perhaps a few tiny research toys, are currently stateless. Robotics control systems like for autonomous cars are also already build as open agency systems, and again, this is true for all of them I am aware of. So this is not a stretch or a big effort, it’s what the engineers are already doing.
Mmh, interesting, I hadn’t thought of it from the need-for-robustness angle.
Do you think it matters because LLMs are inherently less robust than humans, and therefore you can’t just replace humans by general-ish things? Some companies do work as a combination of micro-entities which are extremely predictable and robust, and the more predictable/robust the better. Do you think that every entity that produces value have to follow this structure?
I disagree with what you said about statelessness because the AI with translucent thoughts I describe are mostly stateless. The difference between CAIS & AI with translucent thoughts is not the possibility of a state, it’s the possibility of joint training & long “free” text generations which makes hidden coordination & not-micro thinking possible.
That free text generation is state buildup. It’s the cause of most software failure since the beginning.
Current llms, because they have only a single buffer which is human readable as their state are fairly safe. The danger comes if we start hiding memory cells in network layers. Where unexaminable values are being stored, and from frame to frame these values are being used in a hidden way (to maximize reward)
When you say translucent thoughts you are proposing we structure ourselves exactly WHAT the machine can store from frame to frame, and we can validate this such as by having a different model “pick up” from a frame and complete a task.
If task performance drops because the stored data wasn’t formatted correctly (the AI hijacked the bits to store something else) we can automatically measure this and take action.