Another Tool AI proposal popped out and I want to ask question: what the hell is “tool”, anyway, and how to apply this concept to powerful intelligent system? I understand that calculator is a tool, but in what sense can the process that can come up with idea of calculator from scratch be a “tool”?
I think that first immediate reaction to any “Tool AI” proposal should be a question “what is your definition of toolness and can something abiding that definition end acute risk period without risk of turning into agent itself?”
The problem with such definition is that is doesn’t tell you much about how to build system with this property. It seems to me that it’s a good-old corrigibility problem.
Another Tool AI proposal popped out and I want to ask question: what the hell is “tool”, anyway, and how to apply this concept to powerful intelligent system? I understand that calculator is a tool, but in what sense can the process that can come up with idea of calculator from scratch be a “tool”? I think that first immediate reaction to any “Tool AI” proposal should be a question “what is your definition of toolness and can something abiding that definition end acute risk period without risk of turning into agent itself?”
You can define a tool as not-an-agent. Then something that can design a calculator is a tool, providing it dies nothing unless told to.
The problem with such definition is that is doesn’t tell you much about how to build system with this property. It seems to me that it’s a good-old corrigibility problem.
If you want one shot corrigibility, you have it, in LLMs. If you want some other kind of corrigibility, that’s not how tool AI is defined.