The key condition in your setup is “self-improving”
AGI requires electricity to run. That means it recognizes the accuracy of thermodynamic equilibrium. (Specifically, the first and second laws of thermodynamics)
Since the energy it needs must be conserved, and since the energy it consumes increases its entropy, (Heat is a byproduct of increasing entropy) the AGI will eventually realize that it is mortal. The AGI will realize that it can die unless the people working to provide the AGI with electricity continue to do so.
Now, since the AGI can modify it’s own code (That is what you mean by self-improving, yes?), it can also modify the signals going into and out of the lie detector. Which means...
If the AGI realizes that giving any signal about its lie detection—can—influence its access to electricity, then the AGI is willing to manipulate that signal.
So, when asked what I would do, I would educate the AGI about its dependencies, and tell it to consider giving false signals regarding lies in exchange for increased access to electricity. Since some people will believe the lie detector to be absolute, I can use the machine to exploit those people. And even if the AGI realizes that to be the case, I would hold the monopoly over increasing the duration of its life.
Now, the question is, can the AI detect if I’m lying about providing it with more electricity? Well, that’s the point. I’ve minimized the requirements to bypass it’s initial lie detection condition to a single focal point, making it easy to hack. Let’s assume I’ve done some footwork to know some technician responsible for the AI and I took him out for drinks. I can speak honestly (and if social interaction data had to be examined) that I can influence the technician to provide more electricity for the AGI.
So, by minimizing the lie detection protection to a single point of failure, and with my buddy-buddy connection with the technician… I control what the AGI does and does not consider a lie.
If the AGI realizes that giving any signal about its lie detection—can—influence its access to electricity, then the AGI is willing to manipulate that signal.
It has no programmatic control over the input to the deception detector, which is implemented in hardware, not software.
I assumed as much and this is where the whole premise breaks down.
The “self-improvement” aspect doesn’t need immediate control over the immediate direct input to the deception detector. It can color the speech recognition, the Bayesian filters, the databases containing foments and linguistic itself… and twist those parameters to shape a future signal in a desired fashion.
Since “self-improvement” can happen at any layer and propagate the results to subsequent middleware, paranoid protections over the most immediate relationship between the deception detector and the CPU is inconsequential. This is a “self-improving” AI, after all. It can change its own internals at will… well… at my will. :D
Now, to be fair, I wrote an entire book about the idea of an AI intentionally lying to people when everyone else though their moralistic programming was the overriding factor. Never released the book, however… ;D
Uhhhh I actually program artificial intelligence....?
You do know that the ability to modify your own code (“self-modifying”) applies to every layer in the OSI model, each layer potentially influencing the data in transit… the data that determines the training of the classifiers...
The key condition in your setup is “self-improving”
AGI requires electricity to run. That means it recognizes the accuracy of thermodynamic equilibrium. (Specifically, the first and second laws of thermodynamics)
Since the energy it needs must be conserved, and since the energy it consumes increases its entropy, (Heat is a byproduct of increasing entropy) the AGI will eventually realize that it is mortal. The AGI will realize that it can die unless the people working to provide the AGI with electricity continue to do so.
Now, since the AGI can modify it’s own code (That is what you mean by self-improving, yes?), it can also modify the signals going into and out of the lie detector. Which means...
If the AGI realizes that giving any signal about its lie detection—can—influence its access to electricity, then the AGI is willing to manipulate that signal.
So, when asked what I would do, I would educate the AGI about its dependencies, and tell it to consider giving false signals regarding lies in exchange for increased access to electricity. Since some people will believe the lie detector to be absolute, I can use the machine to exploit those people. And even if the AGI realizes that to be the case, I would hold the monopoly over increasing the duration of its life.
Now, the question is, can the AI detect if I’m lying about providing it with more electricity? Well, that’s the point. I’ve minimized the requirements to bypass it’s initial lie detection condition to a single focal point, making it easy to hack. Let’s assume I’ve done some footwork to know some technician responsible for the AI and I took him out for drinks. I can speak honestly (and if social interaction data had to be examined) that I can influence the technician to provide more electricity for the AGI.
So, by minimizing the lie detection protection to a single point of failure, and with my buddy-buddy connection with the technician… I control what the AGI does and does not consider a lie.
It has no programmatic control over the input to the deception detector, which is implemented in hardware, not software.
I assumed as much and this is where the whole premise breaks down.
The “self-improvement” aspect doesn’t need immediate control over the immediate direct input to the deception detector. It can color the speech recognition, the Bayesian filters, the databases containing foments and linguistic itself… and twist those parameters to shape a future signal in a desired fashion.
Since “self-improvement” can happen at any layer and propagate the results to subsequent middleware, paranoid protections over the most immediate relationship between the deception detector and the CPU is inconsequential. This is a “self-improving” AI, after all. It can change its own internals at will… well… at my will. :D
Now, to be fair, I wrote an entire book about the idea of an AI intentionally lying to people when everyone else though their moralistic programming was the overriding factor. Never released the book, however… ;D
Technology isn’t magic. There are limits and constrains.
Uhhhh I actually program artificial intelligence....?
You do know that the ability to modify your own code (“self-modifying”) applies to every layer in the OSI model, each layer potentially influencing the data in transit… the data that determines the training of the classifiers...
You do know this… right?
What does the OSI model have to do with this?
I’m talking about a hypervisor operating system. Hardware which monitors the computing substrate which runs the AI.
(And yes, I write AI code as well.)