[Public Draft v0.0] AGI: The Depth of Our Uncertainty
[The intent is for this to become a post making a solid case for why our ignorance about AGI implies near-certain doom, given our current level of capability:alignment efforts.]
[I tend to write lots of posts which never end up being published, so I’m trying a new thing where I will write a public draft which people can comment on, either to poke holes or contribute arguments/ideas. I’m hoping that having any engagement on it will strongly increase my motivation to follow through with this, so please comment even if just to say this seems cool!]
[Nothing I have planned so far is original; this will mostly be exposition of things that EY and others have said already. But it would be cool if thinking about this a lot gives me some new insights too!]
Entropy is Uncertainty
Given a model of the world, there are lots of possibilities that satisfy that model, over which our model implies a distribution.
There is a mathematically inevitable way to quantify the uncertainty latent in such a model, called entropy.
A model is subjective in the sense that it is held by a particular observer, and thus entropy is subjective in this sense too. [Obvious to Bayesians, but worth spending time on as it seems to be a common sticking point]
This is in fact the same entropy that shows up in physics!
Engine Efficiency
But wait, that implies that temperature (defined from entropy) is subjective, which is crazy! After all, we can measure temperature with a thermometer. Or define it as the average kinetic energy of the particles (in a monoatomic gas, in other cases you need the potential energy from the bonds)! Those are both objective in the sense of not depending on the observer.
That is true, as those are slightly different notions of temperature. The objective measurement is the one important for determining whether something will burn your hand, and thus is the one which the colloquial sense of temperature tracks. But the definition entropy is actually more useful, and it’s more useful because we can wring some extra advantage from the fact that it is subjective.
And that’s because, it is this notion of temperature which governs the use of a engine. Without the subjective definition, we merely get the law of a heat engine. As a simple intuition, consider that you happen to know that your heat source doesn’t just have molecules moving randomly, but that they are predominantly moving back and forth along a particular axis at a specific frequency. The temperature of a thermometer attached to this may measure the same temperature as an ordinary heat sink with the same amount of energy (mediated by phonon dissipation), and yet it would be simple to create an engine using this “heat sink” exceeding the Carnot limit simply by using a non-heat engine which takes advantage of the vibrational mode!
Say that this vibrational mode was hidden or hard to notice. Then someone with the knowledge of it would be able to make a more effective engine, and therefore extract more work, than someone who hadn’t noticed.
Another example is Maxwell’s demon. In this case, the demon has less uncertainty over the state of the gas than someone at the macro-level, and is thereby able to extract more work from the same gas.
But perhaps the real power of this subjective notion of temperature comes from the fact that the Carnot limit still applies with it, but now generalized to any kind of engine! This means that there is a physical limit on how much work can be extracted from a system which directly depends on your uncertainty about the system!! [This argument needs to actually be fleshed out for this post to be convincing, I think...]
The Work of Optimization
[Currently MUCH rougher than the above...]
Hopefully now, you can start to see the outlines of how it is knowable that
Try to let go of any intuitions about “minds” or “agents”, and think about optimizers in a very mechanical way.
Physical work is about the energy necessary to change the configuration of matter.
Roughly, you can factor an optimizer into three parts: The Modeler, the Engine, and the Actuator. Additionally, there is the Environment the optimizer exists within and optimizes over. The Modeler models the optimizer’s environment—decreasing uncertainty. The Engine uses this decreased uncertainty to extract more work from the environment. The Actuator focuses this work into certain kinds of configuration changes.
[There seems to be a duality between the Modeler and the Actuator which feels very important.]
Examples:
Gas Heater
It is the implicit knowledge of the location, concentration, and chemical structure of a natural gas line that allow the conversion of natural gas and the air in the room to state from a state of both being at the same low temperature to a state where the air is at a higher temperature, and the gas has been burned.
-- How much work does it take to heat up a room?
—How much uncertainty is there in the configuration state before and after combustion?
This brings us to an important point. A gas heater still works with no one around to be modeling it. So how is any of the subjective entropy stuff relevant? Well, from the perspective of no one—the room is simply in one of a plethora of possible states before, and it is in another of those possible states after, just like any other physical process anywhere. It is only because of the fact that we find it somehow relevant that the room is hotter before than after that thermodynamics comes into play. The universe doesn’t need thermodynamics to make atoms bounce around, we need it to understand and even recognize it as an interesting difference.
[Public Draft v0.0] AGI: The Depth of Our Uncertainty
[The intent is for this to become a post making a solid case for why our ignorance about AGI implies near-certain doom, given our current level of capability:alignment efforts.]
[I tend to write lots of posts which never end up being published, so I’m trying a new thing where I will write a public draft which people can comment on, either to poke holes or contribute arguments/ideas. I’m hoping that having any engagement on it will strongly increase my motivation to follow through with this, so please comment even if just to say this seems cool!]
[Nothing I have planned so far is original; this will mostly be exposition of things that EY and others have said already. But it would be cool if thinking about this a lot gives me some new insights too!]
Entropy is Uncertainty
Given a model of the world, there are lots of possibilities that satisfy that model, over which our model implies a distribution.
There is a mathematically inevitable way to quantify the uncertainty latent in such a model, called entropy.
A model is subjective in the sense that it is held by a particular observer, and thus entropy is subjective in this sense too. [Obvious to Bayesians, but worth spending time on as it seems to be a common sticking point]
This is in fact the same entropy that shows up in physics!
Engine Efficiency
But wait, that implies that temperature (defined from entropy) is subjective, which is crazy! After all, we can measure temperature with a thermometer. Or define it as the average kinetic energy of the particles (in a monoatomic gas, in other cases you need the potential energy from the bonds)! Those are both objective in the sense of not depending on the observer.
That is true, as those are slightly different notions of temperature. The objective measurement is the one important for determining whether something will burn your hand, and thus is the one which the colloquial sense of temperature tracks. But the definition entropy is actually more useful, and it’s more useful because we can wring some extra advantage from the fact that it is subjective.
And that’s because, it is this notion of temperature which governs the use of a engine. Without the subjective definition, we merely get the law of a heat engine. As a simple intuition, consider that you happen to know that your heat source doesn’t just have molecules moving randomly, but that they are predominantly moving back and forth along a particular axis at a specific frequency. The temperature of a thermometer attached to this may measure the same temperature as an ordinary heat sink with the same amount of energy (mediated by phonon dissipation), and yet it would be simple to create an engine using this “heat sink” exceeding the Carnot limit simply by using a non-heat engine which takes advantage of the vibrational mode!
Say that this vibrational mode was hidden or hard to notice. Then someone with the knowledge of it would be able to make a more effective engine, and therefore extract more work, than someone who hadn’t noticed.
Another example is Maxwell’s demon. In this case, the demon has less uncertainty over the state of the gas than someone at the macro-level, and is thereby able to extract more work from the same gas.
But perhaps the real power of this subjective notion of temperature comes from the fact that the Carnot limit still applies with it, but now generalized to any kind of engine! This means that there is a physical limit on how much work can be extracted from a system which directly depends on your uncertainty about the system!! [This argument needs to actually be fleshed out for this post to be convincing, I think...]
The Work of Optimization
[Currently MUCH rougher than the above...]
Hopefully now, you can start to see the outlines of how it is knowable that
Try to let go of any intuitions about “minds” or “agents”, and think about optimizers in a very mechanical way.
Physical work is about the energy necessary to change the configuration of matter.
Roughly, you can factor an optimizer into three parts: The Modeler, the Engine, and the Actuator. Additionally, there is the Environment the optimizer exists within and optimizes over. The Modeler models the optimizer’s environment—decreasing uncertainty. The Engine uses this decreased uncertainty to extract more work from the environment. The Actuator focuses this work into certain kinds of configuration changes.
[There seems to be a duality between the Modeler and the Actuator which feels very important.]
Examples:
Gas Heater
It is the implicit knowledge of the location, concentration, and chemical structure of a natural gas line that allow the conversion of natural gas and the air in the room to state from a state of both being at the same low temperature to a state where the air is at a higher temperature, and the gas has been burned.
-- How much work does it take to heat up a room? —How much uncertainty is there in the configuration state before and after combustion?
This brings us to an important point. A gas heater still works with no one around to be modeling it. So how is any of the subjective entropy stuff relevant? Well, from the perspective of no one—the room is simply in one of a plethora of possible states before, and it is in another of those possible states after, just like any other physical process anywhere. It is only because of the fact that we find it somehow relevant that the room is hotter before than after that thermodynamics comes into play. The universe doesn’t need thermodynamics to make atoms bounce around, we need it to understand and even recognize it as an interesting difference.
Thermostat
Bacterium
Natural Selection
Chess Engine
Human
AI
Why Orthogonality?
[More high level sections to come]