Alright, it’s been more than a “little” bit (new baby, haven’t had a ton of time), and not as complete of a reply as I was hoping to write, but
(A) Physical reality is probably hyper-computational
My impression is almost the opposite—physical reality seems not only to contain a finite amount of information and have a finite capacity for processing that information, but on top of that the “finite” in question seems surprisingly small. Specifically, the entropy of the observable universe seems to be in the ballpark of 10124 bits (c=3×108ms, rhorizon=4.4×1026m, so area is given by A=4πr2=2.4×1054m2. The Planck length is lp=1.6×10−35m and thus the Bekenstein–Hawking entropy in natural units is justSBH=A4=9.5×10123nats=1.3×10124bits). For context, the best estimate I’ve seen is that the total amount of data stored by humanity is in the ballpark of 1023 bits. If data storage increases exponentially, we’re a fifth of the way to “no more data storage capacity in the universe”. And similarly Landauer gives pretty tight bounds on computational capacity (I think something on the order of 10229 bit erasures as an upper bound if my math checks out).
So the numbers are large, but not “you couldn’t fit the number on a page if you tried to write it out” large.
but also probably amenable to pulling a nearly infinite stack of “big salient features” from a reductively analyzable real world situation
If we’re limiting it to “big salient features” I expect the number of features one actually cares about in a normal situation is actually pretty small. Deciding which features are relevant, though, can be nontrivial.
What I’m saying is: I think maybe NORMAL human values (amongst people with default mental patterns rather than weirdo autists who try to actually be philosophically coherent and ended up with utility functions that have coherently and intentionally unbounded upsides) might well be finite, and a rule for granting normal humans a perceptually indistinguishable version of “heaven” might be quite OK to approximate with “a mere a few billion well chosen if/then statements”.
I don’t disagree that a few billion if/then statements are likely sufficient. There’s a sense in which e.g. the MLP layers of an LLM are just doing a few tens of millions “if residual > value in this direction then write value in that direction, else noop” operations, which might be using clever-high-dimensional-space-tricks to emulate doing a few billion such operations, and they are able to answer questions about human values in a sane way, so it’s not like those values are intractable to express.
The rules for a policy that achieves arbitrarily good outcomes according to that relatively-compactly-specifiable value system, however, might not admit “a few billion well-chosen if/then statements”. For example it’s pretty easy to specify “earning block reward for the next bitcoin block would be good” but you can’t have a few billion if/then statements which will tell you what nonce to choose such that the hash of the resulting block has enough zeros to earn the block reward.
If I had to condense it down to one sentence it would be “the hard part of consequentialism is figuring out tractable rules which result in good outcomes when applied, not in figuring out which outcomes are even good”.
(B) A completely OTHER response here is that you should probably take care to NOT aim for something that is literally mathematically impossible...
Agreed. I would go further than “not literally mathematically impossible” and further specify “and also not trying to find a fully optimal solution for an exponentially hard problem with large n”.
Past a certain point, one can simply never be adversarially robust in a programmatic and symbolically expressible way.
Beautifully put. I suspect that “certain point” is one of those “any points we actually care about are way past this point” things.
(C) Chaos is a thing! Even (and especially) in big equations, including the equations of mind that big stacks of adversarially optimized matrices represent!
“The boundary of neural network trainability is fractal” was a significant part of my inspiration for writing the above. Honestly, if I ever get some nice contiguous more-than-1-hour chunks of time I’d like to write a post along the lines of “sometimes there are no neat joints separating clusters in thing-space” which emphasizes that point (my favorite concrete example is that “predict which complex root of a cubic or higher polynomial Euler’s method will converge to given a starting value” generates a fractal rather than the voronoi-diagram-looking-thing you would naively expect).
It would be… kinda funny, maybe, to end up believing “we can secure a Win Condition for the Normies (because take A is basically true), but True Mathematicians are doomed-and-blessed-at-the-same-time to eternal recursive yearning and Real Risk (because take B is also basically true)” <3
I think this is likely true. Though I think there are probably not that many True Mathematicians once the people who have sought and found comfort in math find that they can also seek and find comfort outside of math.
Quality over speed is probably maybe still sorta correct
Don’t worry, the speed was low but the quality was on par with it :)
Alright, it’s been more than a “little” bit (new baby, haven’t had a ton of time), and not as complete of a reply as I was hoping to write, but
My impression is almost the opposite—physical reality seems not only to contain a finite amount of information and have a finite capacity for processing that information, but on top of that the “finite” in question seems surprisingly small. Specifically, the entropy of the observable universe seems to be in the ballpark of 10124 bits (c=3×108ms, rhorizon=4.4×1026m, so area is given by A=4πr2=2.4×1054m2. The Planck length is lp=1.6×10−35m and thus the Bekenstein–Hawking entropy in natural units is just SBH=A4=9.5×10123nats=1.3×10124bits). For context, the best estimate I’ve seen is that the total amount of data stored by humanity is in the ballpark of 1023 bits. If data storage increases exponentially, we’re a fifth of the way to “no more data storage capacity in the universe”. And similarly Landauer gives pretty tight bounds on computational capacity (I think something on the order of 10229 bit erasures as an upper bound if my math checks out).
So the numbers are large, but not “you couldn’t fit the number on a page if you tried to write it out” large.
If we’re limiting it to “big salient features” I expect the number of features one actually cares about in a normal situation is actually pretty small. Deciding which features are relevant, though, can be nontrivial.
I don’t disagree that a few billion if/then statements are likely sufficient. There’s a sense in which e.g. the MLP layers of an LLM are just doing a few tens of millions “if residual > value in this direction then write value in that direction, else noop” operations, which might be using clever-high-dimensional-space-tricks to emulate doing a few billion such operations, and they are able to answer questions about human values in a sane way, so it’s not like those values are intractable to express.
The rules for a policy that achieves arbitrarily good outcomes according to that relatively-compactly-specifiable value system, however, might not admit “a few billion well-chosen if/then statements”. For example it’s pretty easy to specify “earning block reward for the next bitcoin block would be good” but you can’t have a few billion if/then statements which will tell you what nonce to choose such that the hash of the resulting block has enough zeros to earn the block reward.
If I had to condense it down to one sentence it would be “the hard part of consequentialism is figuring out tractable rules which result in good outcomes when applied, not in figuring out which outcomes are even good”.
Agreed. I would go further than “not literally mathematically impossible” and further specify “and also not trying to find a fully optimal solution for an exponentially hard problem with large n”.
Beautifully put. I suspect that “certain point” is one of those “any points we actually care about are way past this point” things.
“The boundary of neural network trainability is fractal” was a significant part of my inspiration for writing the above. Honestly, if I ever get some nice contiguous more-than-1-hour chunks of time I’d like to write a post along the lines of “sometimes there are no neat joints separating clusters in thing-space” which emphasizes that point (my favorite concrete example is that “predict which complex root of a cubic or higher polynomial Euler’s method will converge to given a starting value” generates a fractal rather than the voronoi-diagram-looking-thing you would naively expect).
I think this is likely true. Though I think there are probably not that many True Mathematicians once the people who have sought and found comfort in math find that they can also seek and find comfort outside of math.
Don’t worry, the speed was low but the quality was on par with it :)