When I saw the picture, I assumed she was the woman you described in one of your Bayesian conspiracy stories that you post here. But then, she was in a pink jumpsuit, and had, I think, blond hair.
Silas
@Daniel_Franke: I was just describing a sufficient, not a necessary condition. I’m sure you can ethically get away with less. My point was just that, once you can make models that detailed, you needn’t be prevented from using them altogether, because you wouldn’t necessarily have to kill them (i.e. give them information-theoretic death) at any point.
@Tim_Tyler:
The main problem with death is that valuable things get lost. Once people are digital, this problem tends to go away—since you can relatively easily scan their brains—and preserve anything of genuine value. In summary, I don’t see why this issue would be much of a problem.
I was going to say something similar, myself. All you have to do is constrain the FAI so that it’s free to create any person-level models it wants, as long as it also reserves enough computational resources to preserve a copy so that the model citizen can later be re-instantiated in their virtual world, without any subjective feeling of discontinuity.
However, that still doesn’t obviate the question. Since the FAI has limited resources, it will still have to know, for which things it must reserve space for preserving, in order to know if the greater utility of the model justifies the additional resources it requires. Then again, it could just accelerate the model so that that person lives out a full, normal life in their simulated universe, so that they are irreversibly dead in their own world anyway.
Khyre: Setting or clearing a bit register regardless of what was there before is a one-bit irreversible operation (the other two one-bit input, one-bit output functions are constant 1 and constant 0).
face-palm I can’t believe I missed that. Thanks for the correction :-)
Anyway, with that in mind, Landauer’s principle has the strange implication that resetting anything to a known state, in such a way that the previous can’t be retrieved, necessarily releases heat, and the more information the state conveys to the observer, the more heat is released. Okay, end threadjack...
I’m going to nitpick (mainly because of how much reading I’ve been doing about thermodynamics and information theory since your engines of cognition post):
Human neurons … dissipate around a million times the heat per synaptic operation as the thermodynamic minimum for a one-bit operation at room temperature. … it ought to be possible to run a brain at a million times the speed without … invoking reversible computing or quantum computing.
I think you mean neurons dissipate a million times the thermodynamic minimum for an irreversible one-bit operation at room temperature, though perhaps it was clear you were talking about irreversible operations from the next sentence. A reversible operation can be made arbitrarily close to dissipating zero heat.
Even then, a million might be a low estimate. By Landauer’s Principle a one-bit irreversible operation requires only kTln2 = 2.9e-21 J at 25 degrees C. Does the brain use more than 2.9e-15 J per synaptic operation?
Also, how can a truly one-bit digital operation be irreversible? The only such operations that both input and output one bit are the identity and inversion gates, both of which are reversible.
I know, I know, tangential to your point...
Nick_Tarleton: I think you’re going a bit too far there. Stability control theory had by that time been rigorously and scientifically studied, dating back to Watts’s flyball governor in the 18th century (which controlled shaft rotation speed by allowing a ball to swing out and increase rotational inertia as it sped up) and probably even before that with the incubator (which used heat to move a valve that allowed just the right amount of cooling air in). Then all throughout the 19th century engineers attacked the problem of “hunting” on trains, where they would unsettlingly lurch faster and slower. Bicycles, a fairly recent invention then, had to tackle the rotational stability problem, somewhat similar (as many bicycle design constraints are) to what aircraft deal with.
Certainly, many inventors grasped at straws in attempt to replicate functionality, but the idea that they considered the stability implications of the beak isn’t too outlandish.
@Scott_Aaronson: Previously, you had said the problem is solved with certainty after O(1) queries (which you had to, to satisfy the objection). Now, you’re saying that after O(1) queries, it’s merely a “high probability”. Did you not change which claim you were defending?
Second, how can the required number of queries not depend on the problem size?
Finally, isn’t your example a special case of exactly the situation Eliezer_Yudkowsky describes in this post? In it, he pointed out that the “worst case” corresponds to an adversary who knows your algorithm. But if you specifically exclude that possibility, then a deterministic algorithm is just as good as the random one, because it would have the same correlation with a randomly chosen string. (It’s just like the case in the lockpicking problem: guessing all the sequences in order has no advantage over randomly picking and crossing off your list.) The apparent success of randomness is again due to, “acting so crazy that a superintelligent opponent can’t predict you”.
Which is why I summarize Eliezer_Yudkowsky’s position as: “Randomness is like poison. Yes, it can benefit you, but only if you use it on others.”
- 23 May 2014 8:57 UTC; 26 points) 's comment on Can noise have power? by (
Could Scott_Aaronson or anyone who knows what he’s talking about, please tell me the name of the n/4 left/right bits problem he’s referring to, or otherwise give me a reference for it? His explanation doesn’t seem to make sense: the deterministic algorithm needs to examine 1+n/4 bits only in the worst case, so you can’t compare that to the average output of the random algorithm. (Average case for the determistic would, it seems, be n/8 + 1) Furthermore, I don’t understand how the random method could average out to a size-independent constant.
Is the randomized algorithm one that uses a quantum computer or something?
Someone please tell me if I understand this post correctly. Here is my attempt to summarize it:
“The two textbook results are results specifically about the worst case. But you only encounter the worst case when the environment can extract the maximum amount of knowledge it can about your ‘experts’, and exploits this knowledge to worsen your results. For this case (and nearby similar ones) only, randomizing your algorithm helps, but only because it destroys the ability of this ‘adversary’ to learn about your experts. If you instead average over all cases, the non-random algorithm works better.”
Is that the argument?
@Caledonian and Tiiba: If we knew where the image was, we wouldn’t need the dots.
Okay, let’s take a step back: the scenario, as Caledonian originally stated, was that the museum people could make a patron better see the image if the museum people put random dots on the image. (Pronouns avoided for clarity.) So, the problem is framed as whether you can make someone else see an image that you already know is there, by somehow exploiting randomness. My response is that, if you already know the image is there, you can improve beyond randomness, but just putting the dots there in a way that highlights the hidden image’s lines. In any case, from that position, Eliezer_Yudkowsky is correct in that you can only improve the patron’s detection ability for that image, by exploiting your non-random knowledge about the image.
Now, if you want to reframe that scenario, you have to adjust the baselines appropriately. (Apples to apples and all.) Let’s look at a different version:
I don’t know if there are subtle, barely-visible images that will come up in my daily life, but if there are, I want to see them. Can I make myself better off by adding random gray dots to my vision? By scattering physical dots wherever I go?
I can’s see how it would help, but feel free to prove me wrong.
@Joshua_Simmons: I got to thinking about that idea as I read today’s post, but I think Eliezer_Yudkowsky answered it therein: Yes, it’s important to expirment, but why must your selection of what to try out, be random? You should be able to do better by exploiting all of your knowledge about the structure of the space, so as to pick better ways to experiment. To the extent that your non-random choices of what to test do worse than random, it is because your understanding of the problem is so poor as to be worse than random.
(And of course, the only time when searching the small space around known-useful points is a good idea, is when you already have knowledge of the structure of the space...)
@Caledonian: That’s an interesting point. But are you sure the effect you describe (at science museums) isn’t merely due to the brain now seeing a new color gradient in the image, rather than randomness as such? Don’t you get the same effect from adding an orderly grid of dots? What about from aligning the dots along the lines of the image?
Remember, Eliezer_Yudkowsky’s point was not that randomness can never be an improvement, but that it’s always possible improve beyond what randomness would yield.
So, in short: “Randomness is like poison: Yes, it can benefit you, but only if you feed it to people you don’t like.”
Will_Pearson: Is it literally? Are you saying I couldn’t send a message to someone that enabled them to print out a list of the first hundred integers without referencing a human’s cognitive structure.
Yes, that’s what I’m saying. It’s counterintuitive because you so effortlessly refernce others’ cognitive structures. In communicating, you assume a certain amount of common understanding, which allows you to know whehter your message will be understood. In sending such a message, you rely on that information. You would have to think, “will they understand what this sentence means”, “can they read this font”, etc.
Tim_Tyler: The whole idea looks like it needs major surgery to me—at least I can’t see much of interest in it as it stands. Think you can reformulate it so it makes sense? Be my guest.
Certainly. All you have to do is read it so you can tell me what about it doesn’t make sense.
Anyway, such a criticism cuts against the original claim as well—since that contained “know” as well as “don’t know”.
Which contests the point how?
Okay, fair challenge.
I agree about your metal example, but it differs significantly from my discussion of the list-output program for the non-trivial reason I gave: specifically, the output is defined by its impact on people’s cognitive structure.
Look at it this way: Tim_Tyler claims that I know everything there is to know about the output of a program that spits out the integers from 1 to 100. But, when I get the output, what makes me agree that I am in fact looking at those integers? Let’s say that when printing it out (my argument can be converted to one about monitor output), I see blank pages. Well, then I know something messed up: the printer ran out of ink, was disabled, etc.
Now, here’s where it gets tricky: what if instead it only sorta messes up: the ink is low and so it’s applied unevenly so that only parts of the numbers are missing? Well, depending on how badly it messes up, I may or may not still recognize the numbers as being the integers 1-100. It depends on whether it retains enough of the critical characteristics of those numbers for me to so recognize them.
To tie it back to my original point, what this all means is that the output is only defined with respect to a certain cognitive system: that determines whether the numbers are in fact recognizable as 9′s, etc. If it’s not yet clear what the difference is between this and metal’s melting point, keep in mind that we can write a program to find a metal’s melting point, but we can’t write a program that will look at a printout and know if it retains enough of its form that a human recognizes it as any specific letter—not yet, anyway.
Further analysis, you say, Tim_Tyler? Could you please redirect effort away from putdowns and into finding what was wrong with the reasoning in my previous comment?
Very worthwhile points, Tim_Tyler.
First of all, the reason for my spirited defense of MH’s statement is that looked like a good theory because of how concise it was, and how consistent with my knowledge of programs it was. So, I upped my prior on it and tended to see apparent failures of it as a sign I’m not applying it correctly, and that further analysis could yield a useful insight.
And I think I that belief is turning out to be true:
It seems to specify that the output is what is unknown—not the sensations that output generates in any particular observer.
But the sensations are a property of the output. In a trivial sense: it is a fact about the output, that a human will perceive it in a certain way.
And in a deeper sense, the numeral “9” means “that that someone will perceive as symbol representing the number nine in the standard number system”. I’m reminded of Douglas Hofstadter’s claim that definition of individual letters is an AI-complete problem because you must know a wealth of information about the cognitive system to be able to identify the full set of symbols someone will recognize as e.g. an “A”.
This yields the counterintuitive result that, for certain programs, you must reference the human cognitive system (or some concept isomorphic thereto) in listing all the facts about the output. That result must hold for any program whose output will eventually establish mutual information with your brain.
Am I way off the deep end here? :-/
@Eliezer_Yudkowsky: It wouldn’t be an exact sequence repeating, since the program would have to handle contingencies, like cows being uncooperative because of insufficiently stimulating conversation.
Nick_Tarleton: Actually, Tim_Tyler’s claim would still be true there, because you may want to print out that list, even if you knew some exact arrangement of atoms with that property.
However, I think Marcello’s Rule is still valid there and survives Tim_Tyler’s objection: in that case, what you don’t know is “the sensation arising from looking at a the numbers 1 through 100 prettily printed”. Even if you had seen such a list before, you probably would want to print it out unless your memory were perfect.
My claim generalizes nicely. For example, even if you ran a program for the purpose of automating a farm, and knew exactly how the farm would work, then what you don’t know in that case is “the sensation of subsisting for x more days”. Although Marcello’s Rule starts to sound vacuous at that point.
Hey, make a squirrely objection, get a counterobjection twice as squirrely ;-)
Quick question: How would you build something smarter, in a general sense, than yourself? I’m not doubting that it’s possible, I’m just interested in knowing the specific process one would use.
Keep it brief, please. ;-)
Well, now you f*in’ tell me.