But for your reasoning upthread to work, Harry has to be so sure that the outpouring of magic carried all the information about Hermione that it’s not worth it to him to try and protect her brain, and with (a) his protestations that brain damage means that the information must be in the brain and (b) there not being a shred of evidence that I can remember off the top of my head that Muggles require any magic to run (in which case witches/wizards’ brains/souls would presumably have to work completely differently from Muggles), I don’t think Harry is at a point where he can conclude that the chance of reviving her by saving the body is so low that he should concentrate his efforts on chasing her souly-looking emanation of magic.
Benya
Actually, you win if you are able to choose a princess other than Random—you do not need to know which of the two remaining ones is Random. Otherwise, this would clearly be impossible since the answer provides only one bit and there are three possibilities. (And that’s not even considering that under sensible interpretations of the rules, you don’t get any information if you happen to ask Random—i.e., you’re not allowed to ask e.g., “Is it true that (you are False) OR (you are Random and you’ve decided to answer truthfully this time)”, which, if allowed, would be answered in the affirmative iff the one you asked is Random.)
(Okay, I’ll unpack the implication:) Assigning a probability of 10^(-3) would mean being really, really, really, really sure that the hypothesis is wrong. To be well-calibrated, you would have to be able to make ten thousand similar judgments with similar strengths of evidence and only be wrong about ten times, and if you can do that, you’re very good at this sort of thing.
Assigning 10^(-13) -- i.e., suggesting that you’re so good that you can do this and only be wrong one in ten million million times—is just obviously wrong.
So I was implying that the fact that the book suggests that this kind of number can be a plausible outcome means that it isn’t a very good place to learn the art of making Bayesian probability estimates. To learn to make well-calibrated estimates, I should try to learn from people who stand a snowball’s chance in hell of making such estimates themselves.
For an example from someone who has a claim to actually being good at this sort of thing, see Gwern’s Who wrote the Death Note script?.
Beatrice and Claudia end up agreeing that the leading candidate is de Vere, with Ignotus second and Stratford a very distant third. Beatrice’s entries lead to a final probability of 10−13 (one chance in ten million million) that Shakespeare was the gentleman from Stratford-upon-Avon. Claudia’s entries lead to an even smaller probability.
I don’t think I want to use this book to try to learn how to produce to well-calibrated probability estimates.
FWIW, as far as I can remember I’ve always understood this the same way as Wei and cousin_it. (cousin_it was talking about the later logic-based work rather than Wei’s original post, but that part of the idea is common between the two systems.) If the universe is a Game of Life automaton initialized with some simple configuration which, when run with unlimited resources and for a very long time, eventually by evolution and natural selection produces a structure that is logically equivalent to the agent’s source code, that’s sufficient for falling under the purview of the logic-based versions of UDT, and Wei’s informal (underspecified) probabilistic version would not even require equivalence. There’s nothing Cartesian about UDT.
I’ve got a BSc in mathematics from University of Vienna, and the degree I’m working on in Bristol is a PhD, also in mathematics.
Sorry for being confusing, and thanks for giving me a chance to try again! (I did write that comment too quickly due to lack of time.)
So, my point is, I think that there is very little reason to think that evolution somehow had to solve the Löbstacle in order to produce humans. We run into the Löbstacle when we try to use the standard foundations of mathematics (first-order logic + PA or ZFC) in the obvious way to make a self-modifying agent that will continue to follow a given goal after having gone through a very large number of self-modifications. We don’t currently have any framework not subject to this problem, and we need one if we want to build a Friendly seed AI. Evolution didn’t have to solve this problem. It’s true that evolution did have to solve the planning/self-prediction problem, but it didn’t have to solve it with extremely high reliability. I see very little reason to think that if we understood how evolution solved the problem it solved, we would then be really close to having a satisfactory Löbstacle-free decision theory to use in a Friendly seed AI—and thus, conversely, I see little reason to think that an AGI project must solve the Löbstacle in order to solve the planning/prediction problem as well as evolution did.
I can more easily conceive of the possibility (but I think it rather unlikely, too) that solving the Löbstacle is fundamentally necessary to build an agent that can go through millions of rewrites without running out of steam: perhaps without solving the Löbstacle, each rewrite step will have an independent probability of making the machine wirehead (for example), so an AGI doing no better than evolution will almost certainly wirehead during an intelligence explosion. But in this scenario, since evolution build us, an AGI project might build an AI that solves the planning/self-prediction as well as we do, and that AI might then go and solve the Löbstacle and go through a billion self-modifications and take over the world. (The human operators might intervene and un-wirehead it every 50,000 rewrites or so until it’s figured out a solution to the Löbstacle, for example.) So even in this scenario, the Löbstacle doesn’t seem a barrier to AI capability to me; but it is a barrier to FAI, because if it’s the AI that eventually solves the Löbstacle, the superintelligence down the line will have the values of the AI at the time it’s solved the problem. This was what I intended to say by saying that the AGI would “successfully navigate an intelligence explosion—and then paperclip the universe”.
(On the other hand, while I only think of the above as an outside possibility, I think there’s more than an outside possibility that a clean reflective decision theory could be helpful for an AGI project, even if I don’t think it’s a necessary prerequisite. So I’m not entirely unsympathetic to your concerns.)
Does the above help to clarify the argument I had in mind?
Not sure where I’ll be then, so I’m a ‘Maybe’.
Ah, drats. Hopefully having the list will actually help and we’ll be able to do better in the future!
Thanks for explaning the reasoning!
I do agree that it seems quite likely that even in the long run, we may not want to modify ourselves so that we are perfectly dependable, because it seems like that would mean getting rid of traits we want to keep around. That said, I agree with Eliezer’s reply about why this doesn’t mean we need to keep an FAI around forever; see also my comment here.
I don’t think Löb’s theorem enters into it. For example, though I agree that it’s unlikely that we’d want to do so, I don’t believe Löb’s theorem would be an obstacle to modifying humans in a way making them super-dependable.
Benja’s hope seems reasonable to me.
I’d hope so, since I think I got the idea from you :-)
This is tangential to what this thread is about, but I’d add that I think it’s reasonable to have hope that humanity will grow up enough that we can collectively make reasonable decisions about things affecting our then-still-far-distant future. To put it bluntly, if we had an FAI right now I don’t think it should be putting a question like “how high is the priority of sending out seed ships to other galaxies ASAP” to a popular vote, but I do think there’s reasonable hope that humanity will be able to make that sort of decision for itself eventually. I suppose this is down to definitions, but I tend to visualize FAI as something that is trying to steer the future of humanity; if humanity eventually takes on the responsibility for this itself, then even if for whatever reason it decides to use a powerful optimization process for the special purpose of preventing people from building uFAI, it seems unhelpful to me to gloss this without more qualification as “the friendly AI [… will always …] stop unsafe AIs from being a big risk”, because the latter just sounds to me like we’re keeping around the part where it steers the fate of humanity as well.
- Jun 7, 2013, 11:03 PM; 0 points) 's comment on Will the world’s elites navigate the creation of AI just fine? by (
One argument in favor of this being relevant specifically to FAI is that evolution kludged up us, so there is no strong reason to think that AGI projects with an incomplete understanding of the problem space will eventually kludge up an AGI that is able to solve these problems itself and successfully navigate an intelligence explosion—and then paperclip the universe, since the incomplete understanding of the human researchers creating the seed AI wouldn’t suffice for giving the seed AI stable goals. I.e., solving this in some way looks probably necessary for reaching AI safety at all, but only possibly helpful for AI capability.
I’m not entirely unworried about that concern, but I’m less worried about it than about making AGI more interesting by doing interesting in-principle work on it, and I currently feel that even the latter danger is outweighed by the danger of not tackling the object-level problems early enough to actually make progress before it’s too late.
Why the downvotes? Do people feel that “the FAI should at some point fold up and vanish out of existence” is so obvious that it’s not worth pointing out? Or disagree that the FAI should in fact do that? Or feel that it’s wrong to point this out in the context of Manfred’s comment? (I didn’t mean to suggest that Manfred disagrees with this, but felt that his comment was giving the wrong impression.)
Great, hope you’ll do! :)
I think that in addition to this being true, it is also how it looks from the outside—at least, it’s looked that way to me, and I imagine many others who have been concerned about SI focusing on rationality and fanfiction are coming from a similar perspective. It may be the case that without the object-level benefits, the boost to MIRI’s credibility from being seen to work on the actual technical problem wouldn’t justify the expense of doing so, but whether or not it would be enough to justify the investment by itself, I think it’s a really significant consideration.
[ETA: Of course, in the counterfactual where working on the object problem actually isn’t that important, you could try to explain this to people and maybe that would work. But since I think that it is actually important, I don’t particularly expect that option to be available.]
Meetup : Second Bristol meetup & mailing list for future meetups
Will keep an eye out for the next citation.
Thanks!
[...] motorcycles. [...]
Point. Need to think.
Many people believe that about climate change (due to global political disruption, economic collapse etcetera, praising the size of the disaster seems virtuous).
Hm! I cannot recall a single instance of this. (Hm, well; I can recall one instance of a TV interview with a politician from a non-first-world island nation taking projections seriously which would put his nation under water, so it would not be much of a stretch to think that he’s taking seriously the possibility that people close to him may die from this.) If you have, probably this is because I haven’t read that much about what people say about climate change. Could you give me an indication of the extent of your evidence, to help me decide how much to update?
Many others do not believe it about AI.
Ok, agreed, and this still seems likely even if you imagine sensible AI risk analyses being similarly well-known as climate change analyses are today. I can see how it could lead to an outcome similar to today’s situation with climate change if that happened… Still, if the analysis says “you will die of this”, and the brain of the person considering the analysis is willing to assign it some credence, that seems to align personal selfishness with global interests more than (climate change as it has looked to me so far).
Personal and tribal selfishness align with AI risk-reduction in a way they may not align on climate change.
This seems obviously false. Local expenditures—of money, pride, possibility of not being the first to publish, etc. - are still local, global penalties are still global. Incentives are misaligned in exactly the same way as for climate change.
Climate change doesn’t have the aspect that “if this ends up being a problem at all, then chances are that I (or my family/...) will die of it”.
(Agree with the rest of the comment.)
Upvoted, but “always” is a big word. I think the hope is more for “as long as it takes until humanity starts being capable of handling its shit itself”...
I’m not claiming that recoverable info-containing souls are a completely crazy outlandish hypothesis; I merely claim that Harry is nowhere near having enough support for this hypothesis to stake Hermione’s life on it being true.