Their statement accords very well with the Hansonian vision of AI progress.
SilentCal
In the original context, the alleged desirable ambiguity was the ability to concisely omit information—that is, to say “people” instead of “men and women”. Tabooing ‘ambiguity’, I’d frame this as a matter of having words for large sets rather than requiring speakers to construct them out of smaller sets, and say that this is a good thing if those sets are commonly referred to.
On a similar note, there can be intensions whose extensions are not agreed upon—”good” and “right” spring to mind. At first I thought it would be necessary to have words for these, but upon reflection I’m not sure. Could we replace them with more specific words like “right according to classical utilitarianism” or “right according the ethics of the person this word relates to”?
That said, being a statistical or philosophical Bayesian does not require one to believe this cognitive science hypothesis. If Bayesian cognitive science were soundly disproven tomorrow, http://www.yudkowsky.net/rational/bayes/ would still stand in its entirety.
http://rocknrollnerd.github.io/ml/2015/05/27/leopard-sofa.html is also relevant—tl;dr Google Photos classifies a leopard-print sofa as a leopard. I think this lends credence to the ‘treacherous turn’ insofar as it’s an example of a classifier seeming to perform well and breaking down in edge cases.
I was wondering about the state of the deterrence in place against nuclear weapons usage, having always assumed it to be massive, and I can’t tell if there’s actually any formal international treaty about the use of nuclear weapons in war.
https://en.wikipedia.org/wiki/List_of_weapons_of_mass_destruction_treaties has arms-reduction, non-proliferation, and test ban treaties, but apparently nothing about who you actually nuke. I think Geneva says you can’t target civilians with any weapon, but does anything prohibit nuking your enemy’s army?
Much discussion in this SSC thread (http://slatestarcodex.com/2015/10/31/ot32-when-hell-is-full-the-thread-will-walk-the-earth/#comment-255433) of what “nuclear war” would really mean. Mostly focused on a total US/USSR type situation, but still made a big change in how I thought about the subject in general.
I think it would help the discussion to distinguish more between knowing what human values are and caring about them—that is, between acquiring instrumental values and acquiring terminal ones. The “human enforcement” section touches on this, but I think too weakly: it seems indisputable that an AI trained naively via a reward button would acquire only instrumental values, and drop them as soon as it could control the button. This is a counterexample to the Value Learning Thesis if interpreted as referring to terminal values.
An obvious programmer strategy would be to cause the AI to acquire our values as instrumental values, then try to modify the AI to make them terminal.
At the heart of this question is some concept of resource permission that I’m trying to nail down—that is, agent X has ‘self-modified’ into agent Y iff agent Y has the same hardware resources that agent X had. This distinguishes self-modification from emulation, which is important; humans have limited self-modification, but with a long paper tape we can emulate any program.
A proposed measure: Define the ‘emulation penalty’ of a program that could execute on the AI’s machine as the ratio of the runtime of the AI’s fastest possible emulation of that program to the runtime of the program executing directly on the machine. The maximum emulation penalty over all possible programs puts at least an lower bound on the AI’s ability to effectively self-modify into any possible agent.
An AI that can write and exec assembly would have a max emulation penalty of 1; one that can write and exec a higher-level language would probably have 10-100 (I think?); and one that could only carry out general computation by using an external paper tape would have a max emulation penalty in the billions or higher.
Sure. But if we know or suspect any correlation between A and Y, there’s nothing strange about the common information between them being expressed in the prior, right?
Granted, H-T will have nice worst-case performance if we’re not confident about A and Y being independent, but that reduces to this debate http://lesswrong.com/lw/k9c/can_noise_have_power/.
Ah, R&W’s pi function.
This is kind of tricky, because it doesn’t seem like it should hold information, unless it correlates with R&W’s theta (probability of Y = 1).
If pi and theta were guaranteed independent, would Horwitz-Thompson in any meaningful way outperform Sum(Y) / Sum(R), that is, the average observed value of Y in cases where Y is observed?
In particular, about how the classical Bayesian setup in fact tacitly assumes certain structural assumptions that lead to all information living in the likelihood function. In fact these assumptions do not hold in the Robins/Wasserman case, most of the information lives in the assignment probability (which is outside the likelihood).
I’m having trouble following this (i’m not actually that versed in statistics, and I don’t know what you mean by ‘assignment probability’. But it seems to me that we only think Horwitz-Thompson is a good answer because of tacit assumptions we hold about the data.
Accept that the philosophically ideal thing is unattainable in this case, and do the Frequentist thing or the pragmatic-Bayesian thing.
What I actually disagree with in the post is that it seems to be making a philosophical point based on the assumption that the uniform distribution over smooth functions is better subjective Bayesianism than the pragmatic approach. I dispute that premise.
On reflection, I think the point here has to do with logical uncertainty. The argument is that the uniform distribution is ‘purer’ because it’s something that we’re more likely to choose before seeing the problem and we should be able to choose our prior before seeing the problem. But this post is a thought experiment, not a real experiment—the only knowledge it gives us is logical knowledge. I think you should be able to update your estimated priors based on new logical knowledge.
As far as I can tell it all goes off the rails when you try using a uniform distribution over functions. There’s no way you actually believe all smooth random functions are equally likely—for instance, linear, quadratic, exponential, and approximate-step-function effects are probably all more likely than sinusoidal ones.
The way I see this, the demands of subjective Bayesianism as interpreted in the post are impractical. The example calculation makes its structure compatible with those demands, but at the cost of having absurd content.
On the other hand, the power of the prior isn’t always bad. If one measured variable is ‘phase of the moon at time of first kiss’ and another is ‘exposure to ionizing radiation’, we should be able to express the fact that one is more likely to have an effect than the other.
I like your post, but I’d reverse your punchline: humans were indeed not made for big societies, but big societies were made for humans. The problem is that our societies are a retrofit to try to coordinate humans at scales we were never meant for, hence the non-optimality.
I’d prefer we err on the inclusive side when determining fitness for the stupid questions thread, and I don’t think this is such a bad question. It strikes me as the sort of thing where even if it’s a wrong question, dissolving it would be valuable.
Also might want to take a look at http://squid314.livejournal.com/340809.html
I’d expect MIRI to build the “proper reasoning” layer, and then the AI would use its “proper reasoning” to design fast heuristic layers.
One example of a very-hard-to-avoid layering would be if the AI were distributed over large areas (global or perhaps interplanetary), there would probably have to be decisions made locally, but other decisions made globally. You could say the local processes are subagents and only the global process is the real AI, but it wouldn’t change the fact that the AI has designed trustworthy faster less-accurate decision processes.
What does a canvas for ideas/concepts mean?
Also, if someone could give a thought experiment for a location canvas, that would help me confirm that I don’t have one.
+1 for various anime, to be sure. I’m not the top expert, but I’ve seen a lot that, while having significant natural gifts, also place as much weight on training as performance—and if you count mid-contest growth as well as explicit training, then growth is the primary focus.
I’m talking about things like Dragonball (Z), and also sports anime like Hajime no Ippo and Eyeshield 21. I might even describe them as level-up porn.
Yes, stupidity can be an advantage. A literal rock can defeat any intelligent opponent at chicken, if it’s resting on the gas pedal (the swerve-to-avoid-collision version, rather than the brake-closer-to-the-cliff-edge version).
The catch is that you making yourself dumber to harvest this advantage has the same issues as other ways of trying to precommit.