Vaniver comments on The Rocket Alignment Problem

Vaniver 4 Oct 2018 17:07 UTC
10 points
Well the obvious thing to do would be to add more heuristics to your paperclip maker.
I agree this is obvious. But do you have any reason to believe it will work? One of the core arguments here is that trying to constrain optimization processes is trying to constrain an intelligent opponent, because the optimization is performing search through a space of solutions much like an intelligent opponent is. This sort of ‘patch-as-you-go’ solution is highly inadequate, because the adversary always gets the first move and because the underlying problem making the search process an adversary hasn’t been fixed, so it will just seek out the next hole in the specification. See Security Mindset and Ordinary Paranoia.
Once you have all these pieces available to parties with sufficient budgets, it would be like having a way to order highly enriched plutonium from Granger. Then it would be possible to build a closed-loop, self improving system.
What is the word ‘then’ doing in this paragraph? I’m reading you as saying “yes, highly advanced artificial intelligence would be a major problem, but we aren’t there now or soon.” But then there are two responses:
1) How long will it take to do the alignment research? As mentioned in the dialogue, it seems like it may be the longest part of the process, such that waiting to start would be a mistake that delays the whole process and introduces significant risk. As a subquestion, is the alignment research something that happens by default as part of constructing capabilities? It seems to me like it’s quite easy to be able to build rockets without knowing how orbital mechanics work. Historically, orbital mechanics were earlier in the tech tree*, but I don’t think they were a prerequisite for rocket-building.
2) When will we know that it’s time? See There’s No Fire Alarm for Artificial General Intelligence.
*Here I mean ‘rockets that could escape Earth’s gravity well,’ since other rockets were made much earlier.
- Gerald Monroe 4 Oct 2018 17:47 UTC
  −4 points
  Parent
  If the paperclip maker’s architecture is a set of constrained boxes, where each box does a tiny, well defined part of the problem of making paperclips, and is being evaluated by other boxes that ultimately trace their goals and outputs to human defined goals and sensor data, it’s not going anywhere. It’s not even sentient in that there’s no memory in the system for anything like self reflection. Every piece of memory is specific to the needs of a component. You have to build reliable real-time systems like this, other architectures won’t function in a way that wouldn’t fail so often as to be economically infeasible. (because paperclips have very low value, while robotic waldos and human lives are expensive)
  This is what I mean by I’m on the side of the spaceplane designers. I don’t know how another, more flexible architecture would even function, in the same way in this story they don’t know how to build a vehicle that doesn’t depend on air.