Charlie Steiner comments on Conversation with Eliezer: What do you want the system to do?

Charlie Steiner 25 Jun 2022 19:43 UTC
4 points
1
I typically want the system to interact with the world digitally, gain humans’ trust, take over manufacturing capability to radically raise the standard of living on earth, and simultaneously spin up orbital infrastructure to launch replicating probes. For starters.

One could probably aim for more modest actions—I’m not super convinced that that would be easier, but it could be!

This affects what kinds of alignment strategies I consider. As I’ve said elsewhere, you have to solve a lot of philosophical problems when doing AI alignment, but the advantage we have over philosophers is that when a problem seems impossible, we get to throw it out and pick an easier problem. You design the AI, you get to pick what problem it’s solving—if you ask it to do something impossible, that’s bad, and it’s your job to ask it to do something possible instead.

Put that way, though, “what do I want the AI to do” is more importantly about “what currently unsolved problems do I want the AI to solve?” This is a finer grain of detail than the “oh, it will make peoples’ lives better” of my first paragraph.