If you already know what you’re going to tell it when it asks for new goals, couldn’t you just program that in from the beginning? So the script would be, “work on your models for X years, then try to parse this statement …”
Also, re: Eliezer’s HTTP GET objection, you could just give it a giant archive of the internet and no actual connection to the outside world. If it’s just supposed to be learning and not affecting anything external, that should be sufficient (to ensure learning, not necessarily to preclude all effects on the outside world).
At this point, I think we’ve just reinvented the concept of CEV.
If you already know what you’re going to tell it when it asks for new goals, couldn’t you just program that in from the beginning? So the script would be, “work on your models for X years, then try to parse this statement …”
Also, re: Eliezer’s HTTP GET objection, you could just give it a giant archive of the internet and no actual connection to the outside world. If it’s just supposed to be learning and not affecting anything external, that should be sufficient (to ensure learning, not necessarily to preclude all effects on the outside world).
At this point, I think we’ve just reinvented the concept of CEV.