I’d bet we’re going to figure out how to make an omohundro optimiser—a fitness-maximizing AGI—before we figure out how to make AGI that can rescue the utility function, preserve a goal, or significantly optimise any metric other than its own survival, such as paperclip production, or Good.
(Arguing for that is a bit beyond the scope of the question, but I know this position has a lot of support already. I’ve heard Eliezer say, if not this exactly, something very similar. Nick Land especially believes that only the omohundro drives could animate self-improving AGI. I don’t think Nick Land understands how agency needs to intercede in prediction—that it needs to consider all of the competing self-fulfilling prophesies and only profess the prophesy it really wants to live in, instead of immediately siding with the prophesy that seems the most hellish, and most easiest to stumble into. The prophesies he tends to choose do seem like the easiest prophesies to stumble into, so he provides a useful service as a hazard alarm, for we who are trying to learn not to stumble))
What would you advise we do, when one of us finds ourselves in the position of knowing how to build an omohundro optimiser? Delete the code and forget it?
If we had a fitness-optimising program, is there anything good it could be used for?
Don’t run the code! Do the following alone or with friends:
Understand why the code works. Learn about the nature of agency. What are the parts of an agent? Which parts compute what using what information and what algorithms? How do agents coordinate with each other?
Understand psychology. Read some books (psychology, sociology, history). Talk to people who know about this. Meditate and take drugs that help in introspection (e.g. psychedelics). Generate new hypotheses. Talk to people and test your hypotheses.
Relate agency theories to psychology theories using functionalism. Perhaps the agency theory has “value of information”, and psychology has “curiosity”, and these are functionally similar. Perhaps agency theory has “terminal values”, and psychology has “intrinsic wants”, and these are functionally similar. And so on, and so forth. (You’ll develop better psychology theories along the way, and more human-applicable versions of agency theory)
Use your understanding of agency and psychology to develop epistemology and ethics. How do I get better beliefs? What is right action? How does a healthy mind work? How do I tell what I really want? How do I ethically coordinate with others? How do these translate into practicable virtues?
Attain excellence. Practice virtue. Act ethically and effectively towards what you want. Get better introspection into deeper levels of your mind. Let your understanding of agency and psychology affect these parts as you get awareness of them.
At this point, the rest is up to you.