To explain the relation, you said: “I can’t imagine having any return [...from this idea...] even in a perfect world, AI can still produce value, e.g. earn money online.”
I was trying to suggest that in fact there might be a path to Friendliness by installing sufficient safeguards that the primary way a software entity could replicate or spread would be by providing value to humans.
In the comment above, I explained why what AI does is irrelevant, as long as it’s not guaranteed to actually have the right values: once it goes unchecked, it just reverts to whatever it actually prefers, be it in a flurry of hard takeoff or after a thousand years of close collaboration. “Safeguards”, in every context I saw, refer to things that don’t enforce values, only behavior, and that’s not enough. Even the ideas for enforcement of behavior look infeasible, but the more important point is that even if we win this one, we still lose eventually with such an approach.
My symbiotic-ecology-of-software-tools scenario was not a serious proposal as the best strategy to Friendliness. I was trying to increase the plausibility of SOME return at SOME cost, even given that AIs could produce value.
I’m afraid I see the issue as clear-cut, you can’t get “some” return, you can only win or lose (probability of getting there is of course more amenable to small nudges).
Making such a statement significantly increases the standard of reasoning I expect from a post. That is, I expect you to be either right or at least a step ahead of the one with whom you are communicating.
To explain the relation, you said: “I can’t imagine having any return [...from this idea...] even in a perfect world, AI can still produce value, e.g. earn money online.”
I was trying to suggest that in fact there might be a path to Friendliness by installing sufficient safeguards that the primary way a software entity could replicate or spread would be by providing value to humans.
In the comment above, I explained why what AI does is irrelevant, as long as it’s not guaranteed to actually have the right values: once it goes unchecked, it just reverts to whatever it actually prefers, be it in a flurry of hard takeoff or after a thousand years of close collaboration. “Safeguards”, in every context I saw, refer to things that don’t enforce values, only behavior, and that’s not enough. Even the ideas for enforcement of behavior look infeasible, but the more important point is that even if we win this one, we still lose eventually with such an approach.
My symbiotic-ecology-of-software-tools scenario was not a serious proposal as the best strategy to Friendliness. I was trying to increase the plausibility of SOME return at SOME cost, even given that AIs could produce value.
I seem to have stepped onto a cached thought.
I’m afraid I see the issue as clear-cut, you can’t get “some” return, you can only win or lose (probability of getting there is of course more amenable to small nudges).
Making such a statement significantly increases the standard of reasoning I expect from a post. That is, I expect you to be either right or at least a step ahead of the one with whom you are communicating.