Yeah, this agrees with my thinking so far. However, I think if you could research how to align AIs specifically to human flourishing (as opposed to things like obedience/interpretability/truthfulness, which defer to the user’s values), that kind of work could be more helpful than most.
I very much agree with human flourishing as the main value I most want AI technologies to pursue and be used to pursue.
In that framing, my key claim is that in practice no area of purely technical AI research — including “safety” and/or “alignment” research — can be adequately checked for whether it will help or hinder human flourishing, without a social model of how the resulting techologies will be used by individuals / businesses / governments / etc..
And we don’t have good social models of technology for really any technology, even retrospectively. So AI is certainly one we are not going to align with human flourishing in advance. When it comes to human flourishing the humanizing of technologies take a lot of time. Eventually we will get there, but it’s a process that requires a lot of individual actors making choices and “feature requests” from the world, features that promote human flourishing.
I would not be surprised if lurking in the background of my thought is Tyler Cowen. He’s a huge influence on me. But I was thinking of specific examples. I don’t know of a good general history of “humanizing”.
What I had explicitly in mind was the historical development of automobile safety: seatbelts and airbags. There is a history of invention, innovation, deployment, and legal mandating that is long and varied for these.
How long did it take between the discovery of damaging chlorofluorocarbons and their demise? Or for asbestos and its abatement—how much does society pay for this process? What’s the delta between climate change research and renewables investment?
Essentially, many an externality can be internalized once it is named and drawn attention to and the costs are realized.
Before we can even start to try to align AIs to human flourishing, we first need a clear definition of what that means. This has been a topic accessible to philosophical thought for millenia and yet still has no, universally accepted definition so how can you consider AI alignment helpful. Even if that we could all agree on what “human flourishing” meant, you would still have the problem of lock-in i.e. our AI overlords will never allow that definition to evolve once they have assumed control. Would you want to be trapped in the Utopia of someone born 3000 years ago? Better than being exterminated but still not what we want.
I think the key to approaches like this is to eschew pre-existing, complex concepts like “human flourishing” and look for a definition of Good Things that is actually amenable to constructing an agent that Does Good Things. There’s no guarantee that this would lead anywhere; it relies on some weak form of moral realism. But an AGI that follows some morality-you-largely-agree-with by its very structure is a lot more appealing to me than an AGI that dutifully maximizes the morality-you-punched-into-its-utility-function-at-bootup, appealing enough that I think it’s worth wading into moral philosophy to see if the idea pans out.
What would be some concrete examples/areas to work on for human flourishing? (Just saw a similar question on the definition; I wonder what could be some concrete areas or examples)
Yeah, this agrees with my thinking so far. However, I think if you could research how to align AIs specifically to human flourishing (as opposed to things like obedience/interpretability/truthfulness, which defer to the user’s values), that kind of work could be more helpful than most.
I very much agree with human flourishing as the main value I most want AI technologies to pursue and be used to pursue.
In that framing, my key claim is that in practice no area of purely technical AI research — including “safety” and/or “alignment” research — can be adequately checked for whether it will help or hinder human flourishing, without a social model of how the resulting techologies will be used by individuals / businesses / governments / etc..
And we don’t have good social models of technology for really any technology, even retrospectively. So AI is certainly one we are not going to align with human flourishing in advance. When it comes to human flourishing the humanizing of technologies take a lot of time. Eventually we will get there, but it’s a process that requires a lot of individual actors making choices and “feature requests” from the world, features that promote human flourishing.
Are you referring to a Science of Technological Progress ala https://www.theatlantic.com/science/archive/2019/07/we-need-new-science-progress/594946 ?
What is your gist on the processes for humanizing technologies, what sources/researches are available on such phenomena?
I would not be surprised if lurking in the background of my thought is Tyler Cowen. He’s a huge influence on me. But I was thinking of specific examples. I don’t know of a good general history of “humanizing”.
What I had explicitly in mind was the historical development of automobile safety: seatbelts and airbags. There is a history of invention, innovation, deployment, and legal mandating that is long and varied for these.
How long did it take between the discovery of damaging chlorofluorocarbons and their demise? Or for asbestos and its abatement—how much does society pay for this process? What’s the delta between climate change research and renewables investment?
Essentially, many an externality can be internalized once it is named and drawn attention to and the costs are realized.
Before we can even start to try to align AIs to human flourishing, we first need a clear definition of what that means. This has been a topic accessible to philosophical thought for millenia and yet still has no, universally accepted definition so how can you consider AI alignment helpful. Even if that we could all agree on what “human flourishing” meant, you would still have the problem of lock-in i.e. our AI overlords will never allow that definition to evolve once they have assumed control. Would you want to be trapped in the Utopia of someone born 3000 years ago? Better than being exterminated but still not what we want.
I think the key to approaches like this is to eschew pre-existing, complex concepts like “human flourishing” and look for a definition of Good Things that is actually amenable to constructing an agent that Does Good Things. There’s no guarantee that this would lead anywhere; it relies on some weak form of moral realism. But an AGI that follows some morality-you-largely-agree-with by its very structure is a lot more appealing to me than an AGI that dutifully maximizes the morality-you-punched-into-its-utility-function-at-bootup, appealing enough that I think it’s worth wading into moral philosophy to see if the idea pans out.
What would be some concrete examples/areas to work on for human flourishing? (Just saw a similar question on the definition; I wonder what could be some concrete areas or examples)