As a follow-up here, to expand on this a little more:
If we do not yet have sufficient AI safety solutions, advancing general AI capabilities may not be desirable because it leads to further deployment of AI and to bringing AI closer to transformative levels. If new model architectures or training techniques were not going to be developed by other research groups within a similar timeframe, then that increases AI capabilities. The specific capabilities developed for Law-Informed AGI purposes may be orthogonal to developments that contribute toward general AGI work. Technical developments achieved for the purposes of AI understanding law better that were not going to be developed by other research groups within a similar timeframe anyway are likely not material contributors to accelerating timelines for the global development of transformative AI.
However, this is an important consideration for any technical AI research – it’s hard to rule out AI research contributing in at least some small way to advancing capabilities – so it is more a matter of degree and the tradeoffs of the positive safety benefits of the research with the negative of the timeline acceleration.
Teaching AI to better understand the preferences of an individual human (or small group of humans), e.g. RLHF, likely leads to additional capabilities advancements faster and to the type of capabilities that are associated with power-seeking of one entity (human, group of humans, or AI), relative to teaching AI to better understand public law and societal values as expressed through legal data. Much of the work on making AI understand law is data engineering work, e.g., generating labeled court opinion data that can be employed in evaluating the consistency of agent behavior with particular legal standards. This type of work does not cause AGI timeline acceleration as much as work on model architectures or compute scaling.
As a follow-up here, to expand on this a little more:
If we do not yet have sufficient AI safety solutions, advancing general AI capabilities may not be desirable because it leads to further deployment of AI and to bringing AI closer to transformative levels. If new model architectures or training techniques were not going to be developed by other research groups within a similar timeframe, then that increases AI capabilities. The specific capabilities developed for Law-Informed AGI purposes may be orthogonal to developments that contribute toward general AGI work. Technical developments achieved for the purposes of AI understanding law better that were not going to be developed by other research groups within a similar timeframe anyway are likely not material contributors to accelerating timelines for the global development of transformative AI.
However, this is an important consideration for any technical AI research – it’s hard to rule out AI research contributing in at least some small way to advancing capabilities – so it is more a matter of degree and the tradeoffs of the positive safety benefits of the research with the negative of the timeline acceleration.
Teaching AI to better understand the preferences of an individual human (or small group of humans), e.g. RLHF, likely leads to additional capabilities advancements faster and to the type of capabilities that are associated with power-seeking of one entity (human, group of humans, or AI), relative to teaching AI to better understand public law and societal values as expressed through legal data. Much of the work on making AI understand law is data engineering work, e.g., generating labeled court opinion data that can be employed in evaluating the consistency of agent behavior with particular legal standards. This type of work does not cause AGI timeline acceleration as much as work on model architectures or compute scaling.