I think we can already see the early innings of this with large API providers figuring out how to calibrate post-training techniques (RHLF, constitutional AI) between economic usefulness and the “mean” of western morals. Tough to go against economic incentives
Yes, we do see such “values” now, but that’s a separate issue IMO.
There’s an interesting thing happening in which we’re mixing discussions of AI safety and AGI x-risk. There’s no sharp line, but I think they are two importantly different things. This post was intended to be about AGI, as distinct from AI. Most of the economic and other concerns relative to the “alignment” of AI are not relevant to the alignment of AGI.
This thesis could be right or wrong, but let’s keep it distinct from theories about AI in the present and near future. My thesis here (and a common thesis) is that we should be most concerned about AGI that is an entity with agency and goals, like humans have. AI as a tool is a separate thing. It’s very real and we should be concerned with it, but not let it blur into categorically distinct, goal-directed, self-aware AGI.
Whether or not we actually get such AGI is an open question that should be debated, not assumed. I think the answer is very clearly that we will, and soon; as soon as tool AI is smart enough, someone will make it agentic, because agents can do useful work, and they’re interesting. So I think we’ll get AGI with real goals, distinct from the pseudo-goals implicit in current LLMs behavior.
The post addresses such “real” AGI that is self-aware and agentic, but that has the sole goal of doing what people want is pretty much a third thing that’s somewhat counterintuitive.
I think we can already see the early innings of this with large API providers figuring out how to calibrate post-training techniques (RHLF, constitutional AI) between economic usefulness and the “mean” of western morals. Tough to go against economic incentives
Yes, we do see such “values” now, but that’s a separate issue IMO.
There’s an interesting thing happening in which we’re mixing discussions of AI safety and AGI x-risk. There’s no sharp line, but I think they are two importantly different things. This post was intended to be about AGI, as distinct from AI. Most of the economic and other concerns relative to the “alignment” of AI are not relevant to the alignment of AGI.
This thesis could be right or wrong, but let’s keep it distinct from theories about AI in the present and near future. My thesis here (and a common thesis) is that we should be most concerned about AGI that is an entity with agency and goals, like humans have. AI as a tool is a separate thing. It’s very real and we should be concerned with it, but not let it blur into categorically distinct, goal-directed, self-aware AGI.
Whether or not we actually get such AGI is an open question that should be debated, not assumed. I think the answer is very clearly that we will, and soon; as soon as tool AI is smart enough, someone will make it agentic, because agents can do useful work, and they’re interesting. So I think we’ll get AGI with real goals, distinct from the pseudo-goals implicit in current LLMs behavior.
The post addresses such “real” AGI that is self-aware and agentic, but that has the sole goal of doing what people want is pretty much a third thing that’s somewhat counterintuitive.