Language models develp really good world models… primarily of humans writing text on the internet. Who are consequentialist agents, and are not fully aligned (in the absence of effective law enforcement) to other humans.
Language models develp really good world models… primarily of humans writing text on the internet. Who are consequentialist agents, and are not fully aligned (in the absence of effective law enforcement) to other humans.