Eric Drexler comments on Role Architectures: Applying LLMs to consequential tasks

Eric Drexler 30 Mar 2023 23:37 UTC
LW: 2 AF: 1
1
AF
I agree that using the forms of human language does not ensure interpretability by humans, and I also see strong advantages to communication modalities that would discard words in favor of more expressive embeddings. It is reasonable to expect that systems with strong learning capacity could to interprete and explain messages between other systems, whether those messages are encoded in words or in vectors. However, although this kind of interpretability seems worth pursuing, it seems unwise to rely on it.
The open-agency perspective suggests that while interpretability is important for proposals, it is less important in understanding the processes that develop those proposals. There is a strong case for accepting potentially uninterpretable communications among models involved in generating proposals and testing them against predictive models — natural language is insufficient for design and analysis even among humans and their conventional software tools.
Plans of action, by contrast, call for concrete actions by agents, ensuring a basic form of interpretability. Evaluation processes can and should favor proposals that are accompanied by clear explanations that stand up under scrutiny.