Context: An alignment researcher is trying to draft a research paper, but uses multiple words for the same ideas, causing confusion in the reader. The system is useful if it can identify when this occurs and suggest terminology to standardize around. The system is especially useful if the suggested terminology matches what is used in any papers the draft refers to.
Input type: A draft of a research paper.
Output type: A list of possibly synonymous concepts appearing in the paper, and suggestion for standardized terminology.
Info Constraints: None.
Instance 1:
Input: Say you want to train an agent to act in an environment and optimize some goal. In the language of inner alignment, the goal being optimized is the base objective. The model is going to end up with a policy. That policy may not directly optimize the base objective, but instead targets a mesa objective.
Output: The draft uses “agent” and “model” interchangeably. It would be clearer to standardize around “model”, because that is what the linked paper uses.
Output: The terms “AI” and “agent” may be synonyms in this proposal. Consider standardizing around one or the other. It is not clear which to standardize around, because the ELK report uses both terms.
Merge Synonyms in Draft Research Papers
Context: An alignment researcher is trying to draft a research paper, but uses multiple words for the same ideas, causing confusion in the reader. The system is useful if it can identify when this occurs and suggest terminology to standardize around. The system is especially useful if the suggested terminology matches what is used in any papers the draft refers to.
Input type: A draft of a research paper.
Output type: A list of possibly synonymous concepts appearing in the paper, and suggestion for standardized terminology.
Info Constraints: None.
Instance 1:
Input: Say you want to train an agent to act in an environment and optimize some goal. In the language of inner alignment, the goal being optimized is the base objective. The model is going to end up with a policy. That policy may not directly optimize the base objective, but instead targets a mesa objective.
Output: The draft uses “agent” and “model” interchangeably. It would be clearer to standardize around “model”, because that is what the linked paper uses.
Instance 2:
Input: ELK Proposal: Thinking Via a Human Imitator
Output: The terms “AI” and “agent” may be synonyms in this proposal. Consider standardizing around one or the other. It is not clear which to standardize around, because the ELK report uses both terms.