Agree, I do mostly discuss LLMs, but I think there’s significant overlap in aligning LLMs and LMAs.
Also agree that LMAs could scale all the way, but I also think once you get ~human-level automated alignment research, its likely applicability to other types of systems too (than LMAs and LLMs) should still be a nice bonus.
Agree, I do mostly discuss LLMs, but I think there’s significant overlap in aligning LLMs and LMAs.
Also agree that LMAs could scale all the way, but I also think once you get ~human-level automated alignment research, its likely applicability to other types of systems too (than LMAs and LLMs) should still be a nice bonus.