What feels underexplored to me is: If we can control roughly human-level AI systems, what do we DO with them?
Automated/strongly-augmented AI risk mitigation research, among various other options that Redwood discusses in some of their posts/public appearances.
Doesn’t this just shift what we worry about? If control of roughly human level and slightly superhuman systems is easy, that still leaves:
Human institutions using AI to centralize power
Conflict between human-controlled AI systems
Going out with a whimper scenarios (or other multi-agent problems)
Not understanding the reasoning of vastly superhuman AI (even with COT)
What feels underexplored to me is: If we can control roughly human-level AI systems, what do we DO with them?
Automated/strongly-augmented AI risk mitigation research, among various other options that Redwood discusses in some of their posts/public appearances.