Also, an agent with safer goals than humans have (which is a high bar, but not nearly as high a bar as some alternatives) is safer than humans with equivalently powerful tools.
How is this helpful? This is true by definition of the word “safer”. The problem is knowing whether an agent has safer goals, or what “safer” means.
Also, an agent with safer goals than humans have (which is a high bar, but not nearly as high a bar as some alternatives) is safer than humans with equivalently powerful tools.
How is this helpful? This is true by definition of the word “safer”. The problem is knowing whether an agent has safer goals, or what “safer” means.