Thanks for your thoughts! I think I’m having a bit of trouble unpacking this. Can you help me unpack this sentence:
“But I our success rides on overcoming these arguments and designing AI where more is better.”
What is “more”? And what are “these arguments”? And how does this sentence relate to the question of whether explain data makes us put place more or less weight on similar-to-introspection hypotheses?
I’ve edited that sentence to “But I think our success rides on overcoming these arguments and designing AI where more is better.”
Where “more” means more data about humans, or more ability to process the information it already has. And “these arguments” means the arguments for why too much data might lead the AI to do things we don’t want (maybe the most mathematically clear example is how CIRL stops being corrigible if it can accurately predict you).
So to rephrase: there are some reasons why adding brain activity data might cause current AI designs to do things we don’t want. That’s bad; we want value learning schemes that come with principled arguments that more data will lead to better outcomes.
Thanks for your thoughts! I think I’m having a bit of trouble unpacking this. Can you help me unpack this sentence:
What is “more”? And what are “these arguments”? And how does this sentence relate to the question of whether explain data makes us put place more or less weight on similar-to-introspection hypotheses?
Whoops, I accidentally a word there.
I’ve edited that sentence to “But I think our success rides on overcoming these arguments and designing AI where more is better.”
Where “more” means more data about humans, or more ability to process the information it already has. And “these arguments” means the arguments for why too much data might lead the AI to do things we don’t want (maybe the most mathematically clear example is how CIRL stops being corrigible if it can accurately predict you).
So to rephrase: there are some reasons why adding brain activity data might cause current AI designs to do things we don’t want. That’s bad; we want value learning schemes that come with principled arguments that more data will lead to better outcomes.