For the record: The kind of internal experience you describe is a good description of how things currently feel to me (when I look at alignment discourse).
My internal monologue is sort of like:
Here are various ideas/concepts/principles that seem to me like low-hanging fruit and potentially very important for AI aligment. It seems kind of weird that ideas along these lines aren’t already being discussed extensively. Weird enough that it makes me experience significant cognitive dissonance.
Are these ideas maybe less new compared to what it seems like to me? Or are they maybe misguided in ways I don’t realize? Maybe. But it really seems like groups of smart people are overlooking ideas/concepts/considerations in ways that I feel to be somewhat baffling/surprising/weird.
I’m working on posts with more well-developed versions of these ideas, where I also try to explain things better and more quickly than I’ve done previously. In the meantime, the best summary that I currently can point people to are these tweet-threads:
For the record: The kind of internal experience you describe is a good description of how things currently feel to me (when I look at alignment discourse).
My internal monologue is sort of like:
I’m working on posts with more well-developed versions of these ideas, where I also try to explain things better and more quickly than I’ve done previously. In the meantime, the best summary that I currently can point people to are these tweet-threads:
https://twitter.com/Tor_Barstad/status/1615565447486898176
https://twitter.com/Tor_Barstad/status/1615963754738421764
https://twitter.com/Tor_Barstad/status/1615953089990729728
https://twitter.com/Tor_Barstad/status/1615838471326769152