Note, this is outside of Shard Theory’s scope, and I wasn’t appealing to shard theory here.
So the links that I personally viewed to make these updates are here:
This summary of Matthew Barnett’s post:
https://www.lesswrong.com/posts/i5kijcjFJD6bn7dwq/evaluating-the-historical-value-misspecification-argument#N9ManBfJ7ahhnqmu7
And 2 links from Beren about alignment:
https://www.beren.io/2024-05-11-Alignment-in-the-Age-of-Synthetic-Data/
https://www.beren.io/2024-05-15-Alignment-Likely-Generalizes-Further-Than-Capabilities/
Note, this is outside of Shard Theory’s scope, and I wasn’t appealing to shard theory here.
So the links that I personally viewed to make these updates are here:
This summary of Matthew Barnett’s post:
https://www.lesswrong.com/posts/i5kijcjFJD6bn7dwq/evaluating-the-historical-value-misspecification-argument#N9ManBfJ7ahhnqmu7
And 2 links from Beren about alignment:
https://www.beren.io/2024-05-11-Alignment-in-the-Age-of-Synthetic-Data/
https://www.beren.io/2024-05-15-Alignment-Likely-Generalizes-Further-Than-Capabilities/