Unfortunately, they are only sporadically updated and difficult to consume using automated tools. We encourage organizations to start releasing machine-readable bibliographies to make our lives easier.
Oh interesting. Would it be helpful to have something on the AI Alignment in the form of some kind of more machine-readable citation system, or did you find the current setup sufficient?
If we decide to expand the database in 2021 to attempt comprehensive coverage of blog posts, then a machine-readable citation system would be extremely helpful. However, to do that we would need to decide on some method for sorting/filtering the posts, which is going to depend on what the community finds most interesting. E.g., do we want to compare blog posts to journal articles, or should the analyses remain mostly separate? Are we going to crowd-source the filtering by category and organization, or use some sort of automated guessing based on authorship tags on the post? How expansive should the database be regarding topic?
Currently of the 358 web items in our database, almost half (161) are blog posts from AI Alignment Forum (106), LessWrong (38), or Effective Altruism Forum (17). (I emphasize that, as mentioned in the post, our inclusion procedure for web content was pretty random.) Since these don’t collect citations on GoogleScholar, some sort of data on them (# comments and upvotes) would be very useful to surface the most notable posts.
The automated tools on Zotero are good enough now that getting the complete bibtex information doesn’t really make it much easier. I can convert a DOI or arXiv number into a complete listing with one click, and I can do the same with a paper title in 2-3 clicks. The laborious part is (1) interacting with each author and (2) classifying/categorizing the paper.
Oh interesting. Would it be helpful to have something on the AI Alignment in the form of some kind of more machine-readable citation system, or did you find the current setup sufficient?
Also, thank you for doing this!
If we decide to expand the database in 2021 to attempt comprehensive coverage of blog posts, then a machine-readable citation system would be extremely helpful. However, to do that we would need to decide on some method for sorting/filtering the posts, which is going to depend on what the community finds most interesting. E.g., do we want to compare blog posts to journal articles, or should the analyses remain mostly separate? Are we going to crowd-source the filtering by category and organization, or use some sort of automated guessing based on authorship tags on the post? How expansive should the database be regarding topic?
Currently of the 358 web items in our database, almost half (161) are blog posts from AI Alignment Forum (106), LessWrong (38), or Effective Altruism Forum (17). (I emphasize that, as mentioned in the post, our inclusion procedure for web content was pretty random.) Since these don’t collect citations on GoogleScholar, some sort of data on them (# comments and upvotes) would be very useful to surface the most notable posts.
Note that individual researchers will sometimes put up bibtex files of all their publications, but I think it’s rarer for organizations to do this.
The automated tools on Zotero are good enough now that getting the complete bibtex information doesn’t really make it much easier. I can convert a DOI or arXiv number into a complete listing with one click, and I can do the same with a paper title in 2-3 clicks. The laborious part is (1) interacting with each author and (2) classifying/categorizing the paper.