Olah: hrrm, no bounty, I think: it argues that a particular sort of AI research is good, but seems to concede the point that pure capabilities research is bad. (“Doesn’t [interpretability improvement] speed up capabilities? Yes, it probably does—and Chris agrees that there’s a negative component to that—but he’s willing to bet that the positives outweigh the negatives.”)
Kaj Sotala: solid. Bounty!
Drexler: Bounty!
Olah: hrrm, no bounty, I think: it argues that a particular sort of AI research is good, but seems to concede the point that pure capabilities research is bad. (“Doesn’t [interpretability improvement] speed up capabilities? Yes, it probably does—and Chris agrees that there’s a negative component to that—but he’s willing to bet that the positives outweigh the negatives.”)