Roman Leventov comments on An open letter to SERI MATS program organisers

Roman Leventov 26 Apr 2023 12:14 UTC
1 point
0
If AGI labs truly bet on AI-assisted (or fully AI-automated) science, across the domains of science (the second group in your list), then research done in the following three groups will be submerged by that AI-assisted research.

It’s still important to do some research in these areas, for two reasons:

(1) hedging bets against some unexpected turn of events, such as AIs failing to improve the speed and depth of generated scientific insight, at least in some areas (perhaps governance & strategy are more iffy areas, and it’s hard to become sure that strategies suggested by AIs are free of ‘deep deceptiveness’ style of bias than pure math or science).

(2) When AIs will presumably generate all that awesome science, the humanity still needs to have people capable of understanding, evaluating, and finding weaknesses in this science.

This, however, suggests a different focus in the latter three groups: growing excellent science evaluators rather than generators (GAN style). More Yudkowskys able to shut down and poke holes in various plans. Less focus on producing sheer amount of research and more focus on the ability to criticise others’ and one’s own research. There is overlap of course but there are also differences in how researchers should develop, if we keep this in mind. Also credit assignment systems and community authority-inferring mechanisms should recognise this focus.
- Ryan Kidd 26 Apr 2023 22:10 UTC
  1 point
  0
  Parent
  MATS’ framing is that we are supporting a “diverse portfolio” of research agendas that might “pay off” in different worlds (i.e., your “hedging bets” analogy is accurate). We also think the listed research agendas have some synergy you might have missed. For example, interpretability research might build into better AI-assisted white-box auditing, white/gray-box steering (e.g., via ELK), or safe architecture design (e.g., “retargeting the search”).
  The distinction between “evaluator” and “generator” seems fuzzier to me than you portray. For instance, two “generator” AIs might be able to red-team each other for the purposes of evaluating an alignment strategy.