DragonGod comments on Against ubiquitous alignment taxes

DragonGod 7 Mar 2023 1:23 UTC
6 points
0
This all seems basically sensible and checks out.

Re: your arguments #1, #2 and #4, we very well might make the decisions to pursue modular implementations of transformative artificial intelligence such as Drexler’s Open Agency architecture or Comprehensive AI Services over autonomous sovereigns and accept the inefficiency from humans in the loop and modularity because:
1. Modular architectures are much easier to oversee/govern (i.e. “scalable oversight” is more tractable)
2. Correctness/robustness of particular components/services can be locally verified; modular architectures may be more reliable/trustworthy for this reason and thus more economically competitive
3. Such implementations are less vulnerable/prone to (or at least offer less affordances for) “power seeking”/”influence seeking” behaviour; the risk of takeover and disempowerment is lower
4. Misaligned AI is likely to cause small local failures before global catastrophic failures, and hostile sociocultural/political/regulatory reactions to such failures (see nuclear) could well incentivise the big AI labs to play it (very) safe lest they strangle their golden goose
Re: #3 many of the biggest/main labs have safety teams and seem to take existential risk from advanced artificial intelligence seriously:
- Anthropic
- Deepmind
- OpenAI
I guess Google Brain and Meta AI stand out as big/well funded teams that aren’t (yet) safety pilled.