Brendon_Wong comments on Capabilities and alignment of LLM cognitive architectures

Brendon_Wong 28 Apr 2023 3:42 UTC
2 points
1
Have you read Eric Drexler’s work on open agencies and applying open agencies to present-day LLMs? Open agencies seem like progress towards a safer design for current and future cognitive architectures. Drexler’s design touches on some of the aspects you mention in the post, like:
The system can be coded to both check itself against its goals, and invite human inspection if it judges that it is considering plans or actions that may either violate its ethical goals, change its goals, or remove it from human control.
- Seth Herd 28 Apr 2023 4:40 UTC
  1 point
  0
  Parent
  I had read the first but hadn’t seen the second. I just read it. It’s well written and highly relevant, and I’ll be citing it in my future work. Thanks much for the reference!