Thomas Kwa comments on Decomposing Agency — capabilities without desires

Thomas Kwa 31 Jul 2024 17:27 UTC
LW: 6 AF: 3
0
AF
I’m glad to see this post curated. It seems increasingly likely that ~~we need~~ it will be useful to carefully construct agents that have only what agency is required to accomplish a task, and the ideas here seem like the first steps.
- Jeremy Gillen 31 Jul 2024 21:38 UTC
  LW: 2 AF: 1
  −2
  AF Parent
  What task? All the tasks I know of that are sufficient to reduce x-risk are really hard.
  - Thomas Kwa 1 Aug 2024 1:04 UTC
    LW: 2 AF: 1
    0
    AF Parent
    I’m not thinking of a specific task here, but I think there are two sources of hope. One is that humans are agentic above and beyond what is required to do novel science, e.g. we have biological drives, goals other than doing the science, often the desire to use any means to achieve our goals rather than whitelisted means, and the ability and desire to stop people from interrupting us. Another is that learning how to safely operate agents at a slightly superhuman level will be progress towards safely operating nanotech-capable agents, which could also require control, oversight, steering, or some other technique. I don’t think limiting agency will be sufficient unless the problem is easy, and then it would have other possible solutions.