David Scott Krueger (formerly: capybaralet) comments on Quick thoughts on “scalable oversight” / “super-human feedback” research

David Scott Krueger (formerly: capybaralet) 25 Jan 2023 23:25 UTC
LW: 10 AF: 7
5
AF
I understand your point of view and think it is reasonable.

However, I don’t think “don’t build bigger models” and “don’t train models to do complicated things” need to be at odds with each other. I see the argument you are making, but I think success on these asks are likely highly correlated via the underlying causal factor of humanity being concerned enough about AI x-risk and coordinated enough to ensure responsible AI development.

I also think the training procedure matters a lot (and you seem to be suggesting otherwise?), since if you don’t do RL or other training schemes that seem designed to induce agentyness and you don’t do tasks that use an agentic supervision signal, then you probably don’t get agents for a long time (if ever).
- Ediz Ucar 29 Feb 2024 13:07 UTC
  2 points
  0
  AF Parent
  if you don’t do RL or other training schemes that seem designed to induce agentyness and you don’t do tasks that use an agentic supervision signal, then you probably don’t get agents for a long time
  Is this really the case? If you imagine a perfect Oracle AI, which is certainly not agenty, it seems to me that with some simple scaffolding, one could construct a highly agentic system. It would go something along the lines of
  1. Setup API access to ‘things’ which can interact with the real world.
  2. Ask the oracle ‘What would be the optimal action if you want to do <insert-goal> via <insert-api-functions>?’
  3. Do the actions that are outputted.
  4. Some kind of looping mechanism to gain feedback from the world and account for it.
  This is my line of reasoning why AIS matters for language models in general.
  - David Scott Krueger (formerly: capybaralet) 6 Mar 2024 21:08 UTC
    LW: 3 AF: 1
    0
    AF Parent
    I meant “other training schemes” to encompass things like scaffolding that deliberately engineers agents using LLMs as components, although I acknowledge they are not literally “training” and more like “engineering”.
    - Ediz Ucar 7 Mar 2024 17:59 UTC
      1 point
      0
      AF Parent
      The thing that we care about is how long it takes to get to agents. If we put lots of effort making powerful Oracle systems or other non-agentic systems, we must assume that agentic systems will follow shortly. Someone will make them, even if you do not.
      - David Scott Krueger (formerly: capybaralet) 15 Mar 2024 12:07 UTC
        LW: 2 AF: 1
        0
        AF Parent
        I don’t disagree… in this case you don’t get agents for a long time; someone else does though.

David Scott Krueger (formerly: capybaralet) comments on Quick thoughts on “scalable oversight” /​ “super-human feedback” research

David Scott Krueger (formerly: capybaralet) comments on Quick thoughts on “scalable oversight” / “super-human feedback” research