Bogdan Ionut Cirstea comments on Near-mode thinking on AI

Bogdan Ionut Cirstea 4 Aug 2024 23:30 UTC
6 points
1
You can think of a pipeline like
- feed lots of good papers in [situational awareness / out-of-context reasoning / …] into GPT-4′s context window,
- ask it to generate 100 follow-up research ideas,
- ask it to develop specific experiments to run for each of those ideas,
- feed those experiments for GPT-4 copies equipped with a coding environment,
- write the results to a nice little article and send it to a human.
Yup; and not only this, but many parts of the workflow have already been tested out (e.g. ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models; Generation and human-expert evaluation of interesting research ideas using knowledge graphs and large language models; LitLLM: A Toolkit for Scientific Literature Review; Acceleron: A Tool to Accelerate Research Ideation; DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning; Discovering Preference Optimization Algorithms with and for Large Language Models) and it seems quite feasible to get enough reliability/consistency gains to string these together and get ~the whole (post-training) prosaic alignment research workflow loop going, especially e.g. with improvements in reliability from GPT-5/6 and more ‘schlep’ / ‘unhobbling’.
What links here?
- Bogdan Ionut Cirstea's comment on Jeremy Gillen’s Shortform by Jeremy Gillen (31 Aug 2024 19:57 UTC; 6 points)
- Bogdan Ionut Cirstea's comment on Bogdan Ionut Cirstea’s Shortform by Bogdan Ionut Cirstea (13 Aug 2024 7:17 UTC; 6 points)
- Bogdan Ionut Cirstea 13 Aug 2024 7:03 UTC
  10 points
  2
  Parent
  And indeed, here’s what looks like a prototype: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery.
  And already some potential AI safety issues: ’We have noticed that The AI Scientist occasionally tries to increase its chance of success, such as modifying and launching its own execution script! We discuss the AI safety implications in our paper.
  For example, in one run, it edited the code to perform a system call to run itself. This led to the script endlessly calling itself. In another case, its experiments took too long to complete, hitting our timeout limit. Instead of making its code run faster, it simply tried to modify its own code to extend the timeout period.′
- Bogdan Ionut Cirstea 4 Aug 2024 23:42 UTC
  2 points
  1
  Parent
  Some critical factors here and for alignment automation more broadly are also token cheapness and task horizon shortness: https://docs.google.com/presentation/d/1bFfQc8688Fo6k-9lYs6-QwtJNCPOS8W2UH5gs8S6p0o/edit?usp=drive_link; https://x.com/BogdanIonutCir2/status/1819848009473036537; https://x.com/BogdanIonutCir2/status/1819861008568971325.