Opinions on whether it’s positive/negative to build tools like Cursor / Codebuff / Replit?
I’m asking because it seems fun to build and like there’s low hanging fruit to collect in building a competitor to these tools, but also I prefer not destroying the world.
Considerations I’ve heard:
Reducing “scaffolding overhang” is good, specifically to notice if RSPs should trigger a more advanced RSP level
(This depends on the details/quality of the RSP too)
There are always reasons to advance capabilities, this isn’t even a safety project (unless you count… elicitation?), our bar here should be high
Such scaffolding won’t add capabilities which might make the AI good at general purpose learning or long term general autonomy. It will be specific to programming, with concepts like “which functions did I look at already” and instructions on how to write high quality tests.
Anthropic are encouraging people to build agent scaffolding, and Codebuff was created by a Manifold cofounder [if you haven’t heard about it, see here and here]. I’m mainly confused about this, I’d expect both to not want people to advance capabilities (yeah, Anthropic want to stay in the lead and serve as an example, but this seems different). Maybe I’m just not in sync
I’m not confident but I am avoiding working on these tools because I think that “scaffolding overhang” in this field may well be most of the gap towards superintelligent autonomous agents.
If you imagine a o1-level entity with “perfect scaffolding”, i.e. it can get any info on a computer into its context whenever it wants, and it can choose to invoke any computer functionality that a human could invoke, and it can store and retrieve knowledge for itself at will, and its training includes the use of those functionalities, it’s not completely clear to me that it wouldn’t already be able to do a slow self-improvement takeoff by itself, although the cost might be currently practically prohibitive.
I don’t think building that scaffolding is a trivial task at all, though.
I think a simple bash tool running as admin could do most of these:
it can get any info on a computer into its context whenever it wants, and it can choose to invoke any computer functionality that a human could invoke, and it can store and retrieve knowledge for itself at will
Regarding
and its training includes the use of those functionalities
I think this isn’t a crux because the scaffolding I’d build wouldn’t train the model. But as a secondary point, I think today’s models can already use bash tools reasonably well.
it’s not completely clear to me that it wouldn’t already be able to do a slow self-improvement takeoff by itself
This requires skill in ML R&D which I think is almost entirely not blocked by what I’d build, but I do think it might be reasonable to have my tool not work for ML R&D because of this concern. (would require it to be closed source and so on)
Thanks for raising concerns, I’m happy for more if you have them
But as a secondary point, I think today’s models can already use bash tools reasonably well.
Perhaps that’s true, I haven’t seen a lot of examples of them trying. I did see Buck’s anecdote which was a good illustration of doing a simple task competently (finding the IP address of an unknown machine on the local network).
I don’t work in AI so maybe I don’t know what parts of R&D might be most difficult for current SOTA models. But based on the fact that large-scale LLMs are sort of a new field that hasn’t had that much labor applied to it yet, I would have guessed that a model which could basically just do mundane stuff and read research papers, could spend a shitload of money and FLOPS to run a lot of obviously informative experiments that nobody else has properly run, and polish a bunch of stuff that nobody else has properly polished.
[disclaimers: I have some association with the org that ran that (I write some code for them) but I don’t speak for them, opinions are my own]
Also, Anthropic have a trigger in their RSP which is somewhat similar to what you’re describing, I’ll quote part of it:
Autonomous AI Research and Development: The ability to either: (1) Fully automate the work of an entry-level remote-only Researcher at Anthropic, as assessed by performance on representative tasks or (2) cause dramatic acceleration in the rate of effective scaling.
Also, in Dario’s interview, he spoke about AI being applied to programming.
My point is—lots of people have their eyes on this, it seems not to be solved yet, it takes more than connecting an LLM to bash.
Opinions about putting in a clause like “you may not use this for ML engineering” (assuming it would work legally) (plus putting in naive technical measures to make the tool very bad for ML engineering) ?
Opinions on whether it’s positive/negative to build tools like Cursor / Codebuff / Replit?
I’m asking because it seems fun to build and like there’s low hanging fruit to collect in building a competitor to these tools, but also I prefer not destroying the world.
Considerations I’ve heard:
Reducing “scaffolding overhang” is good, specifically to notice if RSPs should trigger a more advanced RSP level
(This depends on the details/quality of the RSP too)
There are always reasons to advance capabilities, this isn’t even a safety project (unless you count… elicitation?), our bar here should be high
Such scaffolding won’t add capabilities which might make the AI good at general purpose learning or long term general autonomy. It will be specific to programming, with concepts like “which functions did I look at already” and instructions on how to write high quality tests.
Anthropic are encouraging people to build agent scaffolding, and Codebuff was created by a Manifold cofounder [if you haven’t heard about it, see here and here]. I’m mainly confused about this, I’d expect both to not want people to advance capabilities (yeah, Anthropic want to stay in the lead and serve as an example, but this seems different). Maybe I’m just not in sync
I’m not confident but I am avoiding working on these tools because I think that “scaffolding overhang” in this field may well be most of the gap towards superintelligent autonomous agents.
If you imagine a o1-level entity with “perfect scaffolding”, i.e. it can get any info on a computer into its context whenever it wants, and it can choose to invoke any computer functionality that a human could invoke, and it can store and retrieve knowledge for itself at will, and its training includes the use of those functionalities, it’s not completely clear to me that it wouldn’t already be able to do a slow self-improvement takeoff by itself, although the cost might be currently practically prohibitive.
I don’t think building that scaffolding is a trivial task at all, though.
I think a simple bash tool running as admin could do most of these:
Regarding
I think this isn’t a crux because the scaffolding I’d build wouldn’t train the model. But as a secondary point, I think today’s models can already use bash tools reasonably well.
This requires skill in ML R&D which I think is almost entirely not blocked by what I’d build, but I do think it might be reasonable to have my tool not work for ML R&D because of this concern. (would require it to be closed source and so on)
Thanks for raising concerns, I’m happy for more if you have them
Perhaps that’s true, I haven’t seen a lot of examples of them trying. I did see Buck’s anecdote which was a good illustration of doing a simple task competently (finding the IP address of an unknown machine on the local network).
I don’t work in AI so maybe I don’t know what parts of R&D might be most difficult for current SOTA models. But based on the fact that large-scale LLMs are sort of a new field that hasn’t had that much labor applied to it yet, I would have guessed that a model which could basically just do mundane stuff and read research papers, could spend a shitload of money and FLOPS to run a lot of obviously informative experiments that nobody else has properly run, and polish a bunch of stuff that nobody else has properly polished.
Your guesses on AI R&D are reasonable!
Apparently this has been tested extensively, for example:
https://x.com/METR_Evals/status/1860061711849652378
[disclaimers: I have some association with the org that ran that (I write some code for them) but I don’t speak for them, opinions are my own]
Also, Anthropic have a trigger in their RSP which is somewhat similar to what you’re describing, I’ll quote part of it:
Also, in Dario’s interview, he spoke about AI being applied to programming.
My point is—lots of people have their eyes on this, it seems not to be solved yet, it takes more than connecting an LLM to bash.
Still, I don’t want to accelerate this.
I think it’s net negative—increases the profitability of training better LLM’s.
Thanks!
Opinions about putting in a clause like “you may not use this for ML engineering” (assuming it would work legally) (plus putting in naive technical measures to make the tool very bad for ML engineering) ?
Why downvote? you can tell me anonymously:
https://docs.google.com/forms/d/e/1FAIpQLSca6NOTbFMU9BBQBYHecUfjPsxhGbzzlFO5BNNR1AIXZjpvcw/viewform