Re: “Let’s think step by step” so let me get this straight.. a simple prompt is able to elicit an entire style of thinking, which is able to solve harder problems, and ultimately ends up motivating new classes of foundation model? Is that what happened last year? Are there any other simple prompts like that? Did we check? Sorry I’m trying to catch up.
The s1 paper introduces a trick of replacing the end-of-thinking token with string “Wait”, which enables continuing to generate a reasoning trace that is as long as you need even when the model itself can’t control this well (“budget forcing”, see Figure 3 in section 3.1).
I just bumped across “atomic thinking”, which asks the model to break the problem into co.ponent parts, attack each separately, and only produce an answer after that’s done and they can all be brought together.
This is how smart humans attack some problems, and it’s notably different from chain of thought.
I expect this approach could also be used to train models, by training on componenet problems. If other techniques don’t keep progressing so fast as to make it irrelevant.
More or less, yes. But I don’t think it suggests there might be other prompts around that unlock similar improvements—chain-of-thought works because it allows the model to spend more serial compute on a problem, rather than because of something really important about the words.
i dont have time to write any of this down so it’s going to come out in the wrong order but here
agentic AI is the means of production for codegen
model access limits and closedness are therefore a threat to Workers
I use and maintain software. I survive by staying 5 feet in front of the steamroller
I am not wealthy, I can’t afford to be tripped and squished.
OSS is traditionally the way of protecting myself in this situation
I need to write tons of good code and enable my company to do the same, and I need to do it while washing the dishes (Covid happened).
The industry wants to give me a Commodore 64, but I need a PDP-10
This might be interesting to LessWrong as a personal take, because the Alignment folks are effectively on the side of Capital here. Without fast and parallel access to foundation models, I can’t learn my new job, which is auto-codegen-pipeline-maintainer. If some 3rd party brings that bird home to my boss instead of me, I’m going to be unwealthy and unemployed. It’s possible I’m too late already… but at any rate some people will be too late. If I were/am them, would be/am be angry.
I think a lot of people realize this already, and some have already ascended. I think the ascended people are being quiet right now because they realize the stuff I put above, and don’t mind less competition. What LessWrong thinks about that, I don’t care; I’m actually fine with it as long as I get to join the ascended. I suspect that’s a common attitude. If you don’t hear from me again, it’s because I figured it out. If this sounds crazy, I’m interested in hearing why.
mm.. I gave the wrong impression there; my actual boss doesn’t have a huge opinion on AI; in fact he’ll take some convincing.
I should state my assumptions:
software engineering will be completely automated in the next 3 years
in the beginning and maybe for a while, it will require advanced models and workflows
the workflows will be different enough between companies that it’s worthwhile to employ some well paid engineers at each company to maintain them.
these engineers will have a much easier time finding a well paying job than ‘regular’ software engineers
while this is going on, consulting and SaaS companies will be (successfully) booting up efforts to replace software engineers with paid products.
So at some point, my employer (whoever they are at the time) will have to choose between retaining me, and paying an AI-pipeline-maintenance vendor.
Or maybe whoever I work for at the time gets outcompeted by companies that use advanced AI workflows to generate software, then I get laid off and also don’t have the kind of experience necessary to work for the competitor
If you don’t think my assumptions hold then you should think your career is safe. If they do hold, there’s still the possibility of noticing later, and reacting by retooling to remain employable. But if you don’t notice in time, there’s nothing your boss (or the CTO for that matter) can do to help you. which is why I need to bulid this knowledge into my career by applying it; get it on the resume, prove the value IRL.
Nice. So something like grabbing a copy of swebench dataset, writing a pipeline that would solve those issues, then putting that on your CV?
I will say though that your value as an employee is not ‘producing software’ so much as solving business problems. How much conviction do you have that producing software marginally faster using AI will improve your value to your firm?
I think part of the important part is building your own (company’s) collection of examples to train against, since the foundation models are trained against swebench already. And if it works the advantage would be on my CV in the worst case but in equity appreciation in the best case. So, just like any skill, right?
You’re right that the whole thing only works if the business can generate returns to high quality code, and can write specifications faster than its complement of engineers can implement them. But I’ve been in that position several times, it does happen. Mainly when the core functionality of the product is designed and led by domain experts who are not software engineers. Like if you make software for accountants for instance.
The reasons you give btw don’t give me much consolation. The code leaking thing is very temporary; if you could host cutting edge models on AWS or Azure it wouldn’t be an issue for most companies. If you could self host them it wouldn’t be an issue for almost /any/ companies. The errors thing is a crux. The basic solution to that, I think, is scaling: multishot the problem, rank the solutions, test in every way imaginable, and then for each solved problem optimize your prompts till they can one-shot, keeping a backlog of examples to perform workflow regression testing against.
The style thing is very tractable, AIs love following style instructions.
The big moment for me was realizing that while each AI’s context window is limited, within that window you can ask LOTS of different questions and expect a pretty good answer. So you ask questions that compress the information in the window for the purpose of your problem (llm’s are pretty darn good at summarizing), and keep doing that until you have enough context to solve the problem.
Re: “Let’s think step by step”
so let me get this straight.. a simple prompt is able to elicit an entire style of thinking, which is able to solve harder problems, and ultimately ends up motivating new classes of foundation model? Is that what happened last year? Are there any other simple prompts like that? Did we check? Sorry I’m trying to catch up.
The s1 paper introduces a trick of replacing the end-of-thinking token with string “Wait”, which enables continuing to generate a reasoning trace that is as long as you need even when the model itself can’t control this well (“budget forcing”, see Figure 3 in section 3.1).
I just bumped across “atomic thinking”, which asks the model to break the problem into co.ponent parts, attack each separately, and only produce an answer after that’s done and they can all be brought together.
This is how smart humans attack some problems, and it’s notably different from chain of thought.
I expect this approach could also be used to train models, by training on componenet problems. If other techniques don’t keep progressing so fast as to make it irrelevant.
slightly related https://arxiv.org/abs/2503.00735
More or less, yes. But I don’t think it suggests there might be other prompts around that unlock similar improvements—chain-of-thought works because it allows the model to spend more serial compute on a problem, rather than because of something really important about the words.
i dont have time to write any of this down so it’s going to come out in the wrong order but here
agentic AI is the means of production for codegen
model access limits and closedness are therefore a threat to Workers
I use and maintain software. I survive by staying 5 feet in front of the steamroller
I am not wealthy, I can’t afford to be tripped and squished.
OSS is traditionally the way of protecting myself in this situation
I need to write tons of good code and enable my company to do the same, and I need to do it while washing the dishes (Covid happened).
The industry wants to give me a Commodore 64, but I need a PDP-10
This might be interesting to LessWrong as a personal take, because the Alignment folks are effectively on the side of Capital here. Without fast and parallel access to foundation models, I can’t learn my new job, which is auto-codegen-pipeline-maintainer. If some 3rd party brings that bird home to my boss instead of me, I’m going to be unwealthy and unemployed. It’s possible I’m too late already… but at any rate some people will be too late. If I were/am them, would be/am be angry.
I think a lot of people realize this already, and some have already ascended. I think the ascended people are being quiet right now because they realize the stuff I put above, and don’t mind less competition. What LessWrong thinks about that, I don’t care; I’m actually fine with it as long as I get to join the ascended. I suspect that’s a common attitude. If you don’t hear from me again, it’s because I figured it out. If this sounds crazy, I’m interested in hearing why.
‘If some 3rd party brings that bird home to my boss instead of me, I’m going to be unwealthy and unemployed.’
Have you talked to your boss about this? I have, for me the answer was some combination of
“Oh but using AI would leak our code”
“AI is a net loss to productivity because it errors too much / has context length limitations / doesn’t care for our standards”
And that is not solvable by a third party, so my job is safe. What about you?
mm.. I gave the wrong impression there; my actual boss doesn’t have a huge opinion on AI; in fact he’ll take some convincing.
I should state my assumptions:
software engineering will be completely automated in the next 3 years
in the beginning and maybe for a while, it will require advanced models and workflows
the workflows will be different enough between companies that it’s worthwhile to employ some well paid engineers at each company to maintain them.
these engineers will have a much easier time finding a well paying job than ‘regular’ software engineers
while this is going on, consulting and SaaS companies will be (successfully) booting up efforts to replace software engineers with paid products.
So at some point, my employer (whoever they are at the time) will have to choose between retaining me, and paying an AI-pipeline-maintenance vendor.
Or maybe whoever I work for at the time gets outcompeted by companies that use advanced AI workflows to generate software, then I get laid off and also don’t have the kind of experience necessary to work for the competitor
If you don’t think my assumptions hold then you should think your career is safe. If they do hold, there’s still the possibility of noticing later, and reacting by retooling to remain employable. But if you don’t notice in time, there’s nothing your boss (or the CTO for that matter) can do to help you. which is why I need to bulid this knowledge into my career by applying it; get it on the resume, prove the value IRL.
Nice. So something like grabbing a copy of swebench dataset, writing a pipeline that would solve those issues, then putting that on your CV?
I will say though that your value as an employee is not ‘producing software’ so much as solving business problems. How much conviction do you have that producing software marginally faster using AI will improve your value to your firm?
I think part of the important part is building your own (company’s) collection of examples to train against, since the foundation models are trained against swebench already. And if it works the advantage would be on my CV in the worst case but in equity appreciation in the best case. So, just like any skill, right?
You’re right that the whole thing only works if the business can generate returns to high quality code, and can write specifications faster than its complement of engineers can implement them. But I’ve been in that position several times, it does happen. Mainly when the core functionality of the product is designed and led by domain experts who are not software engineers. Like if you make software for accountants for instance.
The reasons you give btw don’t give me much consolation. The code leaking thing is very temporary; if you could host cutting edge models on AWS or Azure it wouldn’t be an issue for most companies. If you could self host them it wouldn’t be an issue for almost /any/ companies. The errors thing is a crux. The basic solution to that, I think, is scaling: multishot the problem, rank the solutions, test in every way imaginable, and then for each solved problem optimize your prompts till they can one-shot, keeping a backlog of examples to perform workflow regression testing against.
The style thing is very tractable, AIs love following style instructions.
The big moment for me was realizing that while each AI’s context window is limited, within that window you can ask LOTS of different questions and expect a pretty good answer. So you ask questions that compress the information in the window for the purpose of your problem (llm’s are pretty darn good at summarizing), and keep doing that until you have enough context to solve the problem.