This sounds a bit like davidad’s agenda in ARIA, except you also limit the AI to only writing provable mathematical solutions to mathematical questions to begin with.
In general, I would say that you need possibly better feedback loops than that, possibly by writing more on LW, or consulting with more people, or joining a fellowship or other programs.
This seems like a misunderstanding / not my intent. (Could you maybe quote the part that gave you this impression?)
I believe Dusan was trying to say that davidad’s agenda limits the planner AI to only writing provable mathematical solutions. To expand, I believe that compared to what you briefly describe, the idea in davidad’s agenda is that you don’t try to build a planner that’s definitely inner aligned, you simply have a formal verification system that ~guarantees what effects a plan will and won’t have if implemented.
To answer things which Raymond did not, it is hard for me to say who has the agenda which you think has good chances for solving alignment. I’d encourage you to reaching out to people who pass your bar perhaps more frequently than you do and establish those connections. Your limits on no audio or video do make it hard to participate in something like the PIBBSS Fellowship, but perhaps worth taking a shot at it or others. See if people whose ideas you like are mentoring in some programs—getting to work with them in structured ways may be easier than otherwise.
This sounds a bit like davidad’s agenda in ARIA, except you also limit the AI to only writing provable mathematical solutions to mathematical questions to begin with. In general, I would say that you need possibly better feedback loops than that, possibly by writing more on LW, or consulting with more people, or joining a fellowship or other programs.
[deleted]
I believe Dusan was trying to say that davidad’s agenda limits the planner AI to only writing provable mathematical solutions. To expand, I believe that compared to what you briefly describe, the idea in davidad’s agenda is that you don’t try to build a planner that’s definitely inner aligned, you simply have a formal verification system that ~guarantees what effects a plan will and won’t have if implemented.
To answer things which Raymond did not, it is hard for me to say who has the agenda which you think has good chances for solving alignment. I’d encourage you to reaching out to people who pass your bar perhaps more frequently than you do and establish those connections. Your limits on no audio or video do make it hard to participate in something like the PIBBSS Fellowship, but perhaps worth taking a shot at it or others. See if people whose ideas you like are mentoring in some programs—getting to work with them in structured ways may be easier than otherwise.