jacquesthibs

Karma: 2,571

I work primarily on AI Alignment. Scroll down to my pinned Shortform for an idea of my current work and who I’d like to collaborate with.

Website: https://jacquesthibodeau.com

Twitter: https://twitter.com/JacquesThibs

GitHub: https://github.com/JayThibs

LinkedIn: https://www.linkedin.com/in/jacques-thibodeau/

jacquesthibs Apr 24, 2025, 3:38 AM
4 points
0
in reply to: Wei Dai’s comment on: o3 Is a Lying Liar
I can’t think of anyone making a call worded like that. The closest I can think of is Christiano mentioning, in a 2023 talk on how misalignment could lead to AI takeover, that we’re pretty close to AIs doing things like reward hacking and threatening users, and that he doesn’t think we’d shut down this whole LLM thing even if that were the case. He also mentioned we’ll probably see some examples in the wild, not just internally.
Paul Christiano: I think a lot depends on both. (27:45) What kind of evidence we’re able to get in the lab. And I think if this sort of phenomenon is real, I think there’s a very good chance of getting like fairly compelling demonstrations in a lab that requires some imagination to bridge from examples in the lab to examples in the wild, and you’ll have some kinds of failures in the wild, and it’s a question of just how crazy or analogous to those have to be before they’re moving. (28:03) Like, we already have some slightly weird stuff. I think that’s pretty underwhelming. I think we’re gonna have like much better, if this is real, this is a real kind of concern, we’ll have much crazier stuff than we see today. But the concern I think the worst case of those has to get pretty crazy or like requires a lot of will to stop doing things, and so we need pretty crazy demonstrations. (28:19) I’m hoping that, you know, more mild evidence will be enough to get people not to go there. Yeah. Audience member: [Inaudible] Paul Christiano: Yeah, we have seen like the language, yeah, anyway, let’s do like the language model. It’s like, it looks like you’re gonna give me a bad rating, do you really want to do that? I know where your family lives, I can kill them. (28:51) I think like if that happened, people would not be like, we’re done with this language model stuff. Like I think that’s just not that far anymore from where we’re at. I mean, this is maybe an empirical prediction. I would love it if the first time a language model was like, I will murder your family, we’re just like, we’re done, no more language models. (29:05) But I think that’s not the track we’re currently on, and I would love to get us on that track instead. But I’m not [confident we will].

jacquesthibs Apr 19, 2025, 8:42 PM
3 points
0
in reply to: Raemon’s comment on: What Makes an AI Startup “Net Positive” for Safety?
Yeah, thanks! I agree with @habryka’s comment, though I’m a little worried it may shut down conversation since it might make people think the conversation is about AI startups in general and less about AI startups in service of AI safety. This is because people might consider the debate/question answered after agreeing with the top comment.
That said, I do hear the “any AI startup is bad because it increases AI investment and therefore reduces timelines” so I think it’s worth at least getting more clarity on this.

jacquesthibs Apr 17, 2025, 4:57 PM
78 points
2
on: jacquesthibs’s Shortform
Three Epoch AI employees* are leaving to co-found an AI startup focused on automating work:
“Mechanize will produce the data and evals necessary for comprehensively automating work.”
They also just released a podcast with Dwarkesh.
*Matthew Barnett, Tamay Besiroglu, Ege Erdil
What links here?
- Epoch AI alumni launch Mechanize to “automate the whole economy” by Henry Stanley 🔸 (EA Forum; Apr 18, 2025, 10:12 AM; 101 points)
- What Makes an AI Startup “Net Positive” for Safety? by jacquesthibs (Apr 18, 2025, 8:33 PM; 80 points)

jacquesthibs Apr 15, 2025, 4:17 PM
2 points
0
on: jacquesthibs’s Shortform
In case this is useful to anyone in the future: LTFF does not provide funding for-profit organizations. I wasn’t able to find mentions of this online, so I figured I should share.
I was made aware of this after being rejected today for applying to LTFF as a for-profit. We updated them 2 weeks ago on our transition into a non-profit, but it was unfortunately too late, and we’ll need to send a new non-profit application in the next funding round.

jacquesthibs Apr 8, 2025, 8:27 PM
4 points
0
in reply to: Alexander Gietelink Oldenziel’s comment on: Alexander Gietelink Oldenziel’s Shortform
FWIW, I was always concerned about people trying to make long-horizon forecast predictions because they assumed superforecasting would extrapolate beyond the sub-1-year predictions that were tested.
As an alternative, that’s why I wrote about strategic foresight to focus on robust plans rather than trying to accurately predict the actual scenario.

jacquesthibs Apr 8, 2025, 4:03 PM
2 points
0
in reply to: jacquesthibs’s comment on: jacquesthibs’s Shortform
We got our first 10k! Woo!

jacquesthibs Apr 3, 2025, 9:52 PM
4 points
0
on: jacquesthibs’s Shortform
Coordinal Research: Accelerating the research of safely deploying AI systems.
We just put out a Manifund proposal to take short timelines and automating AI safety seriously. I want to make a more detailed post later, but here it is: https://manifund.org/projects/coordinal-research-accelerating-the-research-of-safely-deploying-ai-systems

jacquesthibs Feb 14, 2025, 4:39 PM
3 points
0
on: Announcing the Q1 2025 Long-Term Future Fund grant round
When is the exact deadline? Is it EOD AOE on February 15th or February 14th? “By February 15th” can sound like the deadline hits as soon as it’s the 15th.
Have seen a few people ask this question in some Slacks.

jacquesthibs Feb 5, 2025, 2:01 PM
2 points
0
on: jacquesthibs’s Shortform
I keep hearing about dual-use risk concerns when I mention automated AI safety research. Here’s a simple solution that could even work in a startup setting:
Keep all of the infrastructure internally and only share with vetted partners/researchers.
You can hit two birds with one stone:
- Does not turn into a mass-market product that leads to dual-use risks.
- Builds a moat where you have complex internal infrastructure which is not shared, only the product of that system is shared. Investors love moats, you just got to convince them that this is the way to go for a product like this these days.
You don’t market the product to mass-market, you just find partners and use the system to spin out products and businesses that have nothing to do with frontier models. So, you can repurpose the system for specific application areas without releasing the platform and process, which would be copied in a day in the age of AI anyways.

jacquesthibs Feb 5, 2025, 1:24 AM
3 points
0
on: Anti-Slop Interventions?
I’m currently working on de-slopifying and building an AI safety startup with this as a central pillar.* Happy to talk privately with anyone working on AI safety who is interested in this.
*almost included John and Gwern’s posts on AI slop as part of a recent VC pitch deck.

jacquesthibs Feb 5, 2025, 1:20 AM
LW: 6 AF: 3
−1
AF
in reply to: evhub’s comment on: Anti-Slop Interventions?
I’m working on this. I’m unsure if I should be sharing what I’m exactly working on with a frontier AGI lab though. How can we be sure this just leads to differentially accelerating alignment?
Edit: my main consideration is when I should start mentioning details. As in, should I wait until I’ve made progress on alignment internally before sharing with an AGI lab. Not sure what people are disagreeing with since I didn’t make a statement.

jacquesthibs Jan 24, 2025, 6:12 PM
27 points
1
on: jacquesthibs’s Shortform
I’m currently in the Catalyze Impact AI safety incubator program. I’m working on creating infrastructure for automating AI safety research. This startup is attempting to fill a gap in the alignment ecosystem and looking to build with the expectation of under 3 years left to automated AI R&D. This is my short timelines plan.
I’m looking to talk (for feedback) to anyone interested in the following:
- AI control
- Automating math to tackle problems as described in Davidad’s Safeguarded AI programme.
- High-assurance safety cases
- How to robustify society in a post-AGI world
- Leverage large amounts of inference-time compute to make progress on alignment research
- Short timelines
- Profitability while still reducing overall x-risk
- Are someone with an entrepreneurial spirit and can spin out traditional business within the org to fund the rest of the work (thereby reducing investor pressure)
If you’re interested in chatting or giving feedback, please DM me!

jacquesthibs Jan 23, 2025, 5:18 PM
9 points
0
on: jacquesthibs’s Shortform
Looks like Meta is panicking over DeepSeek R1

jacquesthibs Jan 21, 2025, 10:45 PM
2 points
0
on: jacquesthibs’s Shortform
Are you or someone you know:
1) great at building (software) companies
2) care deeply about AI safety
3) open to talk about an opportunity to work together on something
If so, please DM with your background. If someone comes to mind, also DM. I am looking thinking of a way to build companies in a way to fund AI safety work.

jacquesthibs Jan 18, 2025, 1:15 PM
4 points
0
in reply to: Charlie Steiner’s comment on: Charlie Steiner’s Shortform
In case you didn’t read Paul’s reasoning.

jacquesthibs Jan 14, 2025, 3:57 PM
2 points
0
in reply to: Bogdan Ionut Cirstea’s comment on: Building AI Research Fleets
Agreed, but I will find a way.

jacquesthibs 14 Jan 2025 14:12 UTC
11 points
0
on: Building AI Research Fleets
Hey Ben and Jesse!
This comment is more of a PSA:

I am building a startup focused on making this kind of thing exceptionally easy for AI safety researchers. I’ve been working as an AI safety researcher for a few years. I’ve been building an initial prototype and I am in the process of integrating it easily into AI research workflows. So, with respect to this post, I’ve been actively working towards building a prototype for the “AI research fleets”.
I am actively looking for a CTO I can build with to +10x alignment research in the next 2 years. I’m looking for someone absolutely cracked and it’s fine if they already have a job (I’ll give my pitch and let them decide).

If that’s you or you know anyone who could fill that role (or who I could talk to that might know), then please let me know!
For alignment researchers or people in AI safety research orgs: hit me up if you want to be pinged for beta testing when things are ready.
For orgs, I’d be happy to work with you to setup automations or give a masterclass on the latest AI tools/automation workflows and maybe provide a custom report (with a video overview) each month so that you can focus on research rather than trying new tools that might not be relevant to your org.
Additional context:
“When we say “automating alignment research,” we mean a mix of Sakana AI’s AI scientist (specialized for alignment), Transluce’s work on using AI agents for alignment research, test-time compute scaling, and research into using LLMs for coming up with novel AI safety ideas. This kind of work includes empirical alignment (interpretability, unlearning, evals) and conceptual alignment research (agent foundations).
We believe that it is now the right time to take on this project and build this startup because we are nearing the point where AIs could automate parts of research and may be able to do so sooner with the right infrastructure, data, etc.
We intend to study how our organization’s work can integrate with the Safeguarded AI thesis by Davidad.”
I’m currently in London for the month as part of the Catalyze Impact programme.
If interested, send me a message on LessWrong or X or email (thibo.jacques @ gmail dot com).

jacquesthibs 9 Jan 2025 17:59 UTC
2 points
0
in reply to: aysajan’s comment on: How much I’m paying for AI productivity software (and the future of AI use)
It has basically significantly accelerated my ability to build fully functional websites very quickly. To the point where it was basically a phase transition between me building my org’s website and not building it (waiting for someone with web dev experience to do it for me).
I started my website by leveraging the free codebase template he provides on his github and covers in the course.

jacquesthibs 7 Jan 2025 19:29 UTC
4 points
0
in reply to: Noosphere89’s comment on: Alexander Gietelink Oldenziel’s Shortform
I mean that it’s a trade secret for what I’m personally building, and I would also rather people don’t just use it freely for advancing frontier capabilities research.

jacquesthibs 7 Jan 2025 19:18 UTC
4 points
0
in reply to: Noosphere89’s comment on: Alexander Gietelink Oldenziel’s Shortform
Is this because it would reveal private/trade-secret information, or is this for another reason?
Yes (all of the above)

jacquesthibs

Coordinal Research: Accelerating the research of safely deploying AI systems.