One of the primary blockers to using LLMs in production is that they can have all sorts of unexpected behavior. This will only increase in worlds where agents are widespread, which will be a big problem for companies looking to deploy LM agents. The idea here is to build a developer framework that puts LM agents on heavy guardrails, strictly defining the set of actions an LM agent can take given its state and environment. This set of actions will be deterministic, well understood by the developer, easy to use, and human legible. If this becomes the de facto standard for building agents in enterprise use cases, it will be a safer future.
Speaking of restricting LM agents/LM behaviour, Towards Realistic ODDs For Foundation Model Based AI Offerings project is precisely about that, though I don’t know how to turn it into a business that is independent of a leading LLM vendor (OpenAI, Anthropic, Google).
But I think we shouldn’t necessary constrain ourselves to “LM agents”. Even though this is the most capable paradigm at the moment, it’s not clear whether it will remain so in the future. E.g., Yann LeCun keeps repeating that LLMs as a paradigm are “doomed” (as a path to superior, human+ level of competence and reasoning robustness) and bets on his hierarchical representation learning and prediction agent architecture (H-JEPA). There are also other approaches, e.g., OpenCog Hyperon, or the approach of www.liquid.ai (from their name, as well as some earlier interviews of Joscha Bach who is apparently a part of their team, I can deduce that their approach involves artificial neurons who can reassign their connectivity with other neurons, somewhat in the spirit of Cooperative GNNs).
And even if LeCun is wrong in his prediction, from the AI Safety perspective, it might not make sense to “join the race” towards LLM-based AGI if we consider it in some ways (at least, practically) irreparably uncontrollable or uninterpretable. There is still a big scientific question mark about this, though, as well as about LeCun’s prediction.
If you actually belief that the LM paradigm towards ubiquitous agency in the economy and society is flawed (as I do), pursuing alternative AI paradigms, even thinking your chances of global success are small, would save you some “dignity points”. And this is the stance that Verses.ai,Digital Gaia, Gaia Consortium, and Bioform Labs, are taking, advocating for and developing the paradigm of Bayesian agents. Though, the key arguments for this paradigm (vs. language modelling) is not interpretability or “local controllability/robustness”, but rather losing out information necessary for reliable cooperation, credit assignment, and “global” controllability/robustness[1] through “mixing” of Bayesian reference frames into a single bundle (LLM). This perhaps sounds cryptic, sorry. This deserves a much longer discussion and hopefully we will publish something about this soon.
OpenAI’s GPT framework is a potential competitor here. Building great developer frameworks for LM agents is part of their vision of the future. I think they may be optimizing first for ease of use out of the box and consumer use cases which could make them neglect some key features in a framework like this, but they pose a substantial threat to this business.
In this context, I want to make an analogy between programming language/framework/IDE/tooling competition and the competition of AI/agent platforms.
Programming languages and frameworks are more often discussed and compared in terms of:
Convenience and efficiency for solving tasks in various fields, or in general: for example, comparing the performance of general-purpose programming languages;
Providing certain guarantees at the level of semantics and computational model: typing, numerical computations, memory integrity, and concurrent computations;
Development and convenience of tooling (IDE, prototyping tools, debugging, DevOps, package managers) and the ecosystem of libraries and plugins.
This aligns quite accurately with the lines of comparison for agent and AI platforms:
Convenience and efficiency in solving problems in robotics, medicine, politics, and other domains, as well as the scalability of intelligence that can be built on these platforms in general;
The ability to specify, track, and verify (‘statically’ - mathematically or ‘dynamically’ - empirically) various constraints and guarantees on the operation of agents and/or the entire system as a whole.
Composability of components (and the scalability of this composability), platforms for their publication: see OpenAI Store, SingularityNET, ‘Model/Skill universe’ in Digital Gaia, etc.
Tools for development, debugging, and ‘IDE’ for AI and agents are not much discussed yet, but soon they will be.
If you actually belief that the LM paradigm towards ubiquitous agency in the economy and society is flawed (as I do), pursuing alternative AI paradigms, even thinking your chances of global success are small, would save you some “dignity points”. And this is the stance that Verses.ai,Digital Gaia, Gaia Consortium, and Bioform Labs, are taking, advocating for and developing the paradigm of Bayesian agents. Though, the key arguments for this paradigm (vs. language modelling) is not interpretability or “local controllability/robustness”, but rather losing out information necessary for reliable cooperation, credit assignment, and “global” controllability/robustness[1] through “mixing” of Bayesian reference frames into a single bundle (LLM). This perhaps sounds cryptic, sorry. This deserves a much longer discussion and hopefully we will publish something about this soon.
Just to develop on this in the context of this post (how can we make something for-profit to advance AI safety?), I want to highlight a direction of thought that I didn’t notice in your post: creating economic value by developing the mechanisms for multi-agent coordination and cooperation. This is what falls under “understanding cooperation” and “understanding agency” categories in this agenda list, although I’d replace “understanding” with “building” there.
Solving practical problems is a great way to keep the research grounded to reality, but also battle-testing it.
There are plenty economically valuable and neglected opportunities for improving coordination:
Energy and transportation: Enterprises to plan their production and logistics to optimise for energy (and grid stability, due to variable generation) and efficiency of the use of logistic systems. (Verses is tackling logistics).
More generally, how could independent businesses coordinate? Cf. Bioform Labs’ vision.
Agriculture: Farmers coordinate on who plants what (and what fertilisers and chemicals do they apply, how much water do they use, etc.) to optimise food production on the regional, national, and international levels from the perspectives of food security (robustness to extreme weather events and pests) and demand. Digital Gaia is tackling this.
Finance: Collectives of people (maybe extended families) pool their resources to unlock them certain investment and financial instruments (like private debt) and reducing the associated management overhead. Coordination is required to negotiate between their diverse investment goals, appetite for risk, ethical and other investment constraints, and to create the best “balanced” investment strategy.
Networking app that recommends people to meet that optimises cumulative results without overloading certain people “in high demand” like billionares.
Attention economy: People to optimise attention to comments, i.e., effectively, collaborative rating system that keeps the amount of “work” on everyone manageable: this is what I’m alluding to here.
Team/org learning: “info agents” that I mentioned in the other comment could be coordinated within teams to optimise team learning: to avoid both the situations that “everyone reads the same stuff” and “everything reads their own stuff, no intersection”. My understanding is that something like that was implemented at Google/DeepMind.
Medical treatment research: it’s well known that the current paradigm for testing drugs and treatments is pseudo-scientific: measuring the average effect of the drug on the population doesn’t predict that the drug will help a particular patient. Judea Pearl is a famous champion of this claim, which I agree with. Then, clinical trials could be coordinated between participants so that they actually infer causal models of the treatment effect on the individual rather than “frequentist public average”, requiring the minimum number of people and the minimum duration of the trial.
Nutrition: Assuming that personal causal models of people’s responses to this or that food are built, coordinate family menus to optimise for everyone’s health while not requiring too much cooking and maximising meal sharing for social reasons.
Apart from just creating better mechanisms and algorithms for coordination from building businesses in all these diverse verticals (and hoping that these coordination algorithms will be transferable to some abstract “AI coordination” or “human-AI coordination”), there is a macro-strategy of sharing information between all these domain specific models, thus creating a loosely coupled, multi-way mega-model for the world as a whole. Our bet at Gaia Consortium is that this “world model merge” is very important in ameliorating multi-polar risks that @Andrew_Critch written about here and also Dan Hendrycks generally refers to as “AI Race” risks.
The strategy that I described above is also highly aligned with Earth Systems Predictability vision (“a roadmap for a planetary nervous system”) by Trillium Tech, which is also a quasi-for-profit org.
Speaking of restricting LM agents/LM behaviour, Towards Realistic ODDs For Foundation Model Based AI Offerings project is precisely about that, though I don’t know how to turn it into a business that is independent of a leading LLM vendor (OpenAI, Anthropic, Google).
But I think we shouldn’t necessary constrain ourselves to “LM agents”. Even though this is the most capable paradigm at the moment, it’s not clear whether it will remain so in the future. E.g., Yann LeCun keeps repeating that LLMs as a paradigm are “doomed” (as a path to superior, human+ level of competence and reasoning robustness) and bets on his hierarchical representation learning and prediction agent architecture (H-JEPA). There are also other approaches, e.g., OpenCog Hyperon, or the approach of www.liquid.ai (from their name, as well as some earlier interviews of Joscha Bach who is apparently a part of their team, I can deduce that their approach involves artificial neurons who can reassign their connectivity with other neurons, somewhat in the spirit of Cooperative GNNs).
And even if LeCun is wrong in his prediction, from the AI Safety perspective, it might not make sense to “join the race” towards LLM-based AGI if we consider it in some ways (at least, practically) irreparably uncontrollable or uninterpretable. There is still a big scientific question mark about this, though, as well as about LeCun’s prediction.
If you actually belief that the LM paradigm towards ubiquitous agency in the economy and society is flawed (as I do), pursuing alternative AI paradigms, even thinking your chances of global success are small, would save you some “dignity points”. And this is the stance that Verses.ai, Digital Gaia, Gaia Consortium, and Bioform Labs, are taking, advocating for and developing the paradigm of Bayesian agents. Though, the key arguments for this paradigm (vs. language modelling) is not interpretability or “local controllability/robustness”, but rather losing out information necessary for reliable cooperation, credit assignment, and “global” controllability/robustness[1] through “mixing” of Bayesian reference frames into a single bundle (LLM). This perhaps sounds cryptic, sorry. This deserves a much longer discussion and hopefully we will publish something about this soon.
In this context, I want to make an analogy between programming language/framework/IDE/tooling competition and the competition of AI/agent platforms.
Programming languages and frameworks are more often discussed and compared in terms of:
Convenience and efficiency for solving tasks in various fields, or in general: for example, comparing the performance of general-purpose programming languages;
Providing certain guarantees at the level of semantics and computational model: typing, numerical computations, memory integrity, and concurrent computations;
Development and convenience of tooling (IDE, prototyping tools, debugging, DevOps, package managers) and the ecosystem of libraries and plugins.
This aligns quite accurately with the lines of comparison for agent and AI platforms:
Convenience and efficiency in solving problems in robotics, medicine, politics, and other domains, as well as the scalability of intelligence that can be built on these platforms in general;
The ability to specify, track, and verify (‘statically’ - mathematically or ‘dynamically’ - empirically) various constraints and guarantees on the operation of agents and/or the entire system as a whole.
Composability of components (and the scalability of this composability), platforms for their publication: see OpenAI Store, SingularityNET, ‘Model/Skill universe’ in Digital Gaia, etc.
Tools for development, debugging, and ‘IDE’ for AI and agents are not much discussed yet, but soon they will be.
Through modern control theory and theory of feedback, see the work by John Doyle’s group.
Just to develop on this in the context of this post (how can we make something for-profit to advance AI safety?), I want to highlight a direction of thought that I didn’t notice in your post: creating economic value by developing the mechanisms for multi-agent coordination and cooperation. This is what falls under “understanding cooperation” and “understanding agency” categories in this agenda list, although I’d replace “understanding” with “building” there.
Solving practical problems is a great way to keep the research grounded to reality, but also battle-testing it.
There are plenty economically valuable and neglected opportunities for improving coordination:
Energy and transportation: Enterprises to plan their production and logistics to optimise for energy (and grid stability, due to variable generation) and efficiency of the use of logistic systems. (Verses is tackling logistics).
More generally, how could independent businesses coordinate? Cf. Bioform Labs’ vision.
Agriculture: Farmers coordinate on who plants what (and what fertilisers and chemicals do they apply, how much water do they use, etc.) to optimise food production on the regional, national, and international levels from the perspectives of food security (robustness to extreme weather events and pests) and demand. Digital Gaia is tackling this.
Finance: Collectives of people (maybe extended families) pool their resources to unlock them certain investment and financial instruments (like private debt) and reducing the associated management overhead. Coordination is required to negotiate between their diverse investment goals, appetite for risk, ethical and other investment constraints, and to create the best “balanced” investment strategy.
Networking app that recommends people to meet that optimises cumulative results without overloading certain people “in high demand” like billionares.
Attention economy: People to optimise attention to comments, i.e., effectively, collaborative rating system that keeps the amount of “work” on everyone manageable: this is what I’m alluding to here.
Team/org learning: “info agents” that I mentioned in the other comment could be coordinated within teams to optimise team learning: to avoid both the situations that “everyone reads the same stuff” and “everything reads their own stuff, no intersection”. My understanding is that something like that was implemented at Google/DeepMind.
Medical treatment research: it’s well known that the current paradigm for testing drugs and treatments is pseudo-scientific: measuring the average effect of the drug on the population doesn’t predict that the drug will help a particular patient. Judea Pearl is a famous champion of this claim, which I agree with. Then, clinical trials could be coordinated between participants so that they actually infer causal models of the treatment effect on the individual rather than “frequentist public average”, requiring the minimum number of people and the minimum duration of the trial.
Nutrition: Assuming that personal causal models of people’s responses to this or that food are built, coordinate family menus to optimise for everyone’s health while not requiring too much cooking and maximising meal sharing for social reasons.
Apart from just creating better mechanisms and algorithms for coordination from building businesses in all these diverse verticals (and hoping that these coordination algorithms will be transferable to some abstract “AI coordination” or “human-AI coordination”), there is a macro-strategy of sharing information between all these domain specific models, thus creating a loosely coupled, multi-way mega-model for the world as a whole. Our bet at Gaia Consortium is that this “world model merge” is very important in ameliorating multi-polar risks that @Andrew_Critch written about here and also Dan Hendrycks generally refers to as “AI Race” risks.
The strategy that I described above is also highly aligned with Earth Systems Predictability vision (“a roadmap for a planetary nervous system”) by Trillium Tech, which is also a quasi-for-profit org.