Maybe it helps if I start by giving some different applications one might want to use artificial agency for:
As a map: We might want to use the LLM as a map of the world, for instance by prompting us with data from the world and having it assist us with navigating that data. Now, the purpose of a map is to reflect as little information as possible about the world while still providing the bare minimum backbone needed to navigate the world.
This doesn’t work well with LLMs because they are instead trained to model information, so they will carry as much information as possible, and any map-making they do will be an accident driven by mimicking the information it’s seen of mapmakers, rather than primarily as an attempt to eliminate information about the world.
As a controller: We might want to use the LLM to perform small pushes to a chaotic system at times when the system reaches bifurcations where its state is extremely sensitive, such that the system moves in a desirable direction. But again I think LLMs are so busy copying information around that they don’t notice such sensitivities except by accident.
As a coder: Since LLMs are so busy outputting information instead of manipulating “energy”, maybe we could hope that they could assemble a big pile of information that we could “energize” in a relevant way, e.g. if they could write a large codebase and we could then excute it on a CPU and have a program that does something interesting in the world. But in order for this to work, the program shouldn’t have obstacles that stop the “energy” dead in its tracks (e.g. bugs that cause it to crash). But again the LLM isn’t optimizing for doing that, it’s just trying to copy information around that looks like software, and it only makes space for the energy of the CPU and the program functionality as a side-effect of that. (Or as the old saying goes, it’s maximizing lines of code written, not minimizing lines of code used.)
So, that gives us the thesis: To interpret the LLMs, we’d want to build a map of how they connect to the energy in the world, but they really don’t connect very well, so there’s not much to build a map of. The only thing you could really point out is the (input, output) relationships, but once you’ve characterized concrete (input, output) pairs, there’s not really much more of interest to say.
I’m sorry, but I still don’t really understand what you mean here. The phrase “the use of LLMs is so brief” is ambiguous to me. Do you mean to say:
a new, better LLM will come out soon anyway, making your work on current LLMs obsolete?
LLM context windows are really small, so you “use” them only for a brief time?
the entire LLM paradigm will be replaced by something else soon?
something totally different from all of the above?
Perhaps both the first and the second, but especially the second: As described above, we might hope you could use them extensively and recursively to build up a big thing, because then for interpretability you could study how to manipulate the contours of that big thing. But that doesn’t really work. So people only use them briefly, rather than extensively.
I thought the idea behind these methods that I have linked was to serve as the building blocks for future work on ontology identification and ultimately getting a clearer picture of what is going on internally, which is a crucial part of stuff like Wentworth’s “Retarget the Search” and other research directions like it.
Retargeting the search is only interesting if the search is able to do big stuff in the world, which LLMs according to the thesis are not.
Maybe it helps if I start by giving some different applications one might want to use artificial agency for:
As a map: We might want to use the LLM as a map of the world, for instance by prompting us with data from the world and having it assist us with navigating that data. Now, the purpose of a map is to reflect as little information as possible about the world while still providing the bare minimum backbone needed to navigate the world.
This doesn’t work well with LLMs because they are instead trained to model information, so they will carry as much information as possible, and any map-making they do will be an accident driven by mimicking the information it’s seen of mapmakers, rather than primarily as an attempt to eliminate information about the world.
As a controller: We might want to use the LLM to perform small pushes to a chaotic system at times when the system reaches bifurcations where its state is extremely sensitive, such that the system moves in a desirable direction. But again I think LLMs are so busy copying information around that they don’t notice such sensitivities except by accident.
As a coder: Since LLMs are so busy outputting information instead of manipulating “energy”, maybe we could hope that they could assemble a big pile of information that we could “energize” in a relevant way, e.g. if they could write a large codebase and we could then excute it on a CPU and have a program that does something interesting in the world. But in order for this to work, the program shouldn’t have obstacles that stop the “energy” dead in its tracks (e.g. bugs that cause it to crash). But again the LLM isn’t optimizing for doing that, it’s just trying to copy information around that looks like software, and it only makes space for the energy of the CPU and the program functionality as a side-effect of that. (Or as the old saying goes, it’s maximizing lines of code written, not minimizing lines of code used.)
So, that gives us the thesis: To interpret the LLMs, we’d want to build a map of how they connect to the energy in the world, but they really don’t connect very well, so there’s not much to build a map of. The only thing you could really point out is the (input, output) relationships, but once you’ve characterized concrete (input, output) pairs, there’s not really much more of interest to say.
Perhaps both the first and the second, but especially the second: As described above, we might hope you could use them extensively and recursively to build up a big thing, because then for interpretability you could study how to manipulate the contours of that big thing. But that doesn’t really work. So people only use them briefly, rather than extensively.
Retargeting the search is only interesting if the search is able to do big stuff in the world, which LLMs according to the thesis are not.