I was interested in your post and noticed it didn’t have a summary, so I generated one using a summarizer script I’ve been working on and iteratively improving:
Scaffolded Language Models (LLMs) have emerged as a new type of general-purpose natural language computer. With the advent of GPT-4, these systems have become viable at scale, wrapping a programmatic scaffold around an LLM core to achieve complex tasks. Scaffolded LLMs resemble the von-Neumann architecture, operating on natural language text rather than bits.
The LLM serves as the CPU, while the prompt and context function as RAM. The memory in digital computers is analogous to the vector database memory of scaffolded LLMs. The scaffolding code surrounding the LLM core implements protocols for chaining individual LLM calls, acting as the “programs” that run on the natural language computer.
Performance metrics for natural language computers include context length (RAM) and Natural Language OPerations (NLOPs) per second. Exponential improvements in these metrics are expected to continue for the next few years, driven by the increasing scale and cost of LLMs and their training runs.
Programming languages for natural language computers are in their early stages, with primitives like Chain of Thought, Selection-Inference, and Reflection serving as assembly languages. As LLMs improve and become more reliable, better abstractions and programming languages will emerge.
The execution model of natural language computers is an expanding Directed Acyclic Graph (DAG) of parallel NLOPs, resembling a dataflow architecture. Memory hierarchy in scaffolded LLMs currently has two levels, but as designs mature, additional levels may be developed.
Unlike digital computers, scaffolded LLMs face challenges in reliability, underspecifiability, and non-determinism. Improving the reliability of individual NLOPs is crucial for building powerful abstractions and abstract languages. Error correction mechanisms may be necessary to create coherent and consistent sequences of NLOPs.
Despite these challenges, the flexibility of LLMs offers great opportunities. The set of op-codes is not fixed but ever-growing, allowing for the creation of entire languages based on prompt templating schemes. As natural language programs become more sophisticated, they will likely delegate specific ops to the smallest and cheapest language models capable of reliably performing them.
If you have feedback on the quality of this summary, you can easily indicate that using LessWrong’s agree/disagree voting.
Post summary
I was interested in your post and noticed it didn’t have a summary, so I generated one using a summarizer script I’ve been working on and iteratively improving:
If you have feedback on the quality of this summary, you can easily indicate that using LessWrong’s agree/disagree voting.