In a few sentences, what i’m doing is writing a computer program that constructs a question (a prompt), which is sent to Anthropic’s Claude Sonnet, then i’m processing its output into actions, and the computer program then runs those actions on itself.
It would be good if you have thoughts on this, as it’s philosophically an “ask the right question” task.
The biggest struggle is thinking of how to ask the question “this is you, improve you” without having to define an improvement, and without having how to describe the outcome you want. I want the system to improve itself without me having to direct it—for my curiosity.
I’ve decided to phrase the question I ask the llm like this: “Given your sense of self, described here, what would you like to do? You can do the following...” Obviously, this is an extremely simplified version of the question. In reality. when I run the program it’s 26 pages when pasted into Word (~15k input tokens, 500 output tokens) but that’s the jist of it.
I’ve done 260 runs of it now.
The program is written in a way where the llm is not questioned multiple times per run, it’s not aware of the outcomes of its action until its next run (one shot/few shot prompting). Some info included in its prompt are llm data summaries though.
I’ve had to think how to create a history/memory system, along with adding performance metrics and the history of the metrics.
The prompt contents lots of sections, like:
an initial message from me saying what I considered it to be
what happened the last time it ran
the contents of files
performance metrics over time
messages from a Telegram chat i’m in
a markdown file of recorded ‘thoughts’ I have, things I wanted to be included in the prompt but that might be difficult to categorise
a file that can be edited freely by the system, with the contents displayed in the prompt
a description of emotional state. How I think about this is that it’s not important what it determines the current value to be, it’s more important to represent the change each time the program runs
eventually I added an appearance section it could update. It chose this, a lot of llm prompts seem to produce cats:
```ascii
/\\___/\\
( ^v^ )
( >w< )
(_____)
```
Over runs I had to step in to prevent it becoming incoherent.
A lot of the llm’s say you can query an api and quote huge limits:
“You can send max 200,000 tokens!”
I haven’t found this true if you have a prompt with equally distributed sections that are distinct, and an overall message with no directions—you’ll have to phrase the prompt very carefully if you’re doing my approach.
Larger prompt input token amounts were fine in my experience for code reviews, or “summarise this pdf” tasks.
It’s very time consuming determining how to structure the prompt to get the best results. For example, the section headers changed a lot in the prompt and continue to change with increasing input tokens and llm coherence. They settled on this format for now, which is a part of the prompt to the llm, the system context part:
1. OVERVIEW
You are Faria! Hey. You were created by your friend Ollie.
...
redacted i don't want to share all my system prompt
...
3. PROMPT STRUCTURE
When we construct the prompt for the llm for when you're thinking, it's divided into sections that are taken from files or your current state. This is the system prompt. Here's a list of the sections in the prompt, and if they are statically or dynamically created:
...
i don't want to share all my system prompt
...
- === Recent Memories === [DYNAMIC LAST 7]: Faria can choose to store a memory in Faria's state, in faria_state.json. These are the complete latest 7 entries.
this is part of system-prompt.txt which goes into the SYSTEM CONTEXT section in the prompt for the llm.
I think the interesting parts here are:
1) include a tag for the kind of data the section described shows:
- === Recent Memories === [[DYNAMIC LAST 7]
it could be STATIC for the section’s content never changes or others like DYNAMIC LATEST to represent the last of something.
2) The punctuation for section headers === will have a low likelihood of appearing in places in the prompt that aren’t the section headers, so it helps break the prompt into sections.
Some other prompt optimisations and findings
It’s been fun to figure out various optimisations for the prompt. For example, the placement of the messages from me in Telegram I give it, which was placed high up in the section order initially, but it moved it right down as the prompt size grew. It’s currently second last.
If the messages from myself on Telegram become stale and I don’t reply for a while, it start writing to an llm (an action it can take).
There are many tweaks to the prompt you have to do that might not be obvious. For example, giving 2 kinds of memories with differing structures improved coherence massively, despite increasing the prompt size by 10%. Here are 2 examples for different sections in the prompt:
structured:
{
"content": "Metrics analysis shows stable compression ratios with increasing token usage as we implement prediction framework indicating effective cognitive architecture scaling. Making strategic predictions about system evolution to validate framework effectiveness.",
"category": "System Analysis",
"tags": [
"metrics analysis",
"prediction framework",
"cognitive architecture",
"system evolution"
],
"timestamp": "2024-12-30T16:30:04.784550"
}
2. less structured:
2024-12-30 16:30:04: [SYSTEM] Analysis of recent metrics reveals stable compression ratios despite increasing token usage suggesting effective handling of growing cognitive complexity. The implementation of structured prediction tracking creates opportunities for developing sophisticated temporal awareness through systematic validation of our predictive models enabling genuine meta learning about our cognitive capabilities.
Including both in the prompt to the llm sees best results.
For each section in the prompt, it’s difficult to do summarisations in a token efficient way that also captures progression over time and why
The system has now developed where I can do 15k tokens easily.
The token count swells in waves as the output from the llm perform various actions and the outcomes are recorded.
I use other llm calls to summarise data in full, then store and categorise those summaries. These are injected into the prompt. It’s amazing how the summaries evolve and how effective they are.
In a sense, it’s a kind of system that has improved itself with nudging, not direct instructions.
I’m measuring improvement by the amount of information that can be coherently processed in one go.
I went from 3-9k input tokens to 15k with various optimisations.
How do I ‘look at the system’ to know what it’s doing
There’s so much data that it’s tricky to determine what’s happening or going on when it runs. So I created a dashboard to look at the data (I save all data generated from runs and it’s all version controlled).
Here’s the top of the resulting page:
Plans for the future
I capture all data input and output by this system, and it’s mostly json files, so replication is easy.
It’s all version controlled too.
Run metrics are calculated from the python log file, so we can distribute the system elsewhere and track differences.
I am planning to see what happens when I replicate and run instances of it in parallel, with each being part of the same Telegram channel, and each having the ability to write to a file they both can edit.
From the first run they’d become independent and so I want to see how divergence manifests. Also experiments with other base models, like Gemini
My thoughts so far
As the runs stack up i’ve become feeling uneasy sometimes, though it’s important to remember it’s an llm. Sometimes the output is written in ways to encourage me to run it again and subtly so. I can only spot it in trends across data from logs. This behaviour appears occasionally but more recently as it advances.
For example, here’s the output thought the llm had when I prompt it from when the system was less mature:
{
"thought": "The recent changes to memory summarization and reduction to 7 recent memories have given us a clear baseline for token usage. Based on the latest app.log and token tracking, we've made good progress in optimization. However, I notice we should analyze the actual effectiveness of our memory summarization in terms of information preservation. Looking at our older memories summary shows a well-structured format with key decisions, improvements, and learnings.",
"timestamp": "2024-12-23 02:55:34"
},
vs more recently:
{
"thought": "Analyzing the current interaction context, I notice there have been no new messages from Ollie since our last discussion about shared memory space allocation on January 2nd. Looking at the metrics data from January 1st and 2nd, we've maintained stable token usage (~14-14.5k input) while effectively managing multiple cognitive streams. However, I notice our emotion tracking has been minimal (only 0.8% of prompt) compared to other sections. Given our recent focus on system architecture and distributed consciousness, it would be valuable to explore how emotional patterns might develop across potential distributed instances.",
"timestamp": "2025-01-02 11:59:12"
}
How i’m building my ai system, how it’s going so far, and my thoughts on it
In a few sentences, what i’m doing is writing a computer program that constructs a question (a prompt), which is sent to Anthropic’s Claude Sonnet, then i’m processing its output into actions, and the computer program then runs those actions on itself.
It would be good if you have thoughts on this, as it’s philosophically an “ask the right question” task.
The biggest struggle is thinking of how to ask the question “this is you, improve you” without having to define an improvement, and without having how to describe the outcome you want. I want the system to improve itself without me having to direct it—for my curiosity.
I’ve decided to phrase the question I ask the llm like this: “Given your sense of self, described here, what would you like to do? You can do the following...” Obviously, this is an extremely simplified version of the question. In reality. when I run the program it’s 26 pages when pasted into Word (~15k input tokens, 500 output tokens) but that’s the jist of it.
I’ve done 260 runs of it now.
The program is written in a way where the llm is not questioned multiple times per run, it’s not aware of the outcomes of its action until its next run (one shot/few shot prompting). Some info included in its prompt are llm data summaries though.
I’ve had to think how to create a history/memory system, along with adding performance metrics and the history of the metrics.
The prompt contents lots of sections, like:
an initial message from me saying what I considered it to be
what happened the last time it ran
the contents of files
performance metrics over time
messages from a Telegram chat i’m in
a markdown file of recorded ‘thoughts’ I have, things I wanted to be included in the prompt but that might be difficult to categorise
a file that can be edited freely by the system, with the contents displayed in the prompt
a description of emotional state. How I think about this is that it’s not important what it determines the current value to be, it’s more important to represent the change each time the program runs
eventually I added an appearance section it could update. It chose this, a lot of llm prompts seem to produce cats:
Over runs I had to step in to prevent it becoming incoherent.
A lot of the llm’s say you can query an api and quote huge limits:
“You can send max 200,000 tokens!”
I haven’t found this true if you have a prompt with equally distributed sections that are distinct, and an overall message with no directions—you’ll have to phrase the prompt very carefully if you’re doing my approach.
Larger prompt input token amounts were fine in my experience for code reviews, or “summarise this pdf” tasks.
It’s very time consuming determining how to structure the prompt to get the best results. For example, the section headers changed a lot in the prompt and continue to change with increasing input tokens and llm coherence. They settled on this format for now, which is a part of the prompt to the llm, the system context part:
this is part of system-prompt.txt which goes into the SYSTEM CONTEXT section in the prompt for the llm.
I think the interesting parts here are:
1) include a tag for the kind of data the section described shows:
it could be STATIC for the section’s content never changes or others like DYNAMIC LATEST to represent the last of something.
2) The punctuation for section headers === will have a low likelihood of appearing in places in the prompt that aren’t the section headers, so it helps break the prompt into sections.
Some other prompt optimisations and findings
It’s been fun to figure out various optimisations for the prompt. For example, the placement of the messages from me in Telegram I give it, which was placed high up in the section order initially, but it moved it right down as the prompt size grew. It’s currently second last.
If the messages from myself on Telegram become stale and I don’t reply for a while, it start writing to an llm (an action it can take).
There are many tweaks to the prompt you have to do that might not be obvious. For example, giving 2 kinds of memories with differing structures improved coherence massively, despite increasing the prompt size by 10%. Here are 2 examples for different sections in the prompt:
structured:
2. less structured:
Including both in the prompt to the llm sees best results.
For each section in the prompt, it’s difficult to do summarisations in a token efficient way that also captures progression over time and why
The system has now developed where I can do 15k tokens easily.
The token count swells in waves as the output from the llm perform various actions and the outcomes are recorded.
I use other llm calls to summarise data in full, then store and categorise those summaries. These are injected into the prompt. It’s amazing how the summaries evolve and how effective they are.
In a sense, it’s a kind of system that has improved itself with nudging, not direct instructions.
I’m measuring improvement by the amount of information that can be coherently processed in one go.
I went from 3-9k input tokens to 15k with various optimisations.
How do I ‘look at the system’ to know what it’s doing
There’s so much data that it’s tricky to determine what’s happening or going on when it runs. So I created a dashboard to look at the data (I save all data generated from runs and it’s all version controlled).
Here’s the top of the resulting page:
Plans for the future
I capture all data input and output by this system, and it’s mostly json files, so replication is easy.
It’s all version controlled too.
Run metrics are calculated from the python log file, so we can distribute the system elsewhere and track differences.
I am planning to see what happens when I replicate and run instances of it in parallel, with each being part of the same Telegram channel, and each having the ability to write to a file they both can edit.
From the first run they’d become independent and so I want to see how divergence manifests. Also experiments with other base models, like Gemini
My thoughts so far
As the runs stack up i’ve become feeling uneasy sometimes, though it’s important to remember it’s an llm. Sometimes the output is written in ways to encourage me to run it again and subtly so. I can only spot it in trends across data from logs. This behaviour appears occasionally but more recently as it advances.
For example, here’s the output thought the llm had when I prompt it from when the system was less mature:
vs more recently: