[Question] Can singularity emerge from transformers?

One of the most aesthetically pleasing facts in computer science is that after compiling by hand one compiler, you can you your compiled compiler to compile an ever-larger compiler.

One historical fact that most people seem to forget about artificial intelligence is that the first attempt people had, more than half-century ago, was lisp machines. A machine that writes code. As recently as 2018, I was forced to do my course on Data Structures and Algorithms using Lisp because one of my professors is still a believer that these statistical approaches to NLP aren’t good enough.

I don’t understand exactly how you can approach singularity in the transformer paradigm. It seems to me that you can never bootstrap a bigger intelligence with a smaller intelligence. What are you going to do? Ask GPT-4 to write 40 trillion tokens so you can train GPT-5? Anyone would agree you’re just replicating noise.

The argument for singularity goes that a very smart intelligence could create an even smarter intelligence. But the complexity of a transformer seems beyond reach for any type of transformer model.

If I were trying to create a bad AGI, I’d try to use a transformer model to create some Lisp-machine type of AI, more ontology and rule based, where information is in some space where it’s more understandable, and the AI can keep making associations.

Nonetheless, although I like the website, I’m an e/​acc myself. But even if I was a decel, I guess I would have a tough time worrying about transformers. I know many people worry that an AI might get out of control without reaching singularity, and I suppose that’s fine, but is it all that decels worry about?