Attempts at concise answers to some of these in case you wanted them:
Simulcra. I spend some time going through the posts and it’s one of those things that just never manages to click with me.
The big misconception that the simulcra post tries to correct is that, while GPT-N can simulate processes that are agentic, GPT-N is not agentic itself. Calling GPT-Ns “optimizers” in any sense of the word is the sort of basic category mistake that seems extremely silly in retrospect.
Blockchain. I guess the thing that I don’t understand here is the hype. I get that it’s a basically a database that can’t be editted and I’ve read through articles talking about the use cases, but it’s been around for a while now and doesn’t seem to have been that game changing. Yet there are smart people who are super excited about it and I suspect that there are things I am failing to appreciate, regardless of whether their excitement is justified.
The hype is that you can send currency (and do a bunch of other cool things) to people over the internet without the consent of one or more state governments or corporations. All other hype is derived from this extremely neat property.
Occam’s razor. Is it saying anything other than P(A) >= P(A & B)?
Yes, but I wouldn’t attempt to mathematize it myself. “A” and “B” are propositions. Occam’s razor say to anticipate hypotheses about the underlying processes that are producing the data you see that are less “complex” for some definition of “complexity” that probably involves something in conceptspace near solomonoff induction or something.
Evolution. I get that at a micro-level, if something makes an organism more likely to reproduce it will in fact, err, spread the genes. And then that happens again and again and again. And since mutations are a thing organisms basically get to try new stuff out and the stuff that works sticks. I guess that’s probably the big idea but I don’t know much beyond it and remember being confused when I initially skimmed through the The Simple Math of Evolution sequence.
The big ideas I remember from that sequence is that, beyond those (true) concepts:
Evolution primarily selects on the gene level.
Evolution works at a specific pace that can be calculated with some very simple math.
Turing machines. Off the top of my head I don’t really know what they are. Something about a roll of tape with numbers and skipping from one place to the next and how that is somehow at the core of all computing? I wish I understood this. After all, I am a programmer. I spent a few weeks skimming through a Udacity coures on the theory of computation a while ago but none of it really stuck.
It’s a simplified model of a computer that can be used (in principle) to run every computer program we have thus far identified. Systems are “turing complete” if they can be used to simulate turing machines (and therefore also all of our computer programs (so far)).
“A” and “B” are propositions. Occam’s razor say to anticipate hypotheses about the underlying processes that are producing the data you see that are less “complex” for some definition of “complexity” that probably involves something in conceptspace near solomonoff induction or something.
There is more than one justification of Occam’s razor. The justification in terms P(A) >= P(A & B) has the almost unique advantage off relating directly to truth/probability. Solonmonoff induction doesn’t have that property (in any obvious way) , because SI’s contain programmes, not propositions.
The big misconception that the simulcra post tries to correct is that, while GPT-N can simulate processes that are agentic, GPT-N is not agentic itself. Calling GPT-Ns “optimizers” in any sense of the word is the sort of basic category mistake that seems extremely silly in retrospect.
This is a misconception itself. GPT is an agent that learns to be a good simulator, an “actor”. It doesn’t mean that the actor itself is not there—that would be physically incoherent. The strength of that agency is a different question, as well as the goals of that agent: might not actually go beyond the goal of “being a good actor” (and this is just what we want, presumably, as of now). But that goal probably should be there if we want the system to be robust. (How good progress RLHF does towards that goal is a different question still.) Compare: human actors do have goals beyond “being a good actor”, e.g., being a good citizen, a good child, etc.
Fortunately, a “pure simulator” is neither physically coherent nor would it be desirable if it was physically possible. Because we don’t want actors in the theatre when playing Nero actually just “become Nero”, we want them to “be actors” in the back of their minds. This is a part of acting as a profession and what good actors are paid for—being able to go in and out of roles.
Attempts at concise answers to some of these in case you wanted them:
The big misconception that the simulcra post tries to correct is that, while GPT-N can simulate processes that are agentic, GPT-N is not agentic itself. Calling GPT-Ns “optimizers” in any sense of the word is the sort of basic category mistake that seems extremely silly in retrospect.
The hype is that you can send currency (and do a bunch of other cool things) to people over the internet without the consent of one or more state governments or corporations. All other hype is derived from this extremely neat property.
Yes, but I wouldn’t attempt to mathematize it myself. “A” and “B” are propositions. Occam’s razor say to anticipate hypotheses about the underlying processes that are producing the data you see that are less “complex” for some definition of “complexity” that probably involves something in conceptspace near solomonoff induction or something.
The big ideas I remember from that sequence is that, beyond those (true) concepts:
Evolution primarily selects on the gene level.
Evolution works at a specific pace that can be calculated with some very simple math.
It’s a simplified model of a computer that can be used (in principle) to run every computer program we have thus far identified. Systems are “turing complete” if they can be used to simulate turing machines (and therefore also all of our computer programs (so far)).
I think Adam is asking about the “simulacra levels” concept, not anything about GPT language models.
Yes, thanks for clarifying. https://www.lesswrong.com/tag/simulacrum-levels
There is more than one justification of Occam’s razor. The justification in terms P(A) >= P(A & B) has the almost unique advantage off relating directly to truth/probability. Solonmonoff induction doesn’t have that property (in any obvious way) , because SI’s contain programmes, not propositions.
This is a misconception itself. GPT is an agent that learns to be a good simulator, an “actor”. It doesn’t mean that the actor itself is not there—that would be physically incoherent. The strength of that agency is a different question, as well as the goals of that agent: might not actually go beyond the goal of “being a good actor” (and this is just what we want, presumably, as of now). But that goal probably should be there if we want the system to be robust. (How good progress RLHF does towards that goal is a different question still.) Compare: human actors do have goals beyond “being a good actor”, e.g., being a good citizen, a good child, etc.
Fortunately, a “pure simulator” is neither physically coherent nor would it be desirable if it was physically possible. Because we don’t want actors in the theatre when playing Nero actually just “become Nero”, we want them to “be actors” in the back of their minds. This is a part of acting as a profession and what good actors are paid for—being able to go in and out of roles.