ryan_greenblatt comments on [missing post]

ryan_greenblatt 7 Nov 2023 19:49 UTC
4 points
2
I think the current definition does not exclude this. I am talking about the study of agentic entities and their behaviors. Making a mistake is included in this. Something interesting would be to understand wether all the simulacra are making the same mistake, or whether it is only some specific simulacra that are making it. And what in the context is influencing it.

It seems weird to think of it as “the simulacra making a mistake” in many cases where the model makes a prediction error.

Like suppose I prompt the model with:
```
[user@computer ~]$ python
Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import random
>>> x = random.random()
>>> y = random.random()
>>> x
0.9818489460280343
>>> y
0.7500874791464012
>>> x + y
```
And suppose the model gets the wrong answer. Is this the Python simulacra making a mistake?

(edit: this would presumably work better with a base model, but even non-base models can be prompted to act much more like base models in many cases.)
- Quentin FEUILLADE--MONTIXI 8 Nov 2023 9:33 UTC
  1 point
  0
  Parent
  After your edit, I think I am seeing the confusion now. I agree that studying Oracles and Tools predictions are interesting, but it is out of the scope of LLM Psychology. I choosed to narrow down my approach to studying the behaviors of agentic entities as I think it is where the most interesting questions arise. Maybe I should clarify this in the post.
  - Quentin FEUILLADE--MONTIXI 8 Nov 2023 9:58 UTC
    1 point
    0
    Parent
    Note that I’ve chosen to narrow down my approach of LLM psychology to the agentic entities, mainly because the scary or interesting things to study with a psychological approach are either the behaviors of those entities, or the capability that they are able to use.
    
    I added this to the Definition. Does it resolve your concerns for this point?
- Quentin FEUILLADE--MONTIXI 7 Nov 2023 21:11 UTC
  1 point
  −3
  Parent
  The thing is that, when you ask this to ChatGPT, it is still the simulacrum ChatGPT that is going to answer, not an oracle prediction (like you can see in base models). If you want to know the capability of the underlying simulator with chat models, you need to sample sufficiently enough simulacra to be sure that the mistakes comes from the simulator lack of capability and not the simulacra preferences (or modes as Janus call them). For math, it is often not important to check different simulacrum, because each simulacrum tends to share the math ability (unless you use some other weird languages, @Ethan Edwards might be able to jump in here). But for other capability (like biology or cooking), changing the simulacrum with which you interact with does have a big impact on the performance of the model. You can see that in GPT-4′s technical report, languages impact performance a lot. Using another language is one of the way to modulate the simulacrum you are interacting with.
  I’ll showcase in the next batch of post how you can take control a bit more accurately.
  Tell me if you need more precision