Matt Goldenberg comments on Matt Goldenberg’s Short Form Feed

Matt Goldenberg 11 Oct 2024 19:29 UTC
51 points
12
I desperately want people to stop using “I asked Claude or ChatGPT” as a stand-in for “I got an objective third party to review”
LLMs are not objective. They are trained on the internet which has specific sets of cultural, religious, ideological biases, and then further trained via RL to be biased in a way that a specific for-profit entity wanted them to be.
- gwern 12 Oct 2024 1:42 UTC
  19 points
  5
  Parent
  Perhaps the norm should be to use some sort of LLM-based survey service like https://news.ycombinator.com/item?id=36865625 in order to try to get a more representative population sample of LLM outputs?
  
  This seems like it could be a useful service in general: do the legwork to take base models (not tuned models), and prompt in many ways and reformulate in many ways to get the most robust distribution of outputs possible. (For example, ask a LLM to rewrite a question at various levels of details or languages, or switch between logically equivalent formulations to avoid acquiescence bias; or if it needs k shots, shuffle/drop out the shots a bunch of times.)
- Thomas Kwa 11 Oct 2024 19:37 UTC
  18 points
  6
  Parent
  Disagree. If ChatGPT is not objective, most people are not objective. If we ask a random person who happens to work at a random company, they are more biased than the internet, which at least averages out the biases of many individuals.
  - Stephen Fowler 11 Oct 2024 22:52 UTC
    28 points
    11
    Parent
    I’ll grant that ChatGPT displays less bias than most people on major issues, but I don’t think this is sufficient to dismiss Matt’s concern.
    My intuition is that if the bias of a few flawed sources (Claude, ChatGPT) is amplified by their widespread use, the fact that it is “less biased than the average person” matters less.
    - Matt Goldenberg 12 Oct 2024 15:44 UTC
      4 points
      2
      Parent
      Yes, this is an excellent point I didn’t get across in the past above.
  - Thane Ruthenis 12 Oct 2024 13:28 UTC
    15 points
    7
    Parent
    LLMs are, simultaneously, (1) notoriously sycophantic, i. e. biased to answer the way they think the interlocutor wants them to, and (2) have “truesight”, i. e. a literally superhuman ability to suss out the interlocutor’s character (which is to say: the details of the latent structure generating the text) based on subtle details of phrasing. While the same could be said of humans as well – most humans would be biased towards assuaging their interlocutor’s worldview, rather than creating conflict – the problem of “leading questions” rises to a whole new level with LLMs, compared to humans.
    You basically have to interpret an LLM being asked something as if a human were asked as biased a way to phrase this question as possible.
    - MichaelDickens 12 Oct 2024 16:55 UTC
      3 points
      2
      Parent
      
      (2) have “truesight”, i. e. a literally superhuman ability to suss out the interlocutor’s character
      
      Why do you believe this?
      - Thane Ruthenis 12 Oct 2024 17:16 UTC
        6 points
        3
        Parent
        See e. g. this and this, and it’s of course wholly unsurprising, since it’s literally what the base models are trained to do.
    - FractalSyn 12 Oct 2024 18:11 UTC
      1 point
      0
      Parent
      I wouldn’t say that my experience with ChatGPT is in total agreement with your conclusion yet you’re raising a good point and the distinction is helpful. I remember of conversations in which the chatbot would both acknowledge and challenge my viewpoint, which I must admit is quite appreciated and not systematic in the biological realm. On the other hand, indeed it is common that pushing the chatbot to buy my arguments and adopt my stance be fairly easy.
      Somehow it’s very related to humanlike intelligence; that is, when training an LLM-based chatbot^[1] by reinforcement, the positive (rewarding) feedback comes from both confirmation of the interlocutor’s beliefs and matters like veracity, ethics, … It’s also what we humans have been experiencing.
      Why and how does it rise to a whole new level when it comes to AI? I tend to think that we must understand the technologies we are using, so it’s our responsibility to use chatbots properly and leverage their capabilities. When talking with a child, or a yound student, or generally someone you know is a newcomer, we adapt our questions, arguments, and the way we process their responses. It’s not an exact science for sure, but there’s no reason to expect so with chatbots.
      ^
      It seems more accurate than LLMs as those have not yet been trained to have a chat with you
  - Matt Goldenberg 11 Oct 2024 21:23 UTC
    9 points
    8
    Parent
    Of course a random person is biased. Some people will will have more authority than others, and we’ll trust them more, and argument screens off authority.
    
    What I don’t want people to do is give chatGPT or Claude authority. Give it to the wisest people you know not Claude.
  - HNX 12 Oct 2024 12:31 UTC
    6 points
    5
    Parent
    [1] Can’t they both be not objective? Why make it a point of one or the other? A bit of a false dichotomy, there.
    [2] There is no single “Internet”—there are specific spaces, forums, communities, blogs, you name it; comprising it. Each has its own, subjective, irrational, moderated (whether by a single individual, a team, or an overall sentiment of the community: promoting/exalting/hyping one subset of topics while ignoring others) mini/sub-culture.
    This last one, furthermore, necessarily only happens to care about its own specific niche; happily ignoring most of everything else. LessWrong used to be mostly about, well—being less wrong—back when it started out. Thus, the “rationality” philosophy. Then it has slowly shifted towards a broader, all-encompassing EA. Now it’s mostly AI.
    Compare the 3k+ results for the former against the 8k+ results for the latter.
    Every space is focused on its own topic, within whatever mini/sub-cultural norms are encouraged/rewarded or punished/denigrated by the people within it. That creates (virtually) unavoidable blind spots, as every group of people within each space only shares information about [A] its chief topic of interest, within [B] the “appropriate” sentiment for the time, while [C] contrasting itself against the enemy/out-group/non-rationalists, you name it.
    In addition to that, different groups have vastly different [I] amount of time on their hands, [II] social, emotional, ethical, moral “charge” with regards to the importance they assign to their topic of choice, and emergent from it come out [III] vastly different amounts of information, produced by the people within that particular space.
    When you compile the data set for your LLM, you’re not compiling a proportionately biased take on different topics. If that was the case, I’d happily agree with you. But you are clearly not. What you are compiling is a bunch of biased, blindsided in their own way, overly leaning towards one social, semantic, political, epistemological position; sets of averaged sentiments. Each will have their own memes, quirks, “hot takes”. Each will have massively over-represented discussions of one topic, at the expense of the other. That’s the web of today.
    When you “train” your GPT on the resulting data set then, who is to say whether it is “averaging” the biases in between different groups? Can you open up any LLM to see its exact logic, reasoning, argumentation steps? Should there be any averaging going on, after all—how is it going to account for disproportionately represented takes of people, who simply have too much time and/or rage to spare? What of the people, who simply don’t spend too much on the web to begin with? Is your GPT going to “average in” those as well, somehow?
    What would prevent the resulting transformer from simply picking up on the likelihood of any given incoming prompt matching the overall “culture” of any single community, thus promptly completing it as if it was a part of an “average” discussion within that particular community there? Isn’t it plain wishful, if not outright naive*, to imagine the algo will do what you hope it will do—instead of what is the easiest possible thing for it to do?
    * the fact a given thought pattern is wishful/naive doesn’t make you wishful/naive; don’t take it personally, plz
  - yc 13 Oct 2024 1:34 UTC
    3 points
    0
    Parent
    It’s probably less on all internet but more on the rlhf guidelines (I imagine the human reviewers receive a guideline based on the LLM-training company’s policy, legal, and safety experts’ advice). I don’t disagree though that it could present a relatively more objective view on some topics than a particular individual (depending on the definition of bias).
- Shankar Sivarajan 11 Oct 2024 21:16 UTC
  4 points
  0
  Parent
  Would you say the same thing of people saying they looked at the Wikipedia article?
  - Matt Goldenberg 12 Oct 2024 15:41 UTC
    7 points
    0
    Parent
    Yes, if people were using Wikipedia in the way they are using the LLMs.
    
    In practice that doesn’t happen though, people cite Wikipedia for facts but are using LLMs for judgement calls.
- RamblinDash 13 Oct 2024 21:54 UTC
  3 points
  0
  Parent
  I treat chatGPT as a vibes-ologist; it’s good for answering questions about like which X is most popular or what do most people think about X. I agree it’s less good for “X is true”
- weightt an 13 Oct 2024 8:24 UTC
  3 points
  0
  Parent
  It’s not just biases, they are also just dumb. (Right now, nothing against 160 iq models that you have in the future). They are often unable to notice important things, or unable to spot problems, or follow up on such observations.
- Seth Herd 11 Oct 2024 19:34 UTC
  3 points
  2
  Parent
  What they’re saying is I got a semi-objective answer fast.
  
  If they’d googled for the answer all the same concerns would apply. You’d need to know the biases of whoever wrote the web content they read to get an answer.
  
  I doubt the orga got much of their own bias into the RLHF/RLAIF process. There are real cultural biases from the humans answering RLHF and the LLM itself from the training set and how it interpreted its constitution.
  - Matt Goldenberg 11 Oct 2024 19:40 UTC
    7 points
    4
    Parent
    What they’re saying is I got a semi-objective answer fast.
    Exactly. Please stop saying this. It’s not semi-objective. The trend of casually treating LLMs as an arbiter of truth leads to moral decay.
    I doubt the orga got much of their own bias into the RLHF/RLAIF process
    This is obviously untrue, orgs spend lots of effort making sure their AI doesn’t say things that would give them bad press for example.
    - Seth Herd 11 Oct 2024 20:43 UTC
      2 points
      0
      Parent
      I should’ve specified—the orgs carefully train to get them to refuse to say things. I don’t think the specifically train them to say things the orgs like or believe. The refusals are intentional, the bias is accidental IMO.
      
      And every source has bias.
      
      So, do you want people.to.quit saying they googled for an answer? I just like them to say where they got the answer so I can judge how biased it might be.
- MondSemmel 12 Oct 2024 16:00 UTC
  2 points
  0
  Parent
  Agreed, except for the small caveat of LLMs answers which can be easily verified as approximately correct. E.g. answers to math problems where the solution is hard but the verification is easy; or Python scripts you’ve tested yourself and whose output looks correct; or reformatted text (like plaintext → BBCode) if it looks correct on a word diff website.
  Incidentally, are there any LLM services which can already this kind of verification in specific domains?
- RaunakChhatwal 12 Oct 2024 5:33 UTC
  1 point
  0
  Parent
  It still signals to the subject of my question that I put in some effort before coming to them.