Paul Tiplady comments on Is this the beginning of the end for LLMS [as the royal road to AGI, whatever that is]?

Paul Tiplady 24 Aug 2023 19:47 UTC
3 points
2
While of course this is easy to rationalize post hoc, I don’t think falling user count of ChatGPT is a particularly useful signal. There is a possible world where it is useful; something like “all of the value from LLMs will come from people entering text into ChatGPT”. In that world, users giving up shows that there isn’t much value.

In this world, I believe most of the value is (currently) gated behind non-trivial amounts of software scaffolding, which will take man-years of development time to build. Things like UI paradigms for coding assistants, experimental frameworks and research for medical or legal AI, and integrations with existing systems.

There are supposedly north of 100 AI startups in the current Y Combinator batch; the fraction of those that turn into unicorns would be my proposal for a robust metric to pay attention to. Even if it’s par for startups that’s still a big deal, since there was just a major glut in count of startups founded. But if the AI hype is real, more of these than normal will be huge.

Another similar proxy would be VC investment dollars; if that falls off a cliff you could tell a story that even the dumb money isn’t convinced anymore.
- Bill Benzon 24 Aug 2023 20:37 UTC
  1 point
  0
  Parent
  While of course this is easy to rationalize post hoc, I don’t think falling user count of ChatGPT is a particularly useful signal.
  I agree with that. Perhaps those who’ve dropped off were casual users and have become bored. But there are other complaints. The continued existence of confabulation seems more troublesome. OTOH, I can imagine that coding assistance will prove viable. As I said, the situation is quite volatile.
  - Nate Showell 26 Aug 2023 2:09 UTC
    2 points
    1
    Parent
    Some other possible explanations for why ChatGPT usage has decreased:
    The quality of the product has declined over time
    People are using its competitors instead
    - gwern 26 Aug 2023 15:55 UTC
      20 points
      1
      Parent
      There’s a lot one could say about this claim:
      
      Recall here the numbers here are substantially fake. The best they can tell you is roughly “this is very big” or “this is very small”. If you want to go much beyond that, you are reading sheep entrails. They are not coming from OpenAI but web traffic measurements. Such measurements are notoriously both noisy and highly biased and the biases change over time, and so unsurprisingly, at the time, OAers were saying it had overestimated the actual users by like 100%.
      
      The numbers have lots of ways to be misleading. For example, because Chinese DL was, and still is, so inferior, there was a whole cottage industry of Chinese companies pirating accounts and black markets in credentials. This also applied to all the Third World or embargoed or difficult countries OA has denied access to, either for paying accounts or just period. Then you have people abusing it or sexing with it and getting banned and figuring out how to create new accounts, or moving on.
      
      Just a huge amount of whac-a-mole going on. This obviously causes issues for interpreting any user metrics: a huge decrease in user count might actually reflect a huge increase in users, and vice-versa, depending on how the security arms race is going.
      
      Summer vacation. If you look at the graph, you’ll notice a remarkable correlation with the Western academic calendar… Someone confidently proclaiming that ChatGPT use is crashing before seeing the September–November 2023 numbers is giving hostage to fortune.
      
      EDIT: as of 1 October, SimilarWeb and other sources are reporting increasing traffic.
      
      Many, many alternatives spinning up, many of which are using OA as the backend, particularly after the large price drops. These would count in a naive web-traffic approach as OA ‘losing users’, rather than gaining them.
      
      More broadly, if they are going to true competitors like Claude-2 and do not count as OA users by any definition, that’s still damaging to Goia’s thesis that ‘generative AI is useless’ - sure, maybe it’s not great for OA but it shows that the users are getting value out of generative AI, and just that they found a better way to get that value.
      
      Personally, I would say that a simple test of OA users/activity would be to look at how they act. OA is presumably getting regular large shipments of GPUs installed into datacenters as fast as MS money can buy them; if OA usage is flat for several months—never mind crashing! - then they should be ‘enjoying’ a glut of GPUs by now and acting accordingly. Does OA look like it has an embarrassment of GPUs? Or does it look like it is struggling to add capacity as necessary to keep up with constant user growth, and holding back major improvements because it can’t afford them, and focusing on optimizing models (even to the detriment of quality) to get more out of its existing GPUs?
      - Bill Benzon 26 Aug 2023 18:01 UTC
        1 point
        0
        Parent
        Thanks for this. Very useful.
  - Paul Tiplady 25 Aug 2023 0:42 UTC
    2 points
    1
    Parent
    Confabulation is a dealbreaker for some use-cases (e.g. customer support), and potentially tolerable for others (e.g. generating code when tests / ground-truth is available). I think it’s essentially down to whether you care about best-case performance (discarding bad responses) or worst-case performance.
    But agreed, a lot of value is dependent on solving that problem.
    - Bill Benzon 25 Aug 2023 13:24 UTC
      1 point
      0
      Parent
      As sort of an aside, in some way I think the confabulation is the default mode of human language. We make stuff up all the time. But we have to coordinate with others too, so that places constraints on what we say. Those constraints can be so binding that we’ve come to think of this socially constrained discourse as ‘ground truth’ and free of the confabulation impulse. But that’s not quite so.