Andy Jones comments on Are we in an AI overhang?

Andy Jones 27 Jul 2020 20:37 UTC
9 points
Thinking about this a bit more, do you have any insight on Tesla? I can believe that it’s outside DM and GB’s culture to run with the scaling hypothesis, but watching Karpathy’s presentations (which I think is the only public information on their AI program?) I get the sense they’re well beyond $10m/run by now. Considering that self-driving is still not there—and once upon a time I’d have expected driving to be easier than Harry Potter parodies—it suggests that language is special in some way. Information density? Rich, diff’able reward signal?
- ChristianKl 27 Jul 2020 20:59 UTC
  22 points
  0
  Parent
  Self driving is very unforgiving of mistakes. The text generation on the other hand doesn’t have similar failure conditions and bad content can be easily fixed.
- gwern 28 Jul 2020 14:36 UTC
  19 points
  Parent
  Tesla publishes nothing and I only know a little from Karpathy’s occasional talks, which are as much about PR (to keep Tesla owners happy and investing in FSD, presumably) & recruiting as anything else. But their approach seems heavily focused on supervised learning in CNNs and active learning using their fleet to collect new images, and to have nothing to do with AGI plans. They don’t seem to even be using DRL much. It is extremely unlikely that Tesla is going to be relevant to AGI or progress in the field in general given their secrecy and domain-specific work. (I’m not sure how well they’re doing even at self-driving cars—I keep reading about people dying when their Tesla runs into a stationary object on a highway in the middle of the day, which you’d think they’d’ve solved by now...)
  - Daniel Kokotajlo 29 Jul 2020 17:21 UTC
    3 points
    Parent
    I’m pretty sure I remember hearing they use unsupervised learning to form their 3D model of their local environment, and that’s the most important part, no?
  - Matthew Wilson 20 Aug 2021 17:34 UTC
    1 point
    Parent
    Curious if you have updated on this at all, given AI Day announcements?
    - gwern 20 Aug 2021 19:29 UTC
      3 points
      Parent
      They still running into stationary objects? The hardware is cool, sure, but unclear how much good it’s doing them...
      - simon 9 Oct 2021 18:52 UTC
        1 point
        Parent
        I believe that is referring to the baseline driver assistance system, and not the advanced “full self driving” one (that has to be paid for separately). Though it’s hard to tell that level of detail from a mainstream media report.
- Andy Jones 27 Jul 2020 20:37 UTC
  15 points
  0
  Parent
  hey man wanna watch this language model drive my car
  - mercury 28 Jul 2020 23:58 UTC
    8 points
    0
    Parent
    I just realized with a start that this is _absolutely_ going to happen. We are going to, in the not-too-distant-future see a GPT-x (or similar) be ported to a Tesla and drive it.
    It frustrates me that there are not enough people IRL I can excitedly talk about how big of a deal this is.
    - TurnTrout 29 Jul 2020 3:15 UTC
      9 points
      Parent
      Can you explain why GPT-x would be well-suited to that modality?
      - Davidmanheim 29 Jul 2020 10:56 UTC
        6 points
        Parent
        Presumably, because with a big-enough X, we can generate text descriptions of scenes from cameras and feed them in to get driving output more easily than the seemingly fairly slow process to directly train a self-driving system that is safe. And if GPT-X is effectively magic, that’s enough.
        I’m not sure I buy it, though. I think that once people agree that scaling just works, we’ll end up scaling the NNs used for self driving instead, and just feed them much more training data.
        ChristianKl 29 Jul 2020 13:35 UTC
        4 points
        Parent
        There might be some architectures that are more scaleable then others. As far as I understand the present models for self driving have for the most part a lot of hardcoded elements. That might make them more complicated to scale.
        Davidmanheim 29 Jul 2020 14:24 UTC
        3 points
        Parent
        Agreed, but I suspect that replacing those hard-coded elements will get easier over time as well.
        _vk_ 29 Jul 2020 14:40 UTC
        1 point
        Parent
        Andrej Karpathy talks about exactly that in a recent presentation: https://youtu.be/hx7BXih7zx8?t=1118
- Daniel Kokotajlo 27 Jul 2020 21:37 UTC
  8 points
  Parent
  My hypothesis: Language models work by being huge. Tesla can’t use huge models because they are limited by the size of the computers on their cars. They could make bigger computers, but then that would cost too much per car and drain the battery too much (e.g. a 10x bigger computer would cut dozens of miles off the range and also add $9,000 to the car price, at least.)
  - orthonormal 27 Jul 2020 22:58 UTC
    9 points
    Parent
    [EDIT: oops, I thought you were talking about the direct power consumption of the computation, not the extra hardware weight. My bad.]
    It’s not about the power consumption.
    The air conditioner in your car uses 3 kW, and GPT-3 takes 0.4 kWH for 100 pages of output—thus a dedicated computer on AC power could produce 700 pages per hour, going substantially faster than AI Dungeon (literally and metaphorically). So a model as large as GPT-3 could run on the electricity of a car.
    The hardware would be more expensive, of course. But that’s different.
    - Daniel Kokotajlo 28 Jul 2020 12:41 UTC
      6 points
      Parent
      Huh, thanks—I hadn’t run the numbers myself, so this is a good wake-up call for me. I was going off what Elon said. (He said multiple times that power efficiency was an important design constraint on their hardware because otherwise it would reduce the range of the car too much.) So now I’m just confused. Maybe Elon had the hardware weight in mind, but still...
      Maybe the real problem is just that it would add too much to the price of the car?
      - CarlShulman 29 Jul 2020 15:43 UTC
        4 points
        Parent
        Maybe the real problem is just that it would add too much to the price of the car?
        Yes. GPU/ASICs in a car will have to sit idle almost all the time, so the costs of running a big model on it will be much higher than in the cloud.
  - Linch 8 Aug 2020 9:49 UTC
    3 points
    Parent
    Re hardware limit: flagging the implicit assumption here that network speeds are spotty/unreliable enough that you can’t or are unwilling to safely do hybrid on-device/cloud processing for the important parts of self-driving cars.
    (FWIW I think the assumption is probably correct).