It would be quite easy to automatically generate all of the math and logic you could ever want for these models. Far more than you could possibly ever want train it on (wouldn’t want to make it a math only bot, probably.). I could easily program a computer to come up with effectively infinite correct math problems. There are quintillions of 64bit addition problems alone… (actually an immense underestimate. there are 18.4 quintillion 64bit numbers alone). Subtraction, multiplication, division, algebra, trig, calculus, statistics, etc; AND, OR, NOT, XOR, NAND, NOR, syllogisms, first order logic, second order, etc.
Multi-modality stuff would be helpful too such as music videos. Movements, sounds, sung words, subtitles, etc, and then also include other performances of the same song by the same and different people, marked in some way as being the same song. This is the sort of thing that would make it start to understand how ideas work.
Want it to understand what the word ‘falling’ really means? Mark a bunch of videos of things falling with ‘falling’, expound on ‘falling’ in math, show its relations in logic, and use it in many sentences. Even a small number of items will allow it to start bootstrapping meaning (humans only need less than 10,000 words for things relating to the world, and can largely start the process in a new language with 1,000.). What colors look like, what, vowels, consonants, etc sound like, what objects people use in everyday life, basic physics, etc, all easy and with a ton of data available.
This might be a very large endeavor...but if it isn’t already, will soon be much cheaper than the training cost. (And by these training laws, make the overall cost much less for much higher performance.) And yes, I think this will help it even if all it ever does is generate text completions. I am a firm believer that grounding is at least one of the absolutely necessary things for AI to become ‘Intelligence’.
Side note: I don’t really believe in the AI hype machine. Definitely not for the near future, at least. We haven’t even reached what people were claiming about GPT-3 yet.
The Pile includes 7GB of math problems generated by deepmind basically as you describe. I don’t believe the models trained on it can do any of them, but my testing wasn’t properly done.
I am unsurprised it includes them, since it is an obvious thing. 7GB sounds like a crazy amount of math problems...but is only a tiny amount compared to what could be generated. Chinchilla was all about how they need more data, and it would be an easy way to increase that (correctly).
That don’t understand math on 7GB amount of examples is obviously related to the current extremely primitive state of logic in all such models. The big question, would it still not understand math and logic at 100x the amount of it. If it could learn basic abstract reasoning, that would massively improve its performance at all tasks. Since math and logic are literally just languages that express an understanding of the easiest (context-independent) of relations between things, that would prove modern techniques wholly unsuited to real AI. I suspect if it was 700GB of math, it wouldn’t fail so hard at math, but who knows?
(GPT-J even fails at things like ‘2 + 2 =’ on half the prompts I try, often giving strange results like ‘0’ or ‘5’ even with a temperature of 0, though often that is because it doesn’t even realize it is math, assuming that ‘2 + 2 =’ is somehow a programming thing even though the similarity is entirely superficial. Even when it knows it is doing math, it will often get the answer right at first, and then switch to ‘2 + 2 = 0’.).
Human beings can not do most math without pencil and paper and a lot of pondering. Whereas there are a number of papers showing specialized transformers can do math and code at a more sophisticated level than I would have expected before seeing the results.
I literally noted that GPT-J, which uses said 7GB of math (assuming that number is right), usually fails at ‘2 + 2 =’. People can do several digit addition without pencil and paper. ’763 + 119 =’ probably doesn’t require pencil and paper to get ’882′. We do require it for many step algorithms, but this is not that. ‘Dumb’ computers do 64-bit addition trivially (along with algebra, calculus, etc.). I haven’t seen specialized math models, but I’m dumbfounded that general models don’t do math way better.
I haven’t tried coding using ‘AI’ tools, so have no real opinion on how well it compares to basic autocomplete.
The basic problem of arithmetic is this: You can’t be informal in math, and every single step needs to be checked. Language, while complicated can allow a degree of informality, as long as you can communicate well. Math does not allow this.
This is obviously correct math, but formally you would do each step separately. The steps don’t necessarily need to be checked either, because it is an easy enough one that you can just check the result.
Math is a language, just a rigorous one, where it is simple to be right or wrong. It is a simple way to abstract away things that don’t matter, and talk about the underlying relations. Math is a subset of language with easier relations. For something with a pure general intelligence, math is probably much easier than a normal language.
I hold that we are story telling intelligences [and consciousness is us telling ourselves our own story as we compose it] that have been generalized through a deep understanding of the patterns in stories, which is why normal languages are easier for us -they were made to tell stories. (I also hold that you story of math is technically incorrect.)
It would be quite easy to automatically generate all of the math and logic you could ever want for these models. Far more than you could possibly ever want train it on (wouldn’t want to make it a math only bot, probably.). I could easily program a computer to come up with effectively infinite correct math problems. There are quintillions of 64bit addition problems alone… (actually an immense underestimate. there are 18.4 quintillion 64bit numbers alone). Subtraction, multiplication, division, algebra, trig, calculus, statistics, etc; AND, OR, NOT, XOR, NAND, NOR, syllogisms, first order logic, second order, etc.
Multi-modality stuff would be helpful too such as music videos. Movements, sounds, sung words, subtitles, etc, and then also include other performances of the same song by the same and different people, marked in some way as being the same song. This is the sort of thing that would make it start to understand how ideas work.
Want it to understand what the word ‘falling’ really means? Mark a bunch of videos of things falling with ‘falling’, expound on ‘falling’ in math, show its relations in logic, and use it in many sentences. Even a small number of items will allow it to start bootstrapping meaning (humans only need less than 10,000 words for things relating to the world, and can largely start the process in a new language with 1,000.). What colors look like, what, vowels, consonants, etc sound like, what objects people use in everyday life, basic physics, etc, all easy and with a ton of data available.
This might be a very large endeavor...but if it isn’t already, will soon be much cheaper than the training cost. (And by these training laws, make the overall cost much less for much higher performance.) And yes, I think this will help it even if all it ever does is generate text completions. I am a firm believer that grounding is at least one of the absolutely necessary things for AI to become ‘Intelligence’.
Side note: I don’t really believe in the AI hype machine. Definitely not for the near future, at least. We haven’t even reached what people were claiming about GPT-3 yet.
The Pile includes 7GB of math problems generated by deepmind basically as you describe. I don’t believe the models trained on it can do any of them, but my testing wasn’t properly done.
I am unsurprised it includes them, since it is an obvious thing. 7GB sounds like a crazy amount of math problems...but is only a tiny amount compared to what could be generated. Chinchilla was all about how they need more data, and it would be an easy way to increase that (correctly).
That don’t understand math on 7GB amount of examples is obviously related to the current extremely primitive state of logic in all such models. The big question, would it still not understand math and logic at 100x the amount of it. If it could learn basic abstract reasoning, that would massively improve its performance at all tasks. Since math and logic are literally just languages that express an understanding of the easiest (context-independent) of relations between things, that would prove modern techniques wholly unsuited to real AI. I suspect if it was 700GB of math, it wouldn’t fail so hard at math, but who knows?
(GPT-J even fails at things like ‘2 + 2 =’ on half the prompts I try, often giving strange results like ‘0’ or ‘5’ even with a temperature of 0, though often that is because it doesn’t even realize it is math, assuming that ‘2 + 2 =’ is somehow a programming thing even though the similarity is entirely superficial. Even when it knows it is doing math, it will often get the answer right at first, and then switch to ‘2 + 2 = 0’.).
Human beings can not do most math without pencil and paper and a lot of pondering. Whereas there are a number of papers showing specialized transformers can do math and code at a more sophisticated level than I would have expected before seeing the results.
I literally noted that GPT-J, which uses said 7GB of math (assuming that number is right), usually fails at ‘2 + 2 =’. People can do several digit addition without pencil and paper. ’763 + 119 =’ probably doesn’t require pencil and paper to get ’882′. We do require it for many step algorithms, but this is not that. ‘Dumb’ computers do 64-bit addition trivially (along with algebra, calculus, etc.). I haven’t seen specialized math models, but I’m dumbfounded that general models don’t do math way better.
I haven’t tried coding using ‘AI’ tools, so have no real opinion on how well it compares to basic autocomplete.
The basic problem of arithmetic is this: You can’t be informal in math, and every single step needs to be checked. Language, while complicated can allow a degree of informality, as long as you can communicate well. Math does not allow this.
You kind of can be informal though?
Suppose, 5x − 2 = 3b +9, thus
x = (3b + 11)/5 or b = (5x −11)/3
If b = 2, then
x = 17⁄5
If x = 2, then
b = −1/3
This is obviously correct math, but formally you would do each step separately. The steps don’t necessarily need to be checked either, because it is an easy enough one that you can just check the result.
Math is a language, just a rigorous one, where it is simple to be right or wrong. It is a simple way to abstract away things that don’t matter, and talk about the underlying relations. Math is a subset of language with easier relations. For something with a pure general intelligence, math is probably much easier than a normal language.
I hold that we are story telling intelligences [and consciousness is us telling ourselves our own story as we compose it] that have been generalized through a deep understanding of the patterns in stories, which is why normal languages are easier for us -they were made to tell stories. (I also hold that you story of math is technically incorrect.)