PaLM-540B- a stunningly powerful question-answering language model.
Great Palm- A hypothetical language model that combines the powers of GPT-3 and PaLM-540B.
I would’ve thought that palm was better at text generation then gpt-3 by default. They’re both pretrained on internet next-word prediction and palm is bigger with more data. What makes you think GPT-3 is better at text generation?
I’m puzzled by this as well. For a moment I thought maybe PaLM used an encoder-decoder architecture, but no it uses next-word prediction just like GPT-3. Not sure what GPT-3 has that PaLM lacks. A model with the parameter count of PaLM and training dateset size of Chinchilla would be a better hypothetical for “Great Palm”.
I would’ve thought that palm was better at text generation then gpt-3 by default. They’re both pretrained on internet next-word prediction and palm is bigger with more data. What makes you think GPT-3 is better at text generation?
I’m puzzled by this as well. For a moment I thought maybe PaLM used an encoder-decoder architecture, but no it uses next-word prediction just like GPT-3. Not sure what GPT-3 has that PaLM lacks. A model with the parameter count of PaLM and training dateset size of Chinchilla would be a better hypothetical for “Great Palm”.