I’m beginning to think that lack of access to a transformer model will be a bad handicap for anyone engaged in any kind of intellectual activity. Access alone isn’t enough—one needs to know how to use it—and the people who own the transformers have the greatest advantage of all, because they can log all the output, just as search engine owners potentially know the search history of all their users.
I confess that I have never interacted directly with GPT-3, I’ve only looked over the shoulder of someone who had access. Is there some kind of guide to accessing transformers—how much it costs, the relative merits of the different models—or is it all still fundamentally about knowing the right people?
I’ve found that the more I use Github CoPilot the more time I give to thinking how to write comments and function names to prompt good code recommendations.
Cost: You have basically 3 months free with GPT3 Davinci (175B) (under a given limit but which is sufficient for personal use) and then you pay as you go. Even if you use it a lot, you’re likely to pay less than 5$ or 10$ per months. And if you have some tasks that need a lot of tokens but that are not too hard (e.g hard reading comprehension), Curie (GPT3 6B) is often enough and is much cheaper to use!
In few-shot settings (i.e a setting in which you show examples of something so that it reproduces it), Curie is often very good so it’s worth trying it!
Merits: It’s just a matter of cost and inference speed that you need. The biggest models are almost always better so taking the biggest thing that you can afford, both in terms of speed and of cost is a good heuristic
Use: It’s very easy to use with the new Instruct models. You just put your prompt and it completes it. The only parameter you have to care about are token uses (which is basically the max size of the completion you want) / temperature (it’s a parameter that affects how “creative” is the answer ; the higher the more creative)
I’m beginning to think that lack of access to a transformer model will be a bad handicap for anyone engaged in any kind of intellectual activity. Access alone isn’t enough—one needs to know how to use it—and the people who own the transformers have the greatest advantage of all, because they can log all the output, just as search engine owners potentially know the search history of all their users.
I confess that I have never interacted directly with GPT-3, I’ve only looked over the shoulder of someone who had access. Is there some kind of guide to accessing transformers—how much it costs, the relative merits of the different models—or is it all still fundamentally about knowing the right people?
I’ve found that the more I use Github CoPilot the more time I give to thinking how to write comments and function names to prompt good code recommendations.
You can also play around with open-source versions that offer surprisingly comparable capability to OpenAI models.
Here is the GPT-6-J from EleutherAI that you can use without any hassle: https://6b.eleuther.ai/
They also released a new, 20B model but I think you need to log in to use it: https://www.goose.ai/playground
Cost: You have basically 3 months free with GPT3 Davinci (175B) (under a given limit but which is sufficient for personal use) and then you pay as you go. Even if you use it a lot, you’re likely to pay less than 5$ or 10$ per months.
And if you have some tasks that need a lot of tokens but that are not too hard (e.g hard reading comprehension), Curie (GPT3 6B) is often enough and is much cheaper to use!
In few-shot settings (i.e a setting in which you show examples of something so that it reproduces it), Curie is often very good so it’s worth trying it!
Merits: It’s just a matter of cost and inference speed that you need. The biggest models are almost always better so taking the biggest thing that you can afford, both in terms of speed and of cost is a good heuristic
Use: It’s very easy to use with the new Instruct models. You just put your prompt and it completes it. The only parameter you have to care about are token uses (which is basically the max size of the completion you want) / temperature (it’s a parameter that affects how “creative” is the answer ; the higher the more creative)
https://beta.openai.com/signup