That sounds ambitions and great, thanks for posting. What’s a budget estimation for the fine tuning part?
Training this model would cost from 2 times (on a purely 1-1 dialogue data) to ~10-15 times (on chat room and forum data where messages from the most active users tend to be mixed very well) more than the training of the current LLMs.
Current LLAMA 2 was fine tuned like this:
Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB
A100 costs about 1$ per hour, see https://vast.ai/pricing . So the cost of this model would be 3.3M-33M usd? This seems affordable for Google, Meta, etc. but for a grant with 100K usd max?
So perhaps, update this project to fine tune existing models. Perhaps, for classification only some BERT like model would do. Like DeBERTa or similar.
That sounds ambitions and great, thanks for posting. What’s a budget estimation for the fine tuning part?
Current LLAMA 2 was fine tuned like this:
As per “Llama 2: Open Foundation and Fine-Tuned Chat Models | Research—AI at Meta,” July 2023. https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/.
A100 costs about 1$ per hour, see https://vast.ai/pricing . So the cost of this model would be 3.3M-33M usd? This seems affordable for Google, Meta, etc. but for a grant with 100K usd max?
So perhaps, update this project to fine tune existing models. Perhaps, for classification only some BERT like model would do. Like DeBERTa or similar.