The thing you’re missing is called instruction tuning. You gather a series of prompt/response pairs and fine tune the model over that data. Do it right and you have a chatty model.
Current theme: default
Less Wrong (text)
Less Wrong (link)
The thing you’re missing is called instruction tuning. You gather a series of prompt/response pairs and fine tune the model over that data. Do it right and you have a chatty model.