AFAIK in-context learning functions pretty similarly to fine-tuning
As per “Transformers learn in-context by gradient descent”, which Gwern also mentions in the comment that @quetzal_rainbow links here.
As per “Transformers learn in-context by gradient descent”, which Gwern also mentions in the comment that @quetzal_rainbow links here.