I would also note that most modern-day AI like GPT-N are not actually optimisers, just algorithms produced by optimisation processes—the entity of [GPT-N + its trainer + its training data] could be considered an optimiser (albeit a self-contained one), but as soon as you take GPT-N out of that environment it is a stateless algorithm that looks at a short string of text and provides a probability distribution for the next letter. When it is run in generative mode, the set of its weights and answers will be no different from its isolated guesses when being trained.
I would also note that most modern-day AI like GPT-N are not actually optimisers, just algorithms produced by optimisation processes—the entity of [GPT-N + its trainer + its training data] could be considered an optimiser (albeit a self-contained one), but as soon as you take GPT-N out of that environment it is a stateless algorithm that looks at a short string of text and provides a probability distribution for the next letter. When it is run in generative mode, the set of its weights and answers will be no different from its isolated guesses when being trained.