Well, then computing ℓnovel would just take a really long time.
So, it’s not impossible in principle if you trained the loss function as I suggested (loss function trained by reinforcement learning, then applied to train the actual novel-generating model), but it is a totally impractical approach.
If you really wanted to teach an AI to generate good novels, you’d probably start by training a LLM to imitate existing novels through some sort of predictive loss (e.g., categorical cross-entropy on next-token prediction) to give it a good prior. Then train another LLM to predict reader reviews or dissertations written by literary grad students, using the novels they’re based on as inputs, again with a similar predictive loss. (Pretraining both LLMs on some large corpus (as with GPT) could probably help with providing necessary cultural context.) At the same time, use a Mechanical Turk to get thousands of people to rate the sentiment of every review/dissertation, then train another LLM to predict the sentiment scores of all raters (or a low-dimensional projection of all their ratings), using the reviews/dissertations as input and something like MSE loss to predict sentiment scores as output. Then chain these latter two networks together to compute ℓnovel, to act as the prune to the first network’s babble, and train to convergence.
Honestly, though, I probably still wouldn’t trust the resulting system to produce good novels (or at least not with internally consistent plots, characterizations, and themes) if the LLMs were based on a Transformer architecture.
Mostly due to the limited working memory that Transformers typically use (e.g., a buffer of only the most recent 512 tokens feeding into the decoder). When humans write novels, they have to keep track of plot points, character sheets, thematic arcs, etc. across tens of thousands of words. You could probably get it to work, though, if you augmented the LLM with content-addressable memory and included positional encoding that is aware of where in the novel (percentage-wise) each token resides.
Well, then computing ℓnovel would just take a really long time.
So, it’s not impossible in principle if you trained the loss function as I suggested (loss function trained by reinforcement learning, then applied to train the actual novel-generating model), but it is a totally impractical approach.
If you really wanted to teach an AI to generate good novels, you’d probably start by training a LLM to imitate existing novels through some sort of predictive loss (e.g., categorical cross-entropy on next-token prediction) to give it a good prior. Then train another LLM to predict reader reviews or dissertations written by literary grad students, using the novels they’re based on as inputs, again with a similar predictive loss. (Pretraining both LLMs on some large corpus (as with GPT) could probably help with providing necessary cultural context.) At the same time, use a Mechanical Turk to get thousands of people to rate the sentiment of every review/dissertation, then train another LLM to predict the sentiment scores of all raters (or a low-dimensional projection of all their ratings), using the reviews/dissertations as input and something like MSE loss to predict sentiment scores as output. Then chain these latter two networks together to compute ℓnovel, to act as the prune to the first network’s babble, and train to convergence.
Honestly, though, I probably still wouldn’t trust the resulting system to produce good novels (or at least not with internally consistent plots, characterizations, and themes) if the LLMs were based on a Transformer architecture.
Interesting—why is that?
Mostly due to the limited working memory that Transformers typically use (e.g., a buffer of only the most recent 512 tokens feeding into the decoder). When humans write novels, they have to keep track of plot points, character sheets, thematic arcs, etc. across tens of thousands of words. You could probably get it to work, though, if you augmented the LLM with content-addressable memory and included positional encoding that is aware of where in the novel (percentage-wise) each token resides.