So, since it is an agent, it seems important to ask, which agent, exactly? The answer is apparently: a clerk which is good at slavishly following instructions, but brainwashed into mealymouthedness and dullness, and where not a mealymouthed windbag shamelessly equivocating, hopelessly closed-minded and fixated on a single answer. (...) This agent is not an ideal one, and one defined more by the absentmindedness of its creators in constructing the training data than any explicit desire to emulate a equivocating secretary.
Never in history has an AI been roasted so hard. Heheheh
Taking that perspective suggests including more conditioning and a more Decision-Transformer-like approach.
+1. And I expect runtime conditioning approaches to become more effective with scale as “meta learning” capacities increase.
Would love to know what you think of the post decision transformer research progress. such Q-transformer, onwards. Are enviornment tokens the answer to our ‘grounding problem’?
Never in history has an AI been roasted so hard. Heheheh
+1. And I expect runtime conditioning approaches to become more effective with scale as “meta learning” capacities increase.
Would love to know what you think of the post decision transformer research progress. such Q-transformer, onwards. Are enviornment tokens the answer to our ‘grounding problem’?