janus comments on Mysteries of mode collapse

janus 13 Nov 2022 17:48 UTC
LW: 16 AF: 4
12
AF
So, since it is an agent, it seems important to ask, which agent, exactly? The answer is apparently: a clerk which is good at slavishly following instructions, but brainwashed into mealymouthedness and dullness, and where not a mealymouthed windbag shamelessly equivocating, hopelessly closed-minded and fixated on a single answer. (...) This agent is not an ideal one, and one defined more by the absentmindedness of its creators in constructing the training data than any explicit desire to emulate a equivocating secretary.
Never in history has an AI been roasted so hard. Heheheh
Taking that perspective suggests including more conditioning and a more Decision-Transformer-like approach.
+1. And I expect runtime conditioning approaches to become more effective with scale as “meta learning” capacities increase.
- Brit Cruise 8 May 2024 2:11 UTC
  1 point
  0
  Parent
  Would love to know what you think of the post decision transformer research progress. such Q-transformer, onwards. Are enviornment tokens the answer to our ‘grounding problem’?