Tristan Harris called them “generative large language multimodal models (GLLMMs)”. So “golems”. Gollums?
My first guess is that I would prefer just calling them “Multimodals”. Or perhaps “Image/Text Multimodals”.
Tristan Harris called them “generative large language multimodal models (GLLMMs)”. So “golems”. Gollums?
My first guess is that I would prefer just calling them “Multimodals”. Or perhaps “Image/Text Multimodals”.