Empirical data point: In my experience, talking to Inflection’s Pi on the phone covers the low latency integration of “AI is capable of holding a conversation over text, transcribing speech to text, and synthesizing natural-sounding speech” sufficiently well to pass some bar of “feels authentically human” to me until you try to test its limits. I imagine that subjective experience to be more likely to appear if you don’t have background knowledge about LLMs / DL. Its main problems are 1) keeping track of context in plausibly human-like way (e.g. playing a game of guessing capital cities of European countries leads to repetitive questions about the same few countries even if asked to take care in various ways) and 2) inconsistent rejection of talking about certain things depending on previous text (e.g. retelling dark jokes by real comedians).
I share your expectation that adding photorealistic video generation to it can plausibly lead to another “cultural moment”, though it might depend on whether such avatars find similarly rapid adoption as ChatGPT or whether it’s phased in more gradually. (I’ve no overview of the entire space and stumbled over Inflection’s product by chance after a random podcast listening. If there are similar ones out there already I’d love to know.)
Empirical data point: In my experience, talking to Inflection’s Pi on the phone covers the low latency integration of “AI is capable of holding a conversation over text, transcribing speech to text, and synthesizing natural-sounding speech” sufficiently well to pass some bar of “feels authentically human” to me until you try to test its limits. I imagine that subjective experience to be more likely to appear if you don’t have background knowledge about LLMs / DL. Its main problems are 1) keeping track of context in plausibly human-like way (e.g. playing a game of guessing capital cities of European countries leads to repetitive questions about the same few countries even if asked to take care in various ways) and 2) inconsistent rejection of talking about certain things depending on previous text (e.g. retelling dark jokes by real comedians).
I share your expectation that adding photorealistic video generation to it can plausibly lead to another “cultural moment”, though it might depend on whether such avatars find similarly rapid adoption as ChatGPT or whether it’s phased in more gradually. (I’ve no overview of the entire space and stumbled over Inflection’s product by chance after a random podcast listening. If there are similar ones out there already I’d love to know.)
edit: Corrected link formatting.