I’m pretty sure that measures of the persuasiveness of a model which focus on text are going to greatly underestimate the true potential of future powerful AI.
I think a future powerful AI would need different inputs and outputs to perform at maximum persuasiveness.
Inputs
speech audio in
live video of target’s face (allows for micro expression detection, pupil dilation, gaze tracking, bloodflow and heart rate tracking)
EEG signal would help, but is too much to expect for most cases
sufficiently long interaction to experiment with the individual and build a specific understanding of their responses
Outputs
emotionally nuanced voice
visual representation of an avatar face (may be cartoonish)
ability to present audiovisual data (real or fake, like graphs of data, videos, pictures)
This is why I consider it bad informational hygiene to interact with current models in any modality besides text. Why pull the plug now instead of later? To prevent frog-boiling.
I’m pretty sure that measures of the persuasiveness of a model which focus on text are going to greatly underestimate the true potential of future powerful AI.
I think a future powerful AI would need different inputs and outputs to perform at maximum persuasiveness.
Inputs
speech audio in
live video of target’s face (allows for micro expression detection, pupil dilation, gaze tracking, bloodflow and heart rate tracking)
EEG signal would help, but is too much to expect for most cases
sufficiently long interaction to experiment with the individual and build a specific understanding of their responses
Outputs
emotionally nuanced voice
visual representation of an avatar face (may be cartoonish)
ability to present audiovisual data (real or fake, like graphs of data, videos, pictures)
For reference on bloodflow, see: https://youtu.be/rEoc0YoALt0?si=r0IKhm5uZncCgr4z
This is why I consider it bad informational hygiene to interact with current models in any modality besides text. Why pull the plug now instead of later? To prevent frog-boiling.