kromem comments on kromem’s Shortform

kromem May 31, 2024, 8:28 AM
−2 points
0
I wonder if with the next generations of multimodal models we’ll see a “rubber ducking” phenomenon where, because their self-attention was spread across mediums, things like CoT and using outputs as a scratch pad will have a significantly improved performance in non-text streams.

Will GPT-4o fed its own auditory outputs with tonal cues and pauses and processed as an audio data stream make connections or leaps it never would if just fed its own text outputs as context?

I think this will be the case, and suspect the various firms dedicating themselves to virtualized human avatars will accidentally stumble into profitable niches—not for providing humans virtual AI clones as an interface, but for providing AIs virtual human clones as an interface. (Which is a bit frustrating, as I really loathe that market segment right now.)

When I think about how Sci-Fi authors projected the future of AI cross- or self-talk, it was towards a super-efficient beeping or binary transmission of pure data betwixt them.

But I increasingly get the sense that, like much of actual AI development over the past few years, a lot of the Sci-Fi thinking was tangential or inverse to the actual vector of progress, particularly in underestimating the inherent value humans bring to bear. The wonders we see developing around us are jumpstarted and continually enabled by the patterns woven by ourselves, and it seems at least the near future developments of models will be conforming to those patterns more and more, not less and less.

Still, it’s going to be bizarre as heck to watch a multimodal model’s avatar debating itself aloud like I do in my kitchen...
- kromem Aug 9, 2024, 10:42 PM
  1 point
  0
  Parent
  When I wrote this I thought OAI was sort of fudging the audio output and was using SSML as an intermediate step.
  
  After seeing details in the system card, such as copying user voice, it’s clearly not fudging.
  
  Which makes me even more sure the above is going to end up prophetically correct.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer