What does it mean for perception to compress a frame of video to 1k tokens? What kind of information gets lost when you do this?
Current theme: default
Less Wrong (text)
Less Wrong (link)
What does it mean for perception to compress a frame of video to 1k tokens? What kind of information gets lost when you do this?