This is not clear from how we wrote the paper but we actually do the clustering in the full 768-dimensional space! If you look closely as the clustering plot you can see that the clusters are slightly overlapping—that would be impossible with k-means in 2D, since in that setting membership is determined by distance from the 2D centroid.
Ahh sorry! Going back to read it was pretty clear from the text. I was tricked by the figure where the embedding is presented first.
Again, good job! :)
Thank you for the comment and the questions! :)
This is not clear from how we wrote the paper but we actually do the clustering in the full 768-dimensional space! If you look closely as the clustering plot you can see that the clusters are slightly overlapping—that would be impossible with k-means in 2D, since in that setting membership is determined by distance from the 2D centroid.
Ahh sorry! Going back to read it was pretty clear from the text. I was tricked by the figure where the embedding is presented first. Again, good job! :)