Ege Erdil comments on Translating between Latent Spaces

Ege Erdil 30 Jul 2022 20:01 UTC
3 points
0

Some brief attempts were tried by first getting a vector for brightness and then a vector orthogonal to this (using Gram-Schmidt), but this didn’t quite work. Depending on how one increased brightness, one could get a vector that is not orthogonal to shoe height. For the larger VAE, moving along the brightness vector, the shoe gets both brighter and taller than in the shoe height direction—our orthogonalization attempts unfortunately did not end up working.

I’d be interested to hear you elaborate on exactly how these attempts failed. What goes wrong if you look at vectors of the form $v_{b} + α v_{h}$ for $α \in R$ and $v_{b}, v_{h}$ the vectors in latent space that you identify for brightness and height respectively, and then do some kind of manual binary search on $α$ to eyeball the value that cancels out the brightness factor? Do you run into a collinearity problem when you try this, for instance?