Yeah, physics tends to be taught as if you’re going to use it. So you don’t just get told what a Christoffel symbol is, it’s assumed you’re going to spend a few hours calculating them.
I found the post itself a bit confusing. The connection of quaternions to rotations wasn’t clear to me (what does the real part do? IF nothing, isn’t this a violation of one of the desiderata for representations? How does this relate to spinors—don’t spinors use all the degrees of freedom? Etc.). I think there’s an interesting comparison to be made between the representation as size-2 vectors of quaternions versus size-4 vectors of complex numbers, both practically (spinor calculations do seem to involve duplicated effort in the size-4 representation) and in interpretation (antimatter!).
In the 2D matrix representation, the basis element corresponding to the real part of a quaternion is the identity matrix. So scaling the real part results in scaling the (real part of the) diagonal of the 2D matrix, which corresponds to a scaling operation on the spinor. It incidentally plays the same role on 3D objects: it scales them. Plus, it plays a direct role in rotations when it’s −1 (180 degree rotation) or 1 (0 degree rotation). Same as with i, j, and k, the exact effect of changing the real part of the quaternion isn’t obvious from inspection when it’s summed with other non-zero components. For example, it’s hard to tell by inspection what the 2 or the 3j is doing in the quaternion 2+3j.
In total, quaternions represent both scaling, rotating, and any mix of the two. I should have been clearer about that in the post. Spinors for quaternions do include any “state changes” resulting from the real part of the quaternion as well as any changes resulting from i, j, and k components, so the spinor does use all degrees of freedom.
The change in representation between 2-quaternion and 4-complex spinors is purely notational. It doesn’t affect any of the math or underlying representations. Since a quaternion operation can be represented by a 2x2 complex matrix, you can represent a 2-quaternion operation as the tensor product of two 2x2 complex matrices, which would give you a 4x4 complex matrix. That’s where 4x4 gamma matrices come from—each is a tensor products of two 2x2 Pauli matrices. For all calculations and consequences, you get the exact same answers whether you choose to represent the operations and spinors as quaternions or complex numbers.
Plus, it plays a direct role in rotations when it’s −1 (180 degree rotation) or 1 (0 degree rotation)
Isn’t −1 inversion? Inverting the axis of rotation makes total sense (while “180 degree rotation” with no axis is nonsense) - and inverting the scale of the object also makes sense, but is nonphysical. (This is why physicists talk about SO(3), not O(3)).
I think for quaternions, (−1)∗q corresponds both to inversion and a 180 degree rotation.
When using quaternions to describe rotations in 3D space however, one can still represent rotations with unit-quaternions q=cos(α/2)+sin(α/2)∗n where n is a ‘unit vector’ distributed along the directions i,j,k and indicates the rotation axis, and α is the 3D rotation angle. If one wishes to rotate any orientation x (same type of object as n) by q, the result is qxq∗. Here, q=−1 corresponds to α/2=π and is thus a full 360 turn.
I have tried to read up on explanations for this a few times, but unfortunately never with full success. But usually people start talking about q describing a “double cover” of the 3D rotations.
Maybe a bit of intuition about this relation can come from thinking about measured quantities in quantum mechanics as ‘expectation values’ of some operator X written as ψ†Xψ:
Here it becomes more intuitive that replacing X=q∗qXq∗q (rotating the measured quantity back and forth by α around the axis n) results in ψ†q∗qXq∗qψ, which is an α-rotated X measured on an α/2-rotated wavefunction.
Thanks for the explanation. I found this post that connects your explanation to an explanation of the “double cover.” I believe this is how it works:
Consider a point on the surface of a 3D sphere. Call it the “origin”.
From the perspective of this origin point, you can map every point of the sphere to a 2D coordinate. The mapping works like this: Imagine a 2D plane going through the middle of the sphere. Draw a straight line (in the full 3D space) from the selected origin to any other point on the sphere. Where the line crosses the plane, that’s your 2D vector representation of the other point. Under this visualization, the origin point should be mapped to a 2D “point at infinity” to make the mapping smooth. This mapping gives you a one-to-one conversion between 2D coordinate systems and points on the sphere.
You can create a new 2D coordinate system for sphere surface points using any point on the sphere as the origin. All of the resulting coordinate systems can be smoothly deformed into one another. (Points near the origin are always large, points on the opposite side of the sphere are always close to the 0,0,0, and the changes are smooth as you move the origin smoothly.)
Each choice of origin on the surface of the sphere (and therefore each 2D coordinate system) corresponds to two unit-length quaternions. You can see this as follows. Pick any choice of i,j,k values from a unit quaternion. There are now either 1 or 2 choices for what the real component of that quaternion might have been. If i,j,k alone have unit length, then there’s only one choice for the real component: zero. If i,j,k alone do not have unit length, then there are two choices for the real component since either a positive or a negative value can be used to make the quaternion unit length again.
Take the set of unit quaternions that have a real component close to zero. Consider the set of 2D coordinate systems created from those points. In this region, each coordinate system corresponds to two quaternions EXCEPT at the points where the quaternion’s real component is 0. This exceptional case prevents a one-to-one mapping between coordinate transformations and quaternion transformations.
As a result, there’s no “smooth” way to reduce the two-to-one mapping from quaternions to coordinate systems down to a one-to-one mapping. Any mapping would require either double-counting some quaternions or ignoring some quaternions. Since there’s a one-to-one mapping between coordinate systems and candidate origin points on the surface of the sphere, this means there is also no one-to-one mapping between quaternions and points on the sphere.
No matter what smooth mapping you choose from SU(2), unit quaternions, to SO(3), unit spheres, the mapping must do the equivalent of collapsing distinctions between quaternions with positive and negative real components. And so the double cover corresponds to the two sets of covers: one of positive-real-component quaternions over the sphere, and one of the negative-real-component quaternions over the sphere. Within each cover, there’s a smooth one-to-one conversion between quaternion-coordinates mappings, but across covers there is not.
Yeah, physics tends to be taught as if you’re going to use it. So you don’t just get told what a Christoffel symbol is, it’s assumed you’re going to spend a few hours calculating them.
I found the post itself a bit confusing. The connection of quaternions to rotations wasn’t clear to me (what does the real part do? IF nothing, isn’t this a violation of one of the desiderata for representations? How does this relate to spinors—don’t spinors use all the degrees of freedom? Etc.). I think there’s an interesting comparison to be made between the representation as size-2 vectors of quaternions versus size-4 vectors of complex numbers, both practically (spinor calculations do seem to involve duplicated effort in the size-4 representation) and in interpretation (antimatter!).
In the 2D matrix representation, the basis element corresponding to the real part of a quaternion is the identity matrix. So scaling the real part results in scaling the (real part of the) diagonal of the 2D matrix, which corresponds to a scaling operation on the spinor. It incidentally plays the same role on 3D objects: it scales them. Plus, it plays a direct role in rotations when it’s −1 (180 degree rotation) or 1 (0 degree rotation). Same as with i, j, and k, the exact effect of changing the real part of the quaternion isn’t obvious from inspection when it’s summed with other non-zero components. For example, it’s hard to tell by inspection what the 2 or the 3j is doing in the quaternion 2+3j.
In total, quaternions represent both scaling, rotating, and any mix of the two. I should have been clearer about that in the post. Spinors for quaternions do include any “state changes” resulting from the real part of the quaternion as well as any changes resulting from i, j, and k components, so the spinor does use all degrees of freedom.
The change in representation between 2-quaternion and 4-complex spinors is purely notational. It doesn’t affect any of the math or underlying representations. Since a quaternion operation can be represented by a 2x2 complex matrix, you can represent a 2-quaternion operation as the tensor product of two 2x2 complex matrices, which would give you a 4x4 complex matrix. That’s where 4x4 gamma matrices come from—each is a tensor products of two 2x2 Pauli matrices. For all calculations and consequences, you get the exact same answers whether you choose to represent the operations and spinors as quaternions or complex numbers.
Isn’t −1 inversion? Inverting the axis of rotation makes total sense (while “180 degree rotation” with no axis is nonsense) - and inverting the scale of the object also makes sense, but is nonphysical. (This is why physicists talk about SO(3), not O(3)).
I think for quaternions, (−1)∗q corresponds both to inversion and a 180 degree rotation.
When using quaternions to describe rotations in 3D space however, one can still represent rotations with unit-quaternions q=cos(α/2)+sin(α/2)∗n where n is a ‘unit vector’ distributed along the directions i,j,k and indicates the rotation axis, and α is the 3D rotation angle. If one wishes to rotate any orientation x (same type of object as n) by q, the result is qxq∗. Here, q=−1 corresponds to α/2=π and is thus a full 360 turn.
I have tried to read up on explanations for this a few times, but unfortunately never with full success. But usually people start talking about q describing a “double cover” of the 3D rotations.
Maybe a bit of intuition about this relation can come from thinking about measured quantities in quantum mechanics as ‘expectation values’ of some operator X written as ψ†Xψ: Here it becomes more intuitive that replacing X=q∗qXq∗q (rotating the measured quantity back and forth by α around the axis n) results in ψ†q∗ qXq∗ qψ, which is an α-rotated X measured on an α/2-rotated wavefunction.
Thanks for the explanation. I found this post that connects your explanation to an explanation of the “double cover.” I believe this is how it works:
Consider a point on the surface of a 3D sphere. Call it the “origin”.
From the perspective of this origin point, you can map every point of the sphere to a 2D coordinate. The mapping works like this: Imagine a 2D plane going through the middle of the sphere. Draw a straight line (in the full 3D space) from the selected origin to any other point on the sphere. Where the line crosses the plane, that’s your 2D vector representation of the other point. Under this visualization, the origin point should be mapped to a 2D “point at infinity” to make the mapping smooth. This mapping gives you a one-to-one conversion between 2D coordinate systems and points on the sphere.
You can create a new 2D coordinate system for sphere surface points using any point on the sphere as the origin. All of the resulting coordinate systems can be smoothly deformed into one another. (Points near the origin are always large, points on the opposite side of the sphere are always close to the 0,0,0, and the changes are smooth as you move the origin smoothly.)
Each choice of origin on the surface of the sphere (and therefore each 2D coordinate system) corresponds to two unit-length quaternions. You can see this as follows. Pick any choice of i,j,k values from a unit quaternion. There are now either 1 or 2 choices for what the real component of that quaternion might have been. If i,j,k alone have unit length, then there’s only one choice for the real component: zero. If i,j,k alone do not have unit length, then there are two choices for the real component since either a positive or a negative value can be used to make the quaternion unit length again.
Take the set of unit quaternions that have a real component close to zero. Consider the set of 2D coordinate systems created from those points. In this region, each coordinate system corresponds to two quaternions EXCEPT at the points where the quaternion’s real component is 0. This exceptional case prevents a one-to-one mapping between coordinate transformations and quaternion transformations.
As a result, there’s no “smooth” way to reduce the two-to-one mapping from quaternions to coordinate systems down to a one-to-one mapping. Any mapping would require either double-counting some quaternions or ignoring some quaternions. Since there’s a one-to-one mapping between coordinate systems and candidate origin points on the surface of the sphere, this means there is also no one-to-one mapping between quaternions and points on the sphere.
No matter what smooth mapping you choose from SU(2), unit quaternions, to SO(3), unit spheres, the mapping must do the equivalent of collapsing distinctions between quaternions with positive and negative real components. And so the double cover corresponds to the two sets of covers: one of positive-real-component quaternions over the sphere, and one of the negative-real-component quaternions over the sphere. Within each cover, there’s a smooth one-to-one conversion between quaternion-coordinates mappings, but across covers there is not.