Normals and the Inverse Transpose, Part 3: Grassmann On Duals
Welcome back! In the last couple of articles, we learned about different ways to understand normal vectors in 3D space—either as bivectors (part 1), or as dual vectors (part 2). Both can be valid interpretations, but they carry different units, and react differently to transformations.
In this third and final installment, we’re going leave behind the focus on normal vectors, and explore a couple of other unitful vector quantities. We’ve seen how Grassmann bivectors and trivectors act as oriented areas and volumes, respectively; and we saw how dual vectors act as oriented line densities, with units of inverse length. Now, we’re going to put these two geometric concepts together, and find out what they can accomplish with their combined powers. (Get it? Powers? Like powers of a scale factor? Uh, you know what, never mind.)
I’m going to dive right in, so if you need a refresher on either Grassmann algebra or dual spaces, you may want to re-skim the previous articles.
Wedge Products of Dual Vectors
Grassmann algebra allows us to take wedge products of vectors, producing higher-grade algebraic entities such as bivectors and trivectors. Just as we can do this with base vectors, we can do the same thing on dual vectors, producing dual bivectors and dual trivectors.
A dual bivector is formed by wedging two dual vectors, like: $$ {\bf e_x^*} \wedge {\bf e_y^*} = {\bf e_{xy}^*} $$ and a dual trivector is the product of three: $$ {\bf e_x^*} \wedge {\bf e_y^*} \wedge {\bf e_z^*} = {\bf e_{xy}^*} \wedge {\bf e_z^*} = {\bf e_{xyz}^*} $$ This works exactly the same way that wedge products of ordinary vectors do; in particular, the same anticommutative law applies.
So what’s the geometric meaning of these dual $k$-vectors? Recall that a dual vector is defined as a linear form—a function from some vector space $V$ to scalars $\Bbb R$. Conveniently, the wedge products of dual vectors turn out to be isomorphic to the duals of wedge products of vectors. (Mathematically, we can say, for finite-dimensional $V$: $$ \textstyle \bigwedge^k \bigl( V^* \bigr) \cong \bigl(\bigwedge^k V \bigr)^* $$ where $\bigwedge^k$ is the operation to construct the set of $k$-vectors over a given base vector space.)
The upshot is that dual $k$-vectors can be understood as linear forms on $k$-vectors: a dual bivector is a linear function from bivectors to scalars, and a dual trivector is a linear function from trivectors to scalars. Let’s see how this works in more detail.
Dual Bivectors
In the previous article, we saw how a dual vector can be visualized as a field of parallel, uniformly spaced planes, representing the level sets of a linear form:
You can think of the discrete planes in this picture as representing intervals of one unit in the output of the linear form. Keep in mind, though, that there are actually a continuous infinity of these planes, filling space—one for every possible output value of the linear form. When you evaluate the linear form—i.e. pair a dual vector with a vector—the result represents how many planes the vector crosses, from its tail to its tip (in a continuous-measure sense of “how many”). This will depend on both the length and orientation of the vector: for example, a vector parallel to the planes will return zero, no matter its length.
A dual bivector can be thought of in a similar way—but instead of planes, we now picture a field of parallel lines, uniformly spaced over the plane perpendicular to them.
As suggested by this diagram, when you wedge two dual vectors, the resulting dual bivector consists of all the lines of intersection of the two dual vectors’ respective planes.
What happens when we pair this dual bivector with a base bivector? As before, the result is a scalar—this time representing how many lines the bivector crosses! If you visualize the bivector as a parallelogram, or circle or any other shape, it will have a certain area. It will therefore intersect some quantity of the continuous mass of lines. This quantity won’t depend on the shape of the bivector—remember, bivectors don’t actually have any defined shape—only on its area (magnitude) and orientation. A bivector whose plane runs parallel to the lines will return zero, no matter its area.
Because dual vectors have units of inverse length, and a dual bivector is a product of dual vectors, a dual bivector has units of inverse area. It represents an oriented areal density, such as a probability density over a surface! When you pair the dual bivector with a bivector, the result tells you how much probability (or whatever else) is covered by that bivector’s area. And as implied by their units, dual bivectors scale as $1/a^2$. (If you scale an object up by a factor of $a$, the probablity density on its surface goes down by a factor of $a^2$, because the same total probability is now spread over an $a^2$-larger area.)
How about the transformation rule for dual bivectors? Well, we learned in part 1 that bivectors transform as $\text{cofactor}(M)$; and in part 2, we found that dual vectors transform as the inverse transpose, $M^{-T}$. It follows that dual bivectors transform as $\text{cofactor}\bigl(M^{-T}\bigr)$, or equivalently $\bigl(\text{cofactor}(M)\bigr)^{-T}$. Startlingly, for 3×3 matrices these formulas reduce to just $$ \frac{M}{\det M} $$ So, dual bivectors simply transform using $M$ divided by its own determinant.
Dual Trivectors
Follow the pattern: if a dual vector in 3D looks like a stack of parallel planes, and a dual bivector looks like a field of parallel lines, then a dual trivector looks like a cloud of parallel points. Well, drop the “parallel”—it doesn’t mean anything. It’s just uniformly spaced points.
As before, the wedge product of three dual vectors—or a dual vector and dual bivector—constructs the continuous point cloud made of all the intersection points of the wedge factors. This quantity scales as $1/a^3$ and represents a volume density. When you pair it with a trivector, the result tells you how much of the point cloud is enclosed in that trivector’s volume.
The transformation rule for this one is easy—dual trivectors in 3D just get multiplied by $1/\det M$.
A Few More Topics
With the introduction of dual bi- and trivectors, our “scaling zoo” is now complete! We’ve got the full ecosystem of vectorial quantities with scaling powers from −3 to +3, each with its proper units and matching transformation formula.
In the rest of this section, I’ll quickly touch on a few more mathematical aspects of this extended Grassmann algebra with dual spaces.
The Interior Product
As we saw in part 2, a vector space and its dual have a “natural pairing” operation, much like an inner product, between vectors and dual vectors. This pairing extends to $k$-vectors and their duals, too. In fact, we can further extend the natural pairing to work between $k$-vectors and duals of different grades. For example, we can define a way to “pair” a dual vector $w$ with a bivector $B = u \wedge v$, yielding a vector: $$ \langle w, B \rangle = \langle w, u \rangle v - u \langle w, v \rangle $$ Geometrically, the resulting vector lies in the plane of $B$, and runs parallel to the level planes of $w$. In some sense, $w$ is “eating” the dimension of $B$ that lies along the direction of $w$’s density, and leaving the leftover dimension behind as a vector.
This extended pairing operation is known as the interior product or contraction product, although different references often define it in slightly different ways (there are various conventions in the literature). I’m not going to go into it too deeply. The key point is that you can combine a $k$-vector with a dual $\ell$-vector, for any grades $k$ and $\ell$; the result will be a $(k-\ell)$-vector, interpreting negative grades as duals.
The Hodge Star
In addition to the vector-space duality we’ve been talking about, Grassmann algebra contains another, distinct notion of duality: Hodge duality, represented by the Hodge star operator, $\star$. (Note that this is a different symbol from the asterisk $*$ used for the dual vector space!)
The vector-space notion of duality relates $k$-vectors to duals of equal grade—vectors to dual vectors, bivectors to dual bivectors, and so on. Hodge duality, however, connects things to duals of a complementary grade. Applying the Hodge star to a $k$-vector produces an element of grade $n - k$, where $n$ is the dimension of space. In 3D, it interchanges vectors (grade 1) with bivectors (grade 2), and scalars (grade 0) with trivectors (grade 3).
The way I’ll define the Hodge star initially is a bit different than the standard way. In fact, there are actually two Hodge star operations: one that goes from $k$-vectors to dual $(n-k)$-vectors, and another that goes the other way. I’ll denote these by $\star$ and $-\star$ respectively. The two are inverses of each other (in 3D, at least). They’re defined as follows: $$ \begin{aligned} \star&: \textstyle\bigwedge^k V \to \textstyle\bigwedge^{n-k}V^* &&: & v^\star &= \langle {\bf e_{xyz}^*}, v \rangle \\ -\star&: \textstyle\bigwedge^k V^* \to \textstyle\bigwedge^{n-k}V &&: & v^{-\star} &= \langle v, {\bf e_{xyz}} \rangle \end{aligned} $$ The angle brackets on the right here are the interior product. What we’re saying is: to do the Hodge star on a $k$-vector, we take its interior product with ${\bf e_{xyz}^*}$, the standard unit dual trivector (or, in $n$ dimensions, the unit dual $n$-vector). This results in a dual $(n-k)$-vector, which geometrically represents a density over all the dimensions not included in the original $k$-vector.
Conversely, to do the anti-Hodge-star on a dual $k$-vector, we take its interior product with ${\bf e_{xyz}}$, giving an $(n-k)$-vector containing all the dimensions not represented by the original dual $k$-vector, i.e. all the dimensions perpendicular to its level sets.
(These two operations are almost defined on disjoint domains, and could therefore be combined into one “smart” star that automatically knows what to do based on the type of its argument…except for the $k = 0$ case: when you hodge a scalar, does it go to a trivector, or to a dual trivector? Both are possible; that’s why we need two distinct operations here.)
For 3D geometry, the interesting cases are vectors interchanging with bivectors:
- A vector $v$ hodges to a dual bivector whose “field lines” run parallel to $v$.
- A bivector $B$ hodges to a dual vector whose level planes are parallel to $B$.
- A dual vector $w$ unhodges to a bivector parallel to $w$’s level planes.
- A dual bivector $D$ unhodges to a vector parallel to $D$’s field lines.
Although the formal definition was somewhat involved, you can see that the geometric result of the Hodge operations is actually pretty simple. It’s all about swapping between the geometry of a $k$-vector and the corresponding level-set geometry of a dual $(n-k)$-vector. The Hodge stars are a very useful tool for working with Grassmann and dual-Grassmann quantities in practice.
The Inner Product, or Forgetting About Duals
In most treatments of Grassmann or geometric algebra, dual spaces are hardly mentioned. The more conventional definition of the Hodge star has it mapping directly between $k$-vectors and $(n-k)$-vectors—no duals in sight. How does this work?
It turns out that if we have an inner product defined on our vector space, we can use it to convert back and forth between vectors and dual vectors, or $k$-vectors and their duals.
So far, we haven’t discussed any means of mapping individual vectors back and forth between the base and dual spaces. Although they’re both vector spaces of the same dimension, there’s no natural isomorphism that would enable us to map them in a non-arbitrary way. However, the presence of an inner product does pick out a specific isomorphism with the dual space: that which maps each vector $v$ to a dual vector $v^*$ that implements dotting with $v$, using the inner product.
Symbolically, for all vectors $u \in V$, we have $\langle v^*, u \rangle = v \cdot u$. This can be extended to inner products and isomorphisms for all $k$-vectors as well (see Wikipedia for details).
Note, however, that this map is not preserved by scaling, or by transformations in general, because $v^*$ transforms as $M^{-T}$ while $v$ transforms as $M$.
With this correspondence, it becomes possible to largely ignore the existence of dual spaces and dual elements altogether—we have the fiction that they’re not distinct from the base elements. In an orthonormal basis, even the coordinates of a vector and its corresponding dual will be identical.
For an example of “forgetting” about duals: the Hodge star operations can be defined using the inner product to invisibly dualize their input or output as well as hodging it. Then the two Hodge stars I defined above collapse into one operation, mapping between $\bigwedge^k V$ and $\bigwedge^{n-k} V$.
What’s The Use of All This?
This is kind of a lot. We started with just vectors and normal vectors—two kinds of vector-shaped things with different rules, which was confusing enough. But now we have four: vectors, dual vectors, bivectors, and dual bivectors. And on top of that we have three scalar-shaped things, too: true unitless scalars, trivectors, and dual trivectors.
Evidently, lots of people manage to get along well enough without being totally aware of all these distinctions! Even texts on Grassmann or geometric algebra may not fully delve into the “duals” story, instead treating $k$-vectors and their duals as the same thing (implicitly using the isomorphism defined above). Their differing transformation behavior becomes sort of a curiosity, an unsystematic ornamental detail. And this comes at the cost of making some aspects of the algebra require an inner product or a metric, and only work properly in an orthonormal basis. In contrast, when you’re “cooking with duals”, you can derive formulas that work properly in any basis.
As a quick example of this, let’s look at a concrete problem you might encounter in graphics. Let’s say you have a triangle mesh and you want to select a random point on it, chosen uniformly over the surface area. To do this, we must first select a random triangle, with probability proportional to area. The standard technique is to precompute the areas of all the triangles and build a prefix-sum table; then, to select a triangle, we take a uniform random value and binary-search on it in the table.
Let’s throw in another wrinkle, though. What if the triangle mesh is transformed—possibly by a nonuniform scaling, or a shear? In general, this will alter the areas of all the triangles, in an orientation-dependent way. A uniform distribution over surface area in the mesh’s local space will no longer be uniform in world space. We could address this by pre-transforming the whole mesh into world space and doing the sampling process there—but that’s more expensive than necessary.
We can use bivectors to help. Instead of calculating just a scalar area for each triangle, calculate the bivector representing its orientation and area. (If the triangle’s vertices are $p_1, p_2, p_3$, this is $\tfrac{1}{2}(p_2 - p_1) \wedge (p_3 - p_1)$.) Now we can transform all the bivectors into world space, using their transformation rule, and they will accurately represent the areas of the transformed triangles. Then we can calculate their magnitudes and build the prefix-sum table, as before.
Conversely, suppose we have an existing, non-uniform areal probability measure defined over our triangle mesh. (Maybe it’s a light source with a texture defining its emissive brightness, and we want to sample with respect to emitted power; or maybe we want to sample with respect to solid angle subtended at some point, or some sort of visual importance, etc.) We can represent these probability densities as dual bivectors, and again we can take them back and forth between local and world space—even in the presence of shear or nonuniform scaling—with confidence that we’re still representing the same distribution.
Some other examples where dual $k$-vectors show up:
- The derivative (gradient) of a scalar field, such as an SDF, is naturally a dual vector.
- Dual vectors represent spatial frequencies (wavevectors) in Fourier analysis.
- The radiance carried by a ray is a density with respect to projected area, and can therefore be represented, at least in part, as a dual bivector.
Like many theoretical math concepts, I think these ideas are mostly useful for enriching your own mental models of geometry, strengthening your thought process, and deriving results that you can then use in code in a more “conventional” way. I’m not necessarily suggesting we should all go off and start implementing $k$-vectors and their duals as classes in our math libraries. (Frankly, our math libraries are enough of a mess already.)
Organizing the Zoo
One more thing to muse on before I leave you. We’ve seen that there is a “scaling zoo” of mathematical elements with different physical, geometric interpretations and behaviors. Different branches of science and math have distinct ways of conceptually organizing this zoo, and thinking about its denizens and their relationships.
In computer science, for example, we would probably understand vectors, bivectors, dual vectors, and so forth as different types. Each might have an internal structure as a composition of more elementary values (real numbers), and a suite of allowed operations that define what you can do with them and how they interact with one another.
Physicists, meanwhile, tend to take a more rough-and-ready approach: geometric elements are thought of as simply matrices of real (or sometimes complex) numbers, together with transformation laws—rules that define what happens to a given matrix under a change of coordinates. Algebraic properties such as anticommutativity are obtained by constructing the matrices in such a way that matrix multiplication implements the desired algebra. For example, a bivector can be represented as an antisymmetric matrix; wedging two vectors $u, v$ to make a bivector corresponds to calculating the matrix $$ uv^T - vu^T $$ which has the same anticommutative property as a wedge product. Multiplying this matrix by a (dual) vector $w$ then represents the interior product of the bivector with $w$. Meanwhile, a dual bivector would be structurally similar, but have a different transformation law (“covariant” versus “contravariant”).
Lastly, mathematicians like to formalize things by saying that different geometric quantities are elements of different spaces and/or algebras. Both terms ultimately mean a set (in the mathematical sense), together with some extra structure—such as algebraic operations, a topology, a norm or metric, and so on—defined on top of the bare set. The exact kind of structures you need depends on what you’re doing, and there’s a whole menagerie of such structures that might be invoked in different contexts.
So which structure is behind the scaling zoo? We know we’ve got the vector space structure, and the Grassmann algebraic structure. But neither of these fully accounts for the different scaling and transformation behaviors of dual elements: dual spaces are isomorphic to their base spaces (in finite dimensions), totally identical insofar as the vector and Grassmann structures are concerned.
I don’t have a fully developed answer yet—but I suspect it’s got to do with the representation theory of Lie groups. My guess is that the different types of scaling elements we’ve seen can be codified as vector spaces acted on by different representations of $GL(n)$, the Lie group of all linear maps on $\Bbb R^n$. But I’m not going to get into that here. (If you’d like to read more on this, here are a couple web references: one, two. Also: Peter Woit’s book on the role of representation theory in particle physics.)
Conclusion
I hope this has been an entertaining and enlightening tour through some of the layers beneath the surface of your favorite Euclidean geometry. We started with a seemingly simple question—why do normal vectors transform using the inverse transpose matrix?—and found that there was much more rich structure there than meets the eye.
The “scaling zoo” of $k$-vectors and their duals makes a pleasingly complete and symmetrical whole. Even if I’m not going to be employing these things in practical work every day, I feel that studying them has helped me understand some things that were vague and foggy in my mind before. It’s worth appreciating that these subtle distinctions exist. One of my general axioms in life is that everything is more complicated than it first appears, and nowhere is this more consummately borne out than mathematics!