The Radiance Field
One of the things I find compelling about computer graphics—realistic image synthesis in particular—is that it’s a way you can work on physics without actually being a physicist. Graphics is, IMO, a much more approachable field, and it takes a lot less time to become a graphics expert than to do a physics PhD. But graphics still affords opportunities to do some of the things that make physics interesting, such as: learning about the low-level “implementation details” of reality, using rigorous mathematics to model an aspect of the world, and validating your work against experience. There’s a great deal of very elegant mathematics behind the laws of physics, and insofar as graphics is a sub-field of physics, there is also elegant mathematics to be found here.
In graphics, naturally, we’re concerned with light in particular. Light is a form of electromagnetic wave, and at root it allows us to see because EM waves scatter off atoms. When light touches an atom, the electrons oscillate in response to the oscillating EM field. (The nucleus oscillates too, but not very much, because it is much more massive than an electron.) But when the electrons oscillate they emit additional EM waves, which interfere with the original wave. Thus when light tries to go through a bunch of atoms, it gets modified somehow; this is called “scattering”.
Of course, individual atoms are much too small for us to care about, so in our model of the world we think of matter as a continuous medium with certain scattering properties at each point. Then we express the averaged-out effect of numerous atoms in the form of differential equations and integrals.
It’s clear from this account that in reality, all light scattering is volumetric. There is no such thing as surface scattering, because there is no such thing as a surface! We’re already pretty familiar with the idea that diffuse reflection is the result of subsurface volumetric scattering on a scale too small to see. In fact, specular reflection and refraction are the result of volume scattering too…but we’d need to account for the wave nature of light to simulate this! Waves scattered near a smooth interface between two media end up interfering in such a way that only the reflected and refracted waves survive, and waves in all other directions die off.
(By the way, if you’d like to read more about how electrons oscillating in an EM field end up producing specular reflection and refraction, there are three chapters on it in the Feynman Lectures on Physics:
- Vol. 1 Ch. 31, more elementary, derives approximate refraction for a thin gas;
- Vol. 2 Ch. 32 treats refraction in dense materials using Maxwell’s equations;
- Vol. 2 Ch. 33 derives the Fresnel equations.
Feynman’s book QED also uses reflection and refraction as a key example throughout, and has some additional insights.)
But in graphics, we avoid dealing with light as a wave. We are very fortunate in this regard that visible light’s wavelengths are much smaller than we can see, that typical objects are flat on that scale, and that typical light sources are incoherent (have random polarization and phase). We can thus safely sweep waves under the rug and treat light as a form of energy that flows through space in straight lines—the geometric optics approximation. Moreover, we can commonly treat light as being scattered by surfaces, which are often opaque.
The Radiance Field
So, for the purposes of computer graphics, light is a kind of continuous “stuff” that flows through space in straight lines, has a color, and doesn’t interact with itself. At this level of approximation, the EM field reduces to a simpler entity, the radiance field, defined as: $$ L: \mathbb{R}^3 \times S^2 \to \mathbb{R}^3 $$ Here, the $\mathbb{R}^3$ on the left is the scene’s world space; $S^2$ is the sphere of directions about each point; and the $\mathbb{R}^3$ on the right is color space, here taken to be linear RGB space.
The radiance field is therefore a five-dimensional function: three dimensions of position and two of direction. For example, if we express direction in spherical coordinates, we’d have $L(x, y, z, \theta, \phi) = (r, g, b)$. However, we’ll follow convention and abbreviate this to $L(x, \omega)$, where $x$ is a point and $\omega$ a unit vector. Intuitively, “radiance” is the amount of light energy passing through a given point in space, heading in a given direction.
Once we’ve defined a radiance field, we can make it interact with surfaces, using the famous rendering equation that occurs in approximately every academic graphics paper: $$ L(x, \omega) = L_e(x, \omega) + \int_{S^2} L(x, -\omega') \, f(\omega, \omega') \, |n \cdot \omega'| \, d\omega' $$ We add the emitted light $L_e$, and then filter the incoming light through the BSDF. And of course, there’s another rendering equation for volumetrics: $$ \frac{dL(x + t\omega, \omega)}{dt} = L_e(x) - [\beta_a(x) + \beta_s(x)]L(x, \omega) + \beta_s(x) \int_{S^2} L(x, \omega') \, p(\omega, \omega') \, d\omega' $$ Again, we add emitted light $L_e$ (here radiance emitted per unit length along the ray), then subtract absorbed light $\beta_a$, and account for in- and out-scattering $\beta_s$. This is in the form of a differential equation giving the rate of change of $L$ along each ray through a point (hence the $dL(x + t\omega, \omega)$ business)—we can also integrate this along a ray and come up with a form that relates the values of $L$ at two different points, but I won’t bother with that here.
You Have Entered Infinite Dimensions
To a very good approximation, the equations governing the EM field are linear differential equations; as a result, EM waves obey the “superposition principle”: two or more waves can be added together, or superposed, without interacting with each other. This implies that EM waves form a vector space. The set of radiance fields is also a vector space; radiance fields inherit their vector-ness from the underlying EM waves.
A vector space is just a set of objects that can be added together, and multiplied by scalars. You’re familiar with adding and scaling vectors in $\mathbb{R}^3$; adding and scaling entire radiance fields works the same way. (Note that scaling a radiance field means scaling the intensity of light at every point, not scaling the scene in 3D space.) The space of all radiance fields is an infinite-dimensional vector space, since radiance fields are fairly arbitrary functions that would require an infinite number of components if you tried to write them out like a vector. The components might consist, for instance, of the RGB values of the radiance field at every point in space and every direction, and of course there are infinitely many points and directions! But don’t be too perturbed by the infinite-dimensionality: all the basic properties of a vector space continue to work, and to a large extent you can think of these things as being just like the 3D vectors you’re familiar with.
Just as 3D vectors can be manipulated by matrices, vectors more generally can be manipulated by linear operators. If a function is like a vector with infinitely many components, then a linear operator on an infinite-dimensional vector space is a bit like an infinity-by-infinity matrix! These operators respect the additive and multiplicative properties of the vectors they operate on. Formally, if $T$ is a linear operator, then $$ \begin{aligned} T(x + y) &= Tx + Ty \\ T(ax) &= a(Tx) \end{aligned} $$ where $x, y$ are any vectors and $a$ is any scalar. These properties define what it means to be a linear operator.
Integrals and derivatives are linear operators on spaces of functions. Their basic properties that you learned in calculus class confirm this: $$ \begin{aligned} \frac{d}{dx}[f(x) + g(x)] &= \frac{df}{dx} + \frac{dg}{dx} \\ \frac{d}{dx}[af(x)] &= a \frac{df}{dx} \\ \int [f(x) + g(x)] \, dx &= \int f(x) \, dx + \int g(x) \, dx \\ \int af(x) \, dx &= a \int f(x) \, dx \end{aligned} $$ This means that the integrals in the rendering equations are linear operators acting on the space of radiance fields! In fact, we can package up these integrals into one big linear operator; we use the surface operator for points that are on a surface, and the volume operator for all other points. (Remember that every point in 3D space corresponds to different dimensions in the infinite-dimensional vector space—so an operator that does different operations at different points is just like a matrix that does a rotation in the XY plane and a scale along the Z axis in 3D.)
Let’s name this operator $T$. It’s a very complicated operator, much trickier than a rotation or scale, but it’s a linear operator like all the rest. This enables us to write the rendering equation in a very terse form: $$ L = L_e + TL $$ Isn’t that gnomic! Of course, the physics hasn’t gotten any simpler—we’ve just hidden all the complexity inside the definitions of $L$ and $T$. Still, this form of the rendering equation has a benefit: it’s easier to see its fundamental structure. Though the rendering equations look rather complex when written out fully, they separate neatly into three main ingredients: the radiance field $L$, which we are trying to solve for, so we can render stuff; the emitted radiance field $L_e$, which contains only the light directly produced by sources; and the scattering operator $T$.
Fixed That For You
Since $L$ appears on both sides, what we have here is a fixed-point equation. Suppose we define a function that acts on radiance fields as: $f(\ell) \equiv L_e + T\ell$. Then, the $L$ we want is the unique fixed point of this function: $f(L) = L$.
The function $f(\ell)$ consists of two operations: applying the scattering operator $T$ and then adding the emitted radiance field $L_e$. This implies that at the fixed point, there is an equilibrium between these two operations: each undoes the effect of the other! The equilibrium holds when the effect of $T$ on the radiance field is precisely balanced by the addition of $L_e$. In other words, the effect of $T$, when applied to $L$, is simply to subtract $L_e$ from the radiance field! Then we add it back and get $L$ again, so the equation is satisfied.
We can understand physically how this equilibrium arises by considering that in reality, a radiance field is not a thing, but a process. Light doesn’t just stand still in space and wait for someone to look at it; it’s continuously moving and being absorbed, and light sources are continuously emitting. The only way we can see a stationary distribution of light—which is what a radiance field represents—is when light is being continuously emitted and absorbed in such a way that the two balance perfectly, and the amount of light at any given point and direction doesn’t change over time.
The rate of emission is usually fixed by the configuration of light sources in the scene, but the rate of absorption depends on the amount of light present—for instance, a surface might absorb 50% of the light falling on it. This implies that when the light sources are switched on, the amount of light will grow until the rate of absorption precisely matches the rate of emission. Light travels very fast, so this equilibration process is all over in microseconds or less for a typical scene.
This whole notion of a state of equilibrium is inherent in the nature of a fixed-point equation. However, just writing down a fixed-point equation does not guarantee that a fixed point will actually exist. As seen above, absorption is a key element of radiative equilibrium—without it, there is nothing to balance emission, so the amount of light would just keep growing without bound as light sources dump more and more of it into the scene.
(Incidentally, this sort of equilibrium, where energy is added at a fixed rate and absorbed or dissipated at a rate proportional to the amount of energy present, also controls many other things in everyday life—such as how hot your stove gets, and how fast your car goes.)
The Neumann Series
There’s another benefit of looking at the fixed-point form of the rendering equation: fixed-point equations have been studied by mathematicians for a long time, and there are a number of theorems that specify sufficient conditions for various kinds of functions to have fixed points, and often give procedures for finding them.
One major result of this kind is the Banach fixed-point theorem. It states that under certain conditions, which I’ll go into a bit later, a function has a unique fixed point, and we can approach it by starting with any point in the space and iterating the function on it. That is, we pick some initial point $x_0$, and then calculate the sequence $$f(x_0), f(f(x_0)), f(f(f(x_0))) \ldots$$ Then the theorem says this sequence will converge, and that its limit is the fixed point of $f$.
Let’s do this with the rendering equation. Our points are radiance fields, and our function is $f(\ell) = L_e + T\ell$. First we have to pick an initial point, but according to the Banach theorem it doesn’t matter which initial point we pick. So let’s pick the simplest possible thing: zero—that is, the radiance field that is zero everywhere. Then, iterating on it gives: $$ \begin{aligned} f(0) &= L_e \\ f(f(0)) &= L_e + TL_e \\ f(f(f(0))) &= L_e + TL_e + T^2 L_e \\ f(f(f(f(0)))) &= L_e + TL_e + T^2 L_e + T^3 L_e \\ &\:\vdots \end{aligned} $$ It’s clear from this sequence that $L$, the fixed point we’re seeking and the solution to the rendering equation, is an infinite series of all powers of $T$ applied to $L_e$: $$ L = L_e + TL_e + T^2 L_e + T^3 L_e + \cdots $$ This series is called a Neumann series, after mathematician Carl Neumann (no relation to the more famous John von Neumann).
Since the $T$ operator implements one “bounce” of light scattering at all the surfaces and volumes in the scene, this simply means that the final lighting in the scene equals the emitted light, plus light that’s bounced once since it was emitted (direct lighting), plus light that’s bounced twice, and so on. Each additional term in the infinite series is simply one extra bounce of indirect lighting, so this iteration toward the fixed point is an abstract representation of what rendering algorithms like path tracing and photon mapping actually do!
The sequence of applications of $f$ also somewhat mimics what physically happens in a dark room when the lights are turned on: at each “timestep”, existing light is scattered and new light is emitted. The process repeats indefinitely, but settles down to an equilibrium distribution rather quickly. (In a room 5 meters across, you’ll get about 70 million bounces per second!)
Sufficient Conditions
Up to now, I simply assumed that the Banach fixed-point theorem could be applied to the rendering equation—specifically, that the function $f(\ell) = L_e + T\ell$ satisifed the theorem’s conditions. But now we need to go back and actually see what those conditions are.
Paraphrased from Wikipedia, the Banach theorem requires:
Let $(X, d)$ be a non-empty complete metric space. Let $f: X \to X$ be a contraction mapping on $X$, i.e.: there is a nonnegative real number $k < 1$ such that $$ d(f(x), f(y)) \leq k \, d(x, y) $$ for all $x, y$ in $X$.
Let me unpack some of the jargon here. A metric space is just a space in which we can measure the distance between any two points, using a metric (distance function) $d$. The space that $f$ operates on has to be a metric space. For us, that’s the space of all radiance fields, so we’ll need to define a notion of the “distance” between two radiance fields.
Once that’s done, we need $f$ to be a contraction mapping—a function that brings points closer together, but never spreads them apart. The statement $d(f(x), f(y)) \leq k \, d(x, y)$ means that when you apply $f$ to any pair of points, the distance between them gets multiplied by a factor of $k$ or smaller. Since $k$ is a global constant less than 1, applying $f$ always pulls things closer together.
You can now see intuitively why this theorem works: if each application of $f$ brings everything closer together by a factor less than one, then iterating it will make everything get closer and closer still. In the limit, the whole space must collapse to a single point—the fixed point. In the case of the rendering equation, each application of $f$ performs another bounce and brings us a step closer to equilibrium.
Let’s get back to the distance between two radiance fields. Because radiance fields form a vector space, it suffices to define the norm (magnitude) of a radiance field—its distance from zero. Then the distance between two nonzero radiance fields can be computed by subtracting them and taking the norm of the result—just as with ordinary vectors. So how should we define the norm of a radiance field?
A natural choice is to define it as the total energy embodied in all the light in the scene. To get this, we need to integrate the radiance field over all space, all directions, and all color channels: $$ \| L \| \equiv \sum_{r, g, b} \int_{S^2} \int_{\mathbb{R}^3} |L(x, \omega)| \, d^3 x \, d\omega $$ I’ll use the double-bar notation $\| L \|$ for the norm, to distinguish it from absolute value. (Note also that to actually get units of energy, we’d need to divide this by the speed of light—but I won’t bother with that, as it won’t affect anything else here.)
Therefore, to satisfy the conditions of the Banach fixed-point theorem, it must be the case that $$ \|f(\ell_1) - f(\ell_2)\| \leq k \, \| \ell_1 - \ell_2 \|, \quad\text{where } k < 1 $$ for any pair of radiance fields $\ell_1, \ell_2$. Expanding the definition of $f$, $$ \begin{aligned} \|(L_e + T\ell_1) - (L_e + T\ell_2)\| &\leq k \, \| \ell_1 - \ell_2 \| \\ \|T(\ell_1 - \ell_2)\| &\leq k \, \| \ell_1 - \ell_2 \| \end{aligned} $$ The $L_e$ terms have dropped out, and the result depends only on $T$. Since $\ell_1, \ell_2$ are arbitrary radiance fields, their difference is also arbitrary; thus, for any radiance field $\ell$, we must have $$ \| T\ell \| \leq k \, \| \ell \|, \quad\text{where } k < 1 $$ What does this mean? Simply that applying $T$ must decrease the amount of energy in the radiance field, no matter what the initial configuration of that field! As we saw earlier, absorption is a key element of radiative equilibrium, and this result is a formalization of that. The Banach fixed-point theorem will go into effect, guaranteeing that the sequence of applications of $f$ converges, if $T$ always absorbs some energy from any given radiance field. Or, to put it in plainer terms, our key result is:
If a scene is configured such that scattering always results in a net decrease of light energy, then the rendering equation will converge, for any light source configuration.
But this is very similar to a condition that you’ve probably seen many times if you read papers on computer graphics: $$ \text{for all } \omega \in S^2, \quad \int_{S^2} f(\omega, \omega') \, |n \cdot \omega'| \, d\omega' < 1 $$ It’s the inequality for BSDF energy “conservation”! I put “conservation” in quotes because it’s really an absorption condition, not a conservation one. (Even if the condition reads “$\leq 1$” instead, which is often seen—and a somewhat questionable choice—it is at best an energy non-production condition.)
The BSDF absorption inequality, if it holds for all BSDFs in the scene, certainly provides a sufficient condition for the rendering equation to converge. That was intuitively plausible already, but now we’ve seen how the intuition is grounded in rigorous mathematics.
Is the absorption inequality a necessary condition for convergence? No; it’s possible to “break the rule” of absorption in careful ways and still have a convergent scene. For instance, a single perfectly reflecting planar mirror in an otherwise compliant scene will not prevent convergence. However, several such mirrors, arranged in a way that allows light to reflect amongst them indefinitely, will prevent convergence! A sphere made out of perfectly non-absorbing glass will act as a “light trap”, since light that enters it at certain angles will undergo total internal reflection and bounce around inside forever. So, perfectly non-absorbing materials can be okay in some cases, but it’s easy to break convergence by using them unwisely.
Conclusion
Let’s review where we’ve been. For the purposes of computer graphics, we simplified the electromagnetic field into the radiance field, which discards details of wavelength, polarization, and suchlike that humans can’t normally perceive. The rendering equation could be expressed in terms of a linear operator acting on the vector space of radiance fields, and turned out to have the form of a fixed-point equation, which describes the equilibrium between light emission and absorption.
To guarantee the existence of the fixed point, we invoked a high-powered mathematical theorem: the Banach fixed-point theorem. We found that in order to go into effect, this theorem requires that the scene be set up to absorb at least a minimal amount of energy from any light path. Otherwise, light can bounce around forever without decaying, preventing the rendering equation from converging to a fixed point.
I hope you’ve enjoyed this whirlwind tour through some of the deep mathematics underlying computer graphics. There’s plenty more out there to dig into!