$E = mc^2$ Is Only Half The Story
I’m sure you’ve seen the equation $E = mc^2$ many times. Probably the best-known equation in physics, it represents both a straightforward mathematical relationship and a deep physical principle. But did you know that this equation is incomplete? The full version of Einstein’s equation is: $$ E = \sqrt{m^2 c^4 + p^2 c^2} $$ This states that the relativistic energy, $E$, of a moving object is a function of its mass $m$ and its momentum, $p$ (as well as the speed of light, $c$). If you set $p = 0$ and simplify, you’ll get back to the usual $E = mc^2$. To be sure, the above equation doesn’t roll off the tongue quite as easily as $E = mc^2$…but with the momentum term included, it tells a fuller story about how relativity works. In fact, it looks very much like the Pythagorean theorem! Relativistic energy scales the same way as the hypotenuse of a right triangle whose legs are mass and momentum. This fact is not a coincidence, as we’ll see.
In relativity theory, objects with a mass are not allowed to travel at or faster than lightspeed. This presents us with a problem: we have to modify the rules of mechanics. In Newtonian mechanics, if you keep applying force to an object, it keeps accelerating—in principle, to arbitrarily high velocities. But relativity says that the speed of light is an absolute limit.
So Newton’s second law, $F = ma$, can no longer be true when dealing with relativistic velocities. However, it turns out that a closely related law does still work. In Newtonian physics, momentum is defined as $p = mv$, so its time derivative is $dp/dt = ma$; thus an equivalent way to express the Newton’s law is $F = dp/dt$. This version of the law can be made correct all the way to lightspeed—but to do so, we have to trade in our Newtonian definition of momentum for a relativistic one.
How should we define momentum in relativity? Well, it should be a function of velocity, since it’s supposed to measure how fast something’s going. Momentum also ought to be conserved: when a group of objects interact, their total momentum shouldn’t change. In particular, any observer looking at a system (technically: any inertial observer) should see momentum being conserved, no matter what her velocity relative to the system. Different observers may measure different amounts of total momentum, but they should all agree that the total momentum doesn’t change over time.
It turns out that’s enough to pin down the correct relativistic formula for momentum. I won’t go through the derivation here, but before I tell you the answer, I have to explain a bit about the form of it: the relativistic generalization of momentum is a 4-dimensional vector, called the “4-momentum”.
In Newtonian mechanics, most quantities of interest are 3-vectors. Vectors are handy because equations expressed in terms of them automatically work in all coordinate systems, even though the $x, y, z$ components of each vector may be completely different when looked at in different coordinates. Momentum is a vector, so if it’s conserved in my coordinates, it will be conserved in yours too.
The geometric setting for relativity theory is four-dimensional spacetime, though. 3-vectors are no longer appropriate: equations made of them will break when time gets involved, such as when two coordinate systems are moving relative to each other. To fix this, we have to upgrade to 4-vectors, which include a time component along with their $x, y, z$. For momentum to be conserved in relativity, it’s got to be a 4-vector. And it is! The formula comes out to be: $$ p_{xyz} = \frac{mv}{\sqrt{1 - v^2 / c^2}}, \qquad p_t = \frac{mc^2}{\sqrt{1 - v^2 / c^2}} $$ The first equation gives the space components of an object’s 4-momentum, in terms of its regular 3-dimensional velocity, $v$; the second gives the time component. Again, both of these components are required for the full relativistic law of momentum conservation. However, we can still think about the physical interpretations of $p_{xyz}$ and $p_t$ separately, as long as we remember that these quantities aren’t absolute—they depend on what coordinates you’re looking at them in.
One way to see how to interpret these quantities physically is to see what happens in the low-speed limit, since relativistic mechanics should reduce to Newtonian mechanics then. Let’s do this with the space and time components of 4-momentum in turn.
When $v$ is small relative to $c$, $v^2/c^2$ will be quite small indeed, and the denominator will go to 1 and disappear. For the space components, this gives $p_{xyz} = mv$—just the Newtonian formula for momentum! So the space components of the 4-momentum can be interpreted as the relativistic replacement for the Newtonian momentum vector. This is comforting to know, since it helps justify our calling the whole 4-vector a “momentum”.
Going back to the full equation $p_{xyz} = mv/\sqrt{1 - v^2/c^2}$, not the low-speed approximation, this implies that momentum grows toward infinity as velocity gets closer to lightspeed (as you can see by graphing the equation). Or, to put it another way, there is no upper limit on momentum, as there is on velocity! Your momentum can reach an arbitrarily high level, and as it grows you’ll get closer and closer to lightspeed, but you never quite reach it—just as relativity requires. This is why the modified Newton’s law $F = dp/dt$ can still be true; applying force to a relativistic object changes its momentum just the same way it does for slow-moving objects, but the relationship between momentum and velocity has become more complicated.
Now let’s look at the time component of 4-momentum in the low-speed limit. Using the same approximation as before, we get just $p_t = mc^2$, a constant. This level of approximation (first order in $v$) is too coarse to see anything interesting, so let’s go to second order. The result is $$ p_t = mc^2 + \frac{mv^2}{2} $$ Now that’s interesting! The second term there is the Newtonian formula for kinetic energy! This implies that the time component of 4-momentum should be interpreted as the relativistic energy, $E$, of an object—as seen from a particular coordinate system.
This has two deep consequences. First, the laws of conservation of energy and conservation of momentum aren’t really separate—they are one and the same law, the conservation of 4-momentum. In a sense, you can interpret energy as “momentum along the time axis”.
Second, even when an object isn’t moving, it still has a “rest energy” $E = mc^2$. In fact, what we know as “kinetic energy” is just what an object’s rest energy looks like in a moving coordinate system.
Just as relativity alters distances and times when you move, via phenomena like time dilation and length contraction, it also alters energy. It works out that when energy moves, it appears to grow by a factor of $1 / \sqrt{1 - v^2/c^2}$. This is a tiny factor for everyday velocities, but $E = mc^2$ is a huge amount of energy for everyday objects, so a tiny factor has a ponderable effect—and that effect is exactly the Newtonian kinetic energy!
Finally, if we put the space and time components of the 4-momentum together, we can find one more relationship: the full version of Einstein’s equation I mentioned at the top of the article. Every vector has a length or magnitude; the magnitude of a velocity vector is speed, and so on. 4-vectors have magnitudes too, but the formula for it involves opposite signs for the space and time components, like $$ |p| = \sqrt{p_t^2 - c^2(p_x^2 + p_y^2 + p_z^2)} $$ (This is how relativity theory mathematically implements the distinction between time and space: the two contribute in opposite ways to the length of a 4-vector.) If we calculate the magnitude of the 4-momentum, we get $$ |p| = \sqrt{E^2 - c^2 p^2} = \sqrt{\frac{m^2 c^4}{1 - v^2 / c^2} - \frac{m^2 c^2 v^2}{1 - v^2 / c^2}} = mc^2 $$ The magnitude of an object’s 4-momentum is just a constant—the object’s rest energy. This is in contrast to the magnitude of its 3-momentum (what we usually just call “momentum”), which is, as we saw earlier, a function of velocity. Rearranging this equation a little bit gives $$ E = \sqrt{m^2 c^4 + p^2 c^2} $$ which is where we started! Now it’s a bit more clear why this relationship looks like the Pythagorean theorem—it almost is. There’s no right triangle whose legs are mass and momentum, but there is a right triangle whose legs are momentum and energy and whose hypotenuse is mass—it’s the 4-momentum with its space and time components. The equation is the “wrong way around” because of the minus sign in the definition of magnitude of a 4-vector.
To recap: the relativistic energy of an object, as seen in a particular coordinate system, includes contributions from both its mass (rest energy) and its momentum. Momentum has no upper bound, even though velocity does; the relationship between the two becomes nonlinear at high speeds, so that velocity approaches $c$ asymptotically as momentum grows. The true conservation law is that of 4-momentum, which reduces to separate conservation of momentum and energy in the low-speed limit; and kinetic energy is just the result of an object’s motion causing minute shifts in its massive rest energy!