Circle of Confusion From The Depth Buffer
The first step in a depth-of-field postprocessing filter is to find the radius of the circle of confusion (CoC) for each pixel on screen. This isn’t difficult—the lens equation for a realistic camera is simple and easy to implement in a pixel shader. But there’s a neat simplification of it that I found the other day and wanted to share.
To get the CoC for a pixel, you’d typically write a pixel shader that samples the depth buffer and plugs that value into the lens equation. But the depth buffer usually stores post-projective depth using a reciprocal mapping, so you must convert that to linear depth in world-space units before using it in the lens equation. This conversion involves a reciprocal, and the lens equation involves another reciprocal. It turns out that you can put the two equations together and the reciprocals cancel out! The CoC is therefore just a linear scale and bias of the depth buffer value.
First, let me review converting the depth buffer value to linear depth. Let $n$ and $f$ be the
world-space depths of the near and far frustum planes. Consulting a standard projection-matrix
formula, such as the one implemented by
D3DXMatrixPerspectiveLH
,
we see that the formula for the depth buffer value $d$ in terms of world-space depth $z$ is:
$$
d = \frac{f}{f - n} - \frac{nf}{f - n} \frac{1}{z}
$$
Inverting this equation to get ‘z’ in terms of ‘d’ gives:
$$
z = \frac{1}{\frac{1}{n} - \frac{f - n}{nf} d}
$$
From Wikipedia, the equation for the CoC radius $c$ of a point at depth $z$ is:
$$
c = s \left|1 - \frac{z_\text{focus}}{z} \right|,
$$
where $z_\text{focus}$ is the depth at which the camera is in perfect focus, and $s$ is a
constant scale factor related to the camera setup (see the Wikipedia page for details). Putting
together the last two equations gives
$$
\begin{aligned}
c &= s \left|1 - z_\text{focus} \left(\frac{1}{n} - \frac{f - n}{nf} d \right) \right| \\
&= \left| s\left(1 - \frac{z_\text{focus}}{n} \right) +
\left(s z_\text{focus} \frac{f - n}{nf} \right) d \right|
\end{aligned}
$$
This is just (the absolute value of) a linear scale-bias of $d$, since all the other quantities are
constants with respect to the image. So it’s very easy and cheap to compute a physical camera CoC
directly from the post-projective depth buffer!
Of course, you may not wish to match a real-world camera exactly—we have the freedom to calculate CoC with whatever algorithm we like, and can even do things that real-world cameras can’t, like have two different focal planes in one image! But if you do want to simulate the CoC of an actual camera, this is a convenient formula to have.