PathTracer Part 2

Henry Xu


In this project, I explored additional concepts in ray tracing (previous concepts can be found here), including mirror and glass materials, microfacet materials, environment lights, and depth of field effects. Seeing each new feature come together made for an incredible experience, and further deepened my appreciation of computer graphic's ability to model real world phenomena.

Part 1: Mirror and Glass Materials

To model mirror and glass materials, we employed a bit of physics intuition and a touch of math.

Mirror materials rely purely on reflection, so after implementing a function to help us reflect about the object normal, we simply called it in the sampling function, set the pdf to 1, and returned the reflectance divided by abs_cos_theta(*wi) to cancel out the multiplication done in at_least_one_bounce_radiance since perfect mirrors do not cause any Lambertian falloff.

With glass materials, we had to also take into refraction. To refract a ray, we took advantage of the object coordinate frame to use Snell's Law in spherical coordinates, giving us the equations

w_i.x = -eta * w_o.x, w_i.y = -eta * w.o.y, w_i.z = ∓ sqrt(1 - eta^2 * (1 - w_o.z^2))

Note that due to being given only one ior value, eta = 1/ior when entering and eta = ior when exiting. We can determine if we're entering or exiting by the sign of the z-coordinate. If it is positive, we're entering, otherwise we're exiting. Finally, to model glass, we must know the ratio of the reflection energy to the refraction energy. The Fresnel equations involved in the ratio's calculation are a little involved, so we use Schlick's approximation instead. The approximated ratio can be used a coin-flip probability to determine if we reflect or refract. The resulting glass material logic looks something like the following:

Screenshots
Max ray depth of 0. Note that a max ray depth 0 is equivalent to zero bounce radiance, or light that travels directly from source to camera. Nothing else is the scene is lit at this stage.
Max ray depth of 1. With a max ray depth of 1, we now have direct lighting and can see the walls of the room. Note that we should be able to see a reflection of the light source in the spheres, but due to an implementation detail regarding how we treat delta lights, this detail is not present.
Max ray depth of 2. With a max ray depth of 2, we now have reflection and lighting of the ceiling. Reflection occurs since we can now track the rays that are one bounce from the mirrored surfaces. Note that refraction is not present since we're unable to track the rays that hit the glass sphere and travel through it just yet.
Max ray depth of 3. With a max ray depth of 3, we now have refraction visible in the glass sphere and reflection of the glass sphere's reflected rays (i.e. what is seen of the glass sphere in the 2 ray depth case) by the mirror sphere. Refraction is now visible since 3 bounces allows us to see the rays that hit the glass sphere and travel through it. The reflection of the glass sphere's reflected rays by the mirror sphere is also possible because of the increased ray depth: source to glass (only reflective rays) to mirror to camera.
Max ray depth of 4. With a max ray depth of 4, we now have a pool of light (caustic) visible underneath the glass sphere and full reflection of the glass sphere in the mirror sphere. The former effect results from the tracking of rays that hit the glass sphere, bounce through it, and hit the ground underneath. The latter is also possible due to the increased ray depth: source to glass (now including both reflective and refractive rays) to mirror to camera.
Max ray depth of 5. With a max ray depth of 5, we now have a caustic visible on the wall. This effect results from the increased ray depth allowing for bounces to reflect off other objects in the scene before reaching the camera.
Max ray depth of 100. With a max ray depth of 100, we have a brighter image and a little more refraction of the light source in the glass sphere, but other than that, it looks similar to the image generated by a max ray depth of 5. From this, we can conclude that convergence more or less occurred with the max ray depth of 5.

Part 2: Microfacet Materials

To model microfacet material, the devil was in the mathematical details. To start we implemented the BSDF evaluation function as follows:

BSDF evaluation equation

To implement D, the Normal Distribution Function, we used the Beckmann distribution:

Beckmann distribution

To calculate the Fresnel term, we need to take an approach different from that taken in Part 1 since we're dealing with a air-conductor interface (as opposed to an air-dialectric one). To do so, we make the simplification of calculating the Fresnel terms for just the R, G, and B channels as follows:

Fresnel term for air-conductor interfaces

Note that eta and k represent the indices of refraction.

Finally, the importance sampling function needs to be changed due to the use of the Beckmann distribution. The equations to sample from the Beckmann NDF are as follows:

Sampled theta and phi

Note that r_1 and r_2 are two random numbers between 0 and 1.

Using theta and phi, we can calculate the sampled normal h:

h = (cos(phi) * sin(theta), sin(phi) * sin(theta), cos(theta))

To calculate the pdf, we can use the following:

PDF calculation for h step 1 PDF calculation for h step 2

where:

Probability calculations for phi and theta Screenshots
alpha=0.005
alpha=0.05
alpha=0.25
alpha=0.5

Note that alpha can be through of as the roughness of the macro surface. When alpha is small, we get a glossy surface, when alpha is large, we get a diffuse one. From the equation for the Beckmann distribution, we can see that a larger the value for alpha results in a larger the variance of the normals, creating a more diffuse surface. Likewise, the smaller the value for alpha, the smaller the variance of the normals, resulting in a more glossy surface. We can see this effect in action in the images above (note that the glossy surfaces have more noise by nature of their more specular BSDF).


Cosine hemisphere sampling
Importance sampling

We can see that cosine hemisphere sampling is substantially noisier than importance sampling. We can see a vague inkling of the copper material the cosine hemisphere sampling is trying to represent. In contrast, the bunny rendered using importance sampling is definitely copper. This difference can be attributed to the additional information importance sampling uses to inform its sampling process. Note that with enough samples, however, cosine hemisphere sampling can achieve a similar look to importance sampling, but, as seen above, importance sampling converges much faster.


Strontium (Sr)
Eta: 1.0691 1.1224 1.2229
k: 5.4952 4.8741 4.1078
alpha: 0.05

Part 3: Environment Light

Environment light furthers scene realism with the use of a light infinitely far away "that supplies incident radiance from all directions on the sphere." In other words, the light is universally present, as one might expect in a naturally lit environment. Consequently, the technique is representative of real world lighting environments.

To accomplish this feat, we employ the use of a texture map to characterize lighting intensity, and sample from it accordingly. In this part, we implemented two sampling techniques: uniform sampling and importance sampling. In the former case, we just generate a random direction on the sphere and perform bilinear interpolation on the texture map. In the latter, we use the fact that most of the energy from an environment light comes from the directions toward bright light sources to improve the results from sampling by biasing the selection of sampled directions towards such sources. By incorporating additional information into our sampling methodology, we reduce noise, especially in cases where the environment light has high variation.

A few notes regarding importance sampling are as follows:

Screenshots
field.exr (used in renderings below)

field.exr probability map

Additional probability maps
ennis.exr

uffizi.exr

bunny_unlit.dae
Uniform sampling
Importance sampling

The difference is relatively subtle, but there is a fair bit more noise in the image rendered using uniform sampling not present in the importance sampled image. Consequently, there is a slight obscuring of some of the finer details in the bunny in the uniform sampled image when compared to the image of the bunny rendered using the importance sampling.


bunny_microfacet_cu_unlit.dae
Uniform sampling
Importance sampling

Again, we have relatively subtle differences, but the uniform sampled bunny has a fair bit of extra noise not present in the importance sampled image. The difference in noise is most prominent in the brighter areas, e.g., the right ear, right cheek, and back of the bunny (above the tail).

Part 4: Depth of Field

In previous parts, we used ideal pin-hole cameras. Ideal pin-hole cameras have infinite depth of field due to their use of an eponymous pin-hole, which ensures that the camera receives only a single light ray from each point in the scene. Consequently, distance from the camera does not impact an object's sharpness. Due the disadvantages of pin-hole cameras (such as issues with diffraction and the fact that an ideal one would essentially need an infinite exposure time), real world cameras (including human eyes) are often modelled instead as lenses with finite aperture. Such lenses result in objects that are in focus only if they are within a plane that is a certain focal distance away from the lens, allowing us to achieve nifty depth of field effects.

As opposed to the pin-hole model, where any point on the film receives radiance from only one point in the scene, with a thin lens model, any given point on the film can receive radiance from any point on the thin lens. Therefore, we adjust our camera ray generation to sample the thin lens accordingly.

Thin lens diagram

A few notes regarding thin lens sampling are as follows:

Screenshots
"Focus stack": Fixed aperture, varying focal distance (64 samples per pixel, 4 samples per light, max ray depth 5)
Focal distance 2.5; Aperture 0.057
Focal distance 2.6; Aperture 0.057
Focal distance 2.7; Aperture 0.057
Focal distance 2.9; Aperture 0.057
Focal distance 3.1; Aperture 0.057
Focal distance 3.3; Aperture 0.057

Notice that as the focal distance increases, the section of the scene in focus gets farther and farther away from the camera as a result of the shifting of the focal plane. In the first few images with smaller focal distances, the front of the dragon is sharp and the tail end of of the dragon is blurred, whereas in the later images with larger focal distances, the front of the dragon is blurred and the tail end of the dragon is sharp (in fact the back wall is starting to come into focus at this point).


Fixed focal distance, varying aperture (64 samples per pixel, 4 samples per light, max ray depth 5)
Focal distance 2.5; Aperture 0.01
Focal distance 2.5; Aperture 0.02
Focal distance 2.5; Aperture 0.04
Focal distance 2.5; Aperture 0.08

Notice that as aperture size increases, only objects within the focal plane retain their sharpness. In this case, the only object within in the focal plane is the very top of the dragon's mouth.


Thank you for reading!