In this project, I implemented a physically-based path tracer that could render 3D .dae scene files. In essence, my renderer sends out computed rays (simply vectors with an origin and direction, and a number-of-bounces counter) and tracks their intersections with the scene objects. These intersections, along with the surface properties of the objects, determine the final rendered image. I began with a bare-bones approach, naively sending random rays out into the scene, and implemented a series of optimizations (and more realistic light + surface material modeling) that allowed a very functional basic renderer to result.
On a high level, my implementation takes in a 3D .dae file, reads command line arguments to specify the various parameters of the render, divides the scene into tiles, and renders each tile with the path tracer implementation. The final light calculation determines the r,g,b value of each pixel in the image.
This project was very time-consuming and difficult to debug, but I thoroughly enjoyed learning more about how renderers work, and actually getting to make one myself!
Ray Generation and Scene Intersection
Rays are simple vectors that store an origin and direction. They are used in rendering to simulate light rays - if a ray intersects an observer and then an object, then that object is visible to the observer. For part one, I implemented a very basic form of this process.
For my implementation, I first created a function that took in an x,y pixel coordinate and determined the integral of the irradiance over the pixel, which was calculated by summing samples and then dividing by the amount of samples to obtain an average. I then created a function to generate a ray originating at the camera. I transformed the incoming coordinate space to camera space, and converted the point to world space to get the direction of the ray. This resulted in rays beginning at the camera sensor, pointing outward towards the world.
Once the rays were created, I had to implement an algorithm to detect whether or not the ray intersected a primitive object. (In this case, we worked with triangles and spheres.) For triangle intersection, I used the Möller–Trumbore intersection algorithm, which determines the distance along a ray where it intersects with a triangle. It also calculates barycentric coordinates, which define a point relative to a triangle's three vertices. If the length-value given by the Möller–Trumbore algorithm was within the length-value limits, and the ray intersected the triangle (the barycentric coordinates were used for this determination), then the primitive was intersected and rendered onto the screen. As for sphere intersection, this used the sphere's center position and radius to geometrically calculate if the ray intersected the surface of the sphere.
Below are some images that I rendered with my implementation for part 1. Since the implementation was very naive, even these simple renders took significant time to complete. Also note that the shading here is simply each primitive's (x,y,z) surface normal represented as (r,g,b) pixel values.
Bounding Volume Hierarchy
To speed up the results from part 1, I implemented a Bounding Volume Hierarchy (BVH) structure. This algorithm calculated a binary tree of bounding boxes, uniformly spaced, for groups of primitives in the scene. This allows the renderer to single out individual primitives by calculating if a given ray intersects the bounding box. Each bounding volume node contains pointers to smaller right and left sub-categories of primitives, eventually arriving at a single small primitive.
I implemented functions that first recursively creates the bounding-box structure from the list of primitives in the mesh. This algorithm created a bounding volume hierarchy node for the given bounding box, and created left and right children. These child vectors were filled with primitives in the bounding box, and a split-point was used to decide if they went in the left or right child. The split point heuristic I used was the average centroids (center position) of the other bounding boxes. The second algorithm I implemented calculated if a ray had intersected a bounding box.
My implementation does the following: first, it decides if a ray has intersected a bounding box using the intersection algorithm. If it has not, the ray is discarded. Otherwise, it recursively iterates through the child nodes of the bounding box, terminating if no intersection or rendering if an intersection takes place.
This was an impressive optimization as it increased the speed immensely, and made for a more efficient render.
At one point, I had a bug where certain faces were disappearing in my rendered image, creating a "confetti" look. I was able to fix this error by checking my boundary conditions in the intersect-bvh method.
Below is a speed comparison of the part 1 algorithm with the BVH algorithm. As you can see, the BVH algorithm rendered in a small fraction of the time taken for the part 1 algorithm. This is because of the hierarchical structure of the bounding boxes - it allows the renderer to pare down rays to ones that actually intersect the bounding boxes. This is especially fast since bounding box intersection is much easier to calculate than intersection of complex primitive objects.
Below are some images rendered using the BVH algorithm. I was able to render these images at an astounding speed, thanks to the binary tree structure!
Now comes the fun part - actually rendering realistic, physically-based scenes! To do this, we need to simulate actual light ray behavior. So far in this project I had created a way to visualize 3D objects, but not their real physical properties. For the first time, now we can play with some virtual light rays.
For part three, I implemented direct lighting in two ways: hemisphere sampling and importance sampling.
Firstly a material representation was needed, to determine how the light rays would react to the surface of the 3D objects. I implemented a diffuse BSDF (bidirectional scattering distribution function) that takes in an ingoing angle and outgoing angle, and returns f(ingoing , outgoing) where f is our bsdf sampling function.
I then implemented uniform hemisphere sampling by taking samples in a uniform hemisphere around the point of intersection between the light ray and the surface. For each ray that bounces off the surface and hits a light source, we record the incoming radiance from that light, and then use the BSDF function to compute outgoing radiance, and average across all these samples to get our per-pixel values.
My hemisphere sampling function worked, but it could be improved in both accuracy and speed by a lot. That is why next I created a function that implemented importance sampling. This loops through all the lights in a scene. If the light is a delta light, only one sample needs to be taken. Otherwise we continue looping num_samples times, accumulating the sample radiance , and converting these to outgoing radiance using the same BSDF function as before. Rather than sampling in a uniform hemisphere around the hit point on the surface, we directly sample each light.
Below you can see the difference between uniform and importance sampling with 256 samples per pixel:
As you can see, the hemisphere sampling is not quite as "smart" of an algorithm as the importance sampling. By intelligently selecting samples based on whether they hit a light, importance sampling renders more quickly and less noisily. While increasing the sample count of hemisphere sampling can eventually make an image that converges to the correct result, importance sampling obtains this same image much more quickly.
Below are renderings of the spheres scene with different number of light rays, and one sample per pixel. Clearly the more light rays there are, the smoother and less noisy the image appears.
An interesting bug I experienced when implementing this function was creating an almost depth- map effect. I noticed that when I rendered, all pixels were either white or black - no in-between. This was due to the fact that values were getting divided by zero, thanks to some faulty variable multiplication, and this pushed values to infinity, causing all light rays that hit the scene to turn white.
Below are some more images showing direct lighting - as you can see, it tends to come out very believably on isolated objects where the scene lighting is not obvious in the final shot.
To extend the path tracer to allow for indirect illumination, I needed to find a way to more closely model how real light behaves. Real light doesn't just bounce once and disappear - it bounces around until it is absorbed. I modeled this by probabilistically simulating the likelihood that a ray would "terminate". By querying a coin-flip helper function, I could determine in a loop whether this ray would terminate.
My implementation first called the direct illumination function, then recursively called a function to estimate each subsequent bounce of light. This function took a sample from the BSDF, then obtained the sample's reflectance and determined using the coin-flip method whether or not the ray would terminate. If the ray didn't terminate, and the ray had bounced less than the maximum allowed number of bounces (specified by program argument), then the new bounce ray was created, and its radiance was added by making a recursive call.
Here are some images rendered using my global illumination implementation:
Below is a comparison of direct only versus indirect only. I obtained the second image by taking the at-least-one-bounce radiance and subtracting the zero-bounce radiance.
Below are some images comparing the different maximum ray depths in the scene. (hover to reveal). As you can see, the main difference that occurs is allowing the light to bounce one extra time. With diffuse materials, the subsequent bounces don't make much of a difference, but if we had a translucent material it would be very clear.
Below are images comparing different samples-per-pixel rates. As you can see, increasing the samples per pixel drastically improves the clarity of the image and reduces noise.
In part 5, I implemented the ability of the path tracer to selectively adjust the amount of sampling per pixel depending on how quickly it converged to any given r,g,b value. That means that areas of the image with higher contrast or more sharp details got a higher sampling rate as they were more difficult to render accurately, while areas with less detail could afford a lower sampling rate because they converged very quickly. This allows for a render that is both less noisy and more efficient.
To accomplish this task, I determined if the illuminance calculations converged per each sample, by first calculating the mean and variance of the illuminance, and the standard deviation from the variance. If a pixel's average illuminance was within a given tolerance value of the expected illuminance, then the sampling finishes, because the value has converged. I had perhaps the most difficult time debugging this implementation as I got a completely solid red sampling image for a while - it turned out, I was saving over the real values, so although they were there the image never rendered.
Mirror and Glass Materials
In this part of the project, I Implemented reflective and refractive materials. The mirror material approximates a perfect specular reflection, where light that makes contact with the surface bounces off at the exact opposite angle from which it came.
To get the reflection of a surface, I used a function that took an outgoing angle as input (remember in path tracing, we're working backwards from the camera to the light source), and reflected it around the object's surface normal to get the incoming angle.
To get the refraction of a surface, I implemented a refraction function that took an outgoing angle as input, and returned an ingoing angle based on the object's IOR (index of refraction, a number associated with the density of the material). I also used Snell's law to find out if there was total internal reflection, which would terminate the ray. I then used Schlick's approximation to determine whether a ray would reflect off the surface of the glass or refract into (or out of) the glass material.
Take a look at the below images to notice what happens as we increase the number of light bounces.
At zero bounces, we just have the light from our light source.
At one bounce, we have direct lighting everywhere in the scene (note the black spheres- this is an artifact from how the BSDFs were implemented).
At two bounces, we finally see light everywhere in the scene - light has reflected off of the reflective sphere, and notice the glass sphere is dark with a slight reflection - this is because refracted light has entered (but not exited) the refractive sphere, while the reflected light has bounced off.
By the third bounce, we see refracted light exiting the glass sphere, and by the fourth bounce, we can notice a bright spot on the ground. This is what is known as "caustics" - light that is transmitted through a refractive object and concentrated based on the curvature of the object. This is akin to the concentration of light that happens when you put a magnifying glass underneath a bright light.
By the fourth, fifth and hundredth bounces, notice that the concentrated light is being reflected and diffused back onto the sphere , and concentrated in a lighter spot on the blue wall. Also notice that as the bounces increase the reflected image of the glass sphere in the reflective sphere gets lighter and lighter.
For this part I implemented microfacet materials, which simulate the behavior of real-life conductors. In essence, the surface of these materials is made up of small microfacets, whose shape determines how rough or smooth the material appears.
My implementation used importance sampling to sample the BRDF, which used the inversion method to obtain a probability distribution function that approximates a Beckmann normal distribution function. Given an input angle wo, we calculate a half-angle h using our pdf, and reflect wo about this angle to get our incoming angle wi.
Below are renderings of the golden dragon set to alpha = 0.005, 0.05, 0.25 and 0.5. Note that as the alpha increases the image becomes more and more rough and diffuse, like brushed metal. The alpha value seems to equate to the roughness of the material.
Below are two images of the bunny rendered using cosine hemisphere sampling and my importance sampling, with 64 samples per pixel and max ray depth 5. Notice how much more accurate the second image looks - although they are the same number of samples, the first image is more noisy and much darker than it should be. Although with enough samples it would converge at a correct image, the importance sampling arrives there much more quickly.
In this part I implemented the use of 360-degree environment maps to simulate real-world lighting on a scene. This method maps a 2D image onto the inside of a sphere, and uses the color values at each point on the image to cast light inwards onto the scene. This is a very easy way to achieve highly realistic results, and is commonly used in the world of CG rendering.
I first implemented uniform sampling of the environment light, meaning given a light ray, I sampled a random direction on the light sphere and returned the resulting radiance. This was simple, but noisy. (See the images below.) I then implemented importance sampling of the sphere - because most light cast onto a scene from an environment comes from its areas of high radiance.
To implement importance sampling, I created a probability map that was dependent on the local radiance values of the map. This probability map determined how likely it was that a light ray would sample that point in the environment map. I then created the marginal distribution, which is the cumulative distribution function of each row in the map. I then used Bayes' rule to determine the conditional distribution, which is the cumulative distribution function for sampling any point in a given row of the map. After sampling these functions to obtain (x,y) coordinates, I converted these coordinates into a direction vector and returned the correct radiance value at that point on the map.
Below are images of the copper and diffuse bunny rendered using uniform and importance sampling. Notice the sheer amount of noise in the uniform sampling images compared to the importance sampling images, although they are all 4 samples per pixel and 64 samples per light. The importance sampling images achieve a much higher level of realism at the same sample count.
Depth of Field
In this part, I implemented a representation of a real thin lens. This contrasts with previous parts of the project where we used a pinhole camera, and everything in the scene was in focus at once. But to create some out-of-focus effects, I needed to figure out how to model real life lens behavior in my renderer.
My implementation took a focal point position, which is simply where a perfectly-focused ray would have intersected the scene, and calculated a ray from the sensor plane to this point.
Below are some renders of the golden dragon with aperture 0.1 and varying focal lengths (2.4, 2.5, 2.6, and 2.7). The images below all have 128 samples per pixel.
Since it may be difficult to discern the differences between the above images, I've made a gif of them below:
Below are some renders of the golden dragon with focal length 2.5 and varying apertures (0.15, 0.10, 0.07, 0.05 and 0.0). The images below all have 128 samples per pixel.
Below is an animated gif of the above images, so the difference can be seen more clearly.