Transformations, Instancing, Multisampling, and Distribution Ray Tracing

In this homework, there were implementations of transformations, instancing, multisampling, as well as distribution ray tracing. In this rather involved assignment, many subtleties made me bite my tongue as well as shoot myself in the foot.

For transformations, I have consulted the matrix libraries of GLM. Instead of a fully-featured transformation library, it allows one to build translation, rotation, scaling matrices, provides matrix multiplication, transpose, and inverse features. There are many other features, of course, yet mainly I have used the features mentioned above.

The simple transformation scene is convenient because one can quickly decrease the resolution of the scene, play around the values of transformations, and get a clean and debug session when things go wrong.



However, in this scene, there is only one transformation per object. The order of application of the transformations changes the matrix multiplication order. Therefore it gives the false impression that they are done. So, I just added some scaling and a translation to the sphere, hoping everything goes fine. However, I have encountered this:


The scene is quite simple, so I figured out that I wrongfully transformed the bounding boxes of objects. For rotations, it was doing fine, but anything else just fails. 
So, for a bounding box, given its minimum and maximum points, there are eight extreme points. Thus, any of them could be a potential minimum or maximum point after a series of transformations. I laid out 8 candidates, transform them, picked minimum and maximum among them to come up with the proper bounding box. Although it is not the tightest bounding box, so it has performance implications, it does the job for the time being.

In the beginning, I separated the meshes and their instances, now I have some deep copies of triangles in the mesh instance objects because of some of the issues about motion blur. Early on, I did the motion blur in a way that, during the intersection tests, ray time is used to come up with the new center or point on the triangle. I just did not want to pass an argument to the intersection functions, as well as put the velocity information in ray or ray hit information object before passing it to a shape. The resulting implementation is a performance nightmare at best. In the current state of things, I wish I could come up with a better design. While implementing the motion blur, I have figured out that I do not correctly calculate the normals for transformed objects. While non-transformed objects' normals were just fine, the transformed objects looked slightly off:

The reflection on the mirror object is obviously wrong. While inverting the ray, one should be careful that the position vectors are homogenized with a 1, and directions are homogenized with 0s. Also, the upper 3x3 part of the inverse transpose of the transformation matrix is to be used when transforming the normal back to world coordinates. For the intersection point, multiplying with the forward transformation of the object suffices, and it should be homogenized with 1 for its fourth component as well. Finally, the distance parameter from the intersection test, I named it t in my case, should be carefully recalculated. The mesh instance object is provided with an additional material parameter so that if it is set, the instance material is being used in shading. Another subtle detail is that the intersection tests for the shadow rays may or may not happen in an object's local space, depending on whether it has a transformation or not. If the transformation is the case, the distance between the object and the light should be converted as if it is happening in the object space. It is quite straightforward to define the light position from the shadow ray and use inverse transformation to calculate the local space point. However, it may take forever to figure that out!


For multisampling purposes, I implemented jittered sampling with box filtering, which is the most straightforward approach. Unfortunately, I was sloppy at the beginning in a way that I calculated all the primary rays for all samples. That part of the code desperately requires refactoring. The only positive take is that it was convenient to figure out whether the jittered samples are correctly generated in the beginning with that approach. But given the resulting cache misses, this implementation is only acceptable in an educational setting...

With the advent of multisampling, it is easy to come up with glossy materials. One just has to create an orthonormal basis at the point of intersection of a glossy object, generate a random number between -0.5 and 0.5, and perturb the normal with that random number. With minimal effort, one can have some eye candy:


Also, depth of field can be simulated by coming up with a simplified model of an aperture, as discussed in the class.

While implementing the depth of field, this spheres scene was quite useful because it is the first scene that involved multiple transformations. That way I figured out that I was multiplying matrices in the reverse order, as following:
  
In these two images, there are only silhouettes of bounding boxes of the spheres, whose transformation is applied in the reverse order.

In these two images above, the bounding boxes of spheres and the meshes just clash, resulting in shadow rays and intersection tests behave oddly.

The tap water scene involves smooth shading. It is also straightforward to implement it. For each vertex on the triangle, one defines the vectors from intersection points to the vertices. Then, by taking cross products of subsequent vertices, one can come up with how much area each triangle being formed from the intersection point to triangle vertex. These areas are later used as weighing factors while calculating the normal as a weighted sum of vertex normals. If one does not normalize the final result, the tap will pour blue water. It is one of the miracles of this scene.


The dynamic dragon seemed to be taking too much time but later I figured out there is a weird issue. After some time spending around %90 of the CPU, the ray tracer process was just hanging but I see no PNG output. I was too afraid to lose the data, but when I hit the enter button on the console, it writes the PNG file and shuts down. The console application does not take any data from the user, so it is quite absurd. For some reason, the process just hang and my input got it back to work.


The metal glass plates scene is great for benchmarking and testing purposes. At first, there was not any glossiness on the mirror object and there was a greenish shade on the dielectric materials. I figured out that I applied Beer's Law on the wrong side of the equations. A lot of times during debugging, I single out a pixel by putting a condition for that pixel and step line-by-line to figure out the underlying causes. In the past, this method seemed tedious to me, but now I came to appreciate that it saves more time than I consume.

 
The render times for these scenes are as following:
simple_transform.xml: 0.001898 seconds
cornellbox_boxes_dynamic.xml: 132.183 seconds
metal_glass_plates.xml: 18.6802 seconds
spheres_dof.xml: 3.92529 seconds
cornellbox_brushed_metal.xml: 38.9784 seconds
tap_0200.xml: 47.4469 seconds
dragon_dynamic.xml: 1+ hours

For a wrap-up, here are some key insights:
1) Depending on the GLM version, the functions may or may not take radians. There is even a macro for whether or not to force radians in calculations. 
2) While creating the transformation matrices, for positions, make sure to homogenize with 1 as the fourth component. 
3) For normals, take the upper 3x3 matrix by merely casting the inverse transformation matrix to glm::mat3x3. Then take the transpose with glm::transpose function before multiplying it with the normal in the local space.
4) Directions should be homogenized with 0 as their fourth component.
3) GLM stores its matrix in column-major order.
4) For instances, one should plan ahead how to handle motion blur later on. Carelessly diving into implementation would result in a deep copy of the meshes.
5) The bounding boxes should have an allowance for motion blur component so that it will cover both the starting and ending position of the object.
6) In my implementation, I considered the motion blur component to be in the world space. This requires bringing it back to the object's local space for intersection tests.
7) To apply the transformations in a given order, the multiplication order is reversed.
8) To create an orthonormal basis, one can take the smallest component of the normal at the reflection point, set it to 1 first. We then first cross the reflection direction vector with that vector to find the first basis. Cross-product of the first basis again with the reflection direction yields the second basis.
9) After creating the orthonormal basis, perturbing the reflection direction is to normalize the direction vector, first basis, and second basis vectors, where the first and second basis vectors are multiplied by random numbers between -0.5 and 0.5, as well as the roughness parameter of the material.
10) In the smooth shading, the normal at each point on the triangle depends on the weighted sum of the vertices and their distances to the intersection point. 

Comments