Overview
This project is a peek into the world of rasterization. Through the implementation of a simple rasterizer, I explored texture mapping,
antialiasing techniques such as supersampling, hierarchical transforms, and more. These forays ultimately culminated in a functional vector graphics renderer.
Over the course of the project, I learned quite a bit about what goes on behind the scenes of a rendered image. Aliasing effects like jaggies and texture pixelation, the lack of which I took for granted in daily life,
were eye-opening in their ability to completely change how I perceived an image. Consequently, there's something to be said about the unsung heroics of antialiasing, a job few notice if done well, a job that sticks out like a
almost abrasively if done poorly or not at all. While the problem of triangular interpolation seemed almost intractable at first glance; barycentric coordinates seem almost trivial in hindsight. These moments
of enlightenment are what drive me to continue learning--not to mention the fact that I find computer graphics fascinating in general.
Section I: Rasterization
Part 1: Rasterizing single-color triangles
Rasterization is the process of taking a continuous valued function (in this case the triangle formed by 3 points) and discretizing it such that it is displayable by the frame buffer, which stores data for individual pixels. I accomplished this by
determining if a given sample point was inside or outside the triangle, and coloring the associated pixel accordingly. The sample points were defined to be the center of each pixel (e.g., if the pixel was at location (i,j), the sample
point would be (i+0.5, j+0.5). Inside triangle test details:
-
Before checking any points, I first made sure than the sides were winded in the correct order (counter-clockwise).
-
For each sample point p and each side vector represented by the two points <pi,pk> with no loss of generality, I dotted the normal of the side vector with the vector created by <pi,p> (computationally equivalent to the cross product of the
vector created by <pi,pk> and the vector created by <pi,p>. If the computations for a particular sample point and all sides of the triangle came back nonnegative, I considered the sample point
inside the triangle, and colored it accordingly.
Rasterizer Performance:
-
Points were only checked within the bounding box of the triangle. The bounding box was defined by the top left point being the min x and min y out of all the points of the triangle, and the bottom right point
being the max x and max y out of all the points of the triangle.
-
Partial Extra Credit: I noted that within each row, you could only enter and subsequently exit a triangle once. We can take advantage of this fact by breaking out of the inner
loop the moment we stop passing consecutive inside triangle tests (note that before we start passing the tests, we don't do anything special).
Timing using clock(): ~1.7ms with enhancement; ~1.7ms without enhancement -> no discernible effect
Screenshot:
Notice the jaggies especially prominent along the edge of the red triangle. Due to the
binary treatment of the pixels being either in the triangle or not (undersampling the high frequency), we get a very undesirable staircase effect.
Part 2: Antialiasing triangles
To combat the issue of aliasing artifacts, we employ supersampling, a 1-pixel box filter approximation.
This antialiasing technique smooths out some of the harsh edges we saw in part 1 by averaging the color of multiple samples within the pixel, reducing the high frequencies that
result in jaggies.
Implementation Details:
-
Let sample_rate be the requested supersampling rate, and let sqrt(sample_rate) be the square root of sample_rate.
-
I began by splitting each pixel up into sqrt(sample_rate) by sqrt(sample_rate) sub-pixels, with each sub-pixel of size
1/sqrt(sample_rate) by 1/sqrt(sample_rate), starting at the top left corner.
-
Note that because we wanted to center at the center of these sub-pixel squares, the final expression for the sample location in sub-pixel at (k,l) inside
pixel (i,j) was ((i + (k + 0.5f) / sqrt(sample_rate)), (j + (l + 0.5f) / sqrt(sample_rate))).
-
Unlike in part 1, where color was assigned to the entire pixel after a single sample was taken, in part 2, color assignment took place at the sub-pixel level.
-
Final resolution of each pixel's color was done after the entire image was supersampled and each sub-pixel was colored appropriately.
Screenshots:
Sample Rate of 1
|
Sample Rate of 4
|
Sample Rate of 9
|
Sample Rate of 16
|
By averaging sub-pixel values, supersampling removes the high frequencies associated with the red triangle corner and smooths the jaggies out.
Extra Credit: For personal interest, I implemented jittered sampling. To give some background, jittered sampling is a variant of grid supersampling
where the sample point for each sub-pixel, instead of being at the sub-pixel's center, is randomly chosen within the sub-pixel.
Use of the grid allows jittered sampling to ensure a even distribution across the pixel, which is not guaranteed with purely random sampling.
The results were mixed to say the least. As seen below, formerly solid colors
began exhibiting issues at the shared edges between triangles, presumably due to the randomness in the sampling process. To elaborate,
suppose a subpixel encompassed an edge shared between two triangles. Since sampling is random, there is a possibility that both
sample points within the sub-pixel (one for each triangle) fall in the "outside triangle" category, causing the sub-pixel to remain uncolored (white). Hence
we get traces of edges due to the final supersampling averaging process. The effect is diminished as the number of samples within a pixel increases
due to the same averaging process (the white edge pixels subsequently carry less weight as the total number of sub-pixels increases). To conclude, by fixing the location
of the sample points (even if they're not centered), the sub-pixel is guaranteed to land in one triangle or the other (or on the edge, which is taken care of by an implementation of
top-left edge rules), a guarantee that cannot be made by selecting the sample points randomly.
Extra Credit Screenshots:
Jittered Sampling, Sample Rate of 1
|
Jittered Sampling, Sample Rate of 4
|
Jittered Sampling, Sample Rate of 9
|
Jittered Sampling, Sample Rate of 16
|
Part 3: Transforms
The transform matrices themselves were largely modelled after the discussion about transforms during lecture.
Cubeman is inspired by a running man. I wanted to convey motion, so I drew from sources like exit signs and pictures of running people
to produce a result that feels like cubeman is going somewhere.
To accomplish the effect I wanted, I took advantage of hierarchical transforms and various combinations of scaling, rotation, and translations.
Cubeman Screenshot:
Man in Motion
Extra Credit: I added functionality to keys 'e' and 'r' to rotate the viewport. To do so, I modified the SVG to NDC matrix by multiplying it with a rotation matrix and then a translation matrix
upon relevant keypress. The translation matrix is used to recenter the window after the rotation is performed.
Extra Credit Screenshots:
Section II: Sampling
Part 4: Barycentric coordinates
Barycentric coordinates allow us to perform linear interpolation across a triangle by introducing a coordinate system based on proportional distances from
each of the vertices. The resulting coordinates can be used as weights in a linear combination of the vertices to get
the desired interpolated value.
Linearly interpolated triangle
Implementation details:
Let P be the point we want to calculate barycentric coordinates for, and let A, B, and C be the vertices of the triangle.
- To calculate the coordinate corresponding to A, take the dot product of the vector defined by P and B with the vector defined by B and and C.
- To calculate the coordinate corresponding to B, take the dot product of the vector defined by P and C with the vector defined by C and and A.
- To calculate the coordinate corresponding to C, take the dot product of the vector defined by P and A with the vector defined by A and and B.
Finally, we can weight the colors of each of the vertices by their corresponding barycentric coordinate to get the interpolated value.
Screenshot:
Color wheel created using barycentric coordinates
Part 5: "Pixel sampling" for texture mapping
Pixel sampling is a way for us to determine how to apply a texture in cases where the pixel to texel mapping is not one to one. In particular, pixel sampling when we're dealing with multiple
pixel samples per texel, or magnification.
The two strategies we could use are nearest-pixel sampling, which eponymously returns the color of the nearest texel to the texture coordinate we want, and bilinear sampling, which linearly
interpolates the color from the four nearest texels.
Implementation details:
Both sampling methods start by using the barycentric coordinates of a point inside the triangle
to find said point in uv coordinates. Since uv space is typically defined by the unit square, we'll need to scale the u and v coordinates
by width and height respectively to get the corresponding location in the texture map.
Nearest-pixel sampling can be done by just flooring the resulting location in the texture map.
Note that a floor is taken since the nearest-pixel is with respect to the center of the texel.
Bilinear sampling is slightly more complicated. We begin by first finding the four nearest texels. To get the nearest bottom right texel, we round both x and y values of the scaled texture coordinate.
The top left coordinate can be found by subtracting 1 from both the x and y coordinates of the bottom right. Finally, the remaining two coordinates (top right, bottom left) can be
extrapolated. Next, we calculate the offset of the scaled texture coordinate with respect to the centers of these four pixels by extracting the decimal portion of the scaled point minus 0.5.
Finally, using the offsets and texture values of the four nearest texels,
we can use linear interpolation to compute the value our scaled texture coordinate should take. In my implementation, I interpolated the values horizontally first, getting a value for directly above the
desired coordinate and one for directly below, and then interpolated those vertically.
Screenshots:
Nearest-Pixel, Sample Rate of 1
|
Nearest-Pixel, Sample Rate of 16
|
Bilinear Interpolation, Sample Rate of 1
|
Bilinear Interpolation, Sample Rate of 16
|
Nearest-pixel is far more pixelated than bilinear interpolation at a supersampling rate of 1--
at higher supersampling rates the gap is narrowed, but very slightly. Supersampling does not appear to benefit
bilinear interpolation all that much; notice that the only difference between the two bilinear interpolation images
appear in the grid lines below the pixel inspector.
When the viewing resolution matches the texture resolution, the differences between the two sampling methods are marginal--
nearest-neighbor may win out over bilinear interpolation's smoothing effect due to differences in computational speed. The marginal difference
results from the fact that if each pixel has one corresponding texel, nearest neighbor will faithfully reconstruct the
texture. However, in cases of magnification of high frequencies (edges), bilinear interpolation will be far superior to
nearest-pixel. This superiority results from nearest-pixel's binary-esque nature--a pixel can only be nearest to one texel, resulting in
sharp transitions across edges (pixelation). The idea is similar to the reason why we saw jaggies in parts 1 and 2 at low sample rates. Bilinear interpolation
resolves the issue by linearly interpolating the texture value for the pixel from the four nearest texels, transforming the
harsh edges into a smooth linear gradients.
Part 6: "Level sampling" with mipmaps for texture mapping
Like pixel sampling, level sampling allows us to determine how to apply a texture in scenarios when the pixel to texel mapping is not one to one. In particular, level mapping
is useful during minification, in which we have precomputed the texture at resolutions decreasing by powers of 2 (the set of which is called a mipmap) by applying a low-pass filter and
downsampling. Each resolution is considered a level of the mipmap, starting at level 0, which is the base texture. Increasing the level can be thought of as increasing the distance to the
texture--intuitively, as an object gets farther away, the amount of detail (and subsequently resolution) of the image should decrease. To determine the level of a particular sample point, we'll want to look at how the
scaled uv coordinates change as we move to its neighbors and find the resolution that most closely maps these movements one-to-one with texels.
Often times, however, the sample point does not correspond perfectly to a single level. To address such cases, in this project, I implemented 3 level sampling methods: level 0, nearest-level, and linear interpolation.
In level 0 sampling, all samples are taken from the 0th level mipmap. Nearest-level sampling chooses the closest level. Finally, linear interpolation interpolates between the two closest levels.
Implementation Details:
All sampling methods started by determining the sample point's mipmap level (given that the sample point is within the triangle). To do so,
I began finding the derivatives of the scaled texture coordinates u and v with respect to x and y (effectively the Jacobian). This was accomplished by
looking at uv coordinates of the sample point immediately below and the sample point immediately to the right, and finding the relevant differences.
Using the derivatives as a heuristic for screen footprint area, I could then calculate their corresponding euclidean distances, upon which taking the
log base 2 of the maximum would give me the precise mipmap level.
To perform level 0 sampling, I simply ignored the calculated mipmap level and used level 0 for all texture sampling.
To perform nearest-level sampling, I rounded the mipmap level, clamping it in between 0 and the maximum mipmap level, max(floor(log2(width)), floor(log2(height))).
I then used the rounded level for all texture sampling.
Finally, to perform linear mipmap sampling, I took texture samples from the two nearest mipmap levels, and then linearly interpolated the results using the
fractional part of the calculated mipmap level as the weight.
Tradeoffs between Pixel Sampling, Level Sampling, and Supersampling:
In terms of speed, a low rate of supersampling and nearest-pixel pixel sampling will always be the fastest at all levels of zoom. Depending on the speed of memory access, it may beneficial to use nearest-level sampling
as opposed to level 0 if speed is your top concern. Speed decreases as you increase the supersampling rate (more sub-pixels to compute per pixel), switch to bilinear pixel sampling (for every sample point, you'll need to
linearly interpolate 4 texels), or use linear level sampling (for every sample point, you have to look at 2 mipmap levels instead of 1). A combination of these sampling techniques will decrease speed multiplicatively.
As for memory usage, bilinear pixel sampling, linear level sampling, and high rates of supersampling are the most memory intensive. Supersampling is probably the biggest culprit--as sample rates
go up, so does the memory needed to store all the computed sub-pixels. Bilinear pixel sampling needs to load 4 texels into memory for every sample point, compared to the 1 texel for nearest pixel. Finally,
although mipmaps are precomputed, linear level sampling must evaluate 2 levels at a time versus the 1 for nearest level or level 0. If memory is a constraint, consider using a low rate of supersampling, nearest pixel
sampling and nearest level or level 0 level sampling. These memory usage considerations apply at all levels of zoom.
Antialiasing power is perhaps the most interesting subject. Supersampling is probably the most universal antialiasing tool at one's disposal, although it does experience
diminishing returns during magnification. The antialiasing power of supersampling is directly related to its sample rate (the higher the rate, the higher the power). In cases where the image is close to its source resolution or during minification, it acts like a low-pass filter, reducing jaggies
and other aliasing effects. It's least effective at high levels of magnification, where the lack of detail in source image renders high rates of supersampling essentially negligible.
Nearest-pixel sampling is most effective when the texture resolution matches the image resolution. In cases of magnification,
the binary nature of nearest pixel will lead to pixelation, which can be smoothed out using a bilinear interpolation approach. However, both nearest-pixel and bilinear interpolation struggle with minification due the
nature of needing to reconcile many texels with a single pixel (note that nearest-pixel struggles moreso than bilinear, since the former only uses 1 texel and the latter uses 4). This is where level sampling comes in. Level 0 sampling is best when the texture resolution matches the image resolution, or during case of magnification,
where there isn't a problem of too many texels. Nearest-level sampling is best at high levels of minification, when we don't suffer from aliasing effects due to harsh transitions between mipmap levels. Linear level sampling
works best at middling levels of minification, where we might be between mipmap levels and would benefit from a smoothing out of their differences (edges). Note that linear level sampling and nearest-level essentially
collapse into level 0 sampling when there is no minification.
Screenshots: