Project 1 - Jason Yan

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Jason Yan

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a Russian photographer who captured three exposures of each scene onto a glass plate using red, green, and blue filters. The aim of this project is to digitize Prokudin-Gorskii's glass plate images and, through image processing techniques, automatically generate a color image.

Implementation 1: Alignment

The first thing we need to do is divide the image into three equal parts by height to separately obtain the blue, green, and red plates. Then, we need to overlap the blue, green, and red plates. Since these three plates are not identical, we need an algorithm to better align them. We can move the images along the x-axis and y-axis for comparison. I chose to keep the blue plate stationary and move the red and green plates to align with the blue. I chose a window size of 20 (translate +/- 20 horizontally and +/- 20 vertically). Additionally, we need a method of evaluation to judge the alignment. I initially tried three methods: Euclidean Distance, Normalized Cross-Correlation, and Mean Absolute Error. Among these, Normalized Cross-Correlation generally performs better in most cases.

aligned_cathedral.jpg. R: (12, 3), G: (5, 2)

aligned_monastery.jpg. R: (3, 2), G: (-3, 2)

aligned_tobolsk.jpg. R: (7,3), G: (3, 3)

Implementation 2: Image Pyramid

The second feature I implemented is the Image Pyramid. In addition to handling JPG images, we also work with large jpg images that contain many pixels. Using our previous method on these large images would be computationally intensive. Therefore, I introduced the align_pyramid function. This function begins by aligning downscaled versions of the images, which allows for the use of larger window sizes due to the reduced image size. After alignment, we upscale the images back to their original size and scale up the shifts accordingly. We then perform finer alignment on the upscaled images using smaller window sizes. Specifically, I used a three-level pyramid for this process, with each level downscaled by a factor of two. The largest downscaled image uses a window size of 24, the intermediate level uses 12, and the top level uses 6. This pyramid approach enables efficient processing of large jpg images.

Aligned Church.jpg. R: (58, -4), G: (25, 4)

Aligned Three Generations.jpg. R: (111, 10), G: (53, 13)

Aligned Melons.jpg. R: (177, 14), G: (81, 11)

Aligned Onion Church.jpg. R: (108, 36), G: (51, 27)

Aligned Train.jpg. R: (86, 33), G: (43, 8)

Aligned Icon.jpg. R: (90, 23), G: (41, 17)

Aligned Self Portrait.jpg. R: (175, 37), G: (78, 29)

Aligned Harvesters.jpg. R: (123, 14), G: (60, 18)

Aligned Sculpture.jpg. R: (140, -27), G: (33, -11)

Aligned Lady.jpg. R: (116, 11), G: (54, 8)

Bells & Whistles (Extra Credit)

Auto Cropping

This feature enhances image processing by trimming away unnecessary borders and focusing on the primary subject matter. By calculating the average intensity across rows and columns, it determines the cut-off points to eliminate irrelevant portions within defined intensity thresholds (Black: <50, White: >250). This method effectively minimizes distractions, improving our alignment algorithm. This is especially true for images with large borders, such as "monastery.jpg".

Before Auto Cropping: monastery.jpg

After Auto Cropping: monastery.jpg

Central Region Extraction

Because calculating image similarity can be time-consuming, Central Region Extraction only processes the middle 50% of the image for alignment. If the central part aligns correctly, the rest of the image should also align properly. This approach significantly speeds up our computations and also fixes our border issue.

Auto Contrasting

Auto contrasting adjusts the brightness and contrast across an image to ensure that the pixel values fully utilize the available intensity spectrum. It scales the pixel values based on the image's lightest and darkest points, thus enhancing the visual details and improving our alignment.

Sobel Edge Detection

After completing the previous features, only one image, "emir.tif", still had some issues with alignment. Therefore, I implemented edge detection to advance and refine the alignment process. Compared to using RGB similarity alone, Sobel edge detection is more advanced and precise. It computes the image intensity gradients at each pixel, effectively outlining transitions in color and material.

Before Sobel Edge Detection: emir.jpg

After Sobel Edge Detection: emir.jpg