The Mechanics of Creation

Thoughts, ideas, reflections

Decoding Data Matrices

The summer after my sophomore year, I did an internship at OSRAM OptoSemiconductors GmbH, a leading manufacturer of LEDs. Thousands of wafers go through the manufacturing process each day, and they are tracked by manually writing down wafer numbers, which is prone to many sources of error. Fortunately, high-quality images of each wafer are taken at various stages of the process for quality control of the produced chips. The goal of my project was to apply image processing techniques to perform optical character recognition (OCR) on the numbers and letters that uniquely identify a wafer to be able to automatically track the wafers through the production process.

This post describes why the OCR failed to solve this problem and how I solved it eventually by taking a radically different approach.

Challenges of the OCR approach

Image of wafer taken by microscope The images were taken by a microscope, which produced very low-contrast images where the difference between the foreground and background was within 10 gray-scale values. This property of images made it extremely difficult for a program to differentiate between the unique identifier of a wafer and the background. Furthermore, the images were taken at different brightness levels and the location of the unique identifier on the wafers was not consistent from wafer to wafer, which made the problem even more challenging.