The summer after my sophomore year, I did an internship at OSRAM OptoSemiconductors GmbH, a leading manufacturer of LEDs. Thousands of wafers go through the manufacturing process each day, and they are tracked by manually writing down wafer numbers, which is prone to many sources of error. Fortunately, high-quality images of each wafer are taken at various stages of the process for quality control of the produced chips. The goal of my project was to apply image processing techniques to perform optical character recognition (OCR) on the numbers and letters that uniquely identify a wafer to be able to automatically track the wafers through the production process.
This post describes why the OCR failed to solve this problem and how I solved it eventually by taking a radically different approach.
The images were taken by a microscope, which produced very low-contrast images where the difference between the foreground and background was within 10 gray-scale values. This property of images made it extremely difficult for a program to differentiate between the unique identifier of a wafer and the background. Furthermore, the images were taken at different brightness levels and the location of the unique identifier on the wafers was not consistent from wafer to wafer, which made the problem even more challenging.
I implemented various image processing algorithms to perform OCR:
- cross-correlation – samples of images of numbers and letters are compared to the characters in the test set of images of wafers,
- differentiating foreground from background by calculating gradient in the gray-scale values,
- edge detection, and others.
However, they produced inconsistent results due to the problems with the images mentioned above. I also used machine learning to train the system on 200 images for each letter and number. The images used for training were the images of wafers in production. The system performed at 75%-90% success rate on the test set, which was not satisfactory for the production requirements, as one of the main requirements is that the solution should be robust and reliable. In addition, the only way to obtain the training set was to manually crop the images to produce the representation of letters and numbers in a specific font and with particular image quality. Needless to say, this method was too expensive and unmaintainable.
data matrix is engraved onto the wafer which encodes the unique identifier. A Data Matrix is a two-dimensional square or rectangular matrix barcode that consists of black and white dots, which represent binary 0s and 1s. ECC200 data matrices, which is the standard I worked with, are Reed-Solomon codes, which contain redundant symbols in addition to the symbols in the label. They can detect and correct multiple random errors, up to ⌊r/2⌋ symbols, where r is the number of redundant symbols added to the code.Once I discovered that none of the standard OCR techniques will produce satisfactory results, I realized that a different approach should be taken. Luckily for me, there was another way. When a wafer is labeled with a unique identifier, a
I noticed that although the location of the label varied from wafer to wafer, all wafers had one feature in common: the label was located on the flat edge of the wafer. So this was the solution to the problem:
- find the flat region of the wafer and the data matrix (click on the images to enlarge):
- decode the ECC200 data matrix.
We will focus on the most interesting part of this project which is how to decode the data matrix. During my internship I had a book written in German that described the general idea behind ECC200 data matrices and specified how they are encoded and mapped for some of the common sizes. However, the particular size of the data matrix used by OSRAM OS was unusual and was not described in the book. I reverse engineered the mapping of the symbols onto the data matrix and the encoding of it, based on the information gathered from the book and the official standard specification. The standard specification document states what irreducible reducing polynomial of a particular Galouis field is used to encode the symbols in data matrix depending on the number of redundant symbols in the matrix. And now that we know what field we are in, we know where to go.
Each symbol in the label maps to its binary representation of 8 bits (usually, its ASCII code plus a magic number). In addition, the data matrix contains redundant symbols that enable error correction to recover the symbols on the label in case a part of the wafer’s label is chipped off during the production. In order to further improve reliability of erasure recovery in the case of mechanical damage to the edges of wafers, the binary representation is mapped onto the data matrix in L-shaped configuration and some of the actual and redundant symbols are distributed across the edges of the matrix to further protect from the data matrix from losing too much information. See the diagram below for more information.
The data matrix consists of the frame on the borders and the middle of the matrix. For larger data matrices like this one the area is split into 2 segments. There are 10 symbols in the label which are mapped onto the wafer. In addition, there are 11 redundant symbols distributed across the data matrix resulting in a total of 21 symbols on the data matrix.
Once the symbols are correctly located, Peterson-Gorenstein-Zierler decoder is used to decode the label. I used the theoretical decoder described in the Wikipedia link to implement the decoder in MATLAB. The steps of the algorithm are outlined below:
- read the codeword and locate all the label symbols and redundant symbols;
- determine whether there are any errors in the codeword by calculating the syndrome;
- calculate number of errors present if detected they are present;
- calculate locations of errors;
- correct the errors, that is, find the “offset” of the symbol;
- correct the codeword and display it.
And voilà, we read the label of the wafer.