Categories
Raster to Vector Guides

Thresholding Explained: The Vital Tool for Converting Raster Images to Vector

Converting raster images to vector images (a process referred to as Vectorization) is very important especially when dealing with maps, schematics, and technical drawings.

Although still used as the Web standard, raster graphics (including JPEG, BMP, TIFF, GIF, and PCX files) tend to be less economical, cumbersome to work with, less versatile, slower to display or print, and cannot be used for CAD or CNC. 

That is why you need to convert the raster files to vector files since vector files are easily used for CAD and CNC.

Vector images are defined using points on a Cartesian plane that are interlinked using lines and curves to form different shapes. They generally have clean lines and the shapes can be scaled to any desirable size.

And although artists can manually convert raster files to vector files, automated computer algorithms are the fastest and most efficient when it comes to Vectorization.

Before an automated computer algorithm starts performing the actual raster to vector conversion, the raster image is first prepared for the conversion through several techniques that include color reduction, noise reduction, and thresholding.

In this post, we shall take a closer look at thresholding for raster to vector conversion and why it is a vital tool during vectorization.

What is thresholding?

The automated computer algorithm performing the Vectorization first makes a decision on the parts of the image it shall set to white and those that it shall set to black to make the image/file easier to vectorize. That is where thresholding comes in.

Thresholding involves reducing the number of colors in grayscale or color files/images to black and white to create a sharp distinction between the black backgrounds and white backgrounds.

The resulting images after vectorization are commonly referred to as bitonal or binary images; images that are purely black and white. Black represents the presence of color while white represents the absence of color. The black areas show the edges of specific features while the white parts could be parts of the body (if they are surrounded by the black edges) or just the background (if outside the black edges). 

Thresholding raster to vector conversion explained
Thresholding raster to vector conversion explained (source)

Why thresholding is used in raster to vector conversion

We have stated above that thresholding reduces the colors of an image or file to only two colors; black and white, which are the colors used in most technical drawings.

If you are a mechanical engineer, electrical engineer, architect, or any other technical professional, you understand that the technical drawings produced for manufacturing or construction sites are mostly in black and white.

If you are working in a CNC shop, for example, mechanical engineers and engineering designers will design a 3D part in CAD, and probably depending on the color of the selected material the 3D model could be differently colored. However, after completing the design and testing, a 2D technical drawing is produced with the necessary dimensions to provide the machinists with the instructions they require to produce the part.

A technical drawing example
A technical drawing example (source)

Thresholding makes the foreground, which is the drawing/image black, and the background white resulting in a file where the object is defined in black lines only.

Regardless of the thresholding method used, the black drawing lines should be as solid as possible (without breaks) while the white background should have as little dirt as possible (as white as possible).

Types of thresholding

Most scanner software that used to scan files for use in CAD and CNC use either of the two types of thresholding; simple thresholding and adaptive thresholding. 

Simple thresholding

In simple thresholding, a single color value (every color has a numeric value) is chosen as the threshold. Mostly, the threshold is set by moving a slider.

Then all colors whose values are lower than the threshold value are set to black while those with a value higher than the threshold are set to white.

If the threshold is set too high or too far to the right in the case where a slider is being used, characters like texts may bleed (the white spaces in between them become filled with black color). The background could also become filled with lots of dirt since any color in the background would be converted to black.

On the other hand, if the threshold is set too low or too far to the left in case a slider is being used, most of the characters would disappear since they would be converted to white.

It, therefore, calls for the use of a moderate threshold. One that will ensure that the thin lines and broken lines are more solid without causing bleeding or disappearance of some features/characters. That way, the resulting drawing will have solid edges and a clean background.

At the same time, it could be difficult to get a threshold that allows all the features and characters in your image to be visible after applying the threshold. Depending on the color of some features and characters, some characters and features may appear black and you do not want them to appear black while others may appear white and you do not want them to appear white. This is the main challenge when it comes to using simple thresholding.

Simple thresholding is best used on less complex files/images that do not have a widely varying contrast or varying illumination intensity.

Adaptive thresholding

It is also called local thresholding or dynamic thresholding.

In Adaptive thresholding, the threshold is dynamically changed over the image depending on the lighting condition of the different parts of the image. It is therefore the best thresholding option for images with changing lighting conditions like in cases where there are shadows.

The image is first subdivided into small areas over which the threshold shall change dynamically.

A statistical method is then used to calculate a local threshold that the software shall use in each of the small areas. Mostly, this is done by statistically examining the intensity values of the neighborhoods of every pixel and calculating the median or mean value using the intensity distribution of the input image. 

Where T is the mean/median.

This way the lighting conditions of the entire image are taken care of and every character or feature is therefore taken care of. The max value is taken from regions with the highest pixels while the min value is taken from regions with the lowest pixels.

This statistical computation of getting the local threshold however takes place behind the scenes. The scanning software will only ask the user to select a window threshold and a background threshold by moving two different pointers. The software’s algorithm then takes over the work using the input image to process the image using a dynamically calculated local threshold that accommodates the two thresholds (window and background) thresholds.

The window threshold in some cases referred to as the foreground threshold stipulates the size of the local areas that the input image shall be divided into when processing. The background threshold on the other hand is usually a constant subtracted from the local threshold calculated for each of the local areas.

All the colors with pixels higher than the local threshold in a local area are set to white while the colors whose pixels are lower than the local threshold are set to black. Therefore, the higher the background threshold, the more the colors that the software shall set to white since it will subtract a large value from the calculated local threshold, which in turn reduces the local threshold.

Conclusion

Thresholding is a vital tool in raster to vector conversion especially when it comes to scanning colored images to convert them into technical drawings.

Typically, technical drawings (such as floorplans) have a black foreground (object edges) and white background. By applying thresholding on images, the images can be properly converted into vector files that can be easily used in CAD programs and even CNC machines.

After going through this post, you now understand the type of thresholding you should choose and why when converting different raster files to vector files.