A raster to vector conversion (vectorization) aims to fulfill two objectives. The first entails converting paper-based line drawings into CAD-formatted numeric data that can be reused. The second is to enable pre-processing in order to make symbols much more pronounced such that there’s somewhat of a match between the output and input images.
The conversion relies on algorithms, which are in plenty. Notably, when Osamu Hori and David Doermann were preparing their study, in 1995, more were still being published in the research literature.
However, Hori and Doermann observed that although researchers were still coming up with vectorization algorithms, there was a problem because, at that time, no quantitative analysis system existed to evaluate how the raster to vector algorithms performed. Instead, most of the analysis took a qualitative approach. The downside to this approach is that the resultant evaluations were mostly subjective.
Hori and Doermann, therefore, established a research gap, which they explored in their study titled Qualitative Measurement of the Performance of Raster-to-Vector Conversion Algorithms. They probed three approaches to vectorization, namely thinning, medial line finding, and line fitting, using a proposed methodology that delineated existing algorithms’ shortcomings.
Thus, their contribution to the world of research was the methodology used to evaluate the performance of algorithms through quantitative analysis for raster to vector conversions.
This article details what this methodology entailed, how it probed the algorithms’ performance, and how using it to evaluate the algorithms demonstrated their shortcomings.
The researchers anchored their research approach on several key observations. First, they noted that no generic measurement exists that doesn’t depend on the intended application. To this end, they stated that evaluating an algorithm should follow this template by first stipulating the requirements of the application for which the converted image will be used.
The authors’ main aim was to develop a way to evaluate the raster-to-vector conversion’s output. They would do this by ensuring that the input data uses similar parameters as the output system. This would ease the comparison aspect of the evaluation since both the outputs’ and inputs’ characteristics would have similar constraints. In this regard, it would be possible to establish how well the output conformed to the ideal input representation.
Hori and Doermann’s proposed evaluation method involved the following steps:
Specification of the ideal input data.
For the input data to be considered ideal, it had to contain patterns that conformed to the requirements of the application for which it would be used, i.e., whether it would be used for electrical or mechanical purposes. We’ll detail why this matters later.
Perturbation entails using a constraint to alter the normal state of a system. In their case, Hori and Doermann used a perturbation model Haralick had developed in an earlier study. The model described how images are altered and how the altered (perturbed) images are generated from the ideal input data. The alteration would be through blurring, skewing, binarization, etc.
The primary purpose of utilizing the perturbation model was to obtain data that represented the application domain. For their study, the perturbation was meant to mimic/simulate the effects of photocopying binary images. Notably, the model was an algorithm.
The perturbed images were then vectorized, and the output data was compared to the input data. This is where the purpose or application of the vectorized image comes in. For their study, Hori and Doermann concentrated on mechanical and electrical applications.
In electrical diagrams, the exact location and thickness of lines are not required. Instead, what an electrical engineer or draughtsman would look at is the connections between electrical symbols such as transistors and diodes. But this is not the case with regards to mechanical drawings. In mechanical drawings, the midline and thickness of a line image should be as accurate as possible.
Given the differences between mechanical and electrical diagrams, it follows that the deviation of the input and output images would be different as well. For the former, the difference in line thickness and position were considered as the deviation/error. On the other hand, the latter’s deviations were measured through the errors made in recognizing the connections. The summary of these three steps is shown in the image below.
System of Evaluating Performance
As stated earlier, the framework that Hori and Doermann used was based on Haralick’s framework. They instantiated Haralick’s work, thereby tweaking it for the performance evaluation application. The result was a system that operated was made up of 4 processes.
Specification of the application output feature set
The researchers focused their approach on mechanical engineering drawings, which, as you know, hold paramount the accurate representation of a line. Nonetheless, a line is defined by different parameters, which, when combined, make it as accurate as possible, especially during and after the vectorization process.
The primary parameters include:
- 2 end-points
- The line thickness
- Whether the line is solid, dotted, or broken
For their study, Hori and Doermann included additional parameters because they had observed that the three primary ones were not sufficient to evaluate the algorithms’ performance. To this end, they added crossing points, corners, and T-junctions. These additional elements significantly affect the line quality and act as feature points in line drawings.
Generation of Artificial CAD Models
Hori and Doermann used input images containing two types of lattice patterns, namely rectangular and parallelogram (shown in the image below). They contended that indeed lattice patterns did not feature prominently in mechanical engineering drawings. However, lattice patterns were made up of all the requisite elements for evaluation, i.e., corners, T-junctions, end-points, and crossing points. Thus, their inclusion in the study was warranted.
Notably, the eventual evaluation process would take into account each of these elements’ positions in addition to the line thickness.
The rectangular lattice pattern comprised three line thicknesses (3 pixels, 5 pixels, and 7 pixels). On the other hand, the parallelogram pattern, which was meant to test how algorithms handle angles, had three different angles (30°, 38°, and 45°).
The CAD models were generated using algorithms that utilized one of three vectorization approaches: line fitting, medial line finding, and thinning.
Line Fitting Algorithms
A line fitting algorithm takes thin lines and parallel contours into account. It then chooses the resultant vector by finding the best fitting line among all the thin lines and parallel contours.
Medial Line Finding Algorithms
A medial line finding algorithm first retrieves all image contours as pixel chains, which it then approximates using polygons. It then defines the vectors by finding the mid-points of two parallel contours. Notably, these mid-points are essentially the mid-points of perpendicular lines joining the parallel contours.
However, this algorithm is somewhat deficient because it doesn’t extract all the mid-points, given that parallel contour lines are not usually present at corners, crossing points, and T-junctions. Additionally, to again identify these T-junctions, corners, and crossing points, post-processing is required.
The thinning algorithms use an approach that removes pixels until only the pixel chain whose width is one unit remains. Such an algorithm then converts this thin chain into vectors. This approach does not require post-processing because T-junctions and crossing points appear as points where various end-points meet.
Generation of Synthetic Image Data from CAD Models
This process relied on the X-windows library and the numeric CAD data. Notably, X-windows is a system that provides a graphical user interface across multiple computers, thereby making it possible for various users in different networks to link their work.
But that’s not why Hori and Doermann used the X-windows library. Their main reason was that X-windows has customizable graphical capabilities that enable users to create icons and bitmaps, draw lines or circles, fill shapes, etc. Simply put, it helped them create bitmap image data from line images.
But that’s not all. The creation of the bitmap image also relied on the numeric CAD data, which was manipulated using a perturbation model in order to simulate real-world conditions. In their case, the model degraded the quality of the images by introducing noise. The image below shows the degraded image quality because of perturbation.
The combination of these various conditions, models, and sub-processes created synthetic image data from the CAD models generated in the previous step.
Processing and Evaluation
The synthetic images were compared with ideal CAD models. Notably, because the ideal models were perturbed when generating synthetic images, differences arose. Therefore, the evaluation process focused on establishing the magnitude of the differences using feature points – corners, T-junctions, crossing points, and end-points – and the thickness and length of lines as the basis.
The evaluation of the feature points and lines provided insight into the algorithms’ performance, albeit indirectly. Why so? Well, firstly, a matching ratio had to be obtained for the lines and feature points.
The point matching ratios were derived by first matching each input feature point to the nearest output feature point within a circle whose radius was given by the formula:
t=w2+e where t is the radius, w is the thickness, and e is the allowable margin of error. The image below shows the circle, input and output, lines, and a corner.
The distance (Euclidean distance) between the input feature point and the matching output feature point was calculated as the deviation.
On the other hand, the line matching ratio was calculated by first matching the output lines and the input lines. Secondly, the length of both lines was measured. In this regard, the line matching ratio refers to the ratio of the total length of the output lines to the corresponding input lines’ total length. The distance between them is the line deviation.
Hori and Doermann did not observe any difference with regards to the positional deviations among the three algorithms.
However, the medial line finding algorithm was heavily impacted by image quality degradation. At zero noise, the T-junctions and crossing point had a point matching ratio of 100% when the 3-pixel line thickness was analyzed. But the percentage went down to 82% and 62% for the 5- and 7-pixel line thicknesses, respectively, when noise was introduced. The researchers concluded that this algorithm wasn’t ideal for images whose quality is questionable.
The researchers observed a point deviation with the parallelogram lattice pattern, especially when the line fitting algorithm was used to extract T-junctions. The thinning algorithm was even more impacted because of the acute angles and the fact that the thinning process introduces some errors with regards to the location of the T-junctions. Thus, with the thinning algorithm, the T-junctions and crossing points were greatly distorted.
The parallelogram lattice was also problematic for the medial line finding algorithm since the end-point joining algorithm had trouble joining end-points. This resulted in the distorted lines shown below.
Through their study, Hori and Doermann came up with a quantitative method of analyzing the performance of algorithms used for raster to vector conversion. Besides filling an existing research gap, they contributed a methodology that can be extended to analyze algorithms that extract symbols, circles, arcs, and strings. Such algorithms are common in analyzing engineering line drawings.