Raster to vector glossary

The material in this glossary is the copyright of Softcover International Limited. If you want to use this glossary on your web site, please email postmaster@softcover.com. This glossary may not be copied or used in full or in part without our written permission, an acknowledgement of Softcover International Limited's copyright and a link to this website. Thank you!

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

 

1-bit - raster image containing 2 colors, usually black and white. This is the norm for raster to vector conversion. A 1-bit image is said to have a color depth of 2.

4-bit - raster image that can contain up to 16 colors. A 4-bit image is said to have a color depth of 16.

8-bit (grayscale) - raster image containing up to 256 shades of gray ranging from pure black (0) to pure white (255) with 254 gray shades in between. An 8-bit image is said to have a color depth of 256.

8-bit (color) - raster image containing up to 256 colors. An 8-bit image is said to have a color depth of 256.

24-bit - raster image that can contain up to around 16.7 million colors. A 24-bit image is said to have a color depth of 16.7 million.

If you want to do raster to vector conversion you should normally avoid 24-bit images.

Many scanners - especially desktop scanners - are set up to create 24-bit images by default. This means that it is easy to accidentally create 24-bit color scans even when scanning black and white drawings.

The disadvantage is that the file is many times larger than it needs to be (it contains "space" for 16.7 million colors even if only a few colors are used) and requires more processing than a black and white raster image.

32-bit - raster image that can contain up to around 4 billion colors. If you want to do raster to vector conversion you should avoid 32-bit images.

48-bit - raster image that can contain up to around 281 trillion colors. If you want to do raster to vector conversion you should avoid 48-bit images.

A

acquire - term used to scan an image from a TWAIN compliant scanner.

adaptive threshold - see threshold.

AEC - architecture, engineering, construction. raster to vector conversion is frequently used in AEC.

angle optimization - process in raster to vector conversion whereby lines that deviate slightly from specified angles (e.g. 90 degrees, 45 degrees) are re-aligned to exactly those angles.

ANSI - American National Standards Institute. Defines ANSI paper sizes that are used by many US architects and engineers and are as follows:

A 11 x 8.5"
B 17 x 11"
C 22 x 17"
D 34 x 22"
E 44 x 34"

anti-aliasing - method of smoothing jagged pixel outlines that adds additional fainter pixels around the edges to create the illusion of a smooth outline. Mostly used in graphics software and in low resolution images designed for web display. Not recommended for images destined for raster to vector conversion.

arc - portion of the circumference of a circle.

architectural paper sizes - range of paper sizes often used by US architects:

A 12 x 9"
B 18 x 12"
C 24 x 18"
D 36 x 24"
E 48 x 36"

ASCII - American Standard Code for Information Interchange. A code where text characters are represented by numbers, for example the letter A = 65, B = 66 etc.

ASCII text file - file where the data is stored as ASCII characters. Sounds ghastly but is actually just a very simple text file of the type you would save from or open up in Windows Notepad. Also often known as a text file, plain text file or just an ASCII file.

aspect ratio - the relative proportion of width to height of an image.

artefact - any unwanted feature in a scanned image, like speckles or noise.

auto deskew - a function whereby a raster image that has been scanned in skew can be straightened out (deskewed) automatically. Note that deskewing a raster image decreases its quality, so it is better to scan the drawing in straight to begin with than to rely on an auto deskew function.

B

batch conversion - conversion of many raster images to vector automatically and consecutively without operator intervention.

Bezier curve - type of spline.

bit - abbreviation for "binary digit", either 0 or 1. Used in the description of color depth. For example:

  • In a 1-bit raster image each pixel is represented by one bit which can either be black (0) or white (1).
  • In an 8-bit image each pixel is represented by eight bits giving 256 combinations of 1s and 0s that represent 256 colors.
  • A 24-bit image uses three 8-bit channels (red, green and blue - RGB) making over 16.7m colors (256 x 256 x 256).

bit depth - the same as color depth.

bitmap - uncompressed raster file format with .BMP extension. Sometimes used as a synonym for raster image.

BMP - file extension for a bitmap.

buffer - an area of computer memory used to temporarily store data.

C

CAD - computer-aided design.

CALS - type of raster image commonly used by the military. CALS files can only be black and white (1-bit). There are two types of CALS file - Type 1 and Type 2. CALS Type 1 files contain one image only and are compressed using lossless fax group 4 compression. CALS Type 2 files can contain multiple images and may or may not be compressed. CALS files can have various extensions including .CAL, .GP4, .CG4 and .MIL.

CAM - computer-aided manufacture.

center line tracking - method of raster to vector conversion where the converter places vectors along the center of lines on the raster image as opposed to around the outsides (outline tracking).

raster image Raster line
center line tracking Center line tracking (vector shown in black)
outline tracking Outline tracking (vectors shown in black)

CG4 - one of several possible CALS file extensions.

capture - to record an image digitally either by taking a photograph or by scanning.

CMYK - A color model in which all colors are a mixture of Cyan, Magenta, Yellow and blacK. The standard color model used in full color offset printing or four-color printing.

CNC - computer numeric cutting. Cutting of shapes and profiles from metal, etc. Raster to vector conversion is frequently used in CNC for converting raster images to vector outlines that can be cut with a CNC machine.

color depth - the number of colors that a device or raster image can hold or process. The greater the color depth, the greater the number of colors the image can potentially contain and the greater the file size. For example:

  • A raster image with a color depth of 2 (also called 1-bit) can contain up to 2 colors.
  • A raster image with a color depth of 16 (also called 4-bit) can contain up to 16 colors.
  • A raster image with a color depth of 256 (also called 8-bit) can contain up to 256 colors.
  • A raster image with a color depth of 16.7 million (also called 24-bit) can contain up to 16.7 million colors.

color gamut - the range of different colors that a device can capture, print or display. The more colors, the wider the color gamut.

color mode / model - how color and tonal information is represented. Grayscale, Lab, HSL, RGB and CMYK are examples of color models.

color reduction - reducing the number of colors on a raster image. This can be either automatic (you tell the software program how many colors you want and the program reduces the number of colors automatically) or manual (you manually group colors together and assign new colors to them).

Note that reducing colors is not the same as reducing color depth. For example, an 8-bit image with a color depth of 256 can include any number of colors up to a maximum of 256 colors. If you reduce the number of colors in the file it is still an 8-bit file unless you also reduce the color depth.

color space - subset of colors within a color model that encompasses a particular gamut (range of colors). An example is sRGB which describes a subset of colors within the RGB color model.

compressed file - file that has been reduced in file size by compressing the data it contains. The reduction in file size that can be achieved depends on the level of detail in the file and the compression method used. Note that compression only affects the size of the file when it is saved. As soon as you open it up in a software program such as your raster to vector converter, it expands back to its uncompressed size.

compression - technology that enables data to take up less space - a smaller file size makes data easier to store and faster to transmit.

Raster image files can be compressed in two ways: lossless and lossy. When lossy compression is used, file quality is reduced.

When you save a raster image it is either compressed or not depending on the file type you choose to save it as. For example:

  • If you save a raster image as a BMP file, it is not compressed. This means that its quality is retained but that the file takes up a lot of space on your hard disk.

  • If you save a raster image as a JPEG file, it is compressed using lossy compression. This means that the file loses quality but takes up less space on your hard disk. We do not recommend JPEG files for raster to vector conversion.

  • If you save a raster image as a TIFF (packbits compressed) or TIFF (fax group 4 compressed) file, it is compressed using lossless compression. This means that the file quality is retained and it takes up less space on your hard disk. We recommend TIFF (fax group 4) or TIFF (packbits) as the most suitable file types for raster to vector conversion.

control lines - lines that determine the shape of a Bezier curve.

control points (Bezier) - "handles" used to move control lines in order to edit the size and shape of Bezier curves.

control points (warp) - source-destination point pairs that define how to warp a raster image. A source point is a point to be moved during a warp. Each source point has a corresponding destination point, which is the point that the warp process will try to move the source point to.

contrast - the range between the lightest and darkest tones in an image; high contrast means a larger difference in brightness between the lightest and darkest areas as compared to low contrast.

coordinates - a set of numbers used to identify the location of a point.

corner sharpening - process during raster to vector conversion where the converter sharpens corners:

unsharpened corner
Unsharpened corner
sharpened corner
Sharpened corner

crop - reduce the size of an image by removing its outer edges.

D

deskew - a function whereby a raster image that has been scanned in skew can be straightened out (deskewed). Note that deskewing a raster image decreases its quality, so it is better to scan the drawing in straight to begin with than to rely on a deskew function.

desktop scanner - small A4 - A3 flatbed scanner common in offices and homes.

despeckle - the automatic removal of unwanted dirt or noise from a raster image's background.

destination point - see warp.

device driver - a software module that tells your operating system how to control a given piece of hardware, such as a scanner.

DGN - a type of vector file, the native file format of the CAD program Microstation. Note that Microstation can also read DXF files.

dithering - method of simulating gray tones or colors in black and white images by making them up out of dots. Some poorly scanned images or scans of poor quality drawings can look as though they are dithered when lines that ought to be solid are made up of dots. Raster to vector conversion will not work well on dithered images.

document management - the storage, indexing and control of digital documents.

dot - a unit used to represent the smallest element a printer can image. Often used as a synonym for pixel.

dots per inch - DPI. The resolution of a printed page, expressed in the number of printer dots in an inch. Often used as a synonym for pixels per inch.

DPI - Dots Per Inch.

DWG - a type of vector file, the native file format of the CAD program AutoCAD. Note that AutoCAD can also read DXF files.

DXF - Data eXchange Format, a type of vector file designed by AutoDesk for exchanging drawings between different CAD programs. DXF files can be read by all CAD and CNC packages, including all versions of AutoCAD.

E

E size / E-size - large US paper size, closely equivalent to ISO A0 size.

EDM - Electronic Document Management. A computer based system used for managing electronic and paper-based documents; a means of checking documents and versions into a system and searching for them.

EDRMS - Electronic Document and Records Management System; allows RM rules to be applied to electronic documents as they are entered into such a system.

EMF - Extended (Enhanced) Windows Metafile Format, a type of vector file.

F

fax group 3 - type of lossless compression used to compress 1-bit (black and white) TIFF files. Also called CCITT group 3 or G3. It was originally used for encoding fax transmissions.

fax group 4 - type of lossless compression used to compress 1-bit (black and white) TIFF files and CALS files. Also called CCITT group 4 or G4. It was originally used for encoding fax transmissions. Fax group 4 is the type of compression most commonly used for compressing 1-bit TIFF files.

file format - description of the type of file; BMP, TIFF, GIF and JPEG are examples of different file formats.

flatbed scanner - scanner in which the sheet to be scanned is laid flat on a glass surface and the light source moves below the glass.

font training - allows you to train a program that does OCR to recognize a font that it doesn't recognize by default, e.g. a hand-written font. Often uses neural network technology.

G

gap jumping - process during raster to vector conversion where the converter jumps gaps in the raster image in order to create unbroken vectors.

georeferencing - process of matching up points on a raster or vector image with actual geographic coordinates.

geoTIFF - a type of TIFF raster file that contains geographic information such as coordinate and scaling data.

GIF - Graphics Interchange Format. A 256 color raster file format that uses lossless compression, usually used for web graphics. Not normally recommended for raster to vector conversion as GIF files are typically too low resolution to be useful.

GIS - geographic information systems.

GP4 - one of several possible CALS file extensions.

grayscale - see 8-bit (grayscale).

H

hand-held scanner - small scanner moved by hand; briefly popular but now less so due to problems of movement, quality and the stitching of several small images into a larger one.

hidden layer - the part of a neural network that does the learning during font training. It takes example characters from the input layer and learns to match them up with the characters you are training the neural network to recognize, which are listed in the output layer.

HPGL - Hewlett-Packard Graphics Language, a type of vector file created to drive Hewlett-Packard plotters.

I

input layer - the part of a neural network that presents example characters to the hidden layer for learning during font training.

interpolation - a method of increasing or decreasing the resolution of a raster image by adding or removing pixels. When the resolution is increased, new pixels are added between existing pixels and their color is calculated on the basis of the colors of the pixels surrounding them.

IMG - type of raster file format.

IS0 - International Standards Organisation. Defines ISO paper sizes which are used in Europe and which are as follows:

A4 210 x 297mm
A3 297 x 420mm
A2 420 x 594mm
A1 594 x 841mm
A0 841 x 1189mm

J

JPEG / JPG - Joint Photographic Experts Group. A compressed raster file format. Compresses images by decreasing image quality. Not recommended for scanned CAD drawings.

K

L

layers (neural network) - neural networks are made up of three parts called layers - the input layer, the hidden layer and the output layer.

large format - any drawing size bigger than A3. Aka wide format.

large format scanner - any scanner larger than A3. Aka wide format scanner.

landscape - orientation of a drawing in which the longest dimension is horizontal.

legacy data - valuable historic data, documents and technical drawings.

legal size - US paper size; the equivalent of but smaller than A4 paper size.

line art - images typically consisting of black and white lines or solid areas.

lossless compression - see compression.

lossy compression - see compression.

M

mark-up - the addition of comments onto a scanned drawing or CAD file, also called annotation or redlining.

MIL - one of several possible CALS file extensions.

monochrome - may mean black and white (two colors) or black and white AND grayscale.

N

negative image - raster image where there are white lines on a black background.

neural network - a technology that works in an analogous way to the human brain to learn to recognize shapes and patterns from a range of examples called a training set. Some raster to vector converters use neural network technology for font training.

nodes - components of a neural network.

noise - undesired interference with the conversion of light during digital capture causing visible disruption to the integrity of the image; seen on a scanned image as dirt or speckles against what should be the drawing's clear white background.

O

OCR - Optical Character Recognition. The automatic process of converting scanned characters into editable ASCII characters or CAD text. Without OCR, the scanned characters would be made up of lots of little lines and arcs and would be much more difficult to edit.

Optical Character Recognition - see OCR.

optical resolution - the number of pixels per inch recorded by a scanner's image sensor. The greater the optical resolution, the greater the detail that can be captured from the image.

outline - see center line tracking.

output layer - the set of characters you are training a neural network to recognize.

P

palette - the range of colors contained in a raster image. Only raster images with color depths of 256 or less are said to have palettes.

PCX - type of raster file format rarely used now.

PDF - Portable Document Format file. An open standard Adobe file format. Allows the representation of scans, CAD drawings, etc., to be shared with other people but not edited. There are two types of PDF file, raster and vector. Raster PDF files are normally created by scanning a paper drawing and saving it as PDF. Vector PDF files are normally created by saving PDF from a CAD program. Raster PDF files can be converted to CAD vector formats such as DXF using raster to vector conversion.

pixel - the smallest element in a raster image; refers to a single square within a scan or digital photograph. Each pixel is colored or white and the different colored pixels make up the picture on the scan or photograph.

pixels
Raster image made up of pixels.
The line is represented by black colored pixels.

pixels per inch - PPI. The number of pixels per inch in a raster image, a more accurate term than dpi (dots per inch) as scanners capture square pixels not round dots.

planetary scanner - essentially a mounted camera that takes pictures of documents placed beneath it. It is particularly suitable for fragile documents as it requires less contact with them than other types of scanner. Planetary scanners are often used for scanning rare books as these can be scanned spine down and do not need to be pressed flat against the scan glass.

PNG - Portable Network Graphics file. A lossless raster file format often used as a modern alternative to Graphics Interchange Format (GIF).

polyline - entities where lots of shorter lines and arcs have been joined into one.

portrait - orientation of a drawing in which the longest dimension is vertical.

positive image - image where there are black lines on a white background.

PPI - pixels per inch.

Q

R

R2V - raster to vector conversion.

raster (raster image, raster file) - an image made up of rows and columns of pixels of two (black and white) or more colors. Scanners save images as raster files. Typical raster file formats are BMP, TIFF and JPEG.

raster to vector conversion - R2V. Process of converting raster files, which cannot be edited in CAD, into CAD-editable vector files such as DXF. Often done automatically by raster to vector conversion software. Aka vectorization.

rasterization - process of converting vector images into raster files. The opposite of raster to vector conversion.

reduce color depth - reduce the number of colors an image can contain - see color depth.

reduce colors - see color reduction.

redlining - the addition of comments onto a scanned drawing or CAD file; also called annotation or mark-up.

repro - reprographics or reproduction. The business concerned with large format color scanning, copying, sign-making and printing.

resampling - changing the resolution of an image.

resolution - the number of pixels per inch of raster image.

When you scan a drawing you can choose what resolution you want to scan it at. 200 to 400 pixels per inch (normally referred to as dots per inch or dpi) is optimal for most architectural and engineering drawings. If you are scanning small artwork or logos, you will probably need to scan at a higher resolution. If you scan at too low resolution, drawing detail will be lost. If you scan at too high a resolution you will create a file that is larger than it needs to be and will start picking up unwanted detail such as paper weave.

While you can decrease the resolution of an image after scanning, you cannot increase it and regain lost detail.

revision control - the management of multiple revisions of a design.

RGB - Red, Green, Blue. A color model where colors are made up of different proportions of red, green and blue. Used by display devices like monitors as well as scanners and many printers.

RIP - Raster Image Processor. RIP software bypasses a printer's driver to take control of the printer directly to do a better job of color management and to speed-up printing.

rotation - the process of aligning a raster image into its true orientation.

rubber sheeting - the same as warp.

S

Scan-to-File - scanning drawings and saving them directly into a folder, often with incrementally updated file names e.g. file1.tif, file2.tif, etc.

Scan-to-Copy - scanning images specifically for the purposes of printing them out; aka Copy-to-Print, Scan-to-Print.

Scan-to-Print - see Scan-to-Copy.

scanner - device for capturing images or text and converting them to a digital raster image.

simple threshold - see threshold.

source point - see warp.

speckles - unwanted dirt on a raster image's background.

speckle removal - the automatic removal of unwanted dirt or noise from a raster image's background.

spline - a smooth curve that is easily edited by moving control points attached to its ends.

T

thinning - a process often used transparently during the raster to vector conversion process. During thinning, lines on the raster image are made thinner, often until only the center of the original line remains. There are many different methods for thinning.


Original image

Thinned image

threshold - level that determines which pixels will be black in a raster image (data) and which will be white (background). If the threshold is set too low, too much of the drawing will be black and text and lines will merge into each other. If the threshold is too high, parts of the drawing will be set as background and will be lost. In a simple threshold, one level is used for the whole drawing. In an adaptive threshold, the drawing is divided up into local areas and a different level is used in each local area. This gives better thresholding results on faint and dirty drawings.

thresholding - useful scanner or software tool for tidying up an image by cleaning (whitening) the background while strengthening (blackening) the foreground drawing detail.

tiling - the process of taking many small images and placing them together to create one large image.

TIFF, TIF - Tagged Image File Format. A raster file format capable of storing compressed and uncompressed black and white, greyscale and color images without data loss. There are many different types of TIFF file including fax group 4.

training set - a set of example characters from which a neural network learns to recognize a font during font training.

transoptipolation - resolution that uses a mix of optical resolution in one direction and interpolated resolution in another.

TWAIN - standard software protocol regulating communication between software applications and popular imaging devices like scanners. If you have a scanner and it is "TWAIN compliant" (virtually all scanners are TWAIN compliant these days) you will be able to use it to scan directly into a software application that supports TWAIN. This means that after you scan, the scanned image will appear in the application already open. If your scanner was not TWAIN compliant you would need to scan using the scanner's supplied software, save to file (TIFF or BMP or whatever), then load the file into your application using your application's "Load" or "Open" command.

U

V

vector - A scaleable image using mathematical coordinates to define the start and end points of each line, etc. Unlike a raster image, it can be edited in CAD.

vectorization - see raster to vector conversion.

vectorization settings - the variables that control a raster to vector conversion. For successful conversion, different images and applications need different vectorization settings. For example, architectural floorplans tend to be made up of horizontal and vertical lines that the raster to vector converter needs to try to keep exactly horizontal and vertical. Maps on the other hand are made up of curves and lines that go in all directions that the raster to vector converter must not try to make exactly horizontal and vertical.

W

warp - a process whereby a raster image is deformed by moving specified points on the image (source points), together with all the parts of the image attached to those points, to new positions (destination points). It can be likened to sticking pins through a stretched rubber sheet and then moving the pins. As you move the pins, the rubber sheet is deformed.

wide format - any drawing size bigger than A3. Aka large format.

wide format scanner - any scanner larger than A3. Aka large format scanner.

WMF - Windows Metafile Format, a type of vector file.

workflow - a description of business practice and processes. Workflow products automate repetitive tasks, specifically the processing of documents and drawings through a business.

X

x-coordinate - the value that tells you how far from the origin a point is on the horizontal, or x-axis.

Y

y-coordinate - the value that tells you how far from the origin a point is on the vertical, or y-axis.

Z

Zoom - increase the magnification factor on a selected portion of a viewed image.