SCAN-TO-PDF: Benefits and Considerations

Scan-to-PDF

Scan-to-PDF is something that a growing number of CAD users want to do with a wide format scanner. The popularity of PDF file format has slowly replaced traditional scanned image raster file formats like BMP, TIFF and JPEG for sharing drawings, archiving and other purposes.

All scanning software supplied with the major makes of large format scanners now save scans as PDF files. While this highlights the growing popularity of the PDF file format among CAD and scanner users, what exactly are the benefits of saving scans as PDF files and do these benefits extend to scanned technical drawings?

PDF - What and why?

Adobe Systems Inc have been historic innovators in the graphics software market. In 1993 they created the Portable Document Format (PDF) file for document exchange. This became ISO 19005-1 in 2005 and later ISO 32000 in 2008.

Adobe's PDF file format was originally intended for general purpose DTP graphics and office work, like saving artwork and marketing brochures as small files for email or web use, as well as converting general office documentation, like orders and invoices for storage. CAD functionality came late but its use in CAD is now widespread. Despite the best efforts of Autodesk to promote their DWF file format as a true CAD file viewing alternative to PDF, many AutoCAD users, perhaps the majority, choose to use Adobe's popular and easily viewable file format.

Why do CAD users want to save drawings as PDF files?

They do so for the following benefits, all of which are true.

  • PDF is popular, unbiquitous and free to view.
  • PDF files can be shared with customers and colleagues.
  • PDF is supported by a plethora of inexpensive writers and readers.
  • PDF files can be marked-up with comments without affecting the original data.
  • PDF files enjoy flexible printing controls.

There are other perceived benefits that may or may not be true. These are:

  • PDF files retain some intelligent vector CAD information
  • PDF files are secure
  • PDF files are small and are suitable for emailing and web usage

Let's examine these perceived benefits from the standpoint of scan to PDF creation and raster to vector conversion.

Some PDF files include intelligent CAD information

There are essentially three types of PDF file that are used in CAD work: vector, raster and a mixture of both (hybrid).

Vector PDF

Vector PDF files are created inside a CAD program. Historically, you would use any one of a number of third party programs to save (or plot) CAD drawings to PDF files.

The most obvious product to use to turn a CAD drawing into a vector PDF file is Adobe Acrobat but as this is generally regarded as an expensive option the majority of CAD users use something more affordable, like AcroPlot Pro or Bluebeam PDF Revu, two professional alternatives. There are also many other solutions that are significantly cheaper or free. Today, many CAD programs will save PDF directly.

To all intents and purposes vector PDF files look as crisp and sharp as any DWG, DXF or DGN vector file when you zoom into them. Depending on the software used to create them, vector PDF files can contain some intelligent CAD information which can be accurately extracted by a competent PDF to DXF converter for easy editing in CAD.

Raster PDF

Raster PDF files are created when a technical drawing or document is scanned and saved as PDF using the scan-to-email or scan-to-PDF options in scanning software.

While scan-to-PDF users gain some of the benefits of saving PDF, such as ease of viewing and printing, for all practical purposes a scan-to-PDF file is just a raster file, like a TIFF or JPEG, enclosed in a PDF wrapper. It is a flat, unintelligent scanned image made up of thousands of square pixels each representing black or white or various colors. A raster file is as "dead" as the original drawing on paper.

The image quality of a scan-to-PDF file is subject to the exactly same requirements that determine the quality in any other type of scanned image, especially where automatic raster to vector conversion is concerned. Make a poor quality scan and you save a poor quality PDF file. The poorer the PDF file's scanned image quality, the more unlikely you are to get a good quality raster to vector conversion.

Hybrid PDF

Hybrid PDF is created within a CAD or graphics program by combining vector CAD or text with a scanned image, like a map or photo. The vector CAD or text data can be accurately extracted by PDF to CAD conversion but the raster is more difficult and is subject to the image quality constraints of raster to vector conversion.

PDF files are secure

Many companies thought PDF secure and saved their valuable proprietary CAD drawings as PDF files because they did not want them to be copied.

Today, there is a entire software industry dedicated to supplying programs that will convert vector PDF files to DXF or DWG. Because vector PDF files contain vectors, i.e. real entities like lines, they are relatively easy to convert back to CAD.

Because they are made up of pixels and lack any intelligence, raster PDF files are more difficult to convert into accurate CAD drawings than vector PDF files. Raster PDF files are scanned images made up of thousands of black and white or colored squares (pixels). They either have to be redrawn by hand or interpreted back to lines, arcs, circles, etc. using raster to vector conversion software.

Because of the ease with which vector PDF is converted back to CAD and the difficulties of converting raster PDF files to CAD, some companies now supply scan-to-PDF files as a means of frustrating those who would copy them.

PDF files are small and are suitable for emailing and web usage

The size of a raster PDF file is determined by the size of the drawing; the amount of detail on it; its scan resolution; whether it is scanned in monochrome, grayscale or color; the compression format used to save the scan within the PDF wrapper; and the scanning software used to create the PDF file.

Depending on the nature of the drawing you are scanning and your future requirements - for example viewing, archiving, printing or conversion to CAD - you may need to create a PDF file that is much larger than one that can be easily emailed.

Size of the drawing

All other factors being equal, larger drawings will create larger PDF files.

Amount of detail on the drawing

The more detailed the drawing, the bigger the file. We recently scanned two A1 monochrome architectural drawings on the same scanner at 300 dpi.

Drawing A (less detailed) - 134 kb PDF file
Drawing B (more detailed) - 404 kb PDF file

Scan resolution

The lower the scan resolution, the smaller the PDF file will be. As an example, we scanned an A0 mechanical drawing in monochrome at different resolutions. This created PDF files of the following sizes:

200 dpi - 102 kb
400 dpi - 210 kb
600 dpi - 330 kb

If you are only going to be viewing or printing the file you can typically get away with a lower resolution, like 200 dpi, than if you are intending to convert the file to CAD. For raster to vector conversion to CAD you need to scan your drawing at a high enough resolution, anything from 200 - 600 dpi, so that parallel entities are separated by clean white space and small details are captured in sharp detail.

NOTE: If you are unsure about the future use of your file, remember that if you scan at too high a resolution you can always reduce the resolution later, but that if you scan at too low a resolution you can never add or recover the detail you have lost without scanning again.

Monochrome, grayscale or color

JPEG compressed PDF file
A black and white drawing scanned in
color at too low resolution, then saved
as PDF using JPEG compression with

consequent blurring and speckle artifacts.

Monochrome scans are much smaller than color scans. For example, we scanned the same A1 architectural drawing at 300 dpi in monochrome and color:

Monochrome - 134 kb PDF file
Color - 5 Mb PDF file

We have seen color PDF files of CAD drawings up to 60 Mb in size. By no stretch of the imagination is this a small, emailable PDF!

If you have a black and white technical drawing you should always scan it using your scanner's monochrome scan settings. It sounds obvious, but you'd be amazed at the number of CAD users who scan black and white drawings using their scanner's color settings to produce a file many times larger than it needs to be and then reduce its quality by saving it using JPEG compression. Is this simply operator error or ignorance? (Scanners4CAD will be looking at the need for large format training soon.)

Compression methods

When your scan is saved within the PDF wrapper it is compressed to make it smaller. Different compression methods make larger or smaller PDF files. Some scanning software allows you to select the PDF compression method, others do not.

Typically the scanning software supplied with a scanner saves monochrome scans using TIFF Group 4 compression and color scans using JPEG compression. Both of these are effective compression schemes, but both have hidden issues that you need to be aware of.

  • TIFF Group 4 compression

    Some software supplied with large format scanners, including Contex, Oce and Xerox, gives you the option to save monochrome scans as "stripped" TIFF Group 4 PDFs. The danger is that you may inadvertently select this option. Avoid it.

    Stripped PDF file
    Clicking on a stripped Scan-to-PDF
    file in Adobe Reader 8 selects a

    thin horizontal slice.

    If you use the stripped option the scanned image in your PDF file will be split into a series of separate, horizontal strips. This is a hangover from the early days when PC resources were smaller and images were loaded piece-by-piece. Although a stripped PDF image appears to be a single image when you load it into Adobe Reader 8, if you click on the image to select it only a narrow horizontal strip is highlighted.

    Many programs that read PDF files will not read stripped images, or if they do read them, they will not display them as a single image. So, when scanning to PDF it is important to ensure you have selected the correct settings before saving the PDF file. Failure to do so may not be immediately obvious to you now but may result in unfortunate consequences later. Again, it's a case for training.

  • JPEG compression

    JPEG uses "lossy compression" where it loses or discards data that it thinks you can do without. This causes it to decrease the quality of scanned drawings by blurring the details and adding speckle artifacts.

    This is probably not an issue if all you want to do is view or print your scanned PDF files. However, if you want to perform raster to vector conversion on them you should avoid JPEG compression if your scanning software gives you the option to do so. This may mean that the PDF file you create is larger.

Scanning software used

Different scanning software can create different sized PDF files even when using the same compression method. We recently received a PDF file that was almost 12 Mb in size. The same PDF file saved using the same compression method in another program was just 400 kb.

The reason for the difference in file size was that in addition to the scanned drawing, the 12 Mb file contained information such as how to display colors on different systems. While this might be relevant for graphics files where color is important, it is irrelevant to technical drawings and just inflates the size of the PDF file.

If PDF file size is important to you, get a test drawing scanned on different scanners using different scan software and choose the solution that creates the smallest PDF files while retaining the quality you need for your intended end use.

Conclusion

Before CAD users scan-to-PDF they first need to have a firm idea of:

  • What the scan's ultimate purpose will be.
  • Which of the benefits of PDF they want to gain.

While scan-to-PDF users will gain some of the benefits of PDF such as ease of viewing and printing, for all practical purposes a PDF file created using scan-to-PDF is just a raster image - usually TIFF or JPEG - in a PDF wrapper.

As such, raster PDF files are subject to the same scanned image quality issues as any normal raster file and, like any normal raster file, they contain no intelligent information and can only be converted to CAD by redrawing or using a raster to vector conversion program.

Perhaps most dangerous misconception is that PDF files are small and emailable. In trying to make PDF files as small as possible, users may degrade the level of detail and quality and make them unsuitable for CAD use in the future.

If the ultimate use of the PDF file is quality dependent, such as archiving or raster to vector conversion, the ONLY consideration should be to create a PDF file with the best possible image quality. Only image quality can provide a scan to PDF user with useful long-term results.