The terms scan, scans, scanner, and scanning are used in many places in multiple ways at DP.
"Scan" and "scans" (n.) usually refer to the image files created by Content Providers (occasionally referred to as "scanners" [n.], in the sense of people who scan [v.]), who use hardware known as "scanners" (n.) to "scan" (v.) the individual pages of a book or other textual material. This process is referred to as "scanning" (v. or gerund). In other words, "scans" are the results of running a "scanner" or "scanning." (Sometimes Content Providers harvest scans from other online sources instead of scanning them themselves.)
OCR software is used to create an OCR text from the scanned images (scans). As a project begins its journey through DP's rounds, the proofers working in P1 compare each page's OCR text to its original scan. Thus, "the scans" are the foundation of the e-texts produced by DP.
The DP workflow is focused, at its core, on individual scanned images of the pages of books (and other published works). Associated with every page of a DP project is a page scan for each page in the original work. If the original work is illustrated, higher resolution illustration scans will also be part of the project.
Many projects are produced using scans available online, but if you have qualified books which you would like to see run through DP, you may wish to do the scanning, yourself.
Before you begin
- You will need access to a scanner.
- Please review the Content Providing FAQ, and the Page scans and Illustration scans wiki page.
The term "resolution" as applied to images, is more relevant to printed images. A low-resolution printed image will have less detail, a high-resolution printed image will have more detail. Printed images, especially photographs, are where the concept of "dots per inch" or "dpi" comes from.
Digital images are measured in pixels. Depending on the amount of detail in the printed image, an image scanned at 600 pixels per inch (ppi) will have more detail than an image scanned at 300 ppi.
We commonly use "high-resolution" to refer to images that have been scanned at 400+ ppi, and "low-resolution" to refer to images that have been scanned at fewer ppi, though the quality and size of the original should be taken into account when determining what the appropriate scanner setting is. A large image with no fine detail can be adequately scanned at 300 ppi, but a small image with a lot of fine detail may need to be scanned at 600 ppi to adequately preserve the detail.
You will find the terms dpi, DPI, ppi and PPI used interchangeably at DP.
What kind of scanner?
It depends on how much you want to spend. There are a wide variety of different options, but if you want to scan books, here are some things to take into consideration:
- Will you be scanning bound books, with a need to keep them intact, or scanning destructively?
- Is a Letter/A4-sized scanner good enough, or will you want Legal (8.5" x 14") or even Ledger/A3-sized. The cost goes up quickly.
- For scanning books, it's easier on the book if you have a book-edged scanner, where the glass goes all the way to the edge of the scanner. This feature also adds cost.
- Digital Cameras and DIY overhead scanners using them have become popular over the past several years, but may not be as easy to work with.
- From the old FAQs: You probably will want to avoid "handheld" scanners where you run the scanning lens down the page of text. They require a smooth steady motion which can be difficult to do once or twice let alone the 300 or 400 times to do a full length book. Some are also not wide enough to do a page in a single scan and need the images "stitched" back together; a process that can be painstaking and time consuming.
Technology chages quickly. If you're interested in buying a scanner, ask for advice in the Providing Content forum and do some searching on the internet.
If there are only a couple of books you're interested in scanning, and you live close to a library with scanners, investigate that as a possible option.