Scanno

From DPWiki
Jump to: navigation, search

A scanno is an incorrect character in an OCR text.

  • EXAMPLE: table in the page image appears as tahte in the OCR text.

One of Distributed Proofreaders's (DP's) primary goals during the proofing process is to correct all of the scannos in the OCR text.

  • See the Proofing Advice page, which contains links to lists of common scannos in various languages.

DP tracks specific types of scannos to make the process of identifying them easier. Below is a list of specific types of scannos:

stealth scanno

A stealth scanno (also stealtho) is a specific type of scanno that occurs when a character forms a valid word in the OCR text, but is not the word that appears in the page image.

  • EXAMPLE: and in the page image appears as arid in the OCR text.

Distributed Proofreaders's coining if this term was inspired by stealth bombers, which are undetectable by radar, because stealth scannos are undetectable by normal spellcheck utilities.

ftealth fcanno

See ftealth fcanno article.

Proofers and Project Managers

See the Proofing Advice page, which contains links to lists of common stealth scannos in various languages.

Post-Processors

Post-processing tools useful for finding and correcting stealth scannos include: