Common Fraktur OCR errors

From DPWiki
Jump to navigation Jump to search

Some of the most common OCR text errors are listed below. You will find yourself getting better proofing results if you keep these in mind while you work. Double-checking for these common errors can make a BIG difference!


The Fraktur alphabet

Fraktur alphabet.gif

Similar characters

Some characters have a very similar appearance:

  • f | ſ (long s) (yes, there are two different characters for 's'!)
  • a | u | n
  • c | e | o
  • t | k
  • i | l
  • d | ck (Ligature)
  • w | sch (too many vertical lines?)
  • M | W
  • A | U
  • R | K
  • E | G
  • h | y

See also this site for a discussion of similar letters in Fraktur.

'long s' vs. 'f'

The 'normal' s is used at the end of a syllable, the long 's' (ſ) elsewhere. If in doubt, this may help you to proof the correct word: aus vs. auf, ausſteigen vs. aufſteigen etc.

For long s in general: Proofing_old_texts#Long_s|"Proofing Old Texts" page

very seldom correctly recognized characters

x

Numbers

Sometimes numbers are unreadable in a fraktur-font. The numbers are recognized as dirt and do not appear.

strange (=older) writings

Especially in German texts: the OCR software often uses a spellchecker with a modern dictionary, so words that have changed in spelling may be OCRed with the modern spellings instead of what is on the page:

  • words earlier written with 'th', now are written only with 't' (e.g. eigenthümlich -> eigentümlich, roth -> rot). The OCR results miss the 'h's.
  • The same with umlaut-dots: kömmt -> kommt

Be sure to proof it according to the image, not modern spelling.

Proper Names

  • People and Place-names are often spelled in unexpected ways.
  • Double-check all proper names to be certain that what got OCR'd as "Banks" isn't really "Bariks" ("ri" / "n" is a really common OCR error).

Fixing common OCR errors in Preprocessing when Providing Content

Even with hand-trained patterns, OCR programs have their problems with fraktur font. In order to fix the most common errors, frakprep should be used.

External links

  • Fraktur tool: like DP's Greek transliteration tool, but for the fraktur alphabet
  • Script Teacher: a site for learning fraktur (and other) fonts. Select the first checkbox in the "Font Face" options to practice fraktur (as opposed to other blackletter fonts).