Post-Processing German books

From DPWiki
Jump to: navigation, search

„doppelte Anführungszeichen“

„German quotation marks“ are proofed as »inward pointing guillemets« because the character set that we use at PGDP does not contain the German quotation marks. How do we PP these?

Text version:

It is not necessary to provide a UTF-8 text version for the sole reason of having the German quotation marks. Just keep the guillemets.

HTML version:

Most people keep the guillemets for HTML version. In HTML the German quotation marks are possible. This is up to the PPer. Keep it as guillemets or change it, but be consistent within a book.

‚einfache Anführungsstriche‘

‚einfache Anführungsstriche‘ are usually proofed like >this<. (Check your project comments.)

Text version:

We do not provide a UTF-8 text version for the sole reason of having the German quotation marks. Just keep >this<.

Note minstrel: I disagree, and personally always supply a UTF-8 version and if just for this purpose. It also lets me use n-dashes for „Gedankenstriche“. As I strongly dislike using >this markup< I rather use single quotes in latin-1 version.

Keep in mind that plain-text versions should be readable in fixed-pitch fonts, where all dashes and hyphens are the same length.

HTML version:

If you keep the guillemets for double quotes, replace the single quotes with &rsaquo; and &lsaquo;, or with the characters › and ‹ if your HTML is UTF-8 encoded. The proofed quotes must be changed, because < and > are reserved characters in HTML. The visible forms are entered as &lt; and &gt;. But don't use them as quotation marks. It's not pretty and there's no reason for it. If you change double quotes to lower and upper ones, also do so with single quotes. Mixing them (if they're not mixed in the original book) is unreasonable.

gesperrter Text

An easy way to handle gesperrt text in html is to replace <g>gesperrt</g> by <em class="gesperrt">gesperrt</em> and style the em.gesperrt class to render the <em class="gesperrt"> (emphasize) tag to suit the purpose:

 em.gesperrt {
   font-style: normal;
   font-weight: normal;
   letter-spacing: .2em;
   padding-left: .2em;

(exact letter spacing to taste). The padding is so gesperrt words won't be off-center. Further complications arise when a paragraph starts with a gesperrt word, or in the rare browsers that handle letter-spacing differently, but this will work in most situations.


Some Fraktur texts (especially older ones) have lower-case ä, ö, ü but use Ae, Oe, Ue instead of Ä,Ö,Ü. The upper case Umlauts were not included in all sets of font types. You can keep it that way or change it to real Umlauts. The choice is up to the PPer. However, if you do change them, mark it in the transcriber's notes.

em-dash vs. en-dash

Duden publishes guidelines for typesetting. The section on „Gedankenstrich“ says:

Der Gedankenstrich ist länger als der Bindestrich und in der Regel kürzer als das Minuszeichen. Gesetzt wird er mit vorausgehendem und folgendem Wortabstand.

According to Wikipedia, the ndash entity &ndash; should be used when typesetting the German „Gedankenstrich“. However, some people prefer to use the longer &mdash; as it more resembles how it looks like in the actual Fraktur type face. Whatever you do, be consistent about it within your project.

font changes

It is a typesetter's convention to typeset words or phrases in another language in another style. For books printed in Fraktur, the Antiqua font has been used for this purpose. Such font changes are proofed <f>like this</f>.

HTML version: Similarly to gesperrt text, you may use a customized <em> class to replace the (emphasize) <f> tags to reflect the font change: Search and replace <f>antiqua</f> by <em class="antiqua">antiqua</em> and style the em.antiqua class to something like this:

 em.antiqua {
   font-style: italic;

The CSS above follows modern printing conventions where such a font-change for representing foreign phrases is usually printed in italics.

Text versions: Use some ASCII symbol of your liking to mark up the font change, e.g. #Antiqua#. You might want to point out the usage of the symbol in a transcriber's note.