Distributed Proofreaders 13,418 titles preserved for the world!
179 in Jul 2008 — 186 in Aug 2008 — More...
 

Formatting Guidelines

Version 1.9.e, revised July 19, 2007       (Revision History)

Formatting Guidelines in French / Directives de Formatage en français
Formatting Guidelines in Portuguese / Regras de Formatação em Português
Formatting Guidelines in Dutch / Formatteer-Richtlijnen in het Nederlands
Formatting Guidelines in German / Formatierungsrichtlinien auf Deutsch

Check out the Formatting Quiz!

  Table of Contents
 
 
  • Formatting of the...
 
 
  • Specific Guidelines for Special Books
 
 
  • Common Problems
 
   

The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author put italics, bold text, or a footnote every third word, we mark them italicized, bolded, or footnoted. (See Printer's Errors for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line (End-of-line Hyphenation). Changes such as these help us produce a consistently formatted version of the book. The rules we follow are designed to achieve this result. Please carefully read the rest of these Guidelines with this concept in mind.

To assist the next formatter and the post-processor, we also preserve line breaks. This allows them to easily compare the lines in the text to the lines in the image.

 

Summary Guidelines

The Formatting Summary is a short, 2-page printer-friendly (.pdf) document that summarizes the main points of these Guidelines, and gives examples of how to format. Beginning formatters are encouraged to print out this document and keep it handy while formatting.

You may need to download and install a .pdf reader. You can get one free from Adobe® here.

About This Document

This document is written to explain the formatting rules we use to maintain consistency when formatting a single book that is distributed among many formatters, each of whom is working on different pages. This helps us all do formatting the same way, which in turn makes it easier for the post-processor to eventually combine all these pages into one e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about formatting and proofreading. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum in this thread.

Project Comments

On the Project Page where you start formatting pages, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start formatting pages! If the Project Manager wants you to format something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. (This is also where the Project Manager may give you interesting tidbits of information about the author or the project.)

Please also read the Project Thread (discussion): The Project Manager may clarify project-specific guidelines here, and it is often used by volunteers to alert other volunteers to recurring issues within the project and how they can best be addressed. (See below).

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other volunteers have made changes. This Forum thread discusses different ways to use this information.

Forum/Discuss this Project

On the Project Page where you start formatting pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other volunteers who are working on this book.

Fixing errors on Previous Pages

When you select a project for formatting, the Project Page page is loaded. This page contains links to pages from this project that you have recently worked on. (If you haven't formatted any pages yet, there will be no links shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make corrections or to finish formatting. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and reopen it to fix the error.

You may also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.

For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you are using.

Formatting of the...

Front/Back Title Page

Format all the text, just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.

Older books often show the first letter as a large ornate graphic—format this as just the letter.

Sample Image:
title page image
Correctly Formatted Text:

GREEN FANCY

BY

GEORGE BARR McCUTCHEON

AUTHOR OF "GRAUSTARK," "THE HOLLOW OF HER HAND,"
"THE PRINCE OF GRAUSTARK," ETC.

<i>WITH FRONTISPIECE BY</i>
<i>C. ALLAN GILBERT</i>

NEW YORK
DODD, MEAD AND COMPANY
1917

Table of Contents

Format the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. and surround it with /* and */. Leave a blank line between these markers and the rest of the text. Page number references should be retained and be placed at least six spaces past the end of the line.

Remove any periods or asterisks (leaders) used to align the page numbers.

Sample Image:

Correctly Formatted Text:





CONTENTS

/*
CHAPTER                                         PAGE

I. <sc>The First Wayfarer and the Second Wayfarer
Meet and Part on the Highway</sc>      1

II. <sc>The First Wayfarer Lays His Pack Aside and
Falls in with Friends</sc>      15

III. <sc>Mr. Rushcroft Dissolves, Mr. Jones Intervenes,
and Two Men Ride Away</sc>      33

IV. <sc>An Extraordinary Chambermaid, a Midnight
Tragedy, and a Man Who Said "Thank You"</sc>      50

V. <sc>The Farm-boy Tells a Ghastly Story, and an
Irishman Enters</sc>      67

VI. <sc>Charity Begins Far from Home, and a Stroll in
the Wildwood Follows</sc>      85

VII. <sc>Spun-gold Hair, Blue Eyes, and Various Encounters</sc>      103

VIII. <sc>A Note, Some Fancies, and an Expedition in
Quest of Facts</sc>      120

IX. <sc>The First Wayfarer, the Second Wayfarer, and
the Spirit of Chivalry Ascendant</sc>      134

X. <sc>The Prisoner of Green Fancy, and the Lament of
Peter the Chauffeur</sc>      148

XI. <sc>Mr. Sprouse Abandons Literature at an Early
Hour in the Morning</sc>      167

XII. <sc>The First Wayfarer Accepts an Invitation, and
Mr. Dillingford Belabors a Proxy</sc>      183

XIII. <sc>The Second Wayfarer Receives Two Visitors at
Midnight</sc>      199

XIV. <sc>A Flight, a Stone-cutter's Shed, and a Voice
Outside</sc>      221
*/

Blank Page

Format as [Blank Page] if both the text and the image are blank.

If there is text in the formatting text area and a blank image, or if there is an image but no text, follow the directions for a Bad Image or Bad Text.

Page Headers/Page Footers

Remove page headers and page footers, but not footnotes, from the text.

The page headers are normally at the top of the image and have a page number opposite them. Page headers may be the same all through the book (often the title of the book and the author's name), they may be the same for each chapter (often the chapter number), or they may be different on each page (describing the action on that page). Remove them all, regardless, including the page number.

A chapter header will start further down the page and won't have a page number on the same line. See the next section for a specific example.


Sample Image:

Correctly Formatted Text:
/#
In the United States?[A] In a railroad? In a mining company?
In a bank? In a church? In a college?

Write a list of all the corporations that you know or have
ever heard of, grouping them under the heads <i>public</i> and <i>private</i>.

How could a pastor collect his salary if the church should
refuse to pay it?

Could a bank buy a piece of ground "on speculation?" To
build its banking-house on? Could a county lend money if it
had a surplus? State the general powers of a corporation.
Some of the special powers of a bank. Of a city.

A portion of a man's farm is taken for a highway, and he is
paid damages; to whom does said land belong? The road intersects
the farm, and crossing the road is a brook containing
trout, which have been put there and cared for by the farmer;
may a boy sit on the public bridge and catch trout from that
brook? If the road should be abandoned or lifted, to whom
would the use of the land go?
#/




CHAPTER XXXV.

<sc>Commercial Paper.</sc>


<b>Kinds and Uses.</b>--If a man wishes to buy some commodity
from another but has not the money to pay for
it, he may secure what he wants by giving his written
promise to pay at some future time. This written
promise, or <i>note</i>, the seller prefers to an oral promise
for several reasons, only two of which need be mentioned
here: first, because it is <i>prima facie</i> evidence of
the debt; and, second, because it may be more easily
transferred or handed over to some one else.

If J. M. Johnson, of Saint Paul, owes C. M. Jones,
of Chicago, a hundred dollars, and Nelson Blake, of
Chicago, owes J. M. Johnson a hundred dollars, it is
plain that the risk, expense, time and trouble of sending
the money to and from Chicago may be avoided,

[Footnote A: The United States: "Its charter, the constitution. * * * Its flag the
symbol of its power; its seal, of its authority."--Dole.]

Chapter Headers

Format chapter headers as they appear in the text.

A chapter header may start a bit farther down the page than the page header and won't have a page number on the same line. Chapter Headers are often printed all caps; if so, keep them as all caps. Chapter Headers are usually printed in a different or larger font which may appear to be bold or spaced out, but we do not mark them as a different font or as bold or spaced text; however you should include italics or small-caps markup if it appears in the header.

Put 4 blank lines before the "CHAPTER XXX". Include these blank lines even if the chapter starts on a new page; there are no 'pages' in an e-book, so the blank lines are needed. Then leave one blank line between each additional part of the chapter header, such as a chapter description, opening quote, etc., and finally leave two blank lines before the start of the text of the chapter.

Old books often printed the first word or two of every chapter in all caps or small caps; change these to upper and lower case (first letter only capitalized).

Watch out for a missing double quote at the start of the first paragraph, which some publishers did not include or which the OCR missed due to a large capital in the original. If the author started the paragraph with dialog, insert the double quote.

Sample Image:

Correctly Formatted Text:
GREEN FANCY




CHAPTER I

THE FIRST WAYFARER AND THE SECOND WAYFARER
MEET AND PART ON THE HIGHWAY


A solitary figure trudged along the narrow
road that wound its serpentinous way
through the dismal, forbidding depths of
the forest: a man who, though weary and footsore,
lagged not in his swift, resolute advance. Night
was coming on, and with it the no uncertain prospects
of storm. Through the foliage that overhung
the wretched road, his ever-lifting and apprehensive
eye caught sight of the thunder-black, low-lying
clouds that swept over the mountain and bore
down upon the green, whistling tops of the trees.

At a cross-road below he had encountered a small
girl driving homeward the cows. She was afraid
of the big, strange man with the bundle on his back
and the stout walking stick in his hand: to her a
remarkable creature who wore "knee pants" and
stockings like a boy on Sunday, and hob-nail shoes,
and a funny coat with "pleats" and a belt, and a
green hat with a feather sticking up from the band.

Section Headers

Some texts have sections within chapters. Format these headers as they appear in the text. Leave 2 blanks lines before the header and one after, unless the Project Manager has requested otherwise. If you are not sure if a header indicates a chapter or a section, post a question in the Project Thread, noting the page number. Section Headers are often printed in a different or larger font which may appear to be bold or spaced out, but we do not mark them as a different font or as bold or spaced text; however you should include italics or small-caps markup if it appears in the header.

Other Major Divisions in Texts

Major Divisions in the text such as Preface, Foreword, Table of Contents, Introduction, Prologue, Epilogue, Appendix, References, Conclusion, Glossary, Summary, Acknowledgements, Bibliography, etc., should be formatted in the same way as Chapter Headers, i.e. 4 blank lines before the heading and 2 blank lines before the start of the text.

Paragraph Side-Descriptions (Sidenotes)

Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. Move sidenotes to just above the paragraph that they belong to. A sidenote should be surrounded by a sidenote tag [Sidenote:  and ], with the text of the sidenote placed in between. Format the sidenote text as it is printed, preserving the line breaks, italics, etc. Leave a blank line after the sidenote, so that it does not get merged into the paragraph when the text is rewrapped during post-processing.

If there are multiple sidenotes for a single paragraph, put them one after another at the start of the paragraph. Leave a blank line separating each of them.

If the paragraph began on a previous page, put the sidenote at the top of the page and mark it with * so that the post-processor can see that it belongs on the previous page, like this: *[Sidenote: (text of sidenote)]. The post-processor will move it to the appropriate place.

Sometimes a Project Manager will request that you put sidenotes next to the sentence they apply to, rather than at the top or bottom of the paragraph. In this case, don't separate them out with blank lines.

Sample Image:

Correctly Formatted Text:

*[Sidenote: Burning
discs
thrown into
the air.]

that such as looked at the fire holding a bit of larkspur
before their face would be troubled by no malady of the
eyes throughout the year.[1] Further, it was customary at
Würzburg, in the sixteenth century, for the bishop's followers
to throw burning discs of wood into the air from a mountain
which overhangs the town. The discs were discharged by
means of flexible rods, and in their flight through the darkness
presented the appearance of fiery dragons.[2]

[Sidenote: The Midsummer
fires in
Swabia.]

[Sidenote: Omens
drawn from
the leaps
over the
fires.]

[Sidenote: Burning
wheels
rolled
down hill.]

In the valley of the Lech, which divides Upper Bavaria
from Swabia, the midsummer customs and beliefs are, or
used to be, very similar. Bonfires are kindled on the
mountains on Midsummer Day; and besides the bonfire
a tall beam, thickly wrapt in straw and surmounted by a
cross-piece, is burned in many places. Round this cross as
it burns the lads dance with loud shouts; and when the
flames have subsided, the young people leap over the fire in
pairs, a young man and a young woman together. If they
escape unsmirched, the man will not suffer from fever, and
the girl will not become a mother within the year. Further,
it is believed that the flax will grow that year as high as
they leap over the fire; and that if a charred billet be taken
from the fire and stuck in a flax-field it will promote the
growth of the flax.[3] Similarly in Swabia, lads and lasses,
hand in hand, leap over the midsummer bonfire, praying
that the hemp may grow three ells high, and they set fire
to wheels of straw and send them rolling down the hill.
Among the places where burning wheels were thus bowled
down hill at Midsummer were the Hohenstaufen mountains
in Wurtemberg and the Frauenberg near Gerhausen.[4]
At Deffingen, in Swabia, as the people sprang over the mid-*

[Footnote 1: <i>Op. cit.</i> iv. 1. p. 242. We have
seen (p. 163) that in the sixteenth
century these customs and beliefs were
common in Germany. It is also a
German superstition that a house which
contains a brand from the midsummer
bonfire will not be struck by lightning
(J. W. Wolf, <i>Beiträge zur deutschen
Mythologie</i>, i. p. 217, § 185).]

[Footnote 2: J. Boemus, <i>Mores, leges et ritus
omnium gentium</i> (Lyons, 1541), p.
226.]

[Footnote 3: Karl Freiherr von Leoprechting,
<i>Aus dem Lechrain</i> (Munich, 1855),
pp. 181 <i>sqq.</i>; W. Mannhardt, <i>Der
Baumkultus<i>, p. 510.]

[Footnote 4: A. Birlinger, <i>Volksthümliches aus
Schwaben</i> (Freiburg im Breisgau, 1861-1862),
ii. pp. 96 <i>sqq.</i>, § 128, pp. 103
<i>sq.</i>, § 129; <i>id.</i>, <i>Aus Schwaben</i> (Wiesbaden,
1874), ii. 116-120; E. Meier,
<i>Deutsche Sagen, Sitten und Gebräuche
aus Schwaben</i> (Stuttgart, 1852), pp.
423 <i>sqq.</i>; W. Mannhardt, <i>Der Baumkultus</i>,
p. 510.]

Paragraph Spacing/Indenting

Put a blank line before the start of paragraphs, even if a paragraph starts at the top of a page. You should not indent the start of paragraphs, but if paragraphs are already indented, don't bother removing those spaces—that can be done automatically during post-processing.

See the Chapter Headers image/text for an example.

Multiple Columns

Format ordinary text that has been printed in two columns as a single column.

Spans of multiple-column text within single column sections should be formatted as a single column by placing the text from the left-most column first, the text from the next one after it, and so on. You do not need to mark where the columns were split, just join them together.

If the columns are lists of items, mark the start of the list with /* and the end with */ so that the lines do not get rewrapped during post-processing. Leave a blank line between these markers and the rest of the text.

See also the Indexes, Lists of Items and Tables sections of these Guidelines.

Illustrations

Text for an illustration should be surrounded by an illustration tag [Illustration:  and ], with the caption text placed in between. Format the caption text as it is printed, preserving the line breaks, italics, etc.

If an illustration has no caption, add a tag [Illustration].

If the illustration is in the middle of or at the side of a paragraph, move the illustration tag to before or after the paragraph and leave a blank line to separate them. Rejoin the paragraph by removing any blank lines left by doing so.

If there is no paragraph break on the page, mark the illustration tag with an * like so *[Illustration: (text of caption)], move it to the top of the page, and leave a blank line after it.

Sample Image:

Correctly Formatted Text:

[Illustration: Martha told him that he had always been her ideal and
that she worshipped him.

<i>Frontispiece</i>

<i>Her Weight in Gold</i>]


Sample Image: (Illustration in middle of paragraph)

Correctly Formatted Text:

such study are due to Italians. Several of these instruments
have already been described in this journal, and on the present
occasion we shall make known a few others that will
serve to give an idea of the methods employed.

[Illustration: <sc>Fig.</sc> 1.--APPARATUS FOR THE STUDY OF HORIZONTAL
SEISMIC MOVEMENTS.]

For the observation of the vertical and horizontal motions
of the ground, different apparatus are required. The

Footnotes/Endnotes

Footnotes are placed out-of-line; that is, the text of the footnote is left at the bottom of the page and a tag placed where it is referenced in the text.

During formatting, this means:

1. The number, letter, or other character that marks a footnote location should be surrounded with square brackets ([ and ]) and placed right next to the word being footnoted[1] or its punctuation mark,[2] as shown in the text, and the two examples in this sentence.

When footnotes are marked with a series of special characters (*, †, ‡, §, etc.) we replace these with Capital letters in order (A, B, C, etc.).

2. A footnote should be surrounded by a footnote tag [Footnote #:  and ], with the footnote text placed in between, and the footnote number or letter placed where the # is shown in the tag. Format the footnote text as it is printed, preserving the line breaks, italics, etc. Leave the footnote text at the bottom of the page. Be sure to use the same tag in the footnote as you used in the text where the footnote was referenced. Place each footnote on a separate line in order of appearance. Place a blank line between each footnote if there is more than one.

In some books, the Project Manager may ask that you move the footnotes in-line; read the Project Comments for instructions in this case.

See the Page Headers/Page Footers image/text for a sample footnote.

If there's a footnote at the bottom of the page with no footnote marker in the text, especially if it starts mid-sentence or mid-word, it's probably a continuation of a footnote from a previous page. Leave it at the bottom of the page near the other footnotes, and surround it with *[Footnote: (text of footnote)] (without any footnote number or marker). The * indicates that the footnote was continued, and brings it to the attention of the post-processor.

If a footnote continues on the next page (the page ends before the footnote does), leave the footnote at the bottom of the page, and just put an asterisk * where the footnote ends, like this: [Footnote 1: (text of footnote)]*. (The * indicates that the footnote ended prematurely, and brings it to the attention of the post-processor, who will eventually join it up with the rest of the footnote text.

If a continued footnote ends or starts on a hyphenated word, mark both the footnote and the word with *, thus:
[Footnote 1: This footnote is continued and the last word in it is also con-*]*
for the leading fragment, and
*[Footnote: *tinued onto the next page.].

If a footnote or endnote is referenced in the text but does not appear on that page, keep the footnote/endnote number or marker and surround it with square brackets [ and ]. This is common in scientific and technical books, where footnotes are often grouped at the end of chapters. See "Endnotes" below.

Original Text:
The principal persons involved in this argument were Caesar1, former military
leader and Imperator, and the orator Cicero2. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

1 Gaius Julius Caesar.
2 Marcus Tullius Cicero.
Format with Out-of-Line Footnotes:
The principal persons involved in this argument were Caesar[1], former military
leader and Imperator, and the orator Cicero[2]. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

[Footnote 1: Gaius Julius Caesar.]

[Footnote 2: Marcus Tullius Cicero.]

In some books, footnotes are separated from the main text by a horizontal line. We don't keep this so please just leave a blank line between the main text and the footnotes. (See example above.)

Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of on the bottom of each page. These are formatted in the same manner as footnotes. Where you find an endnote reference in the text, just surround it with [ and ]. If you are formatting one of the ending pages with the endnotes text on it, surround the text of each note with [Footnote #: (text of endnote)], with the endnote text placed in between, and the endnote number or letter placed where the # is. Put a blank line after each endnote so that they remain separate paragraphs when the text is rewrapped during post-processing.

Footnotes in Poetry or Tables should be treated the same as other footnotes. Volunteers should tag them and leave them at the bottom of the page; the post-processor will decide on the final placement.

Original Footnoted Poetry:
Mary had a little lamb1
   Whose fleece was white as snow
And everywhere that Mary went
   The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.
Correctly Formatted Text:
/*
Mary had a little lamb[1]
  Whose fleece was white as snow
And everywhere that Mary went
  The lamb was sure to go!
*/

[Footnote 1: This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.]

Italics

Format italicized text with <i> inserted at the start and </i> inserted at the end of the italics. (Note the "/" in the closing tag.)

Punctuation goes outside the italics, unless it is an entire sentence or section that is italicized, or the punctuation is itself part of a phrase, title, or abbreviation that is italicized.

The periods that mark an abbreviated word in the title of a journal such as Phil. Trans. are part of the title for italicization purposes, and are included within the italic tags, thus: <i>Phil. Trans.</i>.

For dates and similar phrases, format the entire phrase as italics, rather than marking the words as italics and the numbers as non-italics. The reason is that many typefaces found in older texts used the same design for numbers in both regular and italics.

If the italicized text consists of a series/list of words or names, mark these up with italics tags individually.

Examples—Italics:

Original Text: Correctly Formatted Text:
Enacted 4 July, 1776 <i>Enacted 4 July, 1776</i>
God knows what she saw in me! I spoke
in such an affected manner.
<i>God knows what she saw in me!</i> I spoke
in such an affected manner.
As in many other of these Studies, and As in many other of these <i>Studies</i>, and
(Psychological Review, 1898, p. 160) (<i>Psychological Review</i>, 1898, p. 160)
L. Robinson, art. "Ticklishness," L. Robinson, art. "<i>Ticklishness</i>,"
December 3, morning.
1323 Picadilly Circus
/*
<i>December 3, morning.</i>
1323 Picadilly Circus
*/
Volunteers may be tickled pink to read
Ticklishness, Tickling and Laughter,
Remarks on Tickling and Laughter
and Ticklishness, Laughter and Humour.
Volunteers may be tickled pink to read
<i>Ticklishness</i>, <i>Tickling and Laughter</i>,
<i>Remarks on Tickling and Laughter</i>
and <i>Ticklishness, Laughter and Humour</i>.

Bold Text

Format bold text (text printed in a heavier typeface) with <b> inserted before the bold text and </b> after it. (Note the "/" in the closing tag.)

Punctuation goes outside the bold tags, unless it is an entire sentence or section that is in bold, or the punctuation is itself part of a phrase, title, or abbreviation that is in bold type.

See the Page Headers/Page Footers image/text for an example.

Some Project Managers may specify in the Project Comments that bold text be rendered as all caps.

Superscripts

Older books often abbreviated words as contractions, and printed them as superscripts. For example:
    Genrl Washington defeated Ld Cornwall's army.
Format these by inserting a single caret followed by the superscripted text, like this:
    Gen^rl Washington defeated L^d Cornwall's army.

In scientific & technical works, format superscripted characters with curly braces { and } surrounding them, even if there is only one character superscripted.
For example:
    ... up to xn-1 elements in the array.
would be formatted as
    ... up to x^{n-1} elements in the array.

The Project Manager may specify in the Project Comments that superscripted text be marked up differently.

Subscripts

Subscripted text is often found in scientific works, but is not common in other material. Format subscripted text by inserting an underline character _ and surrounding the text with curly braces { and }.
For example:
    H2O.
would be formatted as
    H_{2}O.

Underlined Text

Format underlined text as Italics, with <i> and </i>. (Note the "/" in the closing tag.)

Underlining was often used to indicate emphasis when the typesetter was unable to actually italicize the text, for example in a typewritten document.

Some Project Managers may specify in the Project Comments that underlined text be marked up with the <u> and </u> tags.

S p a c e d   O u t   Text (gesperrt)

Format   s p a c e d   o u t   text with <g> inserted before the text and </g> after it. (Note the "/" in the closing tag.) Remove the extra spaces between letters in each word.

Punctuation goes outside the tags, unless it is an entire sentence or section that is spaced out, or the punctuation is itself part of a phrase that is spaced out.

This was a typesetting technique used to emphasize a piece of text in some older books, especially in German.

Font Changes

Format a change of font within a paragraph or line of normal text by inserting <f> before the change in font and </f> after it. (Note the "/" in the closing tag.) Use this markup to identify any special font or other formatting, except bold, italic, small capped, and spaced out text, which have their own tags.

Possible uses of this markup include:

  • antiqua (a variant of roman font) inside fraktur
  • blackletter within a section of regular font
  • smaller or larger font only if it is within a paragraph in regular font (for a whole paragraph in a different font or size, see the block quotation section)
  • upright font inside of a paragraph of italicized text

The particular use or uses of this markup in a project will usually be spelled out in the Project Comments. Formatters should post in the Project Discussion if the markup appears to be needed and has not yet been discussed.

Punctuation goes outside the tags, unless it is an entire sentence that is in a different font, or the punctuation is itself part of a phrase, title, or abbreviation in the different font.

Font size changes

Normally we do not do anything to mark changes in font size.

The exceptions to this are when the font size changes to indicate a block quotation, or when the font size changes within a single paragraph or line of text (see Font Changes).

Words in all Capitals

Format words that are printed in all capital letters as all capital letters.

The exception to this is the first word of a chapter: many old books typeset the first word of these in all caps; this should be changed to upper and lower case, so "ONCE upon a time," becomes "Once upon a time,"

Words in Small Capitals

The markup is different for Mixed Case Small Caps and all small caps:

Format words that are printed in Mixed Small Caps as mixed upper and lowercase, and surround the text with <sc> and </sc> markup.
    Example: This is Small Caps
    would correctly be: <sc>This is Small Caps</sc>.

Format words that are printed in all small caps as ALL-CAPS, and surround the text with <sc> and </sc> markup.
    Example: You cannot be serious about aardvarks!
    would correctly be: You cannot be serious about <sc>AARDVARKS</sc>!

Words in headings (Chapter Headings, Section Headings, Captions, etc.) that are entirely all-capped should be formatted as all-caps without any <sc> </sc>. The first word of a chapter that is in Small Caps should be changed to mixed case without the tags.

Large, Ornate opening Capital letter (Drop Cap)

Format a large and ornate graphic first letter of a chapter, section, or paragraph as if it were an ordinary letter.

Dashes, Hyphens, and Minus Signs

There are generally four such marks you will see in books:

  1. Hyphens. These are used to join words together, or sometimes to join prefixes or suffixes to a word.
    Leave these as a single hyphen, with no spaces on either side.
    Note that there is a common exception to this shown in the second example below.
  2. En-dashes. These are just a little longer, and are used for a range of numbers, or for a mathematical minus sign.
    Format these as a single hyphen, too. Spaces before or after are determined by the way it was done in the book; usually no spaces in number ranges, usually spaces around mathematical minus signs, sometimes both sides, sometimes just before.
  3. Em-dashes & long dashes. These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat——!
    Format these as two hyphens if the em-dash is short and four hyphens if the em-dash is long. Don't leave a space before or after, even if it looks like there was a space in the original book image.
  4. Deliberately Omitted or Censored Words or Names.
    Format these as 4 hyphens. When it represents a word, we leave appropriate space around it like it's really a word. If it's only part of a word, then no spaces—join it with the rest of the word. If the em-dash looks as if it is the size of the rest of the smaller em-dashes, then proofread it as a single em-dash, i.e. two hyphens.

Note: If an em-dash appears at the start or end of a line of your OCR'd text, join it with the other line so that there are no spaces or line breaks around it. Only if the author used an em-dash to start or end the paragraph or line of poetry or dialog should you leave it at the start or end of a line. See the examples below.

Examples—Dashes, Hyphens, and Minus Signs:

Original Image: Correctly Formatted Text: Type
semi-detached semi-detached Hyphen
three- and four-part harmony three- and four-part harmony Hyphen
discoveries which the Crus-
aders made and brought home with
discoveries which the Crusaders
made and brought home with
Hyphen
factors which mold char-
acter—environment, training and heritage,
factors which mold character--environment,
training and heritage,
Hyphen
See pages 21–25 See pages 21-25 En-dash
–14° below zero -14° below zero En-dash
X – Y = Z X - Y = Z En-dash
2–1/2 2-1/2 En-dash
I am hurt;—A plague
on both your houses!—I am dead.
I am hurt;--A plague
on both your houses!--I am dead.
Em-dash
sensations—sweet, bitter, salt, and sour
—if even all of these are simple tastes. What
sensations--sweet, bitter, salt, and sour--if
even all of these are simple tastes. What
Em-dash
senses—touch, smell, hearing, and sight—
with which we are here concerned,
senses--touch, smell, hearing, and sight--with
which we are here concerned,
Em-dash
It is the east, and Juliet is the sun!— It is the east, and Juliet is the sun!-- Em-dash
"Three hundred——" "years," she was going to say, but the left-hand cat interrupted her. "Three hundred----" "years," she was going to say, but the left-hand cat interrupted her. Longer Em-dash
As the witness Mr. —— testified, As the witness Mr. ---- testified, long dash
As the witness Mr. S—— testified, As the witness Mr. S---- testified, long dash
the famous detective of ——B Baker St. the famous detective of ----B Baker St. long dash
“You —— Yankee”, she yelled. "You ---- Yankee", she yelled. long dash
“I am not a d—d Yankee”, he replied. "I am not a d--d Yankee", he replied. Em-dash

End-of-line Hyphenation

Where a hyphen appears at the end of a line, join the two halves of the hyphenated word back together. If it is really a hyphenated word like well-meaning, join the two halves leaving the hyphen in between. But if it was just hyphenated because it wouldn't fit on the line, and is not a word that is usually hyphenated, then join the two halves and remove the hyphen. Keep the joined word on the top line, and put a line break after it to preserve the line formatting—this makes it easier for the volunteers who come after you. See the Dashes, Hyphens, and Minus Signs section of these Guidelines for examples of each kind (nar-row turns into narrow, but low-lying keeps the hyphen). If the word is followed by punctuation, then carry that punctuation onto the top line, too.

Words like to-day and to-morrow that we don't commonly hyphenate now were often hyphenated in the old books we are working on. Leave them hyphenated the way the author did. If you're not sure if the author hyphenated it or not, leave the hyphen, put an * after it, and join the word together like this: to-*day. The asterisk will bring it to the attention of the post processor, who has access to all the pages, and can determine how the author typically wrote this word.

End-of-page Hyphenation

Format end-of-page hyphens or em-dashes by leaving the hyphen or em-dash at the end of the last line, and mark it with a * after the hyphen.
For example, format:
 
    something Pat had already become accus-
as:
    something Pat had already become accus-*

On pages that start with part of a word from the previous page or an em-dash, place a * before the partial word or em-dash.
To continue the above example, format:
 
    tomed to from having to do his own family
as:
    *tomed to from having to do his own family

These markings indicate to the post-processor that the word must be rejoined when the pages are combined to produce the final e-book.

Single word at bottom of page

Format these by deleting the word, even if it's the second half of a hyphenated word.

In some older books, the single word at the bottom of the page (called a "catchword", usually printed near the right margin) indicates the first word on the next page of the book (called an "incipit"). It was used to alert the printer to print the correct reverse (called "verso"), to make it easier for printers' helpers to make up the pages prior to binding, and to help the reader avoid turning over more than one page.

Contractions

Remove any extra space in contractions: for example, would n't should be formatted as wouldn't.

This was often an early printers' convention, where the space was retained to indicate that 'would' and 'not' were originally separate words. It is also sometimes an artifact of the OCR. Remove the extra space in either case.

Some Project Managers may specify in the Project Comments not to remove extra spaces in contractions, particularly in the case of texts that contain slang, dialect, or are written in languages other than English.

Poetry/Epigrams

This section applies to an occasional Poem or Epigram in a mainly non-poetry book. For an entire book of poetry, see the special guidelines for Poetry Books.

Mark poetry or epigrams so the post-processor can find it more quickly. Insert a separate line with /* at the start of the poetry or epigram and a separate line with */ at the end. Leave a blank line between these markers and the rest of the text.

Preserve the relative indentation of the individual lines of the poem or epigram by adding 2, 4, 6 (or more) spaces in front of the indented lines to make them resemble the original.

When a line of verse is too long for the printed page, many texts wrap the continuation onto the next printed line and place a wide indentation in front of it. These continuation lines should be rejoined with the line above. Continuation lines usually start with a lower case letter. They will appear randomly unlike normal indentation, which occurs at regular intervals in the metre of the poem.

If the poetry is centered on the printed page, don't try to center the lines of poetry during formatting. Move the lines to the left margin, and preserve the relative indentation of the lines.

Footnotes in poetry should be treated the same as regular footnotes during formatting. See footnotes for details.

Line Numbers in poetry should be kept. Put them at the end of the line, leaving at least 6 spaces between them and the end of the text. See Line Numbers for details.

Check the Project Comments for the specific text you are formatting. Books of poetry often have special instructions from the Project Manager. Many times, you won't have to follow all these formatting guidelines for a book that is mostly or entirely poetry.


Sample Image:

Correctly Formatted Text:
to the scenery of his own country:

/*
          Oh, to be in England
          Now that April's there,
      And whoever wakes in England
      Sees, some morning, unaware,
That the lowest boughs and the brushwood sheaf
Round the elm-tree bole are in tiny leaf,
While the chaffinch sings on the orchard bough
              In England--now!

And after April, when May follows,
And the whitethroat builds, and all the swallows!
Hark! where my blossomed pear-tree in the hedge
Leans to the field and scatters on the clover
Blossoms and dewdrops--at the bent spray's edge--
That's the wise thrush; he sings each song twice over,
Lest you should think he never could recapture
The first fine careless rapture!
And though the fields look rough with hoary dew,
All will be gay, when noontide wakes anew
The buttercups, the little children's dower;
--Far brighter than this gaudy melon-flower!
*/

So it runs; but it is only a momentary memory;
and he knew, when he had done it, and to his

Letters/Correspondence

Format letters and correspondence as you would paragraphs. Put a blank line before the start of the letter; you do not need to duplicate any indenting.

Surround consecutive heading or footer lines (such as addresses, date blocks, salutations, or signatures) with /* and */ markers. Leave a blank line between the markers and the rest of the text. The markers will ensure the individual lines are kept in post-processing and not rewrapped.

Don't indent the heading or footer lines, even if they are indented or right justified in the original—just put them at the left margin. The post-processor will format them as needed.

Sample Image:

Correctly Formatted Text:

<i>John James Audubon to Claude François Rozier</i>

[Letter No. 1, addressed]

/*
<sc>M. Fr. Rozier</sc>,
Merchant-Nantes.
<sc>New York</sc>, <i>10 January, 1807.</i>

<sc>Dear Sir:</sc>
*/

We have had the pleasure of receiving by the <i>Penelope</i> your
consignment of 20 pieces of linen cloth, for which we send our
thanks. As soon as we have sold them, we shall take great
pleasure in making our return.

Lists of Items

Surround lists with /* and */ markers. Leave a blank line between these markers and the rest of the text. The markers will ensure the individual lines are not rewrapped during post-processing. Use this markup for any such list that should not be reformatted, including lists of questions & answers, items in a recipe, etc.

Original Text:
Andersen, Hans Christian   Daguerre, Louis J. M.    Melville, Herman
Bach, Johann Sebastian     Darwin, Charles          Newton, Isaac
Balboa, Vasco Nunez de     Descartes, René          Pasteur, Louis
Bierce, Ambrose            Earhart, Amelia          Poe, Edgar Allan
Carroll, Lewis             Einstein, Albert         Ponce de Leon, Juan
Churchill, Winston         Freud, Sigmund           Pulitzer, Joseph
Columbus, Christopher      Lewis, Sinclair          Shakespeare, William
Curie, Marie               Magellan, Ferdinand      Tesla, Nikola
Correctly Formatted Text:
/*
Andersen, Hans Christian
Bach, Johann Sebastian
Balboa, Vasco Nunez de
Bierce, Ambrose
Carroll, Lewis
Churchill, Winston
Columbus, Christopher
Curie, Marie
Daguerre, Louis J. M.
Darwin, Charles
Descartes, René
Earhart, Amelia
Einstein, Albert
Freud, Sigmund
Lewis, Sinclair
Magellan, Ferdinand
Melville, Herman
Newton, Isaac
Pasteur, Louis
Poe, Edgar Allan
Ponce de Leon, Juan
Pulitzer, Joseph
Shakespeare, William
Tesla, Nikola
*/

Tables

Surround tables with /* and */ markers. Leave a blank line between these markers and the rest of the text. The markers will ensure the individual lines are not rewrapped during post-processing. Format the table with spaces to look approximately like the original table. Don't make the table wider than 75 characters. Project Gutenberg's guidelines go on to say "...except where it can't be helped. Never, ever longer than 80...".

Do not use tabs for formatting—use space characters only. Tab characters will line up differently between computers, and your careful formatting will not always display the same way.

It's often hard to format tables in plain ASCII text; just do your best. This is much easier if you use a mono-spaced font such as DPCustomMono or Courier. Remember that the goal is to preserve the Author's meaning, while producing a readable table in an e-book. Sometimes this requires sacrificing the original format of the table on the printed page. Check the Project Comments and discussion thread because other volunteers may have settled on a specific format. If there is nothing there, you might find something useful in the Gallery of Table Layouts forum thread.

Footnotes in tables should go at the end of the table. See footnotes for details.

Sample Image:

Correctly Formatted Text:
/*
Deg. C.   Millimeters of Mercury.    Gasolene.
               Pure Benzene.

 -10°               13.4                 43.5
   0°               26.6                 81.0
 +10°               46.6                132.0
  20°               76.3                203.0
  40°              182.0                301.8
*/

Sample Image:

Correctly Formatted Text:
/*
TABLE II.

-----------------------+----+-----++-------------------------+----+------
                       | C  |     ||                         |  C |
Flat strips compared   | o  |     ||                         |  o |
with round wire 30 cm. | p  |Iron.|| Parallel wires 30 cm.   |  p | Iron.
in length.             | p  |     || in length.              |  p |
                       | e  |     ||                         |  e |
                       | r  |     ||                         |  r |
                       | .  |     ||                         |  . |
-----------------------+----+-----++-------------------------+----+------
Wire 1 mm. diameter    | 20 | 100 || Wire 1 mm. diameter     | 20 |  100
-----------------------+----+-----++-------------------------+----+------
        STRIPS.        |    |     ||       SINGLE WIRE.      |    |
0.25 mm. thick, 2 mm.  |    |     ||                         |    |
  wide                 | 15 |  35 || 0.25 mm. diameter       | 16 |   48
Same, 5 mm. wide       | 13 |  20 || Two  similar wires      | 12 |   30
 "   10  "    "        | 11 |  15 || Four    "      "        |  9 |   18
 "   20  "    "        | 10 |  14 || Eight   "      "        |  8 |   10
 "   40  "    "        |  9 |  13 || Sixteen "      "        |  7 |    6
Same strip rolled up in|    |     || Same, 16 wires bound    |    |
  the form of wire     | 17 |  15 ||   close together        | 18 |   12
-----------------------+----+-----++-------------------------+----+------
*/

Block Quotations

Surround block quotations with /# and #/ markers. Leave a blank line between these markers and the rest of the text. The markers will ensure the block quotation is formatted properly during post-processing.

Apart from adding the markers, block quotations should be formatted as any other text.

Block quotations are long quotations (typically several lines and sometimes several pages) and are often (but not always) printed with wider margins or in a smaller font size—sometimes both.

Sample Image:

Correctly Formatted Text:

later day was welcomed in their home on the Hudson.
Dr. Bakewell's contribution was as follows:[24]

/#
The uncertainty as to the place of Audubon's birth has been
put to rest by the testimony of an eye witness in the person
of old Mandeville Marigny now dead some years. His repeated
statement to me was, that on his plantation at Mandeville,
Louisiana, on Lake Ponchartrain, Audubon's mother was
his guest; and while there gave birth to John James Audubon.
Marigny was present at the time, and from his own lips, I have,
as already said, repeatedly heard him assert the above fact.
He was ever proud to bear this testimony of his protection
given to Audubon's mother, and his ability to bear witness as
to the place of Audubon's birth, thus establishing the fact that
he was a Louisianian by birth.
#/

We do not doubt the candor and sincerity of the
excellent Dr. Bakewell, but are bound to say that the
incidents as related above betray a striking lapse of

Double Quotes

Format these as plain ASCII " double quotes. Do not change double quotes to single quotes. Leave them as the Author wrote them.

For quotes from non-English languages, use the quotation marks appropriate to that language if they are available. The French equivalent, guillemets, «like this», are available from the pulldown menus in the proofreading interface, since they are part of Latin-1. Remember to remove space between the guillemets and the quoted text; if needed, it will be added in post-processing. The same applies to languages which use reversed guillemets, »like this«.

The quotation marks used in some texts (in German or other languages), „like this” are not available in the pulldown menus, as they are not in Latin-1. In that case, follow the instructions in the project comments.

The Project Manager may instruct you in the Project Comments to format non-English language quotation marks differently for a particular book.

Single Quotes

Format these as the plain ASCII ' single quote (apostrophe). Do not change single quotes to double quotes. Leave them as the Author wrote them.

Quote Marks on each line

Format quotation marks at the beginning of each line of a quotation by removing all of them except for the one at the start of the first line of the quotation.

If the quotation goes on for multiple paragraphs, each paragraph should have an opening quote mark on the first line of the paragraph.

Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are formatting. Leave it that way—do not add closing quotation marks that are not in the page image.

There are some language-specific exceptions. In French, for example, dialog within quotations uses a combination of different punctuation to indicate various speakers. If you are not familiar with a particular language, check the Project Comments or leave a message for the Project Manager in the Project Discussion for clarification.

End-of-sentence Periods

Format periods that end sentences with a single space after them.

You do not need to remove extra spaces after periods if they're already in the OCR'd text—we can do that automatically during post-processing. See the Chapter Headers image and text for an example.

Punctuation

In general, there should be no space before punctuation characters except opening quotation marks. If the OCR'd text has a space before punctuation, remove it. This applies even to languages, such as French, which normally use spaces before punctuation characters.

Spaces before punctuation sometimes appear because books typeset in the 1700's & 1800's often used partial spaces before punctuation such as a semicolon or comma.

Scanned Text:
and so it goes ; ever and ever.
Correctly Formatted Text:
and so it goes; ever and ever.

Line Breaks

Leave all line breaks in so that the next formatter and the post-processor can easily compare the lines in the text to the lines in the image. Be especially careful about this when rejoining hyphenated words or moving words around em-dashes. If the previous volunteer removed the line breaks, please replace them so that they once again match the image.

Extra blank lines that are not in the image should be removed except where we intentionally add them for formatting. But blank lines at the bottom of the page are fine—these are removed when you save the page.

Extra Spaces or Tabs Between Words

Extra spaces and tab characters between words are common in OCR output. You don't need to bother removing these—that can be done automatically during post-processing.

However, extra spaces around punctuation, em-dashes, quote marks, etc. do need to be removed when they separate the symbol from the word.

For example, in A horse ;  my kingdom for a horse. the space between the word "horse" and the semicolon should be removed. But the 2 spaces after the semicolon are fine—you don't have to delete one of them.

Trailing Space at End-of-line

Do not bother inserting spaces at the ends of lines of text. It is a waste of your time for something that we can take care of automatically later. Similarly do not waste your time removing extra spaces at the ends of lines.

Line Numbers

Keep line numbers. Place them at least six spaces past the right hand end of the line, even if they are on the left side of the poetry/text in the original image.

Line numbers are numbers in the margin for each line, or sometimes every fifth or tenth line, and are common in books of poetry. Since poetry will not be reformatted in the e-book version, the line numbers will be useful to readers.

Extra Spacing/Stars/Line Between Paragraphs

Most paragraphs start on the line immediately after the end of the previous one. Sometimes two paragraphs are separated to indicate a "thought break." A "thought break" may take the form of a line of stars, hyphens, or some other character, a plain or floridly decorated horizontal line, a simple decoration, or even just an extra blank line or two.

A "thought break" may represent a change of scene or subject, a lapse in time, or a bit of suspense. This is intended by the author, so we preserve it by putting a blank line, <tb>, and then another blank line.

Project Managers and/or Post-Processors may make the request for additional information to be retained in the thought break markup. For example, some projects delineate different types of breaks by the use of different styles of break such as a line of stars in one place and a blank line in another. In these cases, the Project Comments may request that these be marked up: <tb stars> and <tb>. Please, as always, read the project comments carefully so that you will know what is required for each project. Also be careful not to carry these special requests into other projects with different requirements.

Sometimes printers used decorative lines to mark the ends of chapters. As we already mark Chapter Headers, there is no need to add a "thought break" marker.

The proofreading interface has the "thought break" marker available to cut and paste.


Sample Image:
thought break
Correctly Formatted Text:

like the gentleman with the spiritual hydrophobia
in the latter end of Uncle Tom's Cabin.
Unconsciously Mr. Dixon has done his best to
prove that Legree was not a fictitious character.

<tb>

Joel Chandler Harris, Harry Stillwell Edwards,
George W. Cable, Thomas Nelson Page,
James Lane Allen, and Mark Twain are Southern
men in Mr. Griffith's class. I recommend

Period Pause "..." (Ellipsis)

The guidelines are different for English and Languages Other Than English (LOTE).

ENGLISH: Leave a space before the three dots, and a space after. The exception is at the end of a sentence, when there would be no space, four dots, and a space after. This is also the case for any other ending punctuation mark: the 3 dots follow immediately, without any space.

For example:

     That I know ... is true.
     This is the end....
     Wherefore art thou Romeo?...

Sometimes you will see it with the punctuation at the end; so format it that way:

     Wherefore art thou Romeo...?

Remove extra dots, if any, or add new ones, if necessary, to bring the number to three (or four) as appropriate.

LOTE: (Languages Other Than English) Use the general rule "Follow closely the style used in the printed page." In particular, insert spaces, if there are spaces before or between the periods, and use the same number of periods as appear in the image. Sometimes the printed page is unclear; in that case, insert a [**unclear] to draw the attention of the post-processor. (Note: Post Processors should replace those regular spaces with non-breaking spaces.)

Accented/Non-ASCII Characters

Please format these using the proper accented Latin-1 characters, where possible. See Diacritical marks for ways to format some non-Latin-1 characters.

If they are not on your keyboard, there are several ways of inputting these characters:

  • The pull-down menus in the proofreading interface.
  • Applets included with your operating system.
    • Windows: "Character Map"
      Access it through:
      Start: Run: charmap, or
      Start: Accessories: System Tools: Character Map.
    • Macintosh: Key Caps or "Keyboard Viewer"
      For OS 9 and lower this is on the Apple Menu,
      For OS X through 10.2, this is located the in Applications, Utilities folder
      For OS X 10.3 and higher, this is in the Input Menu as "Keyboard Viewer."
    • Linux: Various, depending on your desktop environment.
      For KDE, try KCharSelect (in the Utilities submenu of the start menu).
  • An on-line program, such as Edicode.
  • Keyboard shortcuts.
    (See the tables for Windows and Macintosh below.)
  • Switching to a keyboard layout or locale which supports "deadkey" accents.
    • Windows: Control Panel (Keyboard, Input Locales)
    • Macintosh: Input Menu (on Menu Bar)
    • Linux: Change the keyboard in your X configuration.

The original Project Gutenberg will post as a minimum, 7-bit ASCII versions of texts, but versions using other character encodings which can preserve more of the information from the original text are accepted. Project Gutenberg Europe publishes UTF-8 as its default encoding, but other appropriate encodings are also welcomed.

Currently for Distributed Proofreaders this means using Latin-1 or ISO 8859-1 and -15, and in the future will include Unicode.

Distributed Proofreaders Europe already uses Unicode.

For Windows:

  • You can use the Character Map program (Start: Run: charmap) to select an individual letter, and then cut & paste.
  • The dropdown menus in the proofreading interface.
  • Or you can type the Alt+NumberPad shortcut codes for these characters.
    This is faster than using cut & paste, once you get used to the codes.
    Hold the Alt key and type the four digits on the Number Pad—the number row over the letters won't work.
    You must type all 4 digits, including the leading 0 (zero). Note that the capital version of a letter is 32 less than the lower case.
    These instructions are for the US-English keyboard layout. It may not work for other keyboard layouts.
    The table below shows the codes we use. (Print-friendly version of this table)
    Do not use other special characters unless the Project Manager tells you to in the Project Comments.

Windows Shortcuts for Latin-1 symbols
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Alt-0224 á Alt-0225 â Alt-0226 ã Alt-0227 ä Alt-0228 å Alt-0229 æ Alt-0230
À Alt-0192 Á Alt-0193 Â Alt-0194 Ã Alt-0195 Ä Alt-0196 Å Alt-0197 Æ Alt-0198
è Alt-0232 é Alt-0233 ê Alt-0234 ë Alt-0235
È Alt-0200 É Alt-0201 Ê Alt-0202 Ë Alt-0203
ì Alt-0236 í Alt-0237 î Alt-0238 ï Alt-0239
Ì Alt-0204 Í Alt-0205 Î Alt-0206 Ï Alt-0207 / slash Œ ligature
ò Alt-0242 ó Alt-0243 ô Alt-0244 õ Alt-0245 ö Alt-0246 ø Alt-0248 œ Use [oe]
Ò Alt-0210 Ó Alt-0211 Ô Alt-0212 Õ Alt-0213 Ö Alt-0214 Ø Alt-0216 Œ Use [OE]
ù Alt-0249 ú Alt-0250 û Alt-0251 ü Alt-0252
Ù Alt-0217 Ú Alt-0218 Û Alt-0219 Ü Alt-0220 currency mathematics
ñ Alt-0241 ÿ Alt-0255 ¢ Alt-0162 ± Alt-0177
Ñ Alt-0209 £ Alt-0163 × Alt-0215
çedilla Icelandic marks accents punctuation ¥ Alt-0165 ÷ Alt-0247
ç Alt-0231 Þ Alt-0222 © Alt-0169 ´ Alt-0180 ¿ Alt-0191 $ Alt-0036 ¬ Alt-0172
Ç Alt-0199 þ Alt-0254 ® Alt-0174 ¨ Alt-0168 ¡ Alt-0161 ¤ Alt-0164 ° Alt-0176
superscripts Ð Alt-0208 Alt-0153 ¯ Alt-0175 « Alt-0171 µ Alt-0181
¹ Alt-0185 ð Alt-0240 Alt-0182 ¸ Alt-0184 » Alt-0187 ordinals ¼ 1Alt-0188
² Alt-0178 sz ligature § Alt-0167 · Alt-0183 º Alt-0186 ½ 1Alt-0189
³ Alt-0179 ß Alt-0223 ¦ Alt-0166 * Alt-0042 ª Alt-0170 ¾ 1Alt-0190

1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)

For Apple Macintosh:

  • You can use the "Key Caps" program as a reference.
    In OS 9 & earlier, this is located in the Apple Menu; in OS X through 10.2, it is located in Applications, Utilities folder.
    This brings up a picture of the keyboard, and pressing shift, opt, command, or combinations of those keys shows how to produce each character. Use this reference to see how to type that character, or you can cut & paste it from here into the text in the proofreading interface.
  • In OS X 10.3 and higher, the same function is now a palette available from the Input menu (the drop-down menu attached to your locale's flag icon in the menu bar). It's labeled "Show Keyboard Viewer." If this isn't in your Input menu, or if you don't have that menu, you can activate it by opening System Preferences, the "International" panel, and selecting the "Input Menu" pane. Ensure that "Show input menu in menu bar" is checked. In the spreadsheet view, check the box for "Keyboard Viewer" in addition to any input locales you use.
  • The dropdown menus in the proofreading interface.
  • Or you can type the Apple Opt- shortcut codes for these characters.
    This is a lot faster than using cut & paste, once you get used to the codes.
    Hold the Opt key and type the accent symbol, then type the letter to be accented (or, for some codes, only hold the Opt key and type the symbol).
    These instructions are for the US-English keyboard layout. It may not work for other keyboard layouts.
    The table below shows the codes we use. (Print-friendly version of this table)
    Do not use other special characters unless the Project Manager tells you to in the Project Comments.

Apple Mac Shortcuts for Latin-1 symbols
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Opt-`, a á Opt-e, a â Opt-i, a ã Opt-n, a ä Opt-u, a å Opt-a æ Opt-'
À Opt-`, A Á Opt-e, A Â Opt-i, A Ã Opt-n, A Ä Opt-u, A Å Opt-A Æ Opt-"
è Opt-`, e é Opt-e, e ê Opt-i, e ë Opt-u, e
È Opt-`, E É Opt-e, E Ê Opt-i, E Ë Opt-u, E
ì Opt-`, i í Opt-e, i î Opt-i, i ï Opt-u, i
Ì Opt-`, I Í Opt-e, I Î Opt-i, I Ï Opt-u, I / slash Œ ligature
ò Opt-`, o ó Opt-e, o ô Opt-i, o õ Opt-n, o ö Opt-u, o ø Opt-o œ Use [oe]
Ò Opt-`, O Ó Opt-e, O Ô Opt-i, O Õ Opt-n, O Ö Opt-u, O Ø Opt-O Œ Use [OE]
ù Opt-`, u ú Opt-e, u û Opt-i, u ü Opt-u, u
Ù Opt-`, U Ú Opt-e, U Û Opt-i, U Ü Opt-u, U currency mathematics
ñ Opt-n, n ÿ Opt-u, y ¢ Opt-4 ± Shift-Opt-=
Ñ Opt-n, N £ Opt-3 × (none) †
çedilla Icelandic marks accents punctuation ¥ Opt-y ÷ Opt-/
ç Opt-c Þ (none) ‡ © Opt-g ´ Opt-E ¿ Opt-? $ Shift-4 ¬ Opt-l
Ç Opt-C þ (none) ‡ ® Opt-r ¨ Opt-U ¡ Opt-1 ¤ (none) ‡ ° Shift-Opt-8
superscripts Ð (none) ‡ Opt-2 ¯ Shift-Opt-, « Opt-\ µ Opt-m
¹ (none) ‡ ð (none) ‡ Opt-7 ¸ Opt-Z » Shift-Opt-\ ordinals ¼ (none) ‡1
² (none) ‡ sz ligature § Opt-6 · Shift-Opt-9 º Opt-0 ½ (none) ‡1
³ (none) ‡ ß Opt-s ¦ (none) ‡ * Shift-8 ª Opt-9 ¾ (none) ‡1

‡ Note: No equivalent shortcut, use drop-down menus.

1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)

Characters with Diacritical marks

In some projects, you will find characters with special marks either above or below the normal Latin A...Z character. These are called diacritical marks, and indicate a special pronunciation for this character. For formatting, we indicate them in our normal ASCII text by using a specific coding, such as: ă becomes [)a] for a breve (the u-shaped accent) above an a, or [a)] for a breve below.

Be sure to include the square brackets ([ ]) around these, so the post-processor knows to which letter it applies. He or she will eventually replace these with whatever symbol works in each version of the text they produce, like 7-bit ASCII, 8-bit, Unicode, html, etc.

Note that when some of these marks appear on some characters (mainly vowels) our standard Latin-1 character set already includes that character with the diacritical mark. In those cases, use the Latin-1 character (see here), available from the drop-down lists in the proofreading interface.

The table below lists the special codings currently used:
The "x" represents a character with a diacritical mark.
When formatting, use the actual character from the text, not the x shown in the examples.

Proofreading Symbols for Diacritical Marks
diacritical mark sample above below
macron (straight line) ¯ [=x] [x=]
2 dots (dieresis, umlaut) ¨ [:x] [x:]
1 dot · [.x] [x.]
grave accent ` [`x] or [\x] [x`] or [x\]
acute accent (aigu) ´ ['x] or [/x] [x'] or [x/]
circumflex ˆ [^x] [x^]
caron (v-shaped symbol) [vx] [xv]
breve (u-shaped symbol) [)x] [x)]
tilde ˜ [~x] [x~]
cedilla ¸ [,x] [x,]

Non-Latin Characters

Some projects contain text printed in non-Latin characters; that is, characters other than the Latin A...Z—for example, Greek, Cyrillic (used in Russian, Slavic, and other languages), Hebrew, or Arabic characters.

For Greek, you should attempt a transliteration. Transliteration involves converting each character of the foreign text into the equivalent Latin letter(s). A Greek transliteration tool is provided in the proofreading interface to make this task much easier.

Press the "Greek Transliterator" button near the bottom of the proofreading interface to pop up the tool. In the tool, click on the Greek characters that match the word or phrase you are transliterating, and the appropriate Latin-1 characters will appear in the text box. When you are done, simply cut and paste this transliterated text into the page you are formatting. Surround the transliterated text with the Greek markers [Greek:  and ]. For example, Βιβλος would become [Greek: Biblos]. ("Book"—so appropriate for DP!)

If you are uncertain about your transliteration, mark it with ** to bring it to the attention of the next formatter or the post-processor.

For other languages that cannot be so easily transliterated, such as Cyrillic, Hebrew, or Arabic, surround the text with appropriate markers; [Cyrillic: **], [Hebrew: **], or [Arabic: **] and leave it as scanned. Include the ** so the post-processor can address it later.

  • Greek: Greek HOWTO (from Project Gutenberg) or see the "Greek Transliterator" pop-up tool in the proofreading interface.
  • Cyrillic: While a standard transliteration scheme exists for Cyrillic, we only recommend you attempt a transliteration if you are fluent in a language that uses it. Otherwise, just mark