LaTeX text formatting guidelines 2006

From DPWiki
Jump to navigation Jump to search

These are guidelines for formatting the text of DP projects using LaTeX. See also

The proofing and formatting guidelines are also available as typeset manuals, suitable for online viewing and printing.


The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, leave them spelled that way. If the author wrote outrageous racist or biased statements, leave them that way. If the author puts italics, bold text or a footnote every third word, mark them italicized, bolded or footnoted.

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line (end-of-line hyphenation). Changes such as these help us produce a consistently formatted version of the book. The proofreading and formatting rules we follow are designed to achieve this result. Please carefully read the rest of the Formatting Guidelines with this concept in mind.

To assist the next proofreader and the post-processor, we also preserve line breaks. This allows them to easily compare the lines in the text to the lines in the image.

Summary Guidelines

In LaTeX projects, formatters are more responsible for semantic markup than in non-LaTeX projects. Please take the online LaTeX formatting quizzes and practice with the self-study package (~1.3 MB zip file).

When you start working on project pages, please strive for a simple semantic representation of the page, not necessarily a visual facsimile. In cases of doubt, ask in the project thread or the "All questions welcome" forum. Take your time to do careful work (expect to spend about 30-60 minutes per page when you start out), and don't hesitate to ask for feedback when you start formatting, either by posting in the project forum or sending a PM to dp-feedback.

With the following exceptions, format text just as in non-LaTeX projects. Project comments supersede these guidelines, so please check the project comments and read the project forum before starting to work on a project.

  • Precede the characters # $ & and % with a backslash if they appear in text: \# \$ \& \%.
  • Retain Latin-1 characters added by the proofreading rounds; don't replace them with equivalent LaTeX commands.
  • Comment out most front- and back-matter--the title page(s), table of contents, and index--by placing \iffalse at the top of the page and \fi [sic] at the bottom.
  • Comment out proofers' notes with the LaTeX comment character "%", and add a newline after the text of the comment.
  • Add % [Blank page] and % [Illustration] notes as in non-LaTeX projects, but preceded with a "%".
  • Mark sectional units with appropriate semantic commands: \chapter{}, \section{}, \subsection{} for numbered units, or their starred variants \chapter*{}, etc. for unnumbered units. Place the unit's title in the braces, and retain any accompanying number as a comment: \chapter{Formatting Dashes in LaTeX} % Chapter XLII
  • Place footnotes in-line: The footnote text goes inside a \footnote{} command, itself placed at the location of the note.\footnote{If this were an actual footnote, the marker would come at the end of the sentence.}
  • Dashes: Use two hyphens for an en-dash, three hyphens for an em-dash.
  • Merely note changes of font size, e.g. % [** F1: Font size changes on this page]
  • Format lists--even numbered or multi-column lists--with the itemize environment. The optional argument of each \item[] command is the item label.
  • Do not use the double-quote (") character or the Latin-1 prime accent character (´); use the ASCII single-quote character (') instead (and the back-tick character (`) for open quotes).
  • Math formatting is discussed on a separate page.

About This Document

This document is written in order to reduce formatting differences when proofreading of one book is distributed among many proofreaders, each working on different pages of the book. This helps us all do formatting the same way. That makes it easier for the post-processor to eventually combine all these proofread pages into one e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about formatting. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by editing it, or by contributing to the discussion, or by posting your suggested changes in the Documentation Forum in this thread.

Project Comments

On the proofreading interface page (Project Page) where you start formatting pages, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start formatting pages! If the Project Manager wants you to format something in this book differently from the way specified in these Guidelines, that will be noted there. Instructions in the Project Comments override the rules in these Guidelines, so follow them. (This is also where the Project Manager may give you interesting tidbits of information about the author or the project.)

Please also read the Project Thread: The Project Manager may clarify project-specific guidelines there, and it is often used by proofreaders/formatters to alert others to recurring issues within the project and how they can best be addressed.

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other formatters have made changes. This Forum thread discusses different ways to use this information.

Forum/Discuss this Project

On the proofreading interface page (Project Page) where you start proofreading pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other proofreaders who are working on this book.

Fixing errors on Previous Pages

When you select a project for proofreading, the Project Comments page is loaded. This page contains links to pages from this project that you have recently proofread. (If you haven't proofread any pages yet, there will be no links shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make proofreading corrections or to finish proofreading. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and re-open it to fix the error.

For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you are using.

About LaTeX

DP has elected to use LaTeX to markup texts which contain a lot of mathematics that cannot be easily represented in plain text. LaTeX is a set of macros for the typesetting language TeX. DP uses LaTeX rather than TeX because it's easier for non-experts to get to grips with. This document will not make you an expert in using LaTeX, rather its aim is to provide a standard way for proofers to markup maths books that are going through DP. If you wish to become an expert, some good reference materials are listed on the LaTeX resources page.

Special Characters in LaTeX

The characters # $ & _ ^ % { } ~ have special meanings in LaTeX. If you wish to use any of these characters in text then they must by preceded by a \ (backslash), e.g. The coat cost \$20 with a 30\% discount. \ and ~ are also special characters and to use them in text you should use the commands \textbackslash and \textasciitilde.

TeX was written when vanilla ASCII input was the norm. Now, thanks to the inputenc package, characters from extended character sets can be digested as-is by LaTeX. Since DP uses the Latin-1 character set, proofread text should come with accented letters and some symbols intact: unless the Project Comments direct otherwise, these should not be converted to TeX commands, because inputenc will do this automatically behind the scenes anyway. Hence there is no need to change £ to \pounds or é to \'e etc. This reduces the likelihood of introducing errors and also makes the source code easier to read.

With the minimal preamble that tends to be used during the formatting rounds, some Latin-1 characters will trigger a warning from LaTeX if used in math mode (most commonly the degree symbol). If you find these warnings annoying, add

\makeatletter
\let\@inmathwarn\@gobble
\makeatother

to your preamble to switch them off.

Important note for Mac users

If you are a Macintosh user, you will need to set the document encoding to Latin-1 in your text editor while working on DP files. If the document encoding is not set to Latin-1, even though you see (for example) an é in your source file, compiling it with \usepackage[latin1]{inputenc} will generate an error.

Note: If you are using TeXShop, go to 'Preferences', and set 'Encoding' to 'Western (ISO Latin 1)'.

Installing LaTeX on your system

Installation packages for LaTeX are available for most operating systems. A good place to start looking is the TeX Users Group page or the LaTeX resources page. Note: You don't need to install LaTeX to format maths texts on DP but it makes the Post-Processor's job much easier if formatters in F2 can check that pages compile.

Test-compiling pages

If you have LaTeX on your system and want to try compiling pages a sensible preamble to use is:

\documentclass{book} 
\usepackage{amsmath, amssymb, amsthm} 
\usepackage[latin1]{inputenc}  % even if you work on a Mac: DP .tex files are Latin-1
\begin{document}

If the text contains theorems, lemmas, conjectures, propositions or definitions, you may also need one or more of the following lines depending on which appear.

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{conjecture}{Conjecture}
\newtheorem{corollary}{Corollary}
\newtheorem{proposition}{Proposition}
\newtheorem{definition}{Definition}

There may be additional preamble material to add (such as \usepackage{soul} or \let\so\textbf), which you should find in either the Project Comments or the project's discussion forum.

Copy your formatted text into a text editor, prepend the preamble, save as funproject.tex, and run funproject.tex through LaTeX. The file should look something like

\documentclass{book} 
\usepackage{amsmath, amssymb, amsthm} 
\usepackage[latin1]{inputenc}
% other project-specific preamble commands
\begin{document}

% code for 057.png

\clearpage

% code for 062.png

\end{document}

(It doesn't hurt to keep all the pages you do on a project in the same file.) Correct the code until there are no compilation errors (warnings can generally be ignored), then preview and check that the output sufficiently resembles the original page scan. Some things will almost certainly look 'wrong' because we are compiling a single page out of context. For example, LaTeX will probably insert a paragraph indentation before the first word even if it's obvious to you that the text is actually a continuation of a paragraph begun on a previous page: there's no need to insert an explicit \noindent to 'correct' this because when all the pages are processed together it will be superfluous. Remember that we are not aiming for a facsimile: what is important is that the formatting you have applied adequately and consistently allows the structure of the text to be represented. The final presentation for the structural elements will be fine-tuned during post-processing.

When you're happy with the output, copy the LaTeX code for the current page (without the preamble etc) back into the proofing interface and save it.

Alternatively, create a "wrapper" file:

% Contents of wrapper.tex
\documentclass{book}
\usepackage{amsmath, amssymb, amsthm}
% Other project-specific preamble code goes here
\usepackage[latin1]{inputenc}
\begin{document}
\input{mypage.tex} % load the 'scratch' file containing your formatted text
\end{document}

To perform a test compile, copy your formatted text from the proofer interface into the scratch file 'mypage.tex', then compile and preview the wrapper file wrapper.tex. The contents of wrapper.tex change only when the project preamble gets updated, while the contents of mypage.tex get replaced with each new page you format. Since only the contents of 'mypage.tex' get uploaded, this scheme minimizes the chance of committing a serious faux pas: checking in a page containing preamble code.

Formatting of the...

Title and Copyright Pages

There is no need to spend time formatting these, because the post-processor will format these according to the requirements of the documentclass they decide to use. Retain the proofed text, but comment it out by putting

% title page
\iffalse

at the top of the page and a matching \fi at the bottom.

Table of Contents

Do not format the Table of Contents since the post-processor will either auto-generate it using LaTeX's \tableofcontents machinery, or format it manually themselves. However, do not erase the text: comment it out by placing \iffalse at the beginning and \fi at the end, and leave a note to the PP.

Blank Page

Format as %[Blank Page] if both the text and the image are blank.

If there is text in the proofreading text area and a blank image, or if there is an image but no text, follow the directions for a bad image or bad text.

Page Headers/Page Footers

The proofreaders should already have removed page headers and page footers, but not footnotes, from the text. If the proofreaders have left headers/footers in, check the Project Comments in case there are special procedures for this project.

Chapter Headers

Chapter titles should be marked:

\chapter{Name of The Chapter Goes Here}

There is no need to put 4 blank lines before a chapter title since LaTeX creates the right spacing when the document is processed. If you compile the page to check your code, LaTeX will generate the wrong number for the chapter heading, and the formatting is unlikely to match the original book: the post-processor will sort this out.

If the chapter is unnumbered, use \chapter*. If the chapter is numbered, retain the chapter number in a comment.

\chapter*{Preface}
\chapter{The First Thrilling Installment.} % Chapter 1.

Old books often printed the first word or two of every chapter in all caps; change these to upper and lower case (first letter only capitalized).

Watch out for a missing double quote at the start of the first paragraph, which some publishers did not include or which the OCR missed due to a large capital in the original. If the author started the paragraph with dialog, insert the double quote.

Sample Image:
chap1.png
Correctly Formatted Text:
\title % GREEN FANCY

\chapter{THE FIRST WAYFARER AND THE SECOND WAYFARER
MEET AND PART ON THE HIGHWAY} % CHAPTER I

A solitary figure trudged along the narrow
road that wound its serpentinous way
through the dismal, forbidding depths of
the forest: a man who, though weary and footsore,
lagged not in his swift, resolute advance. Night
was coming on, and with it the no uncertain prospects
of storm. Through the foliage that overhung
the wretched road, his ever-lifting and apprehensive
eye caught sight of the thunder-black, low-lying
clouds that swept over the mountain and bore
down upon the green, whistling tops of the trees. 

At a cross-road below he had encountered a small
girl driving homeward the cows. She was afraid
of the big, strange man with the bundle on his back
and the stout walking stick in his hand: to her a
remarkable creature who wore ``knee pants'' and
stockings like a boy on Sunday, and hob-nail shoes,
and a funny coat with ``pleats'' and a belt, and a
green hat with a feather sticking up from the band. 

Section Headers

Some texts have sections within chapters. Proof these headers as

\section{Name of The Section Goes Here.} 

Leave 1 blank line before the header and one after (even if the heading is run-in), unless the Project Manager has requested otherwise. As with chapters, retain any numbering in a comment; an unnumbered section should be marked up using \section*

LaTeX provides a hierarchy of sectioning commands: \subsection, \subsubsection, and corresponding starred forms are available for lower-level sections, but these are not often needed. There is also a \paragraph command, which is slightly different in that it takes an argument, used to hard-code the paragraph numbering. For example,

5. The axis of any circle of a sphere is

could be marked up as

\paragraph{5.} The \emph{axis} of any circle of a sphere is 

If you are not sure if a header indicates a chapter or a section (or a subsubsection), post a question in the Project Thread, noting the page number.

Paragraph Spacing/Indenting

Put a blank line before the start of (unnumbered) paragraphs, especially if a paragraph starts at the top of a page. You do not need to indent the start of paragraphs, because LaTeX will handle paragraph indentation automatically.

See the chapter headers image/text for an example.

Other Indented or Justified Text

For indented text, such as poetry, letter headings, citations, etc. you can use \quad and \qquad. \quad inserts a horizontal space equal to the current typesize, e.g. a 10pt space for a 10pt font size, \qquad inserts twice as much. Don't worry too much about the exact spacing that \quad and \qquad generate as the Post-Processor will sort out any problems.

If the text is right-justified, enclose the text in a flushright environment, i.e.

\begin{flushright}
Justified text goes here.
\end{flushright}

Multiple Columns

Proofread ordinary text which has been printed in two columns as a single column.

Spans of multiple-column text within single column sections should be proofread as a single column by placing the text from the left-most column first, the text from the next one after it, and so on. You do not need to mark where the columns were split, just join them together.

See also the indexes, lists of items and tables sections of the Formatting Guidelines.

Illustrations

Illustrations are marked as in regular DP projects, except that each line is commented out with a %. If an illustration has no caption:

%[Illustration]

If an illustration has a caption, proofread the caption text as it is printed, preserving the line breaks, italics, etc. If the caption spans several lines add a % at the beginning of each one.

%[Illustration: Caption Text Goes Here.]

If the illustration is in the middle of or at the side of a paragraph, move the illustration tag to before or after the paragraph and leave a blank line to separate them. Rejoin the paragraph by removing any blank lines left by doing so.

If there is no paragraph break on the page, mark the illustration tag with an * like so
% *[Illustration: (text of caption)],
move it to the top of the page, and leave 1 (one) blank line after it.

Footnotes/Endnotes

Unlike regular DP practice, in LaTeX projects, footnotes are placed in-line; that is, the text of the footnote is put where it is referenced in the text.

During formatting, this means:

A footnote should be surrounded by a footnote tag \footnote{ and }, with the footnote text placed in between. The footnote marker1 will be generated automatically by LaTeX, so is discarded. Format the footnote text as it is printed, preserving italics, etc. To improve clarity, consider putting the footnote text on its own line(s), indented. See examples below.

If there's a footnote at the bottom of the page with no footnote marker in the text, especially if it starts mid-sentence or mid-word, it's probably a continuation of a footnote from a previous page. Leave it at the bottom of the page, and surround it with %*\footnote{text of footnote}. The * indicates that the footnote was continued, and brings it to the attention of the post-processor.

If a footnote continues on the next page (the page ends before the footnote does), put an asterisk * at the end, like this:

\footnote
   {text of footnote *}. 

(The * indicates that the footnote ended prematurely, and brings it to the attention of the post-processor, who will eventually join it up with the rest of the footnote text.

If a continued footnote ends or starts on a hyphenated word, mark both the footnote and the word with *, thus:

\footnote
   {This footnote is continued and the last word in it is also con-* *} 

for the leading fragment, and

%\footnote{*tinued onto the next page.}
Original Text:
The principal persons involved in this argument were Caesar1, former military
leader and Imperator, and the orator Cicero2. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

1 Gaius Julius Caesar.
2 Marcus Tullius Cicero.
Formatted with In-Line Footnotes:
The principal persons involved in this argument were Caesar\footnote
  {Gaius Julius Caesar.},
former military
leader and Imperator, and the orator Cicero\footnote
  {Marcus Tullius Cicero.}.
Both were of the aristocratic
(Patrician) class, and were quite wealthy.

In some books, footnotes are separated from the main text by a horizontal line. We don't keep this, because again LaTeX can automatically insert it if required.

Footnotes in poetry or tables should be treated the same as other footnotes. Formatters should tag them and move them to the correct place within the text.

Original Footnoted Poetry:
Mary had a little lamb1
   Whose fleece was white as snow
And everywhere that Mary went
   The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.
Correctly Formatted Text:
\begin{verse}
Mary had a little lamb\footnote
  {This lamb was obviously of the Hampshire breed,
  well known for the pure whiteness of their wool} \\
\quad Whose fleece was white as snow \\
And everywhere that Mary went \\
\quad The lamb was sure to go! \\
\end{verse} 

Italics

Format italicized text with \textit{ and } surrounding the italics, unless it is clear from the context that the italics are being used for emphasis, in which case \emph is more appropriate.

Punctuation goes outside the italics, unless it is an entire sentence or section that is italicized, or the punctuation is itself part of a phrase, title or abbreviation that is italicized.

The periods that mark an abbreviated word in the title of a journal such as Phil. Trans. are part of the title for italicization purposes, and are included within the italic tags, thus: \textit{Phil.\ Trans.}

For dates and similar phrases, proofread the entire phrase as italics, rather than marking the words as italics and the numbers as non-italics. The reason is that many typefaces found in older texts used the same design for numbers in both regular and italics.

If the italicized text consists of a series/list of words or names, mark these up with italics tags individually.

Use \emph to show text differently from the surrounding text, such as in a theorem environment, where the regular text is highlighted.

The exception to the above guidelines are mathematical variables referenced in the text, these should be surrounded with $ like any other mathematics appearing in the main body of the text.


Examples—Italics:

Original Text: Correctly Formatted Text:
Enacted 4 July, 1776 \textit{Enacted 4 July, 1776}
God knows what she saw in me! I spoke
in such an affected manner.
\emph{God knows what she saw in me!} I spoke
in such an affected manner.
As in many other of these Studies, and As in many other of these \textit{Studies}, and
(Psychological Review, 1898, p. 160) (\textit{Psychological Review}, 1898, p.~160)
L. Robinson, art. "Ticklishness," L.~Robinson, art.\ ``\textit{Ticklishness},''
Proofreaders may be tickled pink to read
Ticklishness, Tickling and Laughter,
Remarks on Tickling and Laughter
and Ticklishness, Laughter and Humour.
Proofreaders may be tickled pink to read
\textit{Ticklishness}, \textit{Tickling and Laughter},
\textit{Remarks on Tickling and Laughter}
and \textit{Ticklishness, Laughter and Humour}.
a, b, and c are the three sides of the triangle △ ABC. $a$, $b$, and $c$ are the three sides of the triangle $\triangle ABC$.
Proof: Let the radius have unit length, and Proof: \textit{Let the \emph{radius} have unit length, and}

Bold Text

Proofread bold text (text printed in a heavier typeface) with \textbf{  inserted before the bold text and } after it.

Punctuation goes outside the bold tags, unless it is an entire sentence or section that is in bold, or the punctuation is itself part of a phrase, title or abbreviation that is in bold type.

Superscripts

Older books often abbreviated words as contractions, and printed them as superscripts, for example:

Genrl Washington defeated Ld Cornwall's army.

Format these like this:

Gen\textsuperscript{rl} Washington defeated L\textsuperscript{d} Cornwall's army.

In mathematics, superscripts are marked up using a caret ^. Format superscripts containing more than a single character with curly braces { and }. For example:

... up to xn-1 elements in the array.

would be formatted as

... up to $x^{n-1}$ elements in the array.

Subscripts

Subscripted text is often found in scientific works. Format as follows:

For example:

H2O.

would be formatted as

H${}_{2}$O.

Underlined Text

Format underlined text with \underline{ and }, unless the Project Comments/Project Discussion indicate that underlining is being used for emphasis, in which case use \emph instead.

S p a c e d   O u t   Text (gesperrt)

Format s p a c e d   o u t   text, called gesperrt in German, in which it was commonly used for emphasis when italics weren't available, with \so{ and }, and remove the extra spaces between letters in each word.

Note to Post-Processors: In order to use this command you'll need the soul package available from CTAN. If you don't have and/or can't get this package, use \let\so\textbf in your preamble to use bold instead of the spaced out text.

Font size changes

Unless specified otherwise in the Project Comments, don't mark up changes in font size. Merely leave a comment on the page for the Post-Processor, e.g.

%Font size changes on this page

This will allow the Post-Processor to make sure that all font size changes are handled consistently throughout the text. The exception to this is when the font size changes to indicate a block quotation.

Word in all Caps or Small Caps

Proofread words that are printed in all capital letters as printed.

Format words that are printed in small caps according to semantics, if possible: Chapter, section, subsection, and paragraph headings should be marked as such, even if the test-compiled page does not match the fonts in the page scan. Do not add explicit font specifications to titles of sectional units.

Shorter snippets, such as authors' names or journal titles may be formatted visually, as \textsc{Small-cap Text} to produce SMALL-CAP TEXT. If a project contains many such occurrences, semantic tags (such as \Author{} or \Title{}) are desirable; inquire in the project forum.

One exception to these instructions, very rarely encountered in LaTeX projects, is the first word of a chapter: many old books typeset the first word of these in (sm)all caps; this should simply be changed to upper and lower case, so "ONCE upon a time," becomes ``Once upon a time,''

Large, Ornate opening Capital letter (Drop Cap)

Any large and ornate graphic first letters of a chapter, section, or paragraph should have been proofread as just the letter. No additional formatting is required.

Dashes, Hyphens, and Minus Signs

There are generally four such marks you will see in books:

  • Hyphens. These are used to join words together, or sometimes to join prefixes or suffixes to a word.
    Leave these as a single hyphen, with no spaces on either side.
  • En-dashes. These are just a little longer, and are used for a range of numbers. Format these as two hyphens. Spaces before or after are determined by the way it was done in the book; usually no spaces in number ranges, sometimes both sides, sometimes just before.
  • Em-dashes & long dashes. These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat—!
    Format these as three hyphens. Don't leave a space before or after, even if it looks like there was a space in the original book image.
  • Still longer dashes. These represent omitted or censored words or names.
    Format these as 6 hyphens. When it represents a word, we leave appropriate space around it like it's really a word. If it's only part of a word, then no spaces—join it with the rest of the word.
  • The minus sign. Format it enclosing the formula in a math environment $ ... $.

Note: If an em-dash appears at the start or end of a line of your OCR'd text, join it with the other line so that there are no spaces or line breaks around it. Only if the author used an em-dash to start or end the paragraph or line of poetry or dialog should you leave it at the start or end of a line. See the examples below.


Examples—Dashes, Hyphens, and Minus Signs:

Original Image: Correctly Proofread Text: Type
semi-detached semi-detached Hyphen
four-part harmony four-part harmony Hyphen
discoveries which the Crus-
aders made and brought home with
discoveries which the Crusaders
made and brought home with
Hyphen
factors which mold char-
acter—environment, training and heritage,
factors which mold character---environment,
training and heritage,
Hyphen
See pages 21-25 See pages 21--25 En-dash
-14° below zero $-14°$ below zero Minus
X - Y = Z $X - Y = Z$ Minus
2 - ½ $2 - \frac{1}{2}$ Minus
I am hurt;—A plague
on both your houses!—I am dead.
I am hurt;---A plague
on both your houses!---I am dead.
Em-dash
sensations—sweet, bitter, salt, and sour
—if even all of these are simple tastes.
sensations---sweet, bitter, salt, and sour---if
even all of these are simple tastes. What
Em-dash
senses—touch, smell, hearing, and sight—
with which we are here concerned,
senses---touch, smell, hearing, and sight---with
which we are here concerned,
Em-dash
It is the east, and Juliet is the sun!— It is the east, and Juliet is the sun!--- Em-dash
As the witness Mr. —— testified, As the witness Mr.\ ------ testified, long dash
As the witness Mr. S—— testified, As the witness Mr.\ S------ testified, long dash
the famous detective of ——B Baker St. the famous detective of ------B Baker St. long dash
"You —— Yankee", she yelled. ``You ------ Yankee.'', she yelled. long dash

Initials

Spaces in names printed as initials should be proofread as non-breaking space, a tilde, "~". For example, proofread H. M. S. Pinafore as H.~M.~S. Pinafore, Proofread G. B. Shaw as G.~B. Shaw. This avoids the potential problem of the series of initials being broken across lines when text is rewrapped.

Contractions

Remove any extra space in contractions, for example: would n't should be proofread as wouldn't.

This was often an early printers convention, where the space was retained to indicate that 'would' and 'not' were originally separate words. It is also sometimes an artifact of the OCR. Remove the extra space in either case.

Some Project Managers may specify in the Project Comments not to remove extra spaces in contractions, particularly in the case of texts which contain slang, dialect, or are written in languages other than English.

Poetry/Epigrams

This section applies to an occasional Poem or Epigram in a mainly non-poetry book. For an entire book of poetry, see the special guidelines for Poetry Books.

Enclose each poem in a verse environment, i.e.

\begin{verse} 
Text of poem goes here. 
\end{verse} 

Put a LaTeX linebreak \\ at the end of each line, otherwise LaTeX will try and rewrap them. Add a single linebreak on its own line to mark a blank line between stanzas.

Preserve the relative indentation of the individual lines of the poem or epigram by adding a \quad, \qquad, or three \quad spaces in front of the indented lines to make them resemble the original.

When a line of verse is too long for the printed page, many texts wrap the continuation onto the next printed line and place a wide indentation in front of it. These continuation lines should be rejoined with the line above. Continuation lines usually start with a lower case letter. They will appear randomly unlike normal indentation, which occurs at regular intervals in the metre of the poem.

If the poetry is centered on the printed page, enclose the relevant lines in a center environment, see example below.

Footnotes in poetry should be treated the same as usual footnotes during proofreading. See footnotes for details.

Line Numbers in poetry should be kept. Put them at the end of the line, leaving at least 6 spaces between them and the end of the text. See line numbers for details.

Check the Project Comments for the specific text you are proofreading. Books of poetry often have special instructions from the Project Manager. Many times, you won't have to follow all these formatting guidelines for a book that is mostly or entirely poetry.

Sample Image:
poetry.png
Correctly Formatted Text:
to the scenery of his own country:

\begin{verse} 
\begin{center}  
Oh, to be in England  \\
Now that April's there,  \\
And whoever wakes in England  \\ 
Sees, some morning, unaware,  \\ 
\end{center} 

That the lowest boughs and the brushwood sheaf  \\
Round the elm-tree hole are in tiny leaf,  \\
While the chaffinch sings on the orchard bough  \\
\begin{center} 
In England--now! 
\end{center} 
\\ 
And after April, when May follows,  \\
And the whitethroat  builds, and all the swallows!  \\
Hark! where my blossomed pear-tree in the hedge  \\
Leans to the field and scatters on the clover  \\
Blossoms and dewdrops---at the bent spray's edge---  \\
That's the wise thrush; he sings each song twice over,  \\
Lest you should think he never could recapture  \\
The first fine careless rapture!  \\
And though the fields look rough with hoary dew,  \\
All will be gay, when noontide wakes anew  \\
The buttercups, the little children's dower;  \\
---Far brighter than this gaudy melon-flower!
\end{verse}

So it runs; but it is only a momentary memory;
and he knew, when he had done it, and to his

Letters/Correspondence

Proofread letters and correspondence as you would paragraphs. Put a blank line before the start of the letter.

If the header or footer lines are indented in the original use \quad spacing to reproduce this. If they are right-justified, enclose the text in a flushright environment, i.e. \begin{flushright}Justified text goes here.\end{flushright}

Sample Image:
letter.png
Correctly Formatted Text:
\begin{center} 
\textit{John James Audubon to Claude François Rozier}
\end{center}

\noindent [Letter No. 1, addressed] 

\textsc{M. Fr.\ Rozier},  \\
\qquad Merchant-Nantes. 

\begin{flushright}
\textsc{New York}, \textit{10 January, 1807}.
\end{flushright}

\noindent\textsc{Dear Sir}:

We have had the pleasure of receiving by the \textit{Penelope} your 
consignment of 20 pieces of linen cloth, for which we send our
thanks. As soon as we have sold them, we shall take great
pleasure in making our return.

Lists of Items

For simple lists you can use the itemize environment. Please use the optional argument to the \item command to hard-code the bullets, dashes or whatever, as in the example below. Do not leave a blank line between these markers and the rest of the text unless the next line starts a new paragraph. 

Original Text:
Andersen, Hans Christian         Daguerre, Louis J. M.         Melville, Herman
Bach, Johann Sebastian Darwin, Charles Newton, Isaac
Balboa, Vasco Nunez de Descartes, René Pasteur, Louis
Bierce, Ambrose Earhart, Amelia Poe, Edgar Allan
Carroll, Lewis Einstein, Albert Ponce de Leon, Juan
Churchill, Winston Freud, Sigmund Pulitzer, Joseph
Columbus, Christopher Lewis, Sinclair Shakespeare, William
Curie, Marie Magellan, Ferdinand Tesla, Nikola
Correctly Formatted Text:
 \begin{itemize}
 \item[]  Andersen, Hans Christian
 \item[]  Bach, Johann Sebastian
 \item[]  Balboa, Vasco Nunez de
 \item[]  Bierce, Ambrose
 \item[]  Carroll, Lewis
 \item[]  Churchill, Winston
 \item[]  Columbus, Christopher
 \item[]  Curie, Marie
 \item[]  Daguerre, Louis J.M. 
 \item[]  Darwin, Charles 
 \item[]  Descartes, René
 \item[]  Earhart, Amelia
 \item[]  Einstein, Albert
 \item[]  Freud, Sigmund
 \item[]  Lewis, Sinclair
 \item[]  Magellan, Ferdinand
 \item[]  Melville, Herman
 \item[]  Newton, Isaac
 \item[]  Pasteur, Louis
 \item[]  Poe, Edgar Allan
 \item[]  Ponce de Leon, Juan
 \item[]  Pulitzer, Joseph
 \item[]  Shakespeare, William
 \item[]  Tesla, Nikola
 \end{itemize}

For numbered lists you can also use the itemize environment. Please use the optional argument to the \item command to hard-code the numbers, letters or whatever, as in the example below. Do not leave a blank line between these markers and the rest of the text unless the next line starts a new paragraph. 

Original Text:

1. Item a
2. Item b
3. Item c

Correctly Formatted Text:
\begin{itemize} 
\item[1.] Item a 
\item[2.] Item b 
\item[3.] Item c  
\end{itemize}

Tables

Tables can be thought of as a matrix, consisting of rows and columns. The data is entered one row at a time, with double backslashes (\\) separating the rows, and ampersands (&) separating the entries within each row. A surrounding pair \begin{tabular}{...} ... \end{tabular} then does all the work. Here the argument of the first brace {...} consists of a "template" that specifies the positioning of the table entries: l, r, or c, for left, right, or center. Here is a very simple example:

\begin{tabular}{l r r r c} 

Name & Exam1 & Exam2 & Exam3 & Grade \\
John & 19    & 28    & 33    & C \\
:
Jane & 49    & 35    & 60    & B \\ 
Jim  & 76    & 38    & 59    & A 
\end{tabular}

LaTeX will automatically choose the width of the columns and the heights of the rows so that all table entries fit, with some room to spare.

Modern practice is to be sparing with frames and rules (vertical and horizontal lines) for tables, but these are common in the books we encounter at DP, and easy to do in LaTeX: To get horizontal lines add \hline at each linebreak and at the beginning and end of the table. To get vertical lines (and also specify the positioning of the data entries - left, center, or right), add vertical bar (or "pipe") characters (|) to the template, e.g.: { | l | r | r | r | r } If you use two bar symbols || instead of a single bar, or use a double \hline, the separating lines get "doubled". This is sometimes used to separate the headers from the table contents.

Here is the above table with these embellishments.

\begin{tabular}{| l || r | r | r | c |} 

\hline

Name & Exam1 & Exam2 & Exam3 & Grade \\

\hline\hline

John & 19    & 28    & 33    & C \\ 
\hline
: 
Jane & 49    & 35    & 60    & B  \\ 
\hline
Jim  & 76    & 38    & 59    & A  \\
\hline

\end{tabular}

LaTeX can handle more complicated table formatting, but if you don't know how to code something just do your best and leave a %[**note] for the next round and the post-processor.

tabularx

The tabularx package can be useful if you need to deal with a table which should have a set width (such as the full width of the page) without having to specify and calculate all the column widths explicitly. It allows some columns to have their 'natural' width and other columns to stretch to use up any remaining width. This differs from tabular* (which allows a fixed-width table by stretching the space between columns).

tabularx is one of LaTeX's required tools, so it should already be installed in your LaTeX system, along with documentation (which can also be found at CTAN).

Block Quotations

Use the quote environment for short (one paragraph) quotations and the quotation environment for long quotations.

Block quotations are long quotations (typically several lines and sometimes several pages) and are often (but not always) printed with wider margins or in a smaller font size—sometimes both.

Sample Image:
bquote.png
Correctly Formatted Text:
later day was welcomed in their home on the Hudson. 
Dr. Bakewell's contribution was as follows:\footnote
  {* %[**F1: footnote text missing from scan]
  }

\begin{quote}
The uncertainty as to the place of Audubon's birth has been
put to rest by the testimony of an eye witness in the person
of old Mandeville Marigny now dead some years. His repeated
statement to me was, that on his plantation at Mandeville,
Louisiana, on Lake Ponchartrain, Audubon's mother was
his guest; and while there gave birth to John James Audubon.
Marigny was present at the time, and from his own lips, I have,
as already said, repeatedly heard him assert the above fact.
He was ever proud to bear this testimony of his protection
given to Audubon's mother, and his ability to bear witness as
to the place of Audubon's birth, thus establishing the fact that
he was a Louisianian by birth.
\end{quote}

We do not doubt the candor and sincerity of the
excellent Dr. Bakewell, but are bound to say that the
incidents as related above betray a striking lapse of 

Double Quotes

Format these as `` (two grave accents) opening quotes and '' (two apostrophes) closing quotes  double quotes.

Do not change double quotes to single quotes. Leave them as the Author wrote them.

The Project Manager may instruct you in the Project Comments to format non-English language quotation marks differently for a particular book.

Single Quotes

Format these as ` (grave accent) opening quotes and the plain ASCII ' single quote (apostrophe).

Do not change single quotes to double quotes. Leave them as the Author wrote them.

Quote Marks on each line

If the page image shows quotation marks at the beginning of each line of a quotation, the proofreaders should already have removed all of them except for the one at the start of the first line of the quotation.

If the quotation goes on for multiple paragraphs, each paragraph should have an opening quote mark on the first line of the paragraph.

Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are proofreading. Leave it that way—do not add closing quotation marks that are not in the page image, e.g.

"This is an example
"of the type of quoting
"it is talking about," he
remarked. 

should be formatted as:

``This is an example
of the type of quoting
it is talking about,'' he
remarked.

Line Breaks

Leave all line breaks in so that the next proofreader/formatter and the post-processor can compare the lines in the text to the lines in the image easily. Be especially careful about this when rejoining hyphenated words or moving words around em-dashes. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.

It may sometimes be necessary to introduce additional linebreaks: for example, if you need to leave a %[**Note] right next to what it refers to, then a linebreak at the end of the %[**Note] ensures that LaTeX still sees the rest of the original line. Also, additional linebreaks in complicated tabular or mathematical formatting can assist in code legibility.


Line Numbers

Proofers should have kept line numbers, and placed them at least six spaces past the right hand end of the line, even if they are on the left side of the poetry/text in the original image.

There is no standard way to format line numbers in LaTeX: ask in the project forum if you run into any.

Extra Spacing/Stars/Line Between Paragraphs

Most paragraphs start on the line immediately after the end of the previous one. Sometimes two paragraphs are separated to indicate a "thought break." A "thought break" may take the form of a line of stars, hyphens or some other character, a plain or floridly decorated horizontal line, a simple decoration, or even just an extra blank line or two.

A "thought break" may represent a change of scene or subject, a lapse in time or a bit of suspense. This is intended by the author, so we mark them by putting in %<tb> as a marker for the PPer to decide how to handle them later.

%<tb>

Sometimes printers used decorative lines to mark the ends of chapters. As we already mark chapter headers, there is no need to add a "thought break" marker.

The proofreading interface has the standard "thought break" marker available to cut and paste, remember to place a % at the start of the line if you use this.

Period Pause "..." (Ellipsis)

These should be formatted as \ldots with a space before and a space after.

For example:

That I know \ldots\ is true. 
This is the end\ldots. 
Wherefore art thou Romeo?\ldots 

Sometimes you will see it with the punctuation at the end; so proofread it that way:

Wherefore art thou Romeo\ldots?

Remove extra dots, if any, or add new ones, if necessary, to bring the number to three (or four) as appropriate.

LOTE: (Languages Other Than English) Use the general rule "Follow closely the style used in the printed page." Sometimes the printed page is unclear: in that case, insert a * to draw the attention of the post-processor. If spaces appear to exist between the dots, or between the word and the dots, replace the spaces with tildes: like this~... or like this~.~.~. depending on the style. This will avoid a linebreak before or between the dots.

Greek Characters

Greek characters used in mathematical expressions in LaTeX are represented by "\<name_of_character>", e.g. \alpha for alpha, \beta for beta, etc. For upper case Greek characters the first character of the LaTeX command should be capitalised, e.g. \Alpha, \Beta, etc. Mark up Greek symbols semantically: do not distinguish between the variant forms (unless both forms are used in a book, in which—hopefully rare—case, seek guidance in the Project Thread). Hence for example use \phi even if the scan looks more like a \varphi: the post-processor will take care of the final appearance.

Greek appearing as text should be left in DP-transliterated form and a note added to draw the post-processor's attention, because they will need to modify the transliteration to suit the syntax used by their favourite Greek package.

Fractions

Format fractions as follows: becomes $2\frac{1}{2}$.

Page References "See Pg. 123"

Format page number references within the text such as (see p. 123) as they appear in the image, but clothe periods following abbreviations: thus (see p.~123). Also leave a % [Xref] note at the end of the line, or if there are lots on the page just leave a % [Xrefs] note at the top of the page.

Check the Project Comments to see if the Project Manager has special requirements for page references.

Indexes

The PPer may wish to use \makeindex to build an index by adding \index{Foo} commands throughout the document. In that case, all he needs is the proofed text of the index pages. As with the table of contents, comment them out by putting \iffalse at the top of a page and \fi at the bottom.

However, if no such preference is expressed in the Project Comments or project forum, please format the index pages using \begin{theindex} .. \end{theindex}.

Please retain page numbers in index pages. You don't need to align the numbers as they appear in the scan; just put a comma or semicolon, followed by the page numbers. Indexes are often printed in 2 columns; this narrower space can cause entries to split onto the next line. Rejoin these back onto a single line.

Put \item before each entry and place one blank line between each entry in the index.

For sub-topic listings in an index, start each one with \subitem, on a new line.

Separate the alphabetical sections with \indexspace.

E.g.:

\begin{theindex}

 \item  Yerkes, C.~T. (yer$'$kez) (1837--1905), Am.\ patron,
   \subitem  Observatory 15, 200, 432, 434, 
   \subitem  telescope 15, 202, 424

 \item  Young, C.~A. (1834--1908), Am.\ ast.\ 270, 281, 282, 298

 \indexspace

 \item  Zenith defined 24, 58; 

\end{theindex}

Mathematics

See LaTeX math formatting guidelines.

LaTeX has defined markup for a wide range of special characters and symbols, as well as for for representing the structure of a project as a whole. See LaTeX resources for detailed LaTeX documentation, including a comprehensive list of symbols. If the project that you are working on requires obscure symbols, it's a good idea to post a note about this in the Project Discussion thread. This will save every individual formatter having to look up the markup.

Anything else that needs special handling or that you're unsure of

While formatting, if you encounter something that isn't covered in these guidelines that you think needs special handling or that you are not sure how to handle, post your question, noting the png (page) number, in the Project Discussion thread (a link to the project-specific forum is in the Project Comments), and put a note in the proofread text explaining the problem. Your note will explain to the next proofreader or post-processor what the problem or question is.

Start your note with %, a square bracket and two asterisks %[**, the round you are working in, and end it with another square bracket ] and a linebreak (if there is any more text on the current line). This clearly separates it from from the Author's text and signals the next proofreader to stop and carefully examine this part of the text & the matching image to address any issues.

If you are proofreading in a later round and come across a note from a proofreader in a previous round, once you have resolved the issue, please take a moment and provide feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Do not remove a previous proofer's note: the post-processor may find it helpful even if you have resolved the problem.

Handwritten Notes in Book

Do not include handwritten notes in a book (unless it is overwriting faded, printed text to make it more visible). Do not include handwritten marginal notes made by readers, etc.

Some Project Managers may ask that handwritten notes be marked with %[HW: (text of the note)].

Previous proofreader mistakes

If a previous proofreader made a lot of mistakes or missed a lot of things, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation so that they will know how in the future.

Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.

If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page.

If you are unsure, place a note in the txet %[**typo for text?]
and ask in the Project Discussion thread. If you do make a change, include a note describing what you changed: %[**Transcriber's Note: typo fixed, changed from "txet" to "text"]
. Include an * so the post-processor will notice it.

Errors in texts

In general, don't correct errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them.

A possible exception is in technical or scientific books, where a known formula or equation may be given incorrectly, especially if it is shown correctly on other pages of the book. Notify the Project Manager about these, either by sending them a message via the Forum, or by inserting
%[**note sic explain-your-concern]
at that point in the text.