LaTeX postprocessing guidelines

From DPWiki
Jump to: navigation, search

September 2009: There's a printable LaTeX PPing manual based on these guidelines, other DP wiki pages, LaTeX Typesetters' Team discussions, and private communications. The wiki contents below should be regarded as slightly out-of-date.


These are guidelines for Post-processing the text of DP projects using LaTeX. See also

There is a companion page containing tips and tricks germane to LaTeX post-processing.

Note: This document began as laurawisewell's description of her PPing practice. It is gradually being recast as general guidelines, but anything written in the first person should be read as suggestions rather than guidelines.

LaTeX Post-Processing Walkthrough

Please see the list of recently-posted projects for examples of DP LaTeX projects posted to PG, and the pglatex page for information on required structure of LaTeX source files.

Preparation

  1. Download the project. Put things in the usual directories.
  2. MacOS, Linux and Unix users: Change the line endings using dos2unix or a regex.
  3. Check all the pages are there. Read the Project Comments and the project thread. Comment out the page separators if they aren't already.
    • dcwilson suggests turning the page separators into something that makes the png numbers visible in the draft output, to make it easier to compare your LaTeX-in-progress with the scans. For example, with the memoir class
\ifdraftdoc
\def\PG#1 #2.png#3
{\marginpar{\noindent\null\hfill\small #2.png}}
\else
\def\PG#1 #2.png#3
{}

in the preamble lets you conceal the page separators thus:

\PG--File: 054.png---\*******\**************\********\******\******\-------

Decide if there's a compelling reason to prefer latex over pdflatex

At present it is simplest at the PG end of things if the book can be compiled using pdflatex (because the pdf generation doesn't require additional steps). The most likely reason for needing a different compilation pathway (such as LaTeX -> PostScript -> pdf) will be the illustrations. Decide whether you will recreate (most of) them in picture environment or include them as graphics files, and if the latter, what filetype is best. pdflatex handles .png and .jpg natively (for bitmap graphics) as well as .pdf (for separately-compiled .eps). It can also handle MetaPost and, with assistance from suitable packages, some non-too-compilcated .eps files. Some packages that permit drawing commands to be embedded in the .tex source (such as epic and eepic) will not work immediately with pdflatex. However, new packages are being developed all the time to extend pdflatex's capabilities in this area, so you need a really good reason to insist on latex.

Prepare any included graphics

There are two main classes of images: bitmap and vector. Vector images are generally preferable, because they scale well and have small file sizes. Bitmap images have limited resolution and the file size can be excessive if you insist on print-quality resolution. Many diagrams and simple line art are straightforward to recreate in a vector format: ask in the forums if you need a hand. Monochrome artwork can also be vectorised acceptably using tools like potrace (this is how the publishers' devices were created). There is usually no alternative to a bitmap for photographs or grayscale illustrations; some experimentation will be required to find a good balance between acceptable output resolution and file size.

If you'd rather deal with images later, temporarily comment out all the \includegraphics commands, otherwise it won't compile cos the files aren't there. Or, you can use the draft option in graphicx, if you specify a bounding box in the \includegraphics optional argument. When you resize them, bear in mind that the default textwidth and textheight are 360pt and 595pt.??? Save as eps and distill to pdf, or directly as png or jpg for scanned illustrations.

As with html projects, all graphics files must be located in a subdirectory /images within the project directory. When placing them in the LaTeX code it is best to specify the directory explicitly: \includegraphics{./images/illo743.jpg}

If you are providing both .eps and .pdf versions of illustrations (so that users can compile with either dvips or with pdflatex for example), then you can omit the file extension and let the graphicx package automatically locate the version appropriate to the workflow:\includegraphics{./images/illus-034b} (Using ./images rather than just /images when the graphic extension is omitted ensures TeX will locate files in the current directory rather than in any wider search trees, and does no harm when the graphics extension is explicitly specified.)

If illustrations have been compiled from some kind of code (such as PostScript) and you want this to remain with the project in case maintenance is required, files can be placed in a subdirectory /images/sources

Decide page dimensions and start to put a preamble into your document

The book documentclass is probably appropriate, or amsbook. Or consider memoir.cls (available from CTAN) which makes it easier to change the formatting of the document.

Specify the page size in the documentclass explicitly: \documentclass[12pt,reqno,letterpaper]{book}[2005/09/16] for example. Do not rely on defaults because the WWer will be compiling the book using a TeX system whose defaults could well be set up differently from yours. Make sure that the dimensions for the output pdf are also specified explicitly. With a pdflatex workflow this can be achived by, for example,

%% two-sided A4 PDF
\pdfcatalog{/PageLayout/TwoPageRight }
\setlength{\pdfpagewidth}{210truemm}% or replace the 210truemm with \paperwidth to get it from the documentclass
\setlength{\pdfpageheight}{297truemm}% or replace the 297truemm with \paperheight to get it from the documentclass

(The \pdfcatalog command makes Adobe Reader display the pdf like an open book.)

Put the \end{document} at the end, and \frontmatter, \mainmatter, \backmatter at appropriate points within the text.

You'll add to the preamble as you progress, but putting in packages like hyperref (and, for Mac users, pdfsync) at this stage can help you navigate.

Decide output text encoding

The default input text encoding is Latin-1, because the LaTeX proofing guidelines encourage the use of Latin-1 symbols wherever possible. Books coming out of the rounds will normally be expecting the preamble to include

\usepackage[latin1]{inputenc}

LOTE projects will probably need the babel package too. The PG convention is for text files to be Latin-1, with Windows line endings, so PPers on other platforms may need to convert their .tex source before uploading.

You generally only need to worry about the output text encoding if the project is LOTE and uses lots of accented characters. Even though the input file is Latin-1, the inputenc package will convert é (for example) to \'e behind the scenes. This can interfere with hyphenation, because TeX will not hyphenate a word containing composite characters (it won't hyphenate a word containing an explicit hyphen either). If you want to have TeX hyphenate accented words, then you need to change the output text encoding using the fontenc package. This also requires that you--and whoever compiles your code--have available a set of fonts that contain single glyphs for all the composite characters (unlike Computer Modern, which constructs é from an e and a separate acute, rather than having a ready-made é).

Compile the document

Add packages (such as amsfonts and amssymb) and fix errors (e.g. millions of unescaped &s!) until the thing compiles. You don't really need to compile until much later, but I get curious! You can always comment out sections, such as a horrendously buggy table, that you don't want to fix right now. (As an alternative to putting percent signs on many lines, you can ignore a whole chunk by putting \iffalse before it and \fi after it.)

Highlighting

(Guiguts) turn on highlighting of en_common, so you can check for scannos while you do the next stages.

Remove unwanted whitespace

(Guiguts) Run "Remove end of line spaces", and "Remove blank lines before page separators". Page through the file checking that there are no other unwanted blank lines (for example around mid-paragraph figures or equations), and that blanks at the top of pages are present where needed. As you do so, check that the text makes sense across the page breaks, and keep an eye on those highlighted words.

Optionally, Run Fixup

(Guiguts) Run Fixup WITH GREAT CARE! You don't (or shouldn't) have block markups in, and if someone has kindly formatted a LaTeX table so that the source is readable, you don't want to lose that. And often spaces within curly brackets (such as \mbox{ }) are important. I'd say run Fixup, save the file under another name, and diff them to see what GG has actually done, then go back and modify the options accordingly. It is useful, for the things like spaces before colons etc.

Run Lprep

rfrank has written Lprep, which strips out all or most of the LaTeX commands from your file. The resulting text is what you should perform the next few stages on, although the corrections you make should go back in the original file. While you can probably work around some garbage left behind by Lprep, the WWer will also be using an Lprepped version of your uploaded LaTeX source, so it is worthwhile developing some Lprep-customising code to ensure output that is as clean as possible (see here). Lprep can handle most normal LaTeX, but any macros you build yourself or take from less common packages are likely candidates for needing some attention, providing they are called in the body of the document and outside math mode. The customisation code can be stored in your LaTeX source, immediately following \end{document} and surrounded by ### lines. Even if you don't need any customisation, it's good practice to include the

###

###

after \end{document}.

Word frequency checks

Look at the words occurring only once, and suspicious hyphens, accents, emdashes (yes, they're different from what GG expects, but it can still help you find them).

Gutcheck

(If run on a LaTeX file rather than the Lprep text) it will choke on a lot of things, even at this early stage without the markup that you'll add later. Just do your best. It does find useful stuff, like "no punctuation at paragraph end" and "missing space", which you MUST do before lots more markup goes in.

Scanno checks

Run the guiguts checks on en_common, misspelled, and regex. You will very likely want to modify some of the regexes. (Anyone care to make a latexregex.rc file?)

Back to the .tex file. — Find obvious non-LaTeX-isms

Search for ", the double quote character. There probably should be none.
Search for \s', a space followed by a single quote. It probably should be a backtick.
Search for html markup, such as italics and bold and small caps.
Search for block markup of poetry etc. Guiguts will do this. (Its search for orphaned brackets isn't very useful though, as it can't handle nesting afaik.)
Search for hyphens and dashes and make them the right length.
Search for "..." and turn it into a \dots.
Find any footnotes, make sure they're inline, and sidenotes, which should become \marginpars.
(If you find some of these things, it may be worth sending feedback to the foofers about it.)

Spell check

I use Excalibur, and make a new dictionary just for the project. Don't be like me and forget to switch off "Ignore all-caps words" in preferences, cos typos on the title page are really embarrassing...!
Or, you could run your usual spellchecker on the Lprepped text.

Resolve proofer notes

Also check the project thread again for any issues you might have forgotten.

Open in your favourite TeXt editor

You're just about finished with the parts Guiguts can help you with, so you'd probably be better using something that highlights LaTeX syntax. Will need to be capable of regexing though (so for us Mac lovers, the latest TeXShop's good, and so is SubEthaEdit.)

(At this point, it's now safe to start doing some of the laborious dull stuff I've put at the end, e.g. the index, so that you don't have to do it all in one sitting.)

Sort out abbreviations

  • You probably want to get the abbreviations spaced if they are in the book, using non-breaking spaces ("~") rather than closed up as in normal proofing.
  • Also non-breaking between a person's name and their title (perhaps? if not then a normal "\ " space, as below, if their title ends in a full-stop), and between numerical quantities and units if the book did. This is a pain, since some will have put dollars round just the numerals, or the numerals and the units, or nothing at all, and some may have not left any space even if the book did.
  • Unless you are using \frenchspacing, you need to tell LaTeX whenever a lower-case followed by a full-stop isn't the end of a sentence (such as "e.g. an example", which becomes "e.g.\ an example). Some proofers do this, some don't, and some inappropriately (?) use a non-breaking space.
  • Finally, (unless you are using \frenchspacing) tell LaTeX whenever an upper-case followed by a full-stop is the end of a sentence ("I live in the U.K. " becomes "I live in the U.K\@. ").

Not all books use dots with abbreviations, particularly the ones like "Mr". But here are some common abbreviations to search for:

Mr., Mrs., Messrs., Dr., Prof., Rev., St.; hr. and hrs., m. and min. and mins., sec. and secs., a.m. and p.m., B.C. and A.D.; ft., in., oz., lb. and lbs., cwt.; fig. and figs., vol. and vols., p. and pp., ed.; i.e., e.g., viz., etc., &c., percent.

Proofers may have missed out dots that were in the image.

Format the title page

You may prefer the titlepage environment to the maketitle command.

Sort out the chapter and section structure

Put in \tableofcontents. Replace any \section* with \section and \chapter* with \chapter until you can see the structure in the generated ToC (change back later: this is just to check for any that are missing). Compare to the book's ToC. Page through the book, to find any section headings the proofer just marked as bold, centred etc, and turn them into section commands. Also find out whether the levels of headings in the book are like those in the book's ToC and/or in the generated ToC.

You can add additional things to the generated ToC using

\addcontentsline{toc}{level}{stuff}

for entries formatted like a <level>, or just

\addtocontents{toc}{code}

to insert raw code in the generated ToC (eg to force a pagebreak after a particular entry, or fiddle with vertical spacing).

It's probably good to put in a \label at each section or chapter, unless your documentclass does this automatically anyway. You can format the headings now or later.

Sort out theorems and similar environments

Search for \begin to get some idea of what you'll need in the way of Theorem, Lemma, Corollary, Proof etc. Foofers may have put in formatting such as italic text, which you will probably want to remove and handle later. They may even have typed the word "Theorem" instead of opening an environment. If theorems are numbered, or talked about by page reference at all, then you may want to put \labels in now.

Equations

It is recommended that any numbered equations have the numbers hard-coded using \tag, so check the file for any unstarred equation environments without tags. Will you need \labels?

Page through the output checking for any obvious mathematical monstrosities. Fix any of your pet hates (mine is that I abhor the eqnarray environment).

Consider using the amsmath environments to get nice-looking multiline/aligned displays.

You may find that some numerals are in text mode while others are in math mode. Although they look the same, they won't if someone changes the text font, so you might want to try for consistency. Similarly math letters in an upright font may have been proofed as text, or in math font, or with \textrm or \mathrm.

Sort out tables

You might want multirow or longtable. Don't forget that the tabbing environment is available as well as tabular and can right align columns and split across pages, making it good for some kinds of tables (including a hard-coded ToC, if you end up having to use one).

If you'd like the source to be more legible, you can use Guiguts table fixup:

  • Copy and paste the rows of the table into a blank file in Guiguts.
  • Make sure each table row is a single line.
  • Remove all spaces, or if there are important spaces (such as cells containing more than one word) run the Fixup option to convert multiple spaces into single spaces. Also remove spaces after &.
  • Replace all instances of "&" with "  &" (i.e. put two spaces before them all). GuiGuts might miss some, so you may need to hit "replace all" more times.
  • Open the GuiGuts table fixup, select your table and hit "Autocolumns". Watch it line up beautifully!
  • Highlight each vertical line and delete it, then copy and paste back.

Sort out figures

Go through putting appropriate sizes into the includegraphics commands (??unless you're sure the file is already at the size you want?? Is it good to specify size, either in terms of \textwidth or \textheight, or absolutely??)

Illustrations are required to live in a directory called images; because of the various searching conventions employed by different TeX systems, it is safest to specify the illustrations via

\includegraphics[options]{./images/filename}

Floats

Do you want the tables and/or figures to float? If so, wrap each in a table or figure environment, choosing a reasonable placement specifier ([!htbp]?) and caption if there is one. Some might be best fitted in landscape orientation; there are various ways, depending on whether you want a whole page landscape, want the caption or only the figure rotated, and whether you want any more text around it rotated. See TUG FAQ. You may want to have text flowing around small or narrow figures, if the book did. Packages wrapfig and floatflt can both do this, but they both have the disadvantage that they may place things out of sequence, and can be a bit pernickerty near pagebreaks. Worry about that later.

You will probably want \labels, which need to go just after the caption. Captions, if any, are supposed to go below, but since hyperref links to the caption not the thing itself, you might want to put them above and sort the spacing.

Remember that if you float things, references to them as being "above", "below", "following", "opposite" etc in the text may end up being wrong. The varioref package can help with this to some extent, by allowing LaTeX to change the words used depending on how far the float has floated.

Referencing pages, the index, contents, list of figures, list of tables, bibliography

Things to search for that may need to become \ref, \eqref or \pageref include mentions of

  • page
  • figure
  • plate
  • table
  • equation
  • lemma
  • theorem
  • corollary
  • chapter
  • section
  • appendix
  • part

AND all capitalisations and abbreviations of these. You are bound to miss something. And I've no idea how to find citations except by smoothreading. Put "colourlinks" as an option where you call hyperref even if you will turn it off later, as this helps missed-out references to get noticed.

Autogenerated index, ToC, LoF, LoT are preferred, but if your book has an inconsistent structure you may end up using what the proofers gave you and merely fixing the page references. Decide also whether you will take the lazy approach to page referencing, which is to just point to a \label automatically placed at each page separator, or will place appropriate \labels yourself at the exact points in the text. The former method is quick using regexes, but doesn't help you to find anomalies in things, and means that the references in the final document could be "too early" by about a page quite often. The latter method is preferred, since it's more accurate and helps you get to know your text and notice errors, but it's very time-consuming (especially when you can't find the thing referenced...) and does tend to make your source code rather illegible. If you are going to use makeindex then you have no choice but to place the \index commands manually. If there is a bibliography, edit the proofers' work into a thebibliography environment.

Note: It's important to examine the logs when you run latex or makeindex, as these will warn you if any of your references are amiss. I'm told that on some set-ups the console window will close as soon as it has finished running makeindex, so that you will have to actually go and open the index log myfile.ilg and have a look.

Hyperlinks (optional)

Hyperref will automatically make all uses of \ref, \pageref, \eqref and \cite into hyperlinks, as well as hyperlinking the ToC, LoF, LoT and Index for you. But NOTE! Rather weirdly, \pageref will become a link not to the \label you wanted, but to the last sectioning or figure command preceding it. There are (at least!) two ways to fix this:

  • Insert \phantomsection before all the labels that aren't already a section or figure. This method makes the \pageref hyperlink to the exact point in the text. This is most accurate.
  • Place the following in your preamble:
\makeatletter 
\AtBeginDocument{\def\pageref#1{% 
\expandafter\@pagesetref\csname r@#1\endcsname\@empty{#1}}} 
\makeatother

This makes the \pageref hyperlink to the top of the page containing the thing referenced. This may look more natural than the other method, particularly if you placed \labels mid-sentence.

You can turn arbitrary text into a hyperlink:

\hyperlink{target_label}{linked text}
\hypertarget{target_label}{target text}

As with html, the target text is allowed to be empty. Things you might want to hyperlink (and/or fix with the varioref package) include

above, below, facing, overleaf, opposite, over the page, preceding, previous, t.o., next page, following, this figure, this table, here.

Bookmarks (i.e. the tree-like structure you may see in a side pane of your PDF viewer) are created automatically by hyperref corresponding to the autogenerated ToC. You can place extra bookmarks using \pdfbookmark[n]{Text}{label}, where n is the "level": -1 for part, 0 for chapter, 1 for section...

Various LaTeXy stuff cannot appear in bookmarks: the command \texorpdfstring{LaTeX version}{plain version} is useful for sanitising things when hyperref complains.

Fix lists and any exotic content

Such as Greek or Hebrew, poetry, drama...? You can tweak formatting of enumerated and itemized lists later, for now just get the syntax right.

Fix float placement

This can be a real pain, especially if you have some with text wrapped around and you need to get them in sequence. wrapfig allows you to insist that a figure is placed "right here", which can help once you see where the pagebreaks fall and can move the figure within the source code accordingly. But it's not very satisfactory. You can also tweak some of LaTeX's fussy parameters about float placement. Specifying raggedbottom for the awkward pages may also help. (The flafter package is supposed to at least stop things floating backwards.) Don't spend too much time on this yet—you want it ok for smoothreading, but the pagebreaks could still move once you format theorems and sections etc.

Smooth-read

You will doubtless want to smooth-read it yourself, but you may like to upload it now, since the remaining tasks of tweaking the formatting can be done while the smoothies do their thing. It may be good to warn the smoothies in the comments if the filesize is very large, and also explain how to send their corrections since they can't mark them on the pdf.

Something very useful to get from the smooth-read is a note of where the text says things like "overleaf" or "in the figure below". The packages flafter and varioref can help with those.

Format chapter and section headings

You're not obliged to copy the original book's style, and in particular use of strange fonts should be avoided. But if you want to tweak things, memoir.cls is very flexible, or you might find book.cls or amsbook.cls close to what you want. Some changes can be made by copy-pasting part of the .cls file into your preamble and altering formatting commands in it. If the chunk you copy contains any @ signs, put \makeatletter before the copied chunk, and \makeatother after it. Some feel this is a dirty trick, others feel it is preferable to loading a special documentclass or package.

Sort out font issues

Because the posted pdf will be compiled by a WWer, you can only use fonts which are either universally available or which come with a 'standard' LaTeX installation, or are otherwise easily accessible to the WWer.

If you want to use decorative fonts for special headings (on the title page for example), create image files of the relevant pieces of text as .eps or .pdf, making sure to embed the fonts! Then incorporate them at the appropriate points in the document via \includegraphics.

If the project is LOTE, decide on the output font encoding (you will almost certainly be using an input encoding already via inputenc). The choice of output encoding impacts on both hyphenation and fontsets. If the standard TeX output encoding is used, hyphenation is suppressed for any word containing an accented character. For example, mathématique in the document source will be converted by inputenc into math\`ematique internally. In the standard TeX output font encoding OT1 the accent is a separate glyph, and this forces suppression of the hyphenation algorithm (just as the presence of an explicit hyphen suppresses TeX's hyphenation algorithm): \showhyphens{mathématique} writes math�ematique—that is, no hyphenation points—to the logfile under the OT1 encoding. There are two ways around this: manually provide discretionary hyphens for accented words near bad line breaks, or switch to a different output font encoding in which the accented letters appear as single, rather than composite, glyphs (because then the hyphenation algorithm will see only letters). The most common is \usepackage[T1]{fontenc}: under this encoding \showhyphens{mathématique} writes math-é-ma-tique —three possible hyphenation points—to the logfile. The problem now is that the fonts used in the document have to actually contain the necessary accented glyphs. LaTeX will default to the (bitmapped) EC fonts when the T1 output encoding is specified; it is probably better to use scalable ('Type 1') fonts though, which means using either the CM-super or the Latin Modern fonts, both of which are freely available from CTAN if not already provided with your LaTeX distribution. The CM-super collection includes literally hundreds of optically scaled font sizes, whereas the Latin Modern collection provides a more modest number of fonts and relies on linear scaling. To use the CM-super fonts, \usepackage{type1ec}, and to use the Latin Modern fonts, \usepackage{lmodern}.

Format theorem-like environments

You can do a lot using the newtheoremstyle from the amsthm package. But you can handle some aspects, such as numbering, just using \newtheorem which doesn't require a package.

Format lists

Deal with running headers

Footnotes

Some older books used printer's marks rather than numbers for footnotes, resetting the symbol sequence on each page. The perpage package can assist with this.

If the original book used unusual formatting for footnote markers (eg surrounding them with parentheses) consider making the parentheses (or whatever) part of the footnoting machinery rather than having them hard-coded throughout the text. This makes it easier for subsequent repurposing or adaptation of the code. For example,

\makeatletter
\def\@makefnmark{\hbox{(\@textsuperscript{\normalfont\@thefnmark})}}
\makeatother

in the preamble will make a \footnote come out as (¹). Tweaking things to look like an old book usually requires a bit of LaTeX hacking: ask in the forums if you need a hand.

Transcriber's notes

You might want a block of them at the start (for things people need to be aware of before they read the book), or interspersed as footnotes, or perhaps as endnotes. For footnotes/endnotes, you will want to have a separate sequence of markers from the book's own footnotes—memoir.cls can handle this, as can the footmisc package. Things that might be corrected silently in a "normal" project (such as trivial punctuation corrections) can be noted using comments in the LaTeX code if you want to preserve the history of absolutely every discrepancy between the original scans and the final ebook.

Tidy up the source code

With future maintenance in mind, it's probably best not to rewrap (so the lines of code more or less follow the lines of the original scans), but you might like to get rid of any excessively long lines, being careful not to accidentally put linebreaks in bad places, such as within index labels.

Check there are no proofers' notes remaining, and remove all embarrassing "notes to self".

Put the optional date argument for your class and package files, e.g.

\documentclass[a4paper,11pt]{book}[2001/04/21]

The date to use is found in your log file, in a line like

Document Class: book 2001/04/21 v1.4e Standard LaTeX document class

Consider using the \listfiles declaration so that the log file contains a concise summary of files accessed.

Check that everything in your preamble is really needed, and put comments explaining why each thing is there. Also say what should be done if any of the packages is unavailable: suggest alternative packages, or alternative commands. You can use \providecommand: this defines a new command, but only if it is not already defined. So, rather than telling the user that if hyperref is unavailable they should do a whole bunch of regex replaces, I can put

\providecommand{\hyperlink}[2]{#2}
\providecommand{\hypertarget}[2]{#2}
\providecommand{\phantomsection}{}
\providecommand{\pdfbookmark}[3][0]{}

which will have no effect if hyperref is present, but will allow the file to compile if it is not. Another possibility is to use some if...then statements, for example

\IfFileExists{obscurepackage.sty}
  { % exploit stuff from package
  }{% else provide alternative commands
  }

Use a nifty regex to turn each page separator into a comment giving the folio number, but also keep the png numbers as they are very useful. Remove the proofers' names though.

Run Lacheck.

This checks for things like spaced full-stops, \refs without ties etc.

Insert cataloguing data for the final pdf (optional)

You can add additional value to your final pdf by defining some of the metadata fields that can be stored in a pdf, and by arranging for the pdf to open neatly.

If you're using hyperref then you can use the \hypersetup command:

\providecommand{\ebook}{2xydw} % WWer will redefine \ebook to contain the actual PG number
\hypersetup{pdftitle=The Project Gutenberg eBook \#\ebook: Actual Title Here,
 pdfsubject=Subtitle or similar,
 pdfauthor=Whoever it is,
 pdfkeywords={PM, PP, CP, the DP team, other credits},
 pdfstartview=Fit,
 pdfstartpage=1,
 pdfpagemode=UseNone, % bookmark pane will not be visible initially
 pdfdisplaydoctitle,
 pdfpagelayout=TwoPageRight, % this is like an open paper book: if your book is not two-sided, use SinglePage
 }

If you aren't loading the hyperref package but are using pdflatex as your compiler, then you can use pdftex primitive commands:

\providecommand{\ebook}{2xydw}
\pdfcatalog{/PageMode /UseNone
 /ViewerPreferences <</DisplayDocTitle true >>
 /PageLayout /SinglePage } openaction goto page 1 {/Fit}
\pdfinfo{/Title (The Project Gutenberg eBook \#\ebook: Actual Title Here) % note the parentheses
 /Subject (Subtitle or similar)
 /Author (Whoever it is)
 /Keywords (PM, PP, CP, the DP team, other credits)}

Incorporate formatting for the PG header, credits and license

The WWer adds the actual header boilerplate text, credits line, and PG license, but you must set up the code to format it and place it within the structure of your document. It is a requirement that the header and license are presented in a fixed-width font preserving line breaks and spacing. One method of formatting the boilerplate/licence so that it retains its plaintext structure but any overlong lines will wrap and be indented 0.25in is to use a verbatim environment that has been enhanced using the verbatim package:

\usepackage{verbatim}[2003/08/22]
\makeatletter
\def\@xobeysp{~\hfil\discretionary{}{\kern\z@}{}\hfilneg}
\renewcommand\verbatim@processline{\leavevmode
  \null\kern-0.25in\the\verbatim@line\par}
\addto@hook\every@verbatim{\@totalleftmargin0.25in\small}
\makeatother

Remember that the inserted text may contain special characters, so you need an environment which isn't fazed by the odd naked & or #. You may want to set up an entry for the licensing information in the ToC (and perhaps a pdf bookmark). The WWer will expect three placeholders, of the form

*** Header boilerplate placeholder: can be 72 or more chars wide &#% *** 
***   Credits stanza placeholder   *** 
*** License placeholder: text is at least 72 characters wide <&#%^$> *** 

but it's probably a good idea to (temporarily) include some additional lines of dummy text so you can check that (for example) the font size is appropriate and the real text inserted by the WWer won't run off the page, and that the running heads/page numbers/ToC entries/links are working as expected. If your book has a long title that can't be easily abridged then there is a good chance that there will be a few lines of inserted text that are well over 72 characters. In a LOTE project, check that accented characters within the boilerplate will work properly: for example

*** Header boilerplate placeholder: can bê 72 or móre chärs wìde &#% ***

Once you're satisfied that your document will still look perfect after the WWer pastes the real stuff over your placeholders, remove the dummy text (but leave the placeholders!).

Finish

Incorporate what the smoothies found.

Check pagination

Although LaTeX generally does a superb job, no algorithm can produce ideal output in every circumstance. One area of relative weakness for TeX is in pagebreaking, so an element of manual tweaking is nearly always required to avoid ugly pagination, especially since the option of rewriting the text isn't available.

  1. Finalise the figure placement.
  2. Check for orphaned or widowed section etc headers.
  3. Check the ToC and index for secondary entries separated from their main entry. For example, to prevent a break between an index entry and its subentries, find the definition of \subitem in your class file(s), and put a copy in the preamble with a judiciously-placed \nobreak added, eg
    \renewcommand{\subitem} {\par\nobreak\hangindent 40\p@ \hspace*{20\p@}}
    This may cause other pagination problems, so tread cautiously.
  4. Check for pages with too little content because of things like large unbreakable multi-equation displays: consider inserting \displaybreak at semantically appropriate places and see if this improves the page breaking.
  5. Remember \enlargethispage if a bad pagebreak could be improved if the page were just a line or two longer. (You might want to enlarge the facing page by the same amount.)
  6. Would floating a fixed illustration or table give better pagebreaks?
  7. If all else fails, overrule TeX and resort to a hardcoded pagebreak.
  8. Go to 1.

Document the process

Your book will need to be maintained in the future, and almost certainly by someone else. Therefore it's important to provide enough information about how your document is supposed to be compiled so a future compiler can be confident the book still looks the way you intended. Any features of the book that might break easily, such as figure placement, longtable alignment, hyperlinks etc should be listed. Say precisely how you compiled the file (e.g. Run pdflatex, then makeindex, then pdflatex 3 more times) and any alternatives. List that you've Lachecked and Lprep-gutchecked etc (might want to do those again).

These notes must appear in a comment block near the top of the file, and must be presented in a box drawn with %. Each line must begin and end with %%, and it's probably a good idea to not use double %s anywhere else in the file. For example

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%                                                                       %%
%% Packages and substitutions:                                           %%
%%                                                                       %%
%% memoir:   Advanced book class. Required.                              %%
%% memhfixc: Part of memoir; needed to work with hyperref. Required.     %%
%% amsmath:  AMS mathematics enhancements. Required.                     %%
%% graphicx: Standard interface for graphics inclusion. Required.        %%
%%           Driver option needs to be set explicitly.                   %%
%% hyperref: Hypertext embellishments for pdf output. Required.          %%
%%           Driver option needs to be set explicitly.                   %%
%% indentfirst: Standard package to indent first line following chapter/ %%
%%              section headings. Recommended.                           %%
%% soul:     Facilitates effects like letterspacing. Recommended.        %%
%% wrapfig:  Allows placement of graphics inside text cutouts.           %%
%%             Strongly recommended.                                     %%
%%                                                                       %%
%%                                                                       %%
%% Producer's Comments: A fairly straightforward text, except for        %%
%%                      keeping the illustrations in sequence and not    %%
%%                      too far from the text referring to them.         %%
%%                                                                       %%
%% Things to Check:                                                      %%
%%                                                                       %%
%% Figure 23 fits snugly at the bottom of page 30 (pdf page 38): OK      %%
%% hyperref and graphicx driver option matches workflow: OK              %%
%% color driver option matches workflow (color package is called         %%
%%    by hyperref, so may rely on color.cfg): OK                         %%
%% Spellcheck: OK                                                        %%
%% Smoothreading pool: No                                                %%
%% LaCheck: OK                                                           %%
%% Lprep/gutcheck: OK                                                    %%
%% PDF page size: 422 x 652pt (non standard)                             %%
%% PDF bookmarks: created (point to figures) but closed by default       %%
%% PDF document info: filled in                                          %%
%% PDF Reader displays document title in window title bar                %%
%% Images: 35 png and 1 pdf                                              %%
%% No overfull boxes, one underfull hbox and ten underfull vboxes        %%
%%    (caused by placement of images)                                    %%
%%                                                                       %%
%% PDF pages:  57                                                        %%
%%                                                                       %%
%% Compile sequence:                                                     %%
%% pdflatex x2 #(only needs two runs)                                    %%
%%                                                                       %%
%% Compile History:                                                      %%
%% May 08: dcwilson.                                                     %%
%%         MiKTeX 2.7, Windows XP Pro                                    %%
%%                                                                       %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


Append your lprep configuration (delimited by ### lines, and strongly encouraged) after \end{document}.

Check that the preamble contains a \listfiles command on a line by itself, and compile the project one final time using the sequence of commands you've specified in the command block. Double check that the page count matches the PDF pages line in the preamble, and append the log file after the lprep configuration stanza. These steps provide useful diagnostic information to the WWer if your project mis-compiles at PG.

And finally...

Upload a .zip containing the .tex source, the /images directory (if there are any illustrations), and your compiled .pdf (so the PPVer/WWer can see how the book is meant to look). Breathe a sigh of relief.