LaTeX proofing guidelines

From DPWiki
Jump to: navigation, search

These are guidelines for proofing DP LaTeX projects. There are separate guidelines for proofing mathematics in non-LaTeX projects. For other aspects of LaTeX at DP, please consult the

The proofing and formatting guidelines are also available as typeset manuals, suitable for online viewing and printing.


LaTeX proofing guidelines (short version)

  • Do not worry about the mathematics. Non-mathematicians are expressly encouraged to proof LaTeX projects.
  • Proofread everything you can using non-LaTeX DP guidelines. This includes displayed equations.
  • Any (sequence of) symbols that you can't handle should be replaced by $$ (two dollar-symbols).

Fine points:

  • Exceptions to the non-LaTeX proofing guidelines:
    • For Greek symbols in mathematics, use the LaTeX name, (e.g., \alpha, see table below), or invoke the $$ rule if you're unsure what the letter is. Only transliterate Greek when it is clearly text and not mathematics.
    • Use a space rather than inserting a hyphen between a number and a fraction.
  • Retain any prepped text, which will usually contain lots of $$s. Delete obvious OCR junk, which looks something like this:
 1) 5/3*--4aß--40*--5aß,
 0 =--«, « = I, O =--l,
 x ==|/02(2a-&)2-f2«/3(2o--&)(2&--a)-H32(2&-a)ä, d.h.
 (11) «^(ö/S*--4«/3) (a2 + &*) + 2(5a/3--
 9  «--b-\-x--/2a--6\2  a; + a--ft

LaTeX proofing guidelines (long version)

A LaTeX project is formatted differently because it has complex structure that is beyond the reach of normal DP styling. LaTeX is a powerful mark-up language used for typesetting mathematical and many other special characters and expressions. Because the formatting is different, some of the proofing conventions also need to be a bit different (to make life easier for the formatters). However, the primary goal in proofing is still to accurately match the characters on the scan.

  • For P1, P2, and P3, no knowledge of LaTeX is required. Do not worry about the mathematics. Non-mathematicians are expressly encouraged to work on LaTeX projects in the P rounds.
  • Aside from Greek letters (see below), do not add LaTeX formatting in the P rounds, even if you know LaTeX. Most proofers do not know LaTeX, so LaTeX code seriously interferes with the proofing rounds' work. Less is more. :) In the unusual event the text already contains some LaTeX formatting, leave it there. However, all non-LaTeX formatting should be stripped out of LaTeX projects. (Both problems should be almost non-existent, especially in projects started around 2007 or later.)
  • Proofread the text, and as much of the math as is covered by the normal DP proofing guidelines, e.g., proof a subscript yn+1 as y_{n+1}, superscript z² as z^{2}. In-text fractions are handled slightly differently in LaTeX: Don't join a fraction to a preceding number with a hyphen, just leave a space (see Examples below).
  • Latin-1 characters from the drop-down (or "pop-up") menu can and should be used in the P rounds. These include accented letters, as in non-LaTeX projects, but also the degree symbol °, section symbol §, plus-or-minus ±, multiplication sign ×, and mid-dot · (used for multiplication). To denote prime accents in math, use the ASCII single-quote character ' (a.k.a. close-quote, right-quote, or apostrophe), repeated as many times in succession as needed. Please do not use a superscript "o" for degrees or a letter "x" for multiplication.
  • Proofread parentheses (), square brackets [], and curly braces {} using the ordinary characters, even if the symbols on the page are large.
  • If a mathematical symbol or expression was read as junk by the OCR, or appears to be missing, type in what you can, or at a minimum replace it by $$ (two dollar-symbols).
  • For an expression set off on its own line, or group of lines, treat it as a separate paragraph, with a blank line before and after.
  • When a Greek character is used in a math expression, proof it as the name of the letter preceded by a backslash (e.g., π as \pi). For a capital letter, simply capitalize the command (e.g., \Pi). Capital Greek letters having a Roman equivalent (\Alpha, \Beta, \Epsilon, etc.) and the letter omicron do not have special LaTeX commands; if the project comments don't say how to handle these letters, please ask in the project thread.
    LaTeX's Greek alphabet commands
    As a last resort, if you don't recognize a character, just mark it as $$.
  • Seven Greek letters have lowercase variants in LaTeX (in blue above). In rare cases, a project may use both forms of a letter with different meanings. Should this occur, or if you're unsure, please seek advice in the project thread. In other words, unless the Project Comments explicitly require it, we don't distinguish variant forms of the same Greek letter. Caution: The letters "vee" and "upsilon" are nearly identical, while the following pairs are quite similar-looking: "omega" and "variant pi"; "zeta" and "variant sigma"; "ex" and "variant kappa". Lowercase "upsilon" and "variant sigma" do not occur in mathematics, only in transcribed Greek.
  • LaTeX treats a backslash followed by a sequence of letters as a single entity. When Greek letters appear as part of a larger expression, their names must be followed by a non-letter--such as a space, backslash, or arithmetic operator--as in \pi r^{2} or \alpha+\beta\pi = \omega. Similarly, named functions (log, cos, tan, etc.) should generally be surrounded by spaces, as in "cos u sin v". If the text is easily human-readable, it's probably fine.
  • Within math expressions, spaces next to numbers and arithmetic signs are of no importance, since LaTeX will ignore them and put space around the elements by its own rules. So "x = 2 + 4", "x=2+4", and "x =2 +4" are all syntactically equivalent; they'll all be displayed as x = 2 + 4.
  • Complete rendering of complicated fractions, square roots, etc., is generally beyond the expectations of the proofing rounds. However, please do proof the parts of formulas that are covered by the regular DP proofing guidelines, see the table of examples below. For fractions, type the numerator, a slash, and the denominator, even if the fraction has a horizontal bar. Do not use braces to group the numerator and denominator unless they're present in the page scan; the proofed text need not be mathematically accurate. Finally, never use "ASCII art" to represent fractions, square roots, integrals, or other typographical constructs in a LaTeX project.
  • If you're not certain whether part of an expression is covered by a normal DP guideline or not, the safest thing is to match the scan. For example, you might be uncertain whether a string of dots is an ellipsis (which would be covered by a normal guideline) or just a string of dots (in which case you would just match the number of dots in the scan). Don't be upset if a subsequent round has a different opinion and changes your work: there are many shades of grey in LaTeX proofing. If you prefer black and white, raise the issue in the project thread and get a ruling from the PM.

Examples:

Image: Proof as:
x x
y1 y_{1}
z2+2½ z^{2} + 2 1/2
cos Ax sin By cos Ax sin By
tan θ tan \theta   or   tan $$
a + b = 42 a + b = 42
0frac01.png ( 1.234 × 10^{4} × 678  /  9023 )
0sqrt01.png $$ x^{2} + y^{2}
0sin01.png sin^{-1} A = \pi / 2
0frac03.png a + b / c + d
0exp01.png e^{a^{2} + ab + b^{2}}
0int01.png $$_{a}^{b} f(x) dx
0sum01.png $$ 1 / n
Dblfrac.png dy / dx    dz / dy ; [another slash could be confusing, so use space]
0frac02.png dd y / dd x    dd z / dd y   [dd is becoming conventional for the partial derivative; be sure to put a space between dd and the following variable]
0braces01.png   x = y
z = w
$$ for all real x, y


(Don't spend a lot of time trying to indicate the structure of the mathematics—just get the elements of the expressions proofed and in line, with the tops of fractions followed by the bottom parts. Use a space or two to separate the elements of an expression, but don't add parentheses or braces that aren't on the image. For fractions with a horizontal bar, either replace the bar by a slash or separate the top from the bottom by extra space—whichever gives the clearer output.)

  • For tables, get the elements of the header and body
  1. correctly proofed (watch for O / 0 and I / l / 1),
  2. sorted into rows, and
  3. more-or-less in columns,

just as you would for an ordinary DP project.

Common problems

OCR problems: 1-l-I

OCR commonly has trouble distinguishing between the digit '1' (one), the lowercase letter 'l' (ell), and the uppercase letter 'I'(eye). This is especially true for books where the pages may be in poor condition.

Watch out for these. Read the context of the sentence to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DP Sans Mono or Courier.

OCR problems: 0-O

OCR commonly has trouble distinguishing between the digit '0' (zero), and the uppercase letter 'O'. This is especially true for books where the pages may be in poor condition.

Watch out for these. Normally the context of the sentence is sufficient to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DP Sans Mono or Courier.

OCR problems: hyphens and dashes

OCR commonly has trouble distinguishing between dashes & hyphens.

Noticing these is much easier if you use a mono-spaced font such as DP Sans Mono or Courier.

OCR problems: scannos

Another common OCR issue are scannos. This misrecognition can result in a word that appears to be correct at first glance, but is actually misspelled. These can usually be caught by running the spellcheck from the proofreading interface.

Sometimes the scan produces a valid word that does not match what is in the page image. These are subtle because they can only be caught by someone actually reading the text. Possibly the most common example of the second type is "and" being OCR'ed as "arid." Other examples: "eve" for "eye", "Torn" for "Tom", "train" for "tram". This type is harder to spot and we have a special term for them: "stealth scannos." We collect examples of stealth scannos in this thread.

Spotting scannos is much easier if you use a mono-spaced font such as DP Sans Mono or Courier.

Handwritten notes in book

In general, do not include handwritten notes in a book. However, a note may be worth keeping if, for instance it is a valid correction, or it is overwriting faded, printed text to make it more visible.

If in doubt, proof it, but distinguish it from the printed text, e.g. [**note: (text of the note)].

Bad images

If an image is bad (not loading, chopped off, unable to be read), please put a post about this bad image in the project thread. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Note that some page images are quite large, and it is common for your browser to have difficulty displaying them, especially if you have several windows open or are using an older computer. Before reporting this as a bad page, try clicking on the "Image" line on the bottom of the page to bring up just the image in a new window. If that brings up a good image, then the problem is probably in your browser or system.

It's fairly common for the image to be good, but the OCR scan is missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the scan, then either type in the whole page (if you are willing to do that), or just click on the "Return Page to Round" button and the page will be reissued to someone else. If there are several pages like this, you might post a note to the project thread to notify the Project Manager.

Wrong image for text

If there is a wrong image for the text given, please put a post about this bad image in the project thread. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Previous proofreader mistakes

If the previous proofreader made a lot of mistakes or missed a lot of things, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation so that they will know how in the future.

Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.

If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page.

Printer errors/misspellings

Correct all of the words that the OCR has misread (scannos), but do not correct what may appear to you to be misspellings or printer errors that occur on the scanned image. Many of the older texts have words spelled differently from modern usage and we retain these older spellings, including any accented characters.

If you are unsure, place a note in the txet [**typo for text?] and ask in the project thread. Please do not make corrections to the text, even in case of obvious errors. Some LaTeX post-processors retain the text of errors in the project's source file, and use macros to effect corrections. It's easiest to do this when errors are noted as indicated above.

Factual errors in texts

In general, don't correct factual errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them.

A possible exception is in technical or scientific books, where a known formula or equation may be given incorrectly, especially if it is shown correctly on other pages of the book. Notify the Project Manager about these, either by sending them a message via the project thread, or by inserting [**note sic explain-your-concern].