PPTools/PPsmq
ppsmq: A "smart quote" PP tool to convert straight (Latin-1) quote marks to curly (UTF-8) quote marks
Overview
Printed books use typographically correct quotes, or “curly quotes.” Sometimes, source files of books are only available with straight quotes. This is often caused because the source file is encoded in Latin-1, which does not include the curly-quote characters. Here is an example of straight and curly quotes:
[1] "You'll have to try harder," was all she said. [2] “You’ll have to try harder,” was all she said.
Line [1] uses straight quotes and line [2] uses typographically correct or “curly” quotes. The ppsmq program attempts to convert a file containing straight quotes to a file where those have been converted to curly quotes.
Running ppsmq online
Note: The ppsmq program is available for online use here.
If you use it online, skip the next section.
Running ppsmq locally
Ppsmq runs on the command line in Windows or on a Mac, and the current version requires Python 3. The ppsmq program takes an input file (“-i filename”) and generates an output file (“-o filename”) with straight quotes converted to curly quotes. Anything it can’t reliably convert, it leaves alone. Anything it detect that is suspicious, it marks with a “@” character in the output file.
1. From the command line, run the command: python3 ppsmq.py -i input-file-name -o output-file-name
E.g., python3 ppsmq.py -i treasure.txt -o treasure2.txt
2. Then edit treasure2-src.txt. resolve any @ and any single or double quotes not converted. If your input file was HTML see the notes below.
Notes:
- The ppsmq program is usually used on text files but it can work on HTML. If there were any single or double quotes inside HTML tags, ppsmq will convert them such that single quotes became ∮ and double quotes became ∯ characters. They should be edited back to ' and " after step 2 resolution (above) is complete.
- The input file may be Latin-1 or UTF-8. The output file will be UTF-8.
- Even if you don't want to upload your book using curly quotes, you may find it beneficial to use ppsmq. Users report using this on their Latin-1 files as a check of correct straight-quote spacing. It almost always finds several errors in a typical book. Even if you discard the curly-quoted version, checking each “@” suspect may help you find errors in the straight-quote spacing.
Program History
Roger Frank created ppsmq, and is still the primary maintainer of the program. Walt Farrell (wfarrell) maintains this page and the downloadable copy of ppsmq for DP.
- 2016-02-05: Minor update to provide usage information if the user doesn't provide an input file name. No change to other functionality.
- 2016-02-06: ppsmq 1.12a-wf: Places the output files in the same directory as the source (input) file by default.
- 2018: ppsmq made available online as part of the Post-Processing workbench.