| Basic PPGEN Checklist for use with Guiguts Prepared by Linda Hamilton, 7 October 2014 | |
| Setup | |
| Activity | Details | 
| Download the text and images files and unpack in new folder: | images (nnn.png) in subfolder pngs and hi-res illustration scans (imagenn.png) in subfolder originals | 
| Set image directory | Prefs" -> "Set File Paths" -> "Set Images Directory" | 
| Set Gutcheck Options | "Fixup" -> "Gutcheck Options" - > Check "-v Enable verbose mode", "-p Report ALL unbalanced double quotes" & "-s Report ALL unbalanced single quotes." Set spell dictionary | 
| Comments | Read Project and Forum Comments | 
| General Cleanup and Checking | |
| Activity | Details | 
| Page Through book | Check for end-of-page hyphens, missed paragraph starts at start of page, missed bold/italics, blank pages, and notate type of formatting you'll need so you can start building ppgen macros. As you go, remove blank pages, and extra continuation marker that cross pages. Mark Poems. | 
| Illustrations/tables | Move Illustration tags & tables to paragraph breaks (and to appropriate spots) | 
| Blank pages | Check that they're really blank and remove [blank page] text | 
| Resolve *'s project comments left by proofers and hyphens | Search for * and try to resolve. Make sure you've turned off "whole word". When you find a hyphenated word with a *, try search-replace to see if the text uses the version with or without the hyphen elsewhere in the text. Resolve *s that are page or line endings endings. REMEMBER TO UNCHECK "WHOLE WORD" Setting in Search/Replace. When you make a change add a <target id='tn004'> specific to the page it's on and create the related TN Note on a separate page (#"dectective" changed to "detective" on Page 660:tn661#) | 
| Continuation Markers (this can be done with other checks) | Remove any continuation markers */*, #/# -- (Check that /* */, /$ $/, /p p/ & /# #/ markups match & are correct and on own lines.) - Delete cross-page /* /# /P | 
| Find Orphaned Brackets & Markup" | Using Search & Replace/Find Orphaned Brackets - Click the
  "/* */" & Hit "Search", correcting any mistakes the
  search routine find. Correct broken /* markups, leave all other asterisks,
  they'll be fixed up later. Also try (?<!/)\*(?!/) to search for orphaned asterisks. Do the same with /p
  and /# (You want to make sure that they're all in order before you do the
  following search and replace) | 
| Remove End of Line spaces | Fixup/Remove End of Line Spaces -- do this periodically | 
| First Pass | "Fixup" -> "Footnote Fixup" - "First Pass" button. Guiguts moves thru identifying anything it thinks is a Footnote and its associated anchor. The pass will finish with the first footnote anchor highlighted in orange, the actual associated footnote in green. At the top of the Footnote pop-up, it shows how many footnotes are in the text. It may help you to select "Unlimited Anchor Search" if your footnote markers are a long way away from the actual footnote text. | 
| Check Footnotes | Hit the "Check Footnotes" button. Pop-up lists all footnotes found. Footnotes marked in yellow have duplicate anchors, footnotes marked in pink have no anchors. Use the "Go to - #" footnote dropdown in the Footnote tool to fix any errors. & re-run First Pass & Check Footnotes to make sure all errors have been fixed. | 
| Check Footnote Count | "Fixup" -> "Run Word Frequency Routine" -> "Sort Alpha" -> "Re Run". Look for how many times the word "Footnote" occurs in the text, it should match the number of Footnotes found by the Footnote tool, if it doesn't you have a missing Footnote somewhere, use the features of Word Frequency lists mentioned in 2.6.3 to fix any errors. Run the Footnote "First Pass" step if you fix any errors. | 
| Step through all Footnotes | "Next FN -->" & "<-- Last FN" buttons to step your way through the Footnotes guiguts has found, fixing any errors and joining any multi-page footnotes. You can use the "See Anchor" & "See Footnote" buttons to make sure that the anchor is associated with the correct footnote text. If you fix a missing ending bracket problem, hit the "Adjust Bounds" button and guiguts will find the new closing bracket. For each footnote, select the appropriate symbol style to use with the "Number", "Letter" or "Roman" buttons. You can use the "IMAGE" button to check the original symbol style if needed. Numbers are recommended if you have large numbers of footnotes. Don't worry about removing duplicate footnote symbols, this will be handled auto-magically in the next step. Guiguts depends on Footnotes being in the right format, [?] for markers and [Footnote ?: text] for the footnote. Make sure you correct all footnotes to this format. You can set manual anchors if you have lots of footnotes in [Footnote: ? text] format, see the guiguts manual on how to handle those. | 
| Index Footnotes | Once you have stepped through and checked, adjusted, fixed and anchored all of your footnotes, hit the "Re Index" button. For out-of-line footnotes, this will renumber all of the footnotes using the same family of symbol that it had originally or a number if it had no anchor marker. This will close up any gaps in the numbers and remove duplicates. You can make changes and re index as often as you like. | 
| Check for inconsistent punctuation after markup | =. Versus .=, .</i> and </i>. 'I search for the following regular expressions to check punctuation: =[^\w\s] and [^\w\s]= (Try it with _ for italics and \* to see the asterisks. I think I saw some inconsistencies with the asterisks too.) Also brackets and punctuation. | 
| HTML Markup problems | "Fixup" -> "HTML Markup" -> "Find orphaned Markup" You can also use the Character Count feature of the Word Frequency function to see numbers of <'s & >'s and ( ) etc. | 
| Check for Tabs | Using Character Counts of Word Frequency | 
| Word Frequency - Em Dashes | Rarely find stuff - Hit "Emdashes" - you will see a list of words that contain a hyphen, all words that are identical except that they DON'T have a hyphen and the words that are identical except that they contain an emdash (two hyphens). Fix any words which are marked with em-dashes, but should just be hyphens. This check will not find all occurences of this error as it doesn't list all em-dashes. See the manual for more details. | 
| Word Frequency - Hyphens | VERY USEFUL!!! - Select "Alph"(abetical sort) & then Hit "Hyphens" You will see a list of hyphenated words from the project in alphabetical order. The number to the left represents how many times the word appears. If you scroll down the list you'll see non-hyphenated versions of the same word marked with *, resolve any conflicts by looking at page image, hit the 'See Image' button, if required. | 
| Word Frequency - Alpha-nums | Hit "Alpha/num" - will catch 1ine etc. Check consistency of dates. | 
| Word Frequency - Spelling | Hit "Spelling"    
  You'll see a list of words not in the dictionary in Alpha order which
  occur in the text. This is a useful check for resolving inconsistencies with
  proper names. Work through the list fixing any errors. This is just an initial check. A full spell check will be run shortly. | 
| Word Frequency - Bold/Italic | Hit "Ital/Bold Words" - You'll see a list of words and phrases which appear in & markers in the text & those which do not. This is a useful check for resolving inconsistencies with italics markers for abbreviations & Journal titles.. | 
| Word Frequency - Caps | Not so useful - Hit "ALL CAPS" - useful for checking you have all the CHAPTER headings. If you need to change the case of any text, just select it & use "Selection" -> "UPPERCASE Selection". The text will change to ALL CAPS. Other case changing features which are available from the "Selection" menu are "lowercase Selection", "Sentence case selection" and "Title Case Selection." | 
| Word Frequency - Mixed Caps | Not so useful - Hit "MiXeD CasE" - Useful for checking Chapter headings | 
| Word Frequency - Initial Caps | Hit "Initial Caps" This is useful for checking Proper Names. | 
| Word Frequency - Check Char Counts | Hit "Character Cnts" This is useful for weird characters & spotting missing brackets, (, ) counts should be the same for example. Mark-up mismatches for [] & <> are also easy to spot here too. Good way to catch tab characters that snuck through. | 
| Word Frequency - Check Upper | Hit "Check , Upper" This is useful for checking for , -> . scannos. If there's a lot of dialogue then this generates so many false positives it's not really much help | 
| Word Frequency - Check Lower | Hit "Check . Lower" This is useful for checking for . -> , scannos. Generally yields a manageable number of things to check (usually abbreviations in the middle of a sentence) | 
| Word Frequency - Accents | Also good for catching cases where there are tabs Hit "Check Accents" - Useful for resolving inconsistencies with accented characters. Works like hyphen check. The Latin-1 chart available from "Help" -> "Latin-1 Chart" pops up a window which allows you to click on accented characters to easily add them to the text file. Characters are inserted in the text file at the cursor. I also use the Search & Replace Popup at this stage to check for ae & oe ligatures. I add them back into the .txt if I find they have been removed by the proofing rounds. I proof the oe ligature as [oe] so I can subsititute it later on for a HTML entity in the HTML version. "Check Accents" button in the word frequency window to see if downscaling the accents might cause problems, and if you spot any potential problems, generate an ascii file by hand and upload both ascii and latin1. (The sort of problems are things like "cañon" which will end up as "canon" instead of "canyon", or "coöperate" which will end up ! as "cooeperate", or an aligned table containing "Cæsar" that will stuff up the alignment when expanded to "Caesar". 3) For ascii/latin1 text we normally just put "coeur", for html use "coœur". If there were lots of oe ligatures in the book, we might also generate a unicode text version (I never have, and a lone oe isn't important enough to justify a whole separate version). | 
| Check " and commas or periods | Check for ." followed by a space or a word -->,'\w or \.'\w | 
| Check commas at end of paragraphs | ,\n\n -- also : or ; (Using Search & Replace set for regex) | 
| T all by itself | Scanno for I | 
| Mr. & Mrs. Dr. | Check Mr. and Mrs. To have periods Mr[^.,^s] and Mrs[^.] works ALSO Mr\n and Mrs\n Do same for Dr. (Using Search & Replace set for regex) | 
| Regex Checks | "Search" -> "Stealth Scannos" -> use the Browser to go to the scannos directory in your winguts or guiguts folder. Select regex.rc (regex) and then later do en-commn (whole words), misspelled (whole words), others_w (not whole words), others_s (not regex), others_r (regex), scanno_w (full-words) , scanno_s (not whole word) and scanno_r (regex). | 
| Run Jeebies | From Fixup Menu | 
| Check quotes and commas | space"space and space'space and try out start/end of line
  combos too Search regex ^'space and space'$ and same for " comma with no spaces, [a-zA-Z],[a-zA-Z] or [a-zA-Z]\.[a-zA-Z] | 
| Check for spaces on first lines | ^(space) using Search and Replace regex | 
| Check initials | space between or not? [A-Z]\. [A-Z]\. Versus [A-Z]\.[A-Z]\. | 
| Check for double spaces | |
| Dashs and elipses | Check space-- and --space [A-Z,a-z] -- etc. This is good for TOC-type things and for propper names -- I think the program is mixing things up a bit on the proper names | 
| Check elipses | For spacing | 
| Subscripts and Superscripts | Subscripts - an underline character _ and surrounding the text with curly braces { and } / Superscripts by inserting a single caret (^) followed by the superscripted text. If the superscript continues for more than one character, then surround the text with curly braces { and } as well. Check use throughout | 
| Check for no period at end of paragraph | [a-zA-Z]\n\n | 
| Check for spaced hyphens | hyphen followed by a space | 
| OE / AE | Check oe and ae dipthongs to make sure they didn't get mixed up | 
| Check i.e. i. e. | |
| Check for Greek use | And correct | 
| Spellcheck | "Search" -> "Spell Check" | 
| Check dipthongs | If book has oe and ae dipthongs double-check that they're used correctly (they sometimes mix them up) | 
| Move to Landing Zones | Once you have set landing zones for all your footnotes, hit "Move Footnotes To Landing Zone(s)" & your footnotes will be moved. You can now re-run the "First Pass" step using "Last FN" & "Next FN" buttons with the "See Anchor" & "See Footnote" buttons to make sure all are correct. | 
| Replace Page PNG info | ^---*File: (\d\d\d\.png).*$ replaced by // $1 to get something like this // 019.png (try it out on one at a time til you're sure it works OK | 
| Gutcheck to catch common errors in the file, such as unbalanced quotes. | "Fixup" -> "Run Gutcheck" (selecting paragraphs of text & using the "Search" -> "Highlight double quotes in selection" or "Highlight single quotes in selection" is very useful in tracking down mismatched & wrongspaced quotes.). Don't look at line length. Make sure to try check lower. | 
| Check for triple hyphens | Make sure ---- isn’t --- (can search for regex [^-]---[^-] | 
| Check for dashes with spaces after | unclothed hyphens -- space-- or --space . Good to check all dashes anyway since some are words too and should have spaces, AND Guiguts seems to be cutting spaces before them at some point. Check for space-- and --space and --endofline and start of line | 
| Run pptext | |
| Fix Sidenotes | Step through sidenotes with: Search&Replace of [S, not regex, not whole word, ignore case. Click Search to find each Sidenote. Compare to page image. Move note above paragraph if feasible. Otherwise, position it above the sentence to which it applies, with blank lines to prevent rewrapping if you decide that is best. Remember in HTML to use span if more than one per paragraph | 
| Sidenote fix | Check *[Sidenotes to make sure I didn't delete any I shouldn't have.] | 
| FPN Formatting | |
| Activity | Details | 
| Save again | With a new filename (do this regularly -- just incrementing the number -- eg. school4-src.txt | 
| Add comment re: book and date edited | e.g. //
  ppgen source school-src.txt for Knots Untied // last edit: 10 June 2014 | 
| Put in Book/Author near top of file | .dt The
  Project Gutenberg Book of Knots Untied: Or, Ways and By-ways in the Hidden
  Life of American Detectives, by George S. McWatters Use Caps for first letters of major words -- check this -- and edited by if edited and don't make it horribly long | 
| Set up Macros | Name and write up the formatting for the macros you identified that you'd need when you first paged through the book | 
| Page through the text and the images | Page through the book again in Guiguts, adding appropriate markup as you go. Don't worry about the front pages or ads at this point or even the TOC (unless you want to) | 
| Link up internal links | Add links to correct areas from within pages (eg. See Page 45 or In Chapter 5 etc.) | 
| Set for Not showing Page numbers | To mark page numbers as comments and not have them show, put .pn
  off .pn link near top of document | 
| Put in the Page advancement codes | Search and replace (regex): (// \d+\.png) with -- $1\n.pn +1 You'll need to remove the extra .np +1s that appear for blank pages that are not to be numbered and set start pages for pages that move from roman to alpha etc. or skip pages for some reason | 
| Set start for page numbers | Add .pn 1 or .pn v and .pn 340 etc. in the right spots where numbering starts or restarts (sometimes books have gaps -- pngs that are blank but aren't counted as page numbers). In those spots, get rid of the .pn +1 | 
| Chapter Starts | Add smcaps if needed for words at start of chapters Check page numbering before and add newpg before | 
| Link Plate/Page Numbers to correct areas | In text. In Hans (page ###) -- usually a ### followed by a ) or page ### or search the text in Firefox for Page and then look for Chapter and Figures etc. | 
| Check poems, closings | and check quotes and indents etc. around and in letters and closings and italics and whether a paragraph starts right after (or doesn't) | 
| Front Pages | Format them | 
| TOC | Do TOC and link page numbers to chapters | 
| Illustration TOC | Prepare TOC | 
| Illustration IDs | Put in id=i_586 or whatever into illustrations if you've got a TOC for illustrations. You can also use your regex to add a few other codes to get ready for finalizing the illustration, which you'll do when it's in smoothreading | 
| Do illustration TOC | Adding links etc. | 
| Run GG Fixup - HTML Markup | Find Some Orphaned Markup (it may act odd). And check using Word Frequency for number of > vs < | 
| Frontspiece Spacing | Should be four blank lines between frontispiece and title page (not two). | 
| Do Index (if needed) | http://www.pgdp.net/wiki/CSS_Cookbook/TOC_and_Index#Index_HTML http://www.pgdp.net/wiki/CSS_Cookbook/TOC/IX_regex | 
| After FPN Generation | |
| Activity | Details | 
| Run Generator | python d:\dp\tools\ppgen.py
  -i knots13-src.txt (if you're going to use the online generator, then make
  sure to call it school.zip For a text-only version use python d:\dp\tools\fpn.py -i knots10-src.txt -o t For a line by line output try python d:\dp\tools\ppgen.py -i knots12-src.txt -d a | 
| Check Spacing and look (book specific) and sidenotes for headings | Throughout book -- especially around letters for text and HTML versions | 
| Check HTML versus HTML Tidy | And fix source file, regenerating as needed | 
| Run ppvtxt tool | On text version run ppvtxt on the Fixup Menu | 
| Run pphtml | On html version from HTML Generator menu | 
| Check Grammar and Spelling | Use Word on a separate version of the text or html file so there's no chance of having it change something inadvertently | 
| Remove end of line spaces | |
| Re-run Gutcheck | on text version to check for line length | 
| Smooth Reading | Submit for Smooth Reading | 
| Prepare Illustrations | Illustrations should be 700px (see http://www.pgdp.net/wiki/Guide_to_Image_Processing#Image_Display_Dimensions:_General_Guidelines guidelines). | 
| put in Caption Stylesheet info if needed | near top of
  file -- .de .figcenter p { font-family:sans-serif;\ font-size:smaller; } | 
| Check footnote/sidenote consistency | periods, commas, etc. | 
| TN Fixup | Fix up TN with UL and <li>s for html and - for text and add text about the page numbers being in the html source. You can get the HTML part of the TN (to add <li> etc. after generating the HTML) | 
| Specific to book | Change [=i] etc. for html if needed. Fix Beehive Hut image ALT | 
| Cover | Add cover
  to final copy of HTML plus update link to it. .if h .il fn=cover.jpg w=500px alt="Book Cover" .ca "Transcriber's Note: The cover image was created by the transcriber and is placed in the public domain." .pb .. | 
| Optimize JPGs | using jpegoptim | 
| Final Checks | |
| Activity | Details | 
| Check Smoothreading results | And fix source, updating TN as needed | 
| Links | Manually check all links including footnotes and TN (and for TN, check that the changes were actually made!) | 
| Check HTML | Look at HTML -- check margins stay same throughout and that all the main formatting is as expected (page through book) | 
| Check Text | Look at Text -- check that all the main formatting is as expected (page through book). I'm updating text only now (and renamed file so if I do regen I won't copy over) | 
| Fix up Odds and Ends | In HTML and Text. Correct Tables etc. And run the check for long and short lines again for text and end of line spaces | 
| Check pptext | One more time | 
| Check pphtml | One more time | 
| Check links | "Fixup" -> "HTML Fixup" -> "Link Checker" (on HTML Version) | 
| CHECK SPACING | IN TEXT VERSION | 
| Rerun previous checks | HTML tidy, etc. | 
| No CSS | Ensure HTML is basically readable without CSS enabled | 
| Final check on HTML | http://validator.w3.org/ | 
| Double-check links | http://validator.w3.org/checklink | 
| Check and validate css | http://jigsaw.w3.org/css-validator/ | 
| Epub | Check epub and correct issues (after zipping the html file and images) http://epubmaker.pglaf.org/index.php | 
| Final Cleanup | |
| Category | Activity | 
| Check for surplus images | Use Guiguts .2.4 - HTML - Link Check | 
| Check Page #s | Check that Page Numbers in Comments match real ones | 
| TOC | Double-check TOC for alignment and punctuation in Text and HTML | 
| Images | Check image sizes using ppvimage.pl - 700 max inline, 1200 max
  otherwise.(100k max for inline 200k max otherwise). And check using Firefox
  that hasn't been incorrectly resized | 
| Final Checks on TXT | Spaces at end of lines and long and short lines Regex ^.{75,}. Check all blockquoted text such as footnotes are consistent. AND check \n\n\n etc. | 
| Final Checks on HTML | Look over in Firefox and IE and rerun checkers | 
| Name files correctly | |
| Make Zip files | .txt, and html and images directory. All lower case. NO .bins and NO other files from image directory (include bin file if you don't have DU yet) | 
| Upload Zips via PP Page | If you don't have DU |