Talk:Guiguts PP Process Checklist

I changed the -exec flag to -ok in the find command to mitigate any undesirable rm's that may occur from someone mis-typing this command. rm is deadly in the hands of the unknowing. 8-]
vls 09:26, 2 September 2006 (PDT)

Check for misspelled words

Point 9 says - Apply scanno searching based on misspelled.rc. Work through the list.

The same can be achieved much quicker using the "Stealtho check" from the word frequency screen.

wymannmi 21:22, 1 May 2008

Good words, spell check, word frequency

Another time saver is to:

paste the Good words list to beggining of text so these are spell-checked before the rest of the text, adding them to the Project dictionary
11. Apply Spellcheck, adding words to Project dictionary, ignoring, or fixing as needed
delete Good words list from text
now perform 8. Apply Word-Frequency Checks -- and you'll spend less time "looking for oddities and obvious misspellings."

Tintazul 13:45, 20 December 2008 (PST)

Character coding: plain text, smooth reading

In the end of section 20. Determine character coding it says,

Pure-ASCII etext bookname-asc and optional Latin-1 bookname-lt1 and bookname-utf8 are ready to upload!

No they're not if you are a Unix or Mac user -- how about noting end-of-line conversion needs for 'nixers and Mac'ers? It's just a "unix2dos *.txt" away.

Also, how about a note here saying, "you may make your project available for smooth reading at this point if you want to," explaining why one should, how it's done, which of the text versions we should be offering, giving tips about the time allowance (7 days? 15?) etc.? In fact, I'd make it a whole point and not a section of another. Tintazul 11:02, 22 December 2008 (PST)

Thought breaks marked as unbalanced

Item 15. Check balanced markup will mark all occurrences of <tb> as unbalanced markup (opening tags without corresponding closing tags.) The instructions for converting <tb> should perhaps be moved before here?--Tintazul 14:42, 28 January 2009 (PST)

HTML edition: HTML validator, CSS validator

Under item 21. Prepare HTML Edition maybe this should be added:

HTML validation: Warning about Byte-order mark in UTF-8 versions. I use Linux and apparently Guiguts (?) adds a byte-order mark (BOM) at the beginning of the file. This is just a warning, but my PPVer told me to get rid of it. Tricky because the BOM is an invisible character in most editing software. Here's how: I can open it in Emacs and see it plainly. Remove the wacky character and we're good to go (remember to re-upload into the Validator and ensure a clean pass before sending to PPV again.)

CSS validation: a PPVer told me that my CSS wasn't valid. We should add to the list of things to do, after validating the (X)HTML: validate the CSS with the W3C CSS Validation Service. -- Tintazul 14:35, 5 February 2009 (PST)

Step 1: lowercase filename

In Step 1, where it reads Text to bookname.txt, it should be made more explicit that the filename should be all lowercase: "bookname" alone doesn't convey this, after all book names have uppercase letters. So, it's "warandpeace.txt", not "WarAndPeace.txt" or any other. -- Tintazul 14:51, 5 February 2009 (PST)

Blank Line Removal

Step 3 or perhaps top of 7:

Add an item stating that [Blank Page] indicators should be removed, e.g. Regex search Search for: '\n\[Blank Page\]\n' and replace with: '\n'.

I overlooked this on my first project as it was not in this checklist.

The ligatures in text files

It should be mentioned that the square brackets should be removed from around ligatures for the text files, e.g. Ph[oe]be should become Phoebe in the text files.

Perhaps step 17 or 20.

Another item I wish I had known about from my earlier PP projects.

Step 17: Transliterate Greek text in the text versions

I've been told that Greek text should be transliterated and enclosed in equal signs in the text files.

Step 20: Lt1 file character coding

A regex search by '[\x{0100}-\x{ffff}]' should be done at least on the lt1 version and perhaps for all three versions of the text file as well as the html version. (I've made a practice of correcting them for all versions in my PPV submissions.) For some odd reason this is not in the regex scanno file nor in GutCheck.

Caught by the Smooth Reader facilitator. Apparently she performs this check on all SR submissions.

Step 23 -- Cover Images

A discussion of how to code the html file for cover images should be included. or at least link to [1] and [2]

Step 21 -- Tables of Contents and Indexes

The advice "Tables of Contents and Indexes, which are best formatted using unsigned lists, rather than the markup Guiguts generates for /$..$/" should probably be changed. The page numbers that Guiguts adds in the middle of lists cause dozens of W3C Validate HTML errors for anyone who follows this advice.

Tidy Up Footnotes before rewrapping?

Referring to Section 2.3 ("Rewrap and Clear Rewrap Markers" for the text file), it might be better to do "Tidy Up Footnotes" before rewrapping. Doing it after rewrapping means the first line of every footnote is left shorter than it need be when the "Footnote: " string is removed. Or am I missing some good reason to do it in the order listed? Noyster (talk) 06:33, 30 April 2022 (EDT)