PPTools/Guiguts/Guiguts 2 Manual/Tools Menu
GUIGUTS VERSION 2 MANUAL
Describes features included in release 2.0.3 (September 2025)
Tools Menu
The Tools menu is used to find and fix errors in the early stages of post-processing, and to help arrange the text into a standard layout that can be used later on to prepare both the Plain Text and HTML versions of the book. Many of the tools work by creating a "Checker" dialog that lists potential problems and may provide help in correcting them. The common features of these dialogs are described in the section below.
Checker Dialogs
Many tools display a customized "Checker" dialog, which all have the same basic layout and underlying features.
All Checker dialogs have a list of messages, which contain the output from the tool. These messages typically start with the line and column that the tool is querying, e.g. "71.29" for something that occurs on line 71, beginning at column 29. A simple left-click on such a message will take you to that location in the file, whereas a right-click will remove that message from the list and show you the location of the next message. In many tools, a part of the message will be highlighted, which is usually the key word or phrase that is being queried. When the message is clicked, that word or phrase will be highlighted in the main text. You can also select the previous or next message using the up and down arrow buttons above the list.
When focused on the checker dialog, you can type a single character, e.g. "G" to jump to the first entry in the list beginning with "G". Or you can type part or all of a word, in a similar way to Windows File Explorer or macOS Finder. If several characters are typed with no delay, they are treated as a search term. After a delay, the search term is reset. The default is to search the beginning of the message, or by enabling "Full Search", to search anywhere in the message. Example: run pptext, then type "ellipsis" in the dialog to jump to the ellipsis test results.
If you hold down the Ctrl key (Cmd key on Macs) when you left- or right-click, some tools will attempt to fix the problem (or process it in some way) in addition to the usual left- or right-click behavior. Holding down the Shift key means "do the same to all the matching messages in the list", usually meaning that the planned operation (remove or fix) will not only be executed on the clicked message, but also on all other messages that are identical. The exact details of which key combinations are supported, and what each tool does, are displayed in a tooltip if the mouse pointer is hovered over the list of messages.
Checker dialogs also have buttons that perform the above operations on whichever message is selected. This means that it is not necessary to remember or use Ctrl, Cmd or Shift keys or right-clicks. Instead, just left-click a message to select it, then use the Hide, Hide All, Fix, Fix All, Fix & Hide, or Fix & Hide All button to perform the operation you want. If a button is not there, that operation is not supported for that particular tool.
If you want to copy the list of messages, for example into another editor, or to send to another person, or add to a forum post, click the Copy Results button at the top of the dialog. Alternatively press Ctrl+a (Cmd+a on Macs) with the focus in the dialog to select all the messages, and Ctrl+c (Cmd+c) to copy to the clipboard. After using either of these methods, you so you can paste the list of messages elsewhere in the usual way.
At the top of a checker dialog, a label states how many messages are being displayed. From some tools, this label will also state how many of the messages are "Suspects", meaning that the tool suspects that some attention from the user may be needed. An example of this is in Footnote Fixup, where all footnotes are listed, but continuation footnotes are flagged as suspects. The "Suspects Only" checkbox allows you to hide all the other messages and only display the suspects. Also in the top section, two radio buttons allow the messages to be sorted either according to their order in the main text file, i.e. by line and column, or to be sorted alphabetically (or by type of error), which may be useful for some tools. Finally, a Re-run button in the top right corner allows you to re-run the tool at any point.
Some tools have additional buttons and fields specific to that tool below the top row of controls. These are explained in the tool-specific sections below.
Summary:
- Left click - select and jump to the relevant place in the text
- Right click - remove message from list
- Ctrl/Cmd + left click - fix the problem indicated in the message
- Ctrl/Cmd + right click - fix the problem and remove the message
- Shift + right click - remove all matching messages from the list
- Shift+ Ctrl/Cmd + left click - fix the problem indicated in all matching messages
- Shift+ Ctrl/Cmd + right click - fix the problem and remove all matching messages
- <Home> (Cmd+Up on Macs) - go to first entry in the checker dialog
- <End> (Cmd+Down on Macs) - go to last entry in the checker dialog
One way these dialogs can be used conveniently is to right-click on a message if you do not want to make the suggested change, and Ctrl (or Cmd on Macs) right-click on it if you do want to make the change. Each time you do this, the next message in the list will be selected and the relevant place in the file will be displayed.
View Options
Some checker dialogs have a View Options button that controls which messages are shown and which are hidden. Clicking the View Options button will display the View Options dialog, which lists different message types. Only types that are checked will be displayed in the main dialog. You can also turn on or off all checkboxes, by using Show All or Hide All. A final checkbox "Gray out options with no matches" will disable any checkboxes where there are no messages that would be shown or hidden by toggling that checkbox.
Next to the View Options button are the Prev. Option and Next Option button, which make it easier to view messages one type at a time. Using Next Option will select the next available view option, and display messages of that type. Previous Option goes to the previous view option. Next/Previous respect the setting of "Grey out options with no matches".
Basic Fixup
- SCREENSHOT: Basic Fixup dialog
This tool finds a variety of small errors, for example "Spaced open bracket" and displays them in a checker dialog. The errors can then be automatically fixed by holding down the Ctrl key (or Cmd on Macs) while clicking on the message.
Basic Fixup View Options
To restrict the view to one type of error, use the View Options button, which will show this list:
- SCREENSHOT - Basic Fixup View Options
Using the checkboxes, you can then work on one type of error at a time, if that is your preferred method.
Word Frequency
Word Frequency uses a variant of the checker dialog to display a report on all words in the book and how often they occur. It can then look for various kinds of possible errors and inconsistencies.
On the very top row, a label indicates how many words of chosen type are being displayed.
The list can be sorted alphabetically, or by frequency of use, or by word-length. To change the sort order, click one of the three radio-buttons Alph, Freq, or Len.
Just below the Re-run button, the "Ignore Case" checkbox will cause the Word Frequency list to be regenerated, but ignoring the difference between upper and lower case.
A collection of radio buttons allows you to display certain types of word or word combinations. These are detailed below. In some displays Guiguts identifies "suspects," items that might be errors. These are marked with four asterisks. The Suspects Only switch causes the display to show only suspects, and using it may produce shorter lists in some cases.
When you click a word in the list, Guiguts searches for the first or next occurrence of that word in the document and scrolls to it. Keep clicking the word to scan all uses of it. If you want to start searching from the top of the file again, click a different word, then return to the word you are searching for. Control-click (Command-click on Macs) a word in the list to load that word into the Search Text field of the Search & Replace dialog.
When focused on the WF dialog, you can type a single character, e.g. "G" to jump to the first entry in the list beginning with "G". Or you can type part or all of a word, in a similar way to Windows File Explorer or macOS Finder. If several characters are typed with no delay, they are treated as a search term. After a delay, the search term is reset.
Display Type Radio Buttons
Each of the radio buttons gives a different way to process and display the data.
Starting in the upper-left, they are:
All Words | Displays the full list of words found in the file. |
Diacritics/æ/œ | Displays all words that include an accented character or ae or oe ligatures. A word that is the same except for the special character is displayed as a suspect. Use to check for inconsistent use of accents and ligatures. |
Ligatures | This lists all words containing ligatures (œ Œ æ Æ) and words that might contain ligatures (e.g., "Caesar"). You may want words to be spelled consistently, even though the books didn't always manage to do so. |
ALL CAPITALS | Displays all words and hyphenated phrases spelled entirely in capital letters. Inactive if Ignore Case is enabled. |
MiXeD CasE | Displays all words and hyphenated phrases that include both a lowercase and a capital letter in the non-initial position. Use to find OCR errors that mis-capitalize c/C, o/O, s/S, u/U, v/V. Inactive if Ignore Case is enabled. |
Initial Capitals | Displays all words and hyphenated phrases that start with a single capital letter. Inactive if Ignore Case is enabled. |
Emdashes | Displays all phrases that include an emdash (two hyphens). If an identical phrase having only a single hyphen exists, it is displayed as a suspect. |
Hyphens | Displays all hyphenated phrases. A word that duplicates a hyphenated phrase ("flash-light" and "flashlight"), or a pair of words connected by an em-dash ("flash--light"), or optionally (using Include "Two Word" Matches checkbox) a pair of words separated by a space ("flash light") is displayed as a suspect. Use to find inconsistent hyphenation of words, particularly at ends of lines. |
Alpha/num | Displays all words and hyphenated phrases that contain a mix of alphabetic and numeric characters. Use to find one/ell and oh/zero errors. |
Ital/Bold/SC/etc | Displays all words and phrases (default is up to four words) that are enclosed in italic, bold, small-caps, f, g, u, cite, em or strong markup; and all matching words or phrases that are not so marked. Use to find inconsistent markup. Using the field next to this button you can change the maximum number of words in a phrase. |
Character Counts | Counts all character values in the document and displays the list. If Sort Alpha is checked, the list is sorted by character; otherwise it is sorted by count, most-used first. Used to check for non-ASCII character use and for equal counts of matching brackets and parens. |
Regular Expression | You can construct your own regular expressions in the box just to the right of this button, then click the button to list just those words that match your regular expression. "and" (no quotes) will find "and" but also "hand", "abandon", etc. By indicating word boundaries, "\band\b" only will find "and". Note that Word Frequency regular expressions operate on the "All words" list, and are not equivalent to a regular expression in the Search & Replace dialog, which might find matches that include more than one word or include punctuation. Further discussion and examples are in THIS Forum discussion. |
Bookloupe
Using this option is an essential part of post-processing, and you probably will use it at least twice while preparing the text: once at the end of preparing the "common" text that will be used by both the Plain Text and HTML versions, and again when (you think) the Plain Text version is done.
Bookloupe is an updated version of and replacement for the original Gutcheck program. Both will scan a text file looking for many common proofreading errors that the other tools do not find. When your finished project is submitted for publication to Project Gutenberg, it will be checked in several ways; one of them is with Bookloupe.
Running Bookloupe displays the results of its analysis in a checker dialog.
Bookloupe View Options
The diagnostics list easily can be overwhelmingly long, and, when sorted by line number, the 40+ kinds of diagnostics are all jumbled together, making what should be an invaluable resource difficult to use.
The solution is to click the View Options button, which will show this list:
At the bottom of the dialog, click Hide All. Every option will be selected to "Hide" and the diagnostics list will disappear. Now, click the first option in the View Options dialog to unhide its items in the diagnostics list; in many cases, there won't be any, and the diagnostics area won't change. Continue unhiding the options, one at a time, until some messages do appear. Then, click the first one and the document window will scroll to that line and highlight the possible error.
If it is an error, correct it; otherwise, right-click the diagnostic line to dismiss it and move to the next (if any) diagnostic within the same option. When there are no more of them, unhide the next option, and repeat this until you've worked your way through the entire View Options list.
Some options won't be relevant at the stage you're currently processing, so you can just skip them. For example, until you have re-wrapped the Plain Text, there may be long or short lines that won't be long or short later on.
Spelling
The Spelling check is best applied towards the end of the first phase of post-processing, after removing page separators, running Jeebies, Word Frequency, and checking for Stealth scannos, as these steps remove many trivial mistakes that would turn up as spelling errors.
The Spelling menu entry displays in a checker dialog any words that it does not find in any of the following dictionaries:
- the dictionary supplied with Guiguts (or dictionaries for multi-language texts);
- the user's dictionary (or dictionaries) for the relevant language(s), which is in the GGPrefs directory;
- the project dictionary, which is specific to the project you are working on.
If you have the good/bad words files for the project, you might want to add these to the project dictionary before you run the spelling check, but first you might also want to check that the good words list doesn't contain incorrect words. You will find the button to do this in the File-->Project menu.
Each suspect spelling is listed, followed by the number of times that word occurs in the text, e.g. "88.32 marche (3)" indicates that the word "marche" appears on line 88 at column 32, and that it occurs 3 times in total in the text. If "marche" had been added to the project dictionary as bad word, using the File-->Project menu, it would also have 3 asterisks following the list entry.
At the top of the dialog the Threshold entry field tells the spelling checker to ignore words that occur in the file more than the given number of times. The default is 4, meaning that if "marche" appeared more than 4 times in the file, it would not be reported as a suspect word. This is useful to avoid some false positives, such as characters' names - the appropriate threshold setting may depend on the quality of the text. A text with many spelling errors may need a higher threshold. Setting the Threshold to zero disables this feature completely, meaning that words will be reported regardless of how many times they appear in the file. Your preferred threshold setting will be saved for next time you start Guiguts.
- Add to Global Dict will add the current word to the global dictionary for the main language. The word will no longer be reported as a spelling error in any future projects - use with caution. This dictionary is named dict_en_user.txt (for English) and is created in the GGPrefs directory. It is a simple text file, so can easily be edited if necessary. (Shortcut: Cmd/Ctrl+A)
- Add to Project Dict (or Cmd/Ctrl+right-click) adds the current word to the project dictionary, which is a file called project_dict.json in the current folder. (Shortcut: Cmd/Ctrl+P)
- Skip (or right-click) deletes the highlighted entry from the list and moves to the next entry. Note that it does not add the word to the Project Dictionary, so, if you use "Run Checks" again, the word will appear in the refreshed list. (Shortcut: Cmd/Ctrl+S)
- Skip All (or Shift-right-click) deletes all queries about the current word from the list and moves to the next entry. Note that it does not add the word to the Project Dictionary, so, if you use "Run Checks" again, the word will appear in the refreshed list. If you do not want the word to appear again when spell checking this project, use Add to Project Dict instead. (Shortcut: Cmd/Ctrl+I)
Additional language dictionaries
- Go to the Guiguts supplementary materials page.
- Find the language you want - note that English, Dutch, French, German, Portuguese and Spanish are included with the Guiguts release anyway.
- Download the relevant zip file.
- Unzip the file - it will contain a file named
dict_LANG_user.txt
, where LANG is the two letter language code. - Copy the
dict_LANG_user.txt
file to your GGprefs folder. - Restart Guiguts and ensure the project language is set correctly to pick up the new dictionary.
Jeebies
Jeebies examines an English text trying to find scanning errors that have replaced be with he or vice versa. Such "scannos" are both common and hard to find. When Jeebies completes, its report is displayed in a checker dialog:
The report identifies lines where the use of he and be suggest possible errors.
- Ctrl+left-click (Cmd+left-click on Macs) will make the suggested change for you (change he to be or vice versa).
- Ctrl+right-click (Cmd+right-click on Macs) will make the suggested change and remove the suggestion from the report.
Most or all of the items Jeebies identifies will already be correct not need changing, as modern image-to-text programs are very good and our proofreaders are even better.
The three radio buttons at the top of the dialog, "Paranoid", "Normal", and "Tolerant", control how sensitive Jeebies will be to possible errors. It's usually best to use "Paranoid", but if it reports an overwhelming number of possible errors, try one of the other options and click Re-run Jeebies to see if it finds any actual errors. If so, it's likely that more errors will be lurking in the "Paranoid" list.
Stealth Scannos
Scanno searching makes it practical to search for and correct many common OCR errors. Its reports are displayed in a checker dialog:
There are three scannos files distributed with Guiguts and are found in its "data/scannos" folder:
- en-common.json - Several dozen stealth scannos often found in English text, such as "arid" for "and." These are particularly useful because the scannos are valid words and so would not be reported as errors by the spell checker.
- misspelled.json - A file with over 3000 scannos that may be found in English text. Unlike en-common.json, these scannos are not generally valid English words so may also be reported by a spell check. These words do not have an automatic replacement string - you will need to correct any errors yourself.
- regex.json - A file with a few dozen sophisticated regular expressions designed to find common errors. Some of these errors are also reported by other checking tools.
In the Stealth Scannos Results dialog that is opened when the Stealth Scannos Check is run, the dropdown menu at the top of the dialog allows you to choose from all the scannos files that Guiguts knows about, including your own personal scannos file(s) if you have created any - see next section. If you use a scannos file it is moved to the top of the list in the dropdown menu, so the most recently used scannos files are always near the top of the list.
When the dialog is opened, the scannos file at the top of the list is selected and its first entry displayed. (If that scannos file is not the one you want, simply select another.) The displayed entry will include the hint if there is one. If that word or regular expression doesn't appear in your project, the list below will be empty and the label "0 Entries" will be displayed in the top left corner. Once you have finished dealing with that entry (or if there are no occurrences), click the Next button next to the "match" field. Stealth Scannos will now check through each scanno in the current scannos file in turn, skipping over any where there are no occurrences in your project, until it finds one where there are some occurrences to display. After dealing with those, click Next again, or Prev to return to the previous scanno that has occurrences to display. This is similar, but not identical, in behavior to Auto Advance in Stealth Scannos in Guiguts 1.
As with other checker dialogs, a left-click in the list at the bottom will take you to the occurrence of the potential problem, and right-click in the list will remove that line from the list. If the current scanno being displayed at the top of the dialog has a Replacement, then you can click the Replace button to make that change. Alternatively you can Ctrl+left-click (Cmd+left-click on Macs) on any entry in the list to make the suggested change at that location. If you use Ctrl+right-click (Cmd+right-click on Macs) instead, the suggested change will be made, and the line will also be removed from the list. If you want to do the same operation on all the lines (remove and/or correct), you can hold the Shift key as well as the above key-mouse combinations. Thus Shift+Ctrl+left-click is equivalent to the Replace All button or the Fix All button.
If you Shift-click the Replace button, the Search and Replace dialog will be popped, pre-populated with the appropriate search and replace fields.
Creating Your Own Stealth Scannos Files
You can create your own personal scannos files that can contain additional checks to those performed by the three scannos files distributed with Guiguts. Each file you create must be a .json file matching the format of en-common.json, misspelled.json, or regex.json. To ensure you get the format and behavior expected, the following is recommended. If your scannos file will consist of words to be checked for with replacements, take a copy of en-common.json from the "data/scannos" folder under the Guiguts release (the path to the "data/scannos" folder is found on the first line of the Stealth Scannnos Results dialog), and put it in a folder of your own, not under the release. Then when you upgrade, or remove the Guiguts release, your personal scannos file will remain safe in your folder. Similarly, if your scannos file will consist of words to be checked but without replacements listed in the file, then begin by taking a copy of misspelled.json instead. If your scannos file will consist of some personalized regular expressions, then begin by taking a copy of regex.json instead. In either case, you may keep the existing contents of the file you copied, or you can remove some or all of the word/regex entries from it, and you can add your own. Note that the types of brackets used are important, as are the commas at the end of certain lines, and also that the line before a close curly bracket does not have a comma. If you copy/paste entries, take care that you ensure each entry finishes with a comma, except the last one (two lines above the end of the file) which must not have a comma. If your file has incorrect syntax, you will be notified with an error when Guiguts tries to load it.
Each entry in the scannos file has a "match" line (which is the regex or word to be searched for), optionally a "replacement" line (containing the replacement, which can be empty), and also optionally a "hint" line (which lets the user know what type of error the regular expression is attempting to trap). The hint is not usually necessary for simple words.
Using Your Own Stealth Scannos Files
If you have created a personal scannos file, then the very first time you want to use it, use the Load File button to the right of the dropdown menu. From now on, your personal scannos file will appear in the dropdown menu automatically with all the other available scannos files.
Regex Library
This feature allows the user to load a file containing a list of regexes, and use them in a similar way to the Stealth Scannos dialog. Its reports are displayed in a checker dialog:
There is one regex file distributed with Guiguts which is found in the "data/regex_library" folder:
- dashes.json - This contains several regexes that can be used to find where hyphens have been used by proofers, but the PPer wants to use a different kind of dash. An example is to replace occurrences of double hyphen with an em dash.
See the above section for details on how to use the dialog, how to switch to a different file, and how to create and use your own regex library files. While reading the Scannos section above, just interpret "scannos" or "stealth scannos" as "regexes".
Word Distance Check
This tool finds pairs of words that are very similar to one another, such as "Auguste" and "August", or "infrequently" and "unfrequently". Such pairs are displayed in a checker dialog:
The "Distance" in the name of the tool refers to the "Levenshtein Distance" between the words, which is the minimum number of single character changes (insert, modify, delete) needed to turn one word into the other. By default the distance used is 1, meaning there is just one character added, or removed, or changed between the pairs of words. You can also change the distance to 2, using the radio buttons at the top of the dialog.
Page Separator Fixup
This dialog makes it much easier to remove the -----Page Separators----- and rejoin words that were split across pages.
The page separators are useful to you in the early stages of post-proofing; they show you clearly the page units seen by proofers, the points at which they had to deal with hyphenated words, incomplete poems or italics or block quotes, and so forth. After the first pass through the book, however, they are not useful. When you have moved illustrations and footnotes outside of paragraphs, you can remove the page separator lines.
Use Tools>Page Separator Fixup to open a dialog dedicated to removing the page separator lines from the document:
After you have removed the page separator lines from the document, Guiguts still knows where the page boundaries are because it saves the information in the .json file (see this page). You can still jump to a page, display a page image, or message the proofers for a given page.
Page separators appear within paragraphs, between paragraphs, above or below illustrations and footnotes, and between chapters. Each case needs different handling. When the separator is within a paragraph, there may be blank lines above and below it, and a hyphenated word or phrase might have crossed the page boundary.
Suggestion: Before removing the separators, resolve and remove all proofer's notes, position all [Illustrations] between paragraphs, and run Footnote Fixup to rejoin continuation footnotes and perhaps move them to the end of the document.
The three radio buttons in the center of the dialog control how the dialog progresses through the file as you fix page separators. No Auto means that after you have fixed a page separator, that location will continue to be displayed, and you will need to click Refresh to move to the next page separator. Auto Advance causes the program to advance automatically to the next page separator as the previous one is fixed. The default Auto Fix will automatically fix as many page separators as it can, only pausing when the situation is ambiguous, such as a hyphenated word across a page boundary and Guiguts does not know whether the hyphen needs to be retained. When Auto Fixing, Guiguts is also capable of handling the proper removal of rewrap markers immediately preceding and following page separators, and separators that are immediately followed by another separator.
Begin or continue the removal process by clicking Refresh (shortcut r). This scrolls to the first or next remaining page separator line and highlights it. Examine the line and choose your action as follows, either by clicking the button or pressing the shortcut key indicated:
- If the line is in the title page or a table, just click Delete (shortcut d). You will revisit such areas and adjust the spacing later.
- If the line precedes a chapter-level section (Preface, Contents, etc.), click Chapter (4 lines) (shortcut c). Guiguts deletes the line and then ensures that there are exactly four blank lines between the preceding and following nonblank lines.
- If the line precedes a section-level section, click Section (2 lines) (shortcut s). Guiguts deletes the line and then ensures that there are exactly two blank lines between the preceding and following nonblank lines.
- If the line falls between paragraphs (to be sure, you may have to refer to the page images by clicking See Img in the Status bar or enabling Auto Image), click Blank (1 line) (shortcut b). Guiguts deletes the line and then ensures that there is just one blank line between paragraphs.
- If the line falls within a paragraph, look for a hyphenated word above it. If so, decide if the word should be joined or left as a hyphenated phrase.
- To join the word, click Join (shortcut j).
- To retain a hyphenated phrase, click Join, Keep Hyphen (shortcut k).
- In either case, Guiguts deletes the line and closes up the paragraph.
When you disagree with a change, use Undo (shortcut u) to revert to the state just before the option you just used. The tool supports multiple Undo's. If you undo a change then want to redo it, use Redo (shortcut e). Although the usual undo and redo features in the main text window will also serve to undo and redo changes, the advantage of using the ones in the dialog is that, in addition to undoing/redoing the change, the dialog will also do a Refresh to show you the current page separator.
Some Limitations
- The tool does not attempt to rejoin Index entries whose page lists begin on one page and continue onto the next page.. The Formatters are advised to handle that situation by beginning the continuation page with an opening no-wrap /* and to left-justify the rest of the page list on the next line, without leaving a blank line (which would indicate a new main entry) and without indenting it (which would indicate a new sub-entry). You will need to look for these and rejoin the two lines manually, either before or just after removing the page separators.
- The following regular expression can be used with Search & Replace to help you find and remove any unwanted hyphens in words at the end of lines:
Search: (\s)([A-Za-z]+?)-([a-z,;:\.!\?]+?)\n Replace: \1\2\3\n
- When proofers have marked split words in the middle of a page with -*, a simple (non-regex) Search & Replace can find and help resolve these:
Search: -* Replace: (empty/null line to just remove the -*) Replace: - (hyphen to make the word hyphenated)
Footnote Fixup
This option will help you find and resolve footnote-related errors such as mismatched anchors/footnotes, missing or duplicated footnotes, and missing closing square brackets that result in normal text seeming to be part of footnotes. It'll also help you rejoin the segments of continuation footnotes into one complete footnote.
Always save a copy of your file before running Footnote Fixup as some of its actions cannot be undone.
Where Notes Are Placed
If the project manager has instructed proofers to code inline notes, that is embedding [Footnote: etc] in the text, then these will have to be processed with the Guiguts editor or with a separate tool (not available yet).
Out-of-line footnotes are more common and it is this style of footnote that the Footnote Fixup tool deals with. Each note is proofed in two parts: an anchor (a symbol in square brackets) in the text, and the notes proper, which proofers batch at the end of each page. Since then, you will have moved those notes to the ends of paragraphs, if necessary. Now you need to make an editorial decision: where will the notes be placed in the final etext? You have three choices.
End of Paragraph. Each note is placed just below the paragraph that contains its anchor, or just below the block quote for which the note is a citation. This is appropriate when there are only a handful of notes per chapter, especially when notes are mostly brief citations.
End of Chapter. All the notes from one chapter are batched at the end of the chapter. Do this if the original book did so, or if footnotes are extremely numerous or verbose.
End of Book. All the notes are batched in a block at the end of the book. Do this if the original book did so, or if footnotes are moderately numerous.
The Footnote Fixup Dialog
This decision made, use Tools>Footnote Fixup to open the Footnote Check Results dialog. Guiguts does an initial scan of the document to find every identifiable note and anchor. Besides locating footnotes, during this pass Guiguts also looks for a number of common footnote typographical errors such as "[ Footnote" with a space and "[footnote" and flags them in the checker dialog for you to correct. When the scan ends, the checker dialog is scrolled to display the first footnote anchor, which is highlighted in aqua, along with all the other anchors and footnotes similarly highlighted:
Errors, continuation, and continued footnotes are highlighted in red in the checker dialog and counted at the top of the dialog as "Suspects". This means that those entries require attention from the user.
Many of the footnote processing option buttons are disabled while there are suspect/error entries remaining in the checker dialog since attempting to reindex or move footnotes in that situation may lead to the file being corrupted. However the processing option buttons used to fix and clear suspects are always enabled.
How Footnotes and Anchors are Matched
Some books have distinct types of notes, some numeric and others alphabetic. Indeed, it is possible for a book to have three intertwined series of notes, numeric[2], Roman[IV] and alpha[C]. The numbers may restart on each page (or chapter). Some books use symbols (asterisks, daggers, etc.) and the proofers may have been inconsistent in how they coded them.
During the initial pass, Guiguts attempts to match each footnote [Footnote str : to its anchor [ str ] by searching from that footnote back towards the beginning of the file and using the first match. For purposes of linking a note and its anchor, it doesn't matter what type of symbol is used; only that the two strings be identical and that the footnote follows the anchor. That's why duplicate and inconsistent symbols don't matter at this time. After finding any anchor [Q] Guiguts looks for the next following [Footnote Q:.
Check Footnotes
Look through the checker dialog for entries with an error description highlighted in red; left-click those entries to see the referenced text in Guiguts' Main (document) window. After making some changes, you can refresh the list by clicking Re-run in the top-right corner of the checker dialog.
All identifiable footnotes and anchors in the file are listed in the checker dialog. You can work through the footnotes listed in the dialog either by clicking on each one in turn to select it and see its referenced text in the Main (document) window, or you can select a footnote in the dialog and then use the <- Prev. FN or Next FN -> buttons to more speedily select adjacent footnotes and display their referenced text.
The suspects identified in the checker dialog include those with problems such as NO ANCHOR, SAME ANCHOR, SEQUENCE or CONTINUATION. The last of those is not an error as such but requires the user to rejoin it to the previous footnote of which it is a continuation that has been marked in the proofing and formatting rounds.
The usual reason for Guiguts not connecting a note to its anchor is that the note is improperly coded. Some very typical errors include:
- Missing colon, [Footnote A Text...
- Period instead of colon, [Footnote A. Text...
- Comma instead of colon, [Footnote A, Text...
- Two colons, [Footnote: A: Text...
- Missing symbol, [Footnote: Text...
- Various subtle and hard-to-see misspellings of Footnote.
Inspect Footnotes
Use the Next FN and Last FN buttons to step through the notes or click on a particular note in the checker dialog. Each selected note is highlighted in orange in the Main (document) window.
Verify from the blue highlighting that each note is correctly bounded by square brackets. If a closing bracket is missing or misplaced, correct it. If the note is still embedded in a paragraph, move it between paragraphs. After either change, click Re-run to rescan all notes.
If the distance between note and anchor is wider than you expect, look carefully. It is possible for a note to be mated with the wrong anchor. For example, if [Footnote 1 is mis-coded, say with a missing colon, it will be ignored during the first-pass scan. The anchor [1] will be mated with the next [Footnote 1 somewhere further along in the document.
If no anchor at all is highlighted, the syntax of either the note or the anchor is wrong (and the note should be marked in the checker dialog with an error text highlighted in red). Correct the note and click Re-run to rescan the notes.
If the note is correct but the anchor still is not found, the anchor may be missing or mis-coded. Click that note to select it then look at the page image and find where the anchor should be. If the anchor is malformed, delete it. Place the cursor at the insertion point where the anchor should be and click the Set Anchor button. Guiguts inserts an anchor using the symbol from the selected note.
When all notes are correct as to syntax and type of symbol, compare the count of notes shown at the top of the dialog to the count of the word Footnote from the Word Frequency report. If a discrepancy leads to the discovery of a "lost" footnote, correct it and Re-run again.
"SEQUENCE" before a note in the checker dialog is a warning. It doesn't necessarily mean something is wrong. Sometimes footnotes have footnotes and will appear out of sequence to the program. The same with one footnote that ties to two different anchors. Ask in the forum if you need help making the best decision for your book.
Do not go on to further steps until there are no errors shown in the checker dialog and you have verified the length and anchor of all notes. (A veteran of multiple books with several hundred footnotes each has observed that there is simply no substitute for inspecting each footnote in sequence. Footnote syntax is complex and prone to subtle errors. Not all errors are displayed in the Footnote Check Results dialog. You simply must verify proper scoping of every note as any uncaught errors will cause chaos later on.)
While inspecting notes do not be concerned about duplicate symbols at this time.
Roman vs. Alphabetic Symbols
At this stage you must be aware of how Guiguts tells the difference between an alphabetic symbol [A] and a Roman symbol [I]. They both consist of alphabetic characters, so are ambiguous to program logic. The arbitrary rule is that a Roman anchor ends in a dot, while an alpha anchor does not. Thus [I.] is a Roman number, and [I] is alphabetic. The dot is also required in the note number, as in [Footnote I.: (Roman) versus [Footnote I: (alphabetic). Guiguts recognizes lowercase Roman with a dot, as in [iv.] but it only generates uppercase Roman.
If your book has Roman anchors but the proofers did not include the dot (and why would they?) you must hope that the editors of the original work were careful enough to never use an ambiguous [i] or [v] footnote. Use regular expressions to find all the ought-to-be Roman anchors and notes and add dots to them. For anchors, search for literal [, one or more roman lowercase letters, literal ] using the regular expression \[([ivxl]+)\] and replace with [$1.]. Similarly for finding the notes, use the regular expression Footnote ([ivxl]+): replacing with Footnote $1.:
Indexing Footnotes
When all notes are correct, save the document. Again click Re-run. Guiguts will force all notes and their anchors to use a common type of symbol. Chose All to Number, All to Letter, or All to Roman then click Reindex.
If your book uses Roman-style numbers in lowercase [iv.], the Reindex pass replaces them with uppercase Roman [IV.]. You can force these back to lowercase with two more regular expression search and replace operations, after completing the following step.
Placing Footnotes
To move notes to chapter-end or book-end you need to establish "landing zones" where Guiguts will collect the notes. A landing zone is simply a line containing only the word "FOOTNOTES:" and followed by a blank line. Footnotes preceding that line are moved to follow it.
You can use the Set LZ @ cursor button to insert a "FOOTNOTES:" line anywhere you chose in the file. Or you can use the Autoset Chap. LZ button to insert a "FOOTNOTES:" line preceding each chapter break (four blank lines). When you have inserted landing zones where you want the notes to gather, click either Move FNs to Landing Zone(s) to move them to the ends of their Chapters or the end of the book; or Move FNs to Paragraphs to move each footnote to just below the paragraph containing its anchor. (The heading line "FOOTNOTES:" will not be added in this case.) Guiguts moves each note to the landing zone next below it in the document, leaving a blank line above it. Note that, except when using Move FNs to Paragraphs, Guiguts always moves footnotes downward toward a landing zone on a higher-numbered line. Even if a footnote is sitting directly below a landing zone, it will be moved to the next one down in the document. If you want a footnote to stay where it is, place a landing zone just below it, not just above it.
As a quality check examine the moved notes. You can use the Prev. FN and Next FN buttons to step through the notes.
If you used Autoset Chap. LZ, there may be unused Landing Zones at the ends of chapters with no footnotes. (Recall that Landing Zones are the word FOOTNOTES:). You probably will want to delete the unused Landing Zones. Similarly, if you moved all of the footnotes to the end of the document and later moved them to follow the paragraphs that referenced them, remember to delete FOOTNOTES: at the end of the document.
When everything looks correct, save the document.
Tidy Footnotes for the txt version
Later on, after you've done as much as possible with the "common" version of the text, and have saved a separate copy of it to use in preparing the HTML version, you can simplify the appearance of the footnotes for use in Plain Text. When you are ready to do so, start Tools>Footnote Fixup once more and make sure no errors are shown. Then click Tidy Footnotes. Guiguts changes all the notes from the form [Footnote 1: Text...] to the form [1] Text....
Caution: once you save the file in this form, any further editing or arranging of footnotes must be manual; the Footnote Fixup dialog will no longer work. Also, automatic HTML generation cannot recognize tidied footnotes, so this is for the .txt version only.
Some Limitations
- GG2 Footnote Fixup does not handle mixed symbol footnotes. If a book has more than one distinct type of note, for example some numeric and others alphabetic, then reindexing will no longer maintain those distinctive labels. Reindexing forces all footnotes to have the same symbol type and gives them consecutive values of that type (letter, number, or Roman). To maintain distinct symbol types in notes you have two options:
- (1) for a small number of footnotes you can manually edit them to use more than one type of symbol. If you then click Re-run and the checker dialog shows no errors, you do not Reindex the notes before moving them to their landing zone(s);
- (2) for a large number of mixed-symbol footnotes, for example mainly numbers with some Roman, you deal with the two symbol types separately. This option combines reindexing with a 'masking' technique familar to PPers. First, change the square brackets to (say) curly brackets on all the Roman footnotes, then do Reindex (All to Number) to handle all the number footnotes, ignoring the Roman ones. Then change the square brackets on the number footnotes to curly and all the Roman footnotes back to square brackets and Reindex (All to Roman) to deal with the Roman ones, ignoring the number footnotes. Finally, restore the square brackets to the number footnotes and click Re-run. If the checker dialog shows no errors it is safe to move the (separately reindexed) footnotes to their landing zone(s).
Sidenote Fixup
Sidenote Fixup lists every sidenote in a checker dialog, and lets you examine, and if necessary, move some of them up or down, a paragraph-break (blank line) at a time, until they are where you want them to be (usually just above the paragraphs they summarize). Note: in some books, some sidenotes are positioned mid-paragraph intentionally, and should remain there. Formatting such sidenotes is beyond the scope of this documentation, but there's at least one Wiki article HERE at DP that discusses this in some detail.
- SCREENSHOT: Sidenote fixup dialog
The instructions for selecting and moving sidenotes are identical to the ones for illustrations (and illustrations are far more common than sidenotes), so please see the Illustration-Sidenote Fixup Instructions, below.
Illustration Fixup
As received from the Rounds, Illustration tags often are not where you want them to be. If there only are a few of them in a project, using cut-and-paste probably is the simplest way to move them. However, if there are a lot of them (you must decide what is "a lot"), errors become more likely. This tool, and the mostly identical Sidenote Fixup, above, can make it easier and safer to reposition them correctly. The tools let you move each selected illustration or sidenote up or down to the next paragraph break (blank line), and repeat the move until the selection is in the right place.
In Illustration Fixup, you can also enable Preserve Illo's Page Number, which adjust page break locations as you move illos so that the illo remains on the same page number. This is equivalent to moving the text in the opposite direction. For example, if you have a mid-paragraph illo at the top of a page, and you move it up, by default it will move to the first paragraph break searching backward from the page break, thus putting it onto the previous page. However, if you enable Preserve Illo's Page Number, the page break position will also be moved to just above the new illo position, so the illo will remain on the same page number. This process is equivalent to moving the partial paragraph that was at the bottom of the previous page down onto the next page.
Note that using Undo after moving an illo while preserving the Illo's Page Number, is susceptible to the known Undo issue described here. Therefore, under those circumstances, if you move an illo too far past a page break, do not use undo - reload your latest saved version and try again.
- SCREENSHOT: Illustration fixup dialog
Illustration-Sidenote Fixup Instructions
- When to use the tools:
- These tools should be used relatively early, while working with Plain Text, well before splitting off a copy that will become the HTML version.
- Before using either of these tools, process all proofer notes [**blah] and partially-process all Footnotes (rejoin continuations, resolve any errors, renumber, and move them to the end of the document or the ends of chapters).
- The tools probably should be used before removing page separators
- If your project has both illustrations and sidenotes, either category may be processed first.
- On the Tools Menu, select either Illustration Fixup or Sidenote Fixup
- A list similar to the ones in the screen images above will appear in a new window
- If Guiguts believes an Illustration or Sidenote is mid-paragraph, it will say so: (MIDPARAGRAPH). You can list just the mid-paragraph entries by turning on the Suspects Only checkbox at the top of the dialog.
- Click an item in the list to select it in the main Guiguts window. That will be called the selection in the rest of these instructions
- The selection will begin at the left bracket and end at the first right bracket that is at the end of a line
- NOTE: The captions of some illustrations may contain a mid-caption right bracket at the end of a line, e.g., a "credits" line. That will mislead Guiguts into ending the selection prematurely. One solution is to add a space after that right bracket, click the "Re-run" button, then click the item in the list again. The selection now should include the entire caption, and the extra spaces can be removed later on with "Remove end-of-line spaces."
- If the illustration or sidenote is where you want it to be, just go on to another one in the list. Otherwise:
- To move the selection UP, click the Move Selection Up button. Similarly, you can move the selection DOWN with the Move Selection Down button. Repeat, if necessary, until the selection is where you want it to be.
- You can move the selection several paragraphs or even pages, but
- the tools will not move an illustration past another illustration, or a sidenote past another sidenote, so, when you have two adjacent illustrations that must be moved down, move the second one first.
Rewrapping
NOTE: Before rewrapping, save a copy of the file. If you are not satisfied with the rewrapping results, don't try to use UNDO, but revert to that saved copy.
Most of the rewrap options on this menu, including "Clean Up Rewrap Markers", are used only when preparing the Plain Text version of the book. (Using the "Rewrap Selection" option when preparing the HTML version may make it easier for you, the post-processor, to read what's in the Guiguts window, but it won't affect the appearance of the published ebook.) Unlike the rewrap options on this menu, Guiguts uses most of the rewrap markers (below) for both Plain Text and HTML. So, by refining these markers while working on the common file (before splitting off separate copies that will become the final Plain Text and HTML versions), you can save time and increase the likelihood that both versions will be presented in the same way.
Rewrap Markers
The DP guidelines specify only two rewrap markers: /# ... #/ for rewrappable text (block quotes), and /* ... */ for anything else requiring special handling during post-processing (e.g., poetry, tables, and lists). Guiguts, however, supports several additional rewrap markers which have varying effects on the rewrap rules for Plain Text and for the generated HTML. The markers using letters, e.g., /p, may be upper- or lower-case. Before rewrapping Plain Text, all inline tags should have been converted to their final Plain Text form, e.g., <i> should have been changed to an underscore:
Marker | Rewrap | Plain Text | HTML |
---|---|---|---|
/#...#/ | Yes | Rewraps within default or specified margins. (See below). | As block quote. |
/*...*/ | No rewrap. | Defaults to no-wrap indentation specified on the Preferences Dialog. | Preserves alignment and line breaks. |
/$...$/ | No rewrap. | Does not change indentation. | Preserves indentation by counting each leading space as 0.5em; ends each line with <br> to prevent rewrapping. |
/P...P/ | No rewrap. | Uses Poetry indentation specified on the Preferences Dialog. | As poetry. |
/C...C/ | No rewrap. (See below). | Centers each line within the block, but does not rejoin/rewrap them. | Assigns a <div class="center"> to the block, and adds a <br> at the end of each line to prevent rewrapping. |
/R...R/ | No rewrap. (See below). | Slides the block to the right, until the longest line in the block is at the right margin. You can specify a custom right margin by using /R[n], where n is the position to use, counting from the LEFT. Maintains relative indentations of the other lines. If the /R block is within a Block Quote /#...#/, the right margin of the containing block will be used unless /R[n] overrides it. | Assigns a <div class="right"> to the block, adds a <br> at the end of each line to prevent rewrapping, and attempts to maintain relative indentation of the lines within the /R block by using <span style="margin-right"> (with appropriate numeric values) for all but the longest line. |
/F...F/ | Limited rewrap in HTML. | Ignored in Plain Text. | Centered paragraphs ('f' stands for 'Front Matter'). Each set of non-blank lines within the block becomes a centered, wrappable paragraph: <p class="center"> ... </p>. |
/L...L/ | No rewrap. | Defaults to no-wrap indentation specified on the Preferences Dialog. Use /L[n] to set custom left margin to n. | Unordered list <ul>...</ul>. |
/X...X/ | No rewrap. | No indent. | Generates <pre>...</pre>. |
/I...I/ | Yes, within each entry and each sub-entry. | Indents and rewraps Text Index with hanging indents. (See below). | Generates a formatted, linked Index. |
Notes
- The opening and closing markers (/*, etc.) should stand alone on a line, but an opening markup must be preceded by a blank line, another opening rewrap marker, or a page separator (if you haven't already removed them), and a closing markup must be followed by a blank line, another closing rewrap marker, or a page separator (etc.). If the blank lines, permitted rewrap markers or page separators are missing, Guiguts will fail to recognize either the beginning or the end of the markup and mis-wrapping will result.
- All rewrap markers may be used within Block Quotes /# ... #/ but not within other rewrap markers. Attempting to do so usually will yield mis-wrapped results. See Nested Block Markers for further information.
- When inline tags occur within /C or /R blocks, automatic alignment may require some manual adjustment after rewrapping Plain Text or AutoGenerating HTML.
- After using any Plain Text Rewrap, it's advisable to check the results, particularly by looking at marked blocks. The Regex to find all blocks below can be used for this. One of the reasons for doing this sort of check is that excessively long lines within any no-wrap block must be processed manually (e.g., centering an 80-character line when maximum line length normally is 72). You also may find it useful to look for long lines, which can be done with Bookloupe or the Find Long Lines Regex, also below.
A Regex to find all blocks and selectively change some of them to other types
Note: Although this searches for all of the above markers, it's primarily intended to validate the Block Quote and No Wrap markers added in the Formatting Rounds, and to help you change some of them to the extra ones recognized by Guiguts. Use it while checking and correcting the common file, before making separate copies for Plain Text and HTML, and after Rewrap All, to make sure all blocks were wrapped or positioned as you want them:
Search: \n/([\*#$xXfFlLpPiIcCrR])((.|\n)*?)\n\1/ Replace for poetry: \n/P\2\nP/ Replace for centering: \n/C\2\nC/ Replace for right-alignment maintaining relative indentations: \n/R\2\nR/
A Regex to find lines longer than 72 characters
[^\n]{73,}
Multi-Page Rewrap Blocks
Long block quotes, tables, lists, Indexes, and poems that cross pages will normally have been formatted with opening (and closing) markers at the start (and end) of each continuation page. When those continuation-page markers are on the very first line of the page, immediately following the page separators, Guiguts understands that the block on the second page is just a continuation of what was on the preceding page, and removes the closing marker, the separator, and the opening marker. If the opening marker is followed by a blank line, Guiguts will preserve it, assuming it indicates a new paragraph, a new stanza, or a new row in a table or list.
However, if the first line of a continuation page is blank and the second line is an opening marker, Guiguts will remove only the page separator and will preserve the closing marker, the blank line, and the opening marker. That result most likely is wrong, because books hardly ever print two different quotes, poems, or tables without some regular text, such as an author's comment or a table heading, between them.
You can either prevent this kind of error by looking for incorrectly-placed opening markers before rewrap, or fix these errors by looking for closing markers (blank line) opening markers after rewrap. In either case, make sure blank lines remain where, and only where they should occur.
If you look for these marker sequences before rewrap, you can either:
- remove the end-of-page closing marker and the following top-of-page opening marker, while keeping or deleting a blank line, depending on whether or not it should be there, or
- put the opening marker on the very first line and following it with a blank line if it denotes a new paragraph, stanza, or line.
In the unlikely event that two blocks of the same type actually do follow each other with nothing in between, leave a blank line at the top of the second page and the opening marker directly below it.
Nested Block Markers
"Nesting" refers to placing marked blocks within the scope of other marked blocks. ("Within" means "entirely within": each marked block's opening marker must appear after the start of the containing block and its matching closing marker must appear before the end of that containing block.) Marked blocks may occur within Block Quotes, but not within other kinds of marked blocks. Block Quotes may be nested within other Block Quotes, even at more than one level. Plain Text Rewrap and the HTML Generator will issue warnings if there are more closing markers than opening ones.
Change compared Guiguts 1 behavior: Block markup that is inside block quote markup, e.g. /*
inside /#
markup, will work relative to the current blockquote margins, e.g. if inside double nested block quotes (i.e. left margin = 2 * 2 = 4), the /*
will indent the block a total of 6 (not 2 as in GG1). Similarly, centering and right-aligning will center or right-align to the "currently applying" blockquote margins. Also, a blockquote inside a customized blockquote will indent the left margin from the customized value, e.g. /#
inside /#[10,60]
will have an indent of 10 + 2 = 12.
Nested Block Quotes
When rewrapping Plain Text, nested Block Quotes will be indented within the containing Block Quote, so if the default indentation is [4,68], the outermost Block Quote will use those margins as usual, and the first nested Block Quote within it will use margins of [8,64]. You can specify custom indentations for nested Block Quotes, just as you can for the outermost ones, using absolute, not relative, values. For example, a nested hanging indent might be: [8.4,64].
The HTML Generator supports multiple levels of nested Block Quotes. As indicated above, each nested Block Quote must close before the one containing it closes. Although it isn't necessary, you may wish to supply additional CSS for this situation, e.g., .blockquot .blockquot {margin: blah; font-size: blah;}
Rewrap Commands
You can rewrap part of a document by selecting text and using the command "Tools>Rewrap Selection". Guiguts rewraps the selected text, adjusting unmarked text to the default margins and adjusting marked blocks according to the type of markup. Rewrap All doesn't need preselected text, as it applies to the entire document.
Combining rewrap operations with Undo/Redo may cause the location of page breaks to be lost, so for safety, save the file first, particularly if rewrapping a long section that crosses page boundaries or rewrapping the whole text. Rather than undoing, you can then just reload the saved file. If you rewrap to a wrong margin, another option may be to just re-select the same text and rewrap it again to the correct margin.
Table Indent
By default, a Plain Text table is indented the amount specified by the value set for NoWrap Blocks (/*...*/) in the Preferences dialog. However, you can set a specific indent for any table (/*..*/ markup) by placing an indent value, 0 or a positive integer, in brackets immediately after the opening /*. For example, this text:
/*[6] Some Tabular Text */
will be changed as follows by the rewrapping operation:
/*[6] Some Tabular Text */
Block Quote Indent and Margins
This applies only to Plain Text, and the extra parameters described here should be added only to the Plain Text version. For HTML, the /#...#/ marker generates <div class="blockquot"> or <blockquote>.
By default, a block quote is rewrapped according to the margins specified in the Preferences dialog. But you can set a specific left margin, hanging indent, and right margin for any individual block quote (/#..#/ markup) by putting up to three numbers in brackets after the opening /#. In computerese, the syntax is: left[. first][, right], where left is the number of spaces (indentation) on the left side, first is the number of spaces (indentation) for the first line of the paragraph, and right is the line length; that is, the maximum number of characters per line you want, including any left-side indentation. Here are different combinations of those:
/#[ left ] | Wrap to left margin left, default right margin |
/#[ left , right ] | Wrap within margins left and right |
/#[ left . first ] | Wrap first line of each paragraph to margin first, remaining lines to left margin left, default right margin |
/#[ left . first , right ] | Wrap first line of each paragraph in margins first to right, other lines in margins left and right |
Examples of Indenting Block Quotes
For example, this quote:
/#[4.8,24] I hope to find you well and expect to arrive Wednesday <i>inst</i> Eugenie asks to be remembered to all with love. #/
will rewrap as follows (the top line is a ruler, not included in what actually happens):
....,....1....,....2....,....3 /#[4.8,24] I hope to find you well and expect to arrive Wednesday <i>inst</i> Eugenie asks to be remembered to all with love. #/
A hanging indent may be done the same way if the first line is indented less than the others. For example this quote:
/#[8.4,24] I hope to find you well and expect to arrive Wednesday <i>inst</i> Eugenie asks to be remembered to all with love. #/
will rewrap as follows:
/#[8.4,24] I hope to find you well and expect to arrive Wednesday <i>inst</i> Eugenie asks to be remembered to all with love. #/
You can change the margins midway through a block quote simply by closing it and starting a new block quote with different margin numbers.
When to Use (and not use) the /C Centering Marker
/C is intended for use with normal body text that should be centered but not rewrapped. Multi-line epitaphs, one-line aphorisms, and the title of a letter are examples of this; headings generally are wrappable and should be preceded by multiple blank lines, not enclosed in a /C block. When the /C block is within a Block Quote /#, centering is done within the margins of the Block Quote.
/C markers may be used within block quotes but not within any other markers.
Unlike Guiguts 1, /C accepts an optional column number to center lines on, e.g. /C[25] will center lines in the block on column 25.
When to Use (and not use) the /R Right-Align Marker
/R is primarily used with correspondence, as it facilitates positioning the lines in the city/date area at the top of the letter and the lines in the signature area at the bottom of the letter near the right margin. You (or people in the Formatting rounds) can indent the lines of each area to match their appearance in the original book, and Guiguts will attempt to preserve the indentations. /R also may be useful in positioning one-line credits just below illustrations, as it will place them at the right margin of the illustration's <div>. /R does not right-justify all of the lines within the block; it tries to move all of the lines the same distance towards the right, until one of them reaches the right margin. When the /R block is within a Block Quote /#, which will happen frequently, the indented right-margin of the Block Quote becomes the right-margin of the /R block.
/R[n] specifies a custom right margin, which may be to the left or right of the current default. For example, the normal right margin is at 72, but /R[68] will align the rightmost character within the /R block at column 68. n is only used in the Plain Text version of a document; Guiguts ignores it when generating HTML, and assigns a class of right to the entire block.
/R markers may be used within block quotes but not within any other markers.
Index Indent and Margins
This applies only to Plain Text, and the extra parameters described here should be added only to the Plain Text version.
The default settings for rewrapping a Plain Text Index are:
/I[8.2,72]
These values are used a little differently than in a Block Quote, because an Index may have multiple levels of entries and sub-entries:
- the first value specifies the rewrap column that will be used by all levels: any entry, at any level, whose length would cause it go past the right margin specified by the third value will be rewrapped to the character column specified by this first value;
- the second value specifies the left margin for main entries. Each level of sub-entry will be indented two spaces further than the level above it; that additional indentation is not related to the '2' shown in the default example above, but is fixed;
- the third value specifies the right margin for the Index. It can be greater or less than 72, and any entry whose length would cause it to go past that right margin will be rewrapped to the character column specified by the first value.
Examples of Indenting an Index
Using the defaults,
/I A Accra, Africa, 50 Adamski, G., 16, 203, 204, 278 Aerial Phenomena Group, U. S. Air Force, 2, 271, 272. _See also_ ATIC Aerial Phenomena Research Organization (APRO), 181, 219, 235–36, 275, 278 Aerospace Technical Intelligence Center (ATIC), 2, 271. _See also_ ATIC
will rewrap to:
/I A Accra, Africa, 50 Adamski, G., 16, 203, 204, 278 Aerial Phenomena Group, U. S. Air Force, 2, 271, 272. _See also_ ATIC Aerial Phenomena Research Organization (APRO), 181, 219, 235–36, 275, 278 Aerospace Technical Intelligence Center (ATIC), 2, 271. _See also_ ATIC
Using:
/I[14.4,80] Angel hair, 220–26; alleged origins of, 194, 221; arachnid, 220–24; industrial, 224 “Angels” on radar, Pl. IVc; collision course of, 153–54; defined, 151; conditions producing, 157–60, 164, 170; moisture inversion and, 151, 158–60; possible causes of, 157–58; ring, 150, 165–66; temperature inversion and, 151–52, 158–60; UFO reports based on, 5–6, 71, 72, 151–52, 155–57, 161, 164–71, 182, 190, 192, 200, 202, 204, 208, 220, 222, 230, 232, 240, 242, 250, 252, 254, 256, 258, 260, 272, 384, 292, 300 Ann Arbor, Mich., 241
(the line beginning "UFO reports" is one very long line, but your Browser will rewrap it if it's too wide for your screen) will rewrap to:
/I[14.4,80] Angel hair, 220–26; alleged origins of, 194, 221; arachnid, 220–24; industrial, 224 “Angels” on radar, Pl. IVc; collision course of, 153–54; defined, 151; conditions producing, 157–60, 164, 170; moisture inversion and, 151, 158–60; possible causes of, 157–58; ring, 150, 165–66; temperature inversion and, 151–52, 158–60; UFO reports based on, 5–6, 71, 72, 151–52, 155–57, 161, 164–71, 182, 190, 192, 200, 202, 204, 208, 220, 222, 230, 232, 240, 242, 250, 252, 254, 256, 258, 260, 272, 384, 292, 300 Ann Arbor, Mich., 241
Rewrap All
Rewraps the entire document, using the set margins values in the Preferences dialog.
NOTE: Before rewrapping, save a copy of the file. If you are not satisfied with the rewrapping results, don't try to use UNDO, but revert to that saved copy.
Rewrap Selection
Rewraps the selected text (usually one or more complete paragraphs), using the set margins values in the Preferences dialog. Caution: If the selected text is not followed by a blank line (for example you are rewrapping only part of a paragraph), a blank line will be added to the rewrapped text.
Block Rewrap Selection
As above, but using the block quote margins.
Clean Up Rewrap Markers
Removes all of the rewrap markers from the document. It does not remove extra blank lines that may have been added to keep adjacent close/open markup separated from each other.
Once Clean Up Rewrap Markers has been run, and the result saved, it'll be more difficult to do some kinds of searches and automatic rewrapping, so this usually is one of the very last steps in preparing the Plain Text version of the book.
Convert to Curly Quotes
This converts most straight double and single quotation marks to “curly” UTF-8 opening and closing marks. It's best to use this Convertor near the end of the first stage of post-processing, after all errors have been corrected, but just before making separate copies that will become the Plain Text and HTML versions. That way, the results will be available to both versions.
In order to convert quotes in a file already converted to HTML, use Protect HTML Straight Quotes first, otherwise the tool would convert straight quotes within HTML tags. This option protects those quotes by converting them to "∮" or "∯". After you finish the conversion and correct any remaining issues (described below), use Restore HTML Straight Quotes, which will convert "∮" and "∯" back to single and double straight quotes.
The Curly Quotes Convertor does not attempt to convert single quotation marks that are preceded by a space or newline, as they might indicate omitted letters, such as the ones used in dialect or contractions such as ’Tis (the season to be jolly). Other ambiguous cases, particularly between close single quotation marks and apostrophes may also not be converted (see below for how to find and fix these). The default behavior is therefore quite strict in not converting quotes that Guiguts is unsure of. If you wish Guiguts to convert more single quotes (and check later that it has done this correctly) you can turn off "Strict Single Curly Quote Conversion" in the Advanced tab of the Settings dialog.
Immediately after the conversion the Curly Quotes checker will pop up, listing places where straight quotes were not converted or where there are apparent errors. See the next option below for details.
Check Curly Quotes
As mentioned above a list of quote-related queries is displayed in a checker dialog immediately after Convert to Curly Quotes is run. You may also run Check Curly Quotes at any point, before or after using the Convert to Curly Quotes, or if you used a different method to do the curly quote conversion.
- SCREENSHOT: Curly Quotes Checker dialog
The curly quotes check will find and report several issues, and assists you in quickly fixing them:
- Open DQ unexpected: An open quote was found when quotes had already been opened.
- Close SQ/DQ unexpected: A close quote was found when quotes were not open.
- Close SQ/DQ at line start: Close quotes should not appear at the start of a line.
- Close SQ/DQ after space: Close quotes are not usually preceded by a space.
- Close SQ/DQ before letter: Close quotes are not usually followed by a letter.
- Open SQ/DQ at line end: Open quotes should not usually appear at the end of a line.
- Open SQ/DQ before space: Open quotes should not usually be followed by a space.
- Open SQ/DQ after letter/punc: Open quotes should not normally follow a letter or number.
- SQ/DQ not converted: A straight quote has been found.
The remedy in each of the above cases will depend on the specific error. For example, if an unexpected open quote is found, it is possible that it should be a close quote, or that there is a missing close quote earlier in the paragraph.
Clicking on an error message will take you to the correct point in the text file. At the top of the dialog, several buttons allow you to make common corrections:
- Changing an open quote to a close quote, or vice versa.
- Changing a straight quote to a curly quote, or vice versa.
- Swapping the positions of the quote and an adjacent space, e.g. if there is a space before a close quote, when the space should be after it.
- Deleting the incorrect space that precedes or follows a quote.
- Inserting any of the four main types of curly quote.
In addition, clicking an error message while holding down the Ctrl key (Cmd key on Macs) will do one of the first two corrections in the above list, as appropriate, i.e. either change an open to a close quote, or change a straight to a curly quote.
In the double quotes check, if a paragraph is missing a final close quote but the following paragraph has an open quote, you can choose whether or not that situation will be reported as an error or ignored (since it is common practice in many books) by checking or clearing the Allow "Next Paragraph Begins With quotes" Exception checkbox.
The View Options button allows you to control whether to show or hide messages about double or single quotes not being converted, as well as other types of double and single quote errors.
Most tags (in <angled brackets>), brackets (parentheses, square brackets, and curly braces), Block Markups, and curly quotes are used in opening-closing symmetrical pairs. When one member of a pair is missing or malformed, it's usually easy to find the error, but sometimes, it can be difficult, especially when they are far apart, or two types look similar (round and curly brackets, for example, which sometimes confuse the OCR). These options can make it easier to find such errors: each one scans the entire file and lists any mismatches it finds in a checker dialog.:
- SCREENSHOT: e.g. Unmatched Block Markup
Notes:
- The Block Markup option accepts nesting of Block Quotes /#...#/.
- There are situations where something is deliberately unpaired, such as a right square bracket following the attribution credit under an illustration or a left square bracket preceding Stage Directions. It is also quite common to see multi-paragraph quotes have an open quote at the beginning of each paragraph, but on close quote until the final paragraph of the quote.
- A footnote anchor such as [12] within an [Illustration] will also be listed as a possible error, although it is valid.
Convert Fractions
This submenu contains entries to convert fractions to actual Unicode characters (if available) or to a mix of superscripts and subscripts, with a "⁄" (Fraction Slash) between them, simulating the appearance of fractions for which there are no Unicode equivalents. If you have selected some text before using Convert Fractions, then only the fractions within the selection will be converted. If no text is selected, fractions will be converted through the whole fil. You have three conversion choices, as shown in this example:
- SCREENSHOT: Copy example from GG1 manual https://www.pgdp.net/wiki/PPTools/Guiguts/Guiguts_Manual/Tools_Menu#Convert_Fractions
This submenu contains several options to help you find and insert characters that may not easily be typed on your keyboard.
Unicode Search/Entry
In this dialog, you can search for a character if you know (part of) its name or its Unicode code point (often written as U+nnnn, where nnnn is a 4-digit hexadecimal number).
- SCREENSHOT: Unicode Search/Entry dialog
In the field at the top, you can type one or more words that make up the name of any Unicode character that you want to insert into your file. For example, if you type "acute" then click the Search button (or press Return/Enter), it will list all the Unicode characters that have the word "ACUTE" in their name, such as "ACUTE ACCENT", and "LATIN CAPITAL LETTER A WITH ACUTE". If you are only interested in those with a double acute accent, you could type "ACUTE DOUBLE" or "DOUBLE ACUTE", and characters such as "CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE" will be listed.
Instead of typing words from the Unicode name, you can also type the Unicode code point of the character you want, e.g. "2014" or "U+2014" will list the "EM DASH" character. Also, if you have copied a character from elsewhere, you can paste the character into the field and it will be listed with its code point, character name and Unicode block name.
Once a character is listed, clicking on its character, code point, or name will insert it into the current file; clicking on its block name will cause the Unicode Block dialog to be displayed, showing the whole of that block (see below).
Unicode Blocks
Unicode characters are grouped together in blocks, e.g. "Basic Latin" or "Combining Diacritical Marks", each containing from a few tens to a few hundreds of characters. This dialog can show any of the Unicode blocks, which is selected by using the dropdown at the top. The first entry in the dropdown is "Commonly Used Characters" which is not a Unicode block, but is a collection of some characters commonly used in our books: simple accented characters, quotes, fractions, etc. Hover over a character to see a tooltip describing its code point and name. Click the character to insert it into the file.
Typing when focused on the Unicode block dialog selects the block whose name contains the word or partial word that is typed. The string typed can be the beginning of any word in the block's title. Alternatively, if you type the start of a 4-digit Unicode codepoint, it will jump to the block that contains that character, e.g. knowing that open single quote is `U+2018`, typing `201` will jump to General Punctuation, which has the range `2000-206f`.
Normalize Selected Characters
Many Unicode characters used in our books consist of a base character with one or more accents or other diacritical marks, for example, å ('a' with ring above). There are sometimes two ways of creating such a character: either via a single Unicode character or via a base character followed by one or more combining characters. In the above example, there is a single "canonical composed" or "precomposed" Unicode character (Latin Small Letter A with Ring Above - decimal 229 / hex E5) that can be used. Alternatively, this can also be represented in a "decomposed" manner using a standard 'a' (Latin Small Letter A - decimal 97 / hex 61) followed by a combining character (Combining Ring Above - decimal 778 / hex 03EA). Depending on the font you are using, these may look identical to one another.
However, there are good reasons to use the precomposed form if one exists - a precomposed form generally exists for more common combinations, but not necessarily for less common ones. The first reason is that if you have used the decomposed form, then if a reader wants to find the word and uses the precomposed characters in their search, they may not find the version with decomposed characters, depending on their browser or e-book reader (some tools may use normalization during searching to allow them to find both forms). The second reason is that the Nu HTML Checker will issue warnings about decomposed characters if an equivalent precomposed Unicode character exists.
Converting from the decomposed to precomposed form is referred to as converting to Unicode Normalization Form C, or less formally normalizing the text. This process also includes ensuring any remaining combining characters after normalization are placed in a standard order.
If you see such an error from the Nu HTML Checker, typically something like "Text run is not in Unicode Normalization Form C", or if you have been using combining characters to add accents to letters, you should check the text is normalized before submitting it. You can select the portion of text that includes the characters, then use the "Normalize Selected Characters" menu option to resolve it.
Compose Sequence
- SCREENSHOT: Compose Sequence
Using this menu entry, or Ctrl+I (Cmd+I on Macs) opens a small text entry Dialog box, into which you can enter keyboard characters that will be converted to Unicode characters. See Compose Sequences on the Help Menu for a list of such sequences.
If the character you want is not in the list and/or you know their hex or decimal values, you can compose and insert Unicode characters on-the-fly by pressing and releasing the Compose key to display the Compose dialog, then typing into it the hex value (or #decimal value) and clicking OK or pressing Enter. If you type a 4-character hex value, 'Compose' will insert it without waiting for you to click OK.
For example, if you type: /a into the Compose key dialog, an á will appear at the cursor in the main Guiguts window and the Compose Dialog will disappear; entering 2720 will insert a Maltese Cross: ✠, as will entering #10016 and clicking OK or pressing Return/Enter.
When processing our books, you may encounter symbols that are not available through the Compose Sequence feature, and some that are not in the extensive Unicode character set. Another way that may help you create such symbols is by combining existing normal characters with "combining" ones that are in the "Combining" blocks on the Unicode Blocks dialog. See Combining Characters for a clear explanation of how this works. Combining characters must follow the base character.
PP Workbench
This opens the online Post-Processing Workbench in your browser. The PP Workbench contains online versions of PPtext and PPhtml, which are similar to those tools within Guiguts. It also has ppsmq, which is similar to Guiguts' Curly Quote conversion tool. Finally, it has a link to ppcomp, which is a useful tool for comparing your text file to your HTML file, and can help you spot if you have accidentally made an edit in one file but not the other.