PPTools/Guiguts/Guiguts 2 Manual/Introduction

From DPWiki
Jump to navigation Jump to search


GUIGUTS VERSION 2 MANUAL

Describes features included in release 2.0.0 (July 2025)


Introduction


Overview

Guiguts 2 is a free cross-platform tool designed to speed and simplify every phase of post-processing an e-text. The successor to Guiguts 1, Guiguts 2 is written using Python, rather than Perl. This manual covers the current version of Guiguts 2. For brevity, from this point, Guiguts 2 will usually be referred to as simply "Guiguts", unless there is a need to differentiate between Guiguts 2 and Guiguts 1 behavior.

At heart, Guiguts is a simple text editor. You open a file; you scroll the text to view it; you change the text by selecting, cutting, pasting, etc., using familiar commands and keystrokes; then you save the file.

Under this modest exterior, Guiguts has many special features designed to speed your work as a post-processor, such as built-in searches for stealth scannos (the common OCR text errors), automatic moving and renumbering of footnotes, and automatic generation of HTML that complies with Project Gutenberg standards.

Purpose of this Manual

This manual's objective is to explain what Guiguts does, not how to post-process a project. For such information, please see the PP FAQ and other documentation.

The manual is organized based on the premise that there are three main phases to post-processing. The first one corrects errors, makes the text consistent, and produces a result that will be used by the other two phases. The second phase converts a copy of the first phase results into the final Plain Text that will be published. The third phase converts another copy of the first phase results into the final HTML document that will be published.

Within Guiguts, you can open a page of this manual that is relevant to the work you are doing at any time by pressing the F1 function key on your keyboard. See the Help Menu, Manual entry for more information.

Installing Guiguts 2

Summary of Post-processing Workflow

  • Before you start using Guiguts
    • on your computer, create a folder to contain everything for this project.
    • download the text and image zips for the project, and unzip them into appropriate folders.
    • prepare the illustrations, including a cover. You won't need them until you are creating the HTML version of the book, but if there are missing or unusable ones, or they don't match what you will find in the text file you downloaded, you will want to resolve those discrepancies before spending (or wasting) your time preparing the text.
      • Illustrations will look better if you calibrate your monitor and occasionally clean the screen.
      • Other people may be able to help you prepare some illustrations or covers, if you ask in the Illustrators' Forum.
  • Using Guiguts
    • open the project's text file in Guiguts.
    • configure page labels so the image names in the Rounds are matched with the page numbers printed in the book (optional, but recommended).
    • correct errors, organize the text, resolve inconsistencies in what the author wrote, what the publisher printed, what the proofreaders did, and how the formatters marked various elements and aligned tables. Generally, these steps will take at least one third of the total time you spend on the project.
    • finish preparing the text for conversion to the two final versions of the book: Plain Text and HTML.
    • make a separate copy for later use in preparing the HTML version.
    • process the current copy into the final Plain Text version.
    • open the copy you saved for use in preparing the HTML version.
    • prepare it for conversion to HTML.
    • convert it to HTML.
    • process the HTML copy into the final HTML version. Generally, these steps will take at least one third of the total time you spend on the project.
    • make sure the HTML version also works well on handheld devices.
  • Other than later corrections and improvements to what you've done, the post-processing steps after this do not use Guiguts and are not addressed here.

Guiguts Main Window

Guiguts opens a single document window where you will do most of your post-processing, and (at least in Windows) it also opens a command-line window that is only used by the Python interpreter that is running Guiguts. Normally, you can ignore that command-line window, and keep it minimized.

Guiguts will only edit one document at a time. However, it is possible to launch multiple copies of the Guiguts program and open a different document in each one. The title bar of the window reflects the current file. You view, edit, and scroll the file in the window in ways that are familiar to anyone who has used a text editor. You can resize, minimize, or move the window on the desktop using the mouse, using the normal methods of your operating system.

When you want to close Guiguts, do it from the document window. That also closes the command-line window on Windows.

NOTE: Don't close the command-line window unless Guiguts stops working. If you close the command-line window, the Guiguts document window will close immediately, without any chance to save the changes you might have made in it.

    • SCREENSHOT: Main Window


Menu Bar

The links in the Table of Contents lead to explanations of each function of each menu. Some of those functions also are on the Status bar, and on keyboard shortcuts.

On most systems, in addition to clicking on menus to use the various functions available, you can also navigate the menus using a keyboard. Some users find they can activate certain options more quickly using this method once they have learnt the keystrokes necessary.

On Windows and Linux systems, using the Alt key combined with a specific letter from the menu name will open that menu. For example Alt+s will open the Search menu. Once the menu is open, you will see that each option has an underlined character: such as the C in Find Next Proofer Comment in the Search menu. This indicates that once the menu is open, pressing C will activate the Find Next Proofer Comment function in the same was as clicking on the menu entry would.

On Mac systems, the way to open the menu is slightly different. Press the F10 function key, which should activate the menu bar, then press S for Search, then press C for Find Next Proofer Comment. If you have configured your function keys to activate other system functions, such as mute/unmute for F10, then you can combine the Fn modifier key with F10 to activate the menu bar, e.g. Fn+F10 S C will select Find Next Proofer Comment.

Note that the keystroke required to select a menu option will always be the underlined character in the text of the option. This is independent of, and is often different to, the letter that is used with the Ctrl key as an instant keyboard shortcut to the option. For example the shortcut Ctrl+f instantly brings up the Search and Replace dialog, whereas using keyboard navigation of the menus would require Alt+S R or F10 S R, since R is the underlined character in "Search & Replace..."

Status Bar

The status bar is always below the main text window.

      • SCREENSHOT of status bar

The fields of the status bar are buttons, most of which also display current status information. Hovering over them will cause a tooltip to be shown indicating what the effect will be of clicking the button, including other operations available by combining a Modifier key (Shift, Ctrl, Cmd, etc) with clicking. From left to right, they are used as follows:

L:1/18001 C: 0 Line number and column of the character to the right of the insertion point. Left-click in this box to open a go-to-line dialog. Shift-click to show or hide the display of line numbers down the left edge of the document. Shift-right-click to show or hide the display of column numbers across the top of the document.
Img:001 Current page-image number, corresponding to the page image file nnn.png. Left-click to open a go-to-image dialog.
< Button to go back one page.
See Img Button See Image opens the current page image in the image viewer. Shift-click to toggle Auto Image, which automatically shows the page image in an image viewer any time the cursor moves to a different page.
> Button to go forward one page.
Lbl: Pg 1 Current page label, once labels have been set up. Left-click (Mac: ctl-click) to open a go-to-page dialog. Shift-click to open the Configure Page Labels dialog.
No Selection Displays the start- and end-point of the current selection, or No Selection. Left-click to restore the last selection. Shift-click to toggle between regular selection and column selection.
Lang:en Shows the current language (English in this example). Left-click to change languages. More details here
Dec 10: Hex 000A   Displays the Unicode code point of the character at the insertion point, or the selected character if exactly one character is selected, and optionally the name of the character. Left-click to toggle the display of the name of the character.

Tool Bar and Message Bar

The tool bar and message bar are always at the bottom of the main text window.

The toolbar contains buttons that are commonly used: Load, Save, Undo, Redo, Cut, Copy, Paste, Search/Replace, Go Back, Go Forward, and Help.

The message bar displays the latest message for several seconds. An example is a message reporting how many fixes were made using Fix All in a checker dialog.

      • SCREENSHOT of status bar

Tear-Off Menus

All menus display a dashed "perforation" line at the top:

      • SCREENSHOT of tear-off

It shows that the menu can be "torn off" by clicking or clicking-and-dragging on the "perforation." The torn-off menu is placed on the desktop as an independent menu / small dialog. It can be moved to any convenient location and resized like any other window. (The example below shows it on top of the editing window, but you'll want to move it out of the way.) The original drop-down menu still can be used in the normal way.

      • SCREENSHOT of torn-off menu

Torn-off menus disappear when you close them or when Guiguts terminates. Menu status is not saved; you must tear them off afresh each time Guiguts launches.

Image Viewer

Guiguts has its own built-in Image Viewer, which can show the scan image corresponding to the text being edited. This means that the user does not need to configure an external image viewer or editor to work with Guiguts (though you can configure Guiguts to use one via the Settings dialog if you prefer). The Image Viewer can either be floating in its own window or "docked" as part of the main window. If docked, the text and image are separated by a splitter bar which can be dragged left or right to adjust the proportion of the screen given over to the image. When undocked, the user is responsible for sizing and positioning the floating Image Viewer at a convenient place on-screen. The user can either specifically request to see the image for the current page, or can turn on "Auto Image" so that whenever the current insert point in the text moves to a new page, the image for that page will be displayed.

    • SCREENSHOT Image Viewer

The image viewer has several control buttons:

  • "<"/">" show the previous or next image file alphabetically from the one being shown - usually this will correspond to the previous or next page. This can be useful if you want to look at the scan for the previous or next page while you are working on a page. If Auto Img is currently turned on, it will be paused while you view other pages using this method. When you move the mouse back out of the image viewer to return to editing, Auto Img will re-load the correct image for the current page, flashing the border if you have enabled that feature in the Preferences dialog
  • "+"/"-" zoom the image in and out. You can also use keyboard shortcuts Cmd/Ctrl with the plus and minus keys to zoom, or Cmd/Ctrl with the mousewheel if you have one.
  • "📂" allows you to load any image into the image viewer, whether a scan png file, or jpeg illustration. The "<"/">" buttons described above will then show the previous or next images alphabetically.
  • "↺" rotates the image by 90 degrees, for example if a table or some text is rotated sideways in the book. The rotation for that page will be remembered in the project's json file when the file is saved.
  • "Fit←→"/"Fit↑↓" autofit the image to the width or height of the image viewer. You can also use Cmd/Ctrl with the zero key to fit to height.
  • "Invert" will invert the colors in the image, so that a black on white scan will become white on black instead. This may be more comfortable if you prefer a dark theme.
  • "Dock" controls whether the Image Viewer is docked or floating in its own window. Equivalent to Dock Image in the View menu.
  • "×" closes the Image Viewer, equivalent to Hide Image in the View menu.

Dialogs

Most tools within Guiguts will pop up a smaller window with controls that are specific to that tool. Such windows are called dialogs, and may contain labels, buttons, check boxes, fields to type numbers or words into, etc. There may also be a list of messages that can be acted upon.

Any dialog in Guiguts can be "pinned". This means that it will remain in front of the main window (not available on Linux). When pinned, a pin is shown in the title of the dialog. The current dialog may be pinned using Pin/Unpin Dialog in the View menu, or by right-clicking the dialog in an area where right-click is not normally used, i.e. most of the dialog except a list of messages. When you unpin a dialog, the title bar will say that it will be unpinned when it is next closed. Pinned dialogs are linked to the main window, so are raised, lowered, or iconized with the main window.

Where to go from Here

The Table of Contents at the top of this page is approximately in the sequence many people use when post-processing, and it has links to explanations of every menu and function in the current version of Guiguts.

Known Issues

The items listed here are known bugs or restrictions that are not planned to be fixed, usually because it is not possible/practical to do so.

Minor macOS user interface issues

  • On macOS, font size changes are not reflected in each dialog, e.g. Unicode Blocks, until the dialog is brought to the front and receives focus.
  • On macOS, using the green "gumdrop" Window button to go "full-screen" causes some unexpected behavior and is not recommended. Using Option+green does "maximize" instead, which works fine. An alternative way to maximize is to double-click the title bar.
  • On macOS, column selection of non-breaking spaces may not work as expected.
  • On macOS, changing the tabs in the Preferences dialog by using tab and arrow keys rather than the mouse may lead to the tab not updating until the mouse is moved or clicked on tab. Changing tabs by mouse works fine.

Regex issues

  • When using a regex in the Search dialog, \b represents the boundary between a word and non-word characters, so it matches the beginning and end of a word. Since regexes consider underscore to be a word character, searching for \bdog\b would not match the word "dog" in _dog_. In most cases, instead of using \b, you can just check the Whole Word box, which does not consider underscore to be a word character. A Whole Word search for dog will therefore find the word "dog" in _dog_.
  • In regexes, users should avoid combining use of ^ & $ (which are intended for single line matches) with \n (or \s, which matches \n) which is suitable for multiline matches, particularly if this might result in zero-length matches. Also note that expressions like [^a-z] include newline characters in GG2 but not by default in GG1. If you don't want newline characters, use [^a-z\n].
  • Occasionally, a regex search may "time out" and an error message appear. If this happens, attempt to simplify or adjust the regex, then try searching again. Alternatively, try adjusting flags such as whether to match case or not, if these are not important for how your regex works. It appears that regexes that match multiple lines are more prone to timing out. An example is /#((\n|[^#])+?)#/. Since [^#] matches newline characters, this can be simplified to /#([^#]+?)#/.

Undo issues

  • In general, the undo mechanism is unable to correctly undo the positioning of "marks" in the file. Marks are used to record the position of page breaks and bookmarks, as well as the start and end of problems found by tools such as Jeebies and Proofer Comments checks. Specific examples below.
    • As in Guiguts 1, Search/Replace attempts to preserve the position of page breaks. However, if you then use Undo, the page breaks could move. This is unavoidable. It only applies when a page break is enclosed within the matched text, i.e. normally a multi-line search string. If doing a complex multiline S/R, it's therefore recommended to save beforehand, and instead of undoing, revert to saved version.
    • When using a checking tool, such as Proofer Comments, if an error is fixed/deleted using the buttons in the dialog, then Undo is used to undo that change, Guiguts may lose track of which piece of text the error refers to (the spotlighted text). Re-attempting a fix may therefore do nothing or work incorrectly. Under these circumstances, re-running the tool will refresh Guiguts' records of where remaining problems are located. Some tools will automatically clear their display and prompt you to refresh the list if you use Undo or Redo.
    • Similarly, when using Illustration Fixup to move illustrations, but preserving illo's page numbers, using Undo can leave a moved page break position behind. Instead of using Undo, revert to the previous saved version.

Minor display issues

  • The "Cursor Line" (i.e. where the insert cursor currently is) is highlighted if enabled in the Preferences dialog. If the text window is split, there are two "cursor lines", one for the location of the cursor in each window. This highlighting is shown in both parts of the split window, even though in one of those parts the cursor may be somewhere else. When focus is switched between the parts, the cursor line highlighting may therefore appear to move. This is solely a visual issue and has no effect on the behavior of editing features in Guiguts.
  • The accents on certain characters, especially Greek capital letters such as , may be displayed to the left of the base character. This behavior is font-dependent, and happens with DP Sans Mono and DejaVu Sans Mono. When such a character appears at the start of the line, one or more accents may be clipped and hence not visible. It is still possible to tell which character is at the start of a line by placing the cursor at the start of the line, or selecting the first character, and looking at the name of the character in the status bar. If you are working on accented Greek, and the accent clipping is causing difficulties, another solution is to temporarily use a different font. Depending on your system, one of the following may work well: Noto Sans Mono, Lucida Sans Typewriter (not on macOS), or Courier New (caveat: Noto Sans Mono appears to handle this by making these characters double-width).
  • When making a column selection, if the mouse is moved outside the text window, there may be some flickering of the selected region. In addition if the mouse button is released outside the text window, the program will not detect this, and will continue to slowly extend the selection. Click inside the text window once more to stop this.