PPTools/Guiguts/Spellcheck

From DPWiki
Jump to navigation Jump to search

Spellcheck

Spellcheck is best applied after you have

as these steps remove many trivial mistakes that would turn up as spelling errors.

The following process is the traditional spellcheck, in which you check words in document sequence. For a different, and possibly faster spellcheck process, see the word frequency section.

When you are ready to begin spellchecking, use Tools>Spell Check... or click Guiguts Tb-spell.png in the toolbar. Guiguts saves the document at this time and invokes the Aspell program to spell-check the document. (At this time (2006) Guiguts is only compatible with Aspell version 0.5.x.)

When Aspell completes, Guiguts opens the spellcheck dialog:

Guiguts Spellcheck.png

The word in the top field (here, knes) is a word in the document that is not found in the Aspell dictionary or project dictionary. Just above is the count of how many times this word appears in the document. (The count is available only when the Word Frequency routine has previously been run). The first or only use of the word is highlighted in search-orange in the document window, so you can see it in context.

The text in the second field (here, knees) is Aspell's best guess as to a correct spelling. This text will replace the found word if you click the Change or Change All button. Below it is a list of other close matches from the dictionaries. You can move any of these to the Replacement Text field by double-clicking it.

Examine the word in context and decide what to do:

The word is an error. For example, knes was an uncaught scanno for knee. You may have to look at the page image to make sure of the author's intent and the proper correction. Put the correct spelling in the second field—in the example, double-click knee in the list. If the word appears only once, or if it might be correct in another context, click Change. The word is replaced and the next suspect is displayed. If the word appears more than once and would always be wrong, click Change All. (It's best to change the word using the Change buttons rather than by directly editing it in the document.)

The word is a valid English word. If you think the word is valid, check a dictionary (for example, Merriam-Webster Online; on the Mac, you have a comprehensive dictionary built into your "dashboard.") If the word is valid, click Add To Aspell Dic. In a recent small project the words cresset and tufa came up in this way.

The word is valid in its context. Aspell questions proper nouns, archaic spellings, technical terms, and words from languages other than one for the dictionary in use. There are two possibilities. If you are sure the word is valid everywhere it appears in the book, click either Skip All or Add To Project Dic (which in fact have the identical effect). If you are sure the word is valid in its visible context, but it conceivably might be invalid somewhere else, click Skip.

For example, take a book that has occasional references to French writers and text. When Aspell stops on the name Rochefoucauld—and after you have really looked to make sure it wasn't mis-scanned as Rochcfoucauld—you can confidently click Add to Project Dic. On the other hand, if Aspell stops on sur or sa in the context of a quotation from the French, you should only click Skip. The same letters, in the context of English text, could be scannos for sun or so, and you want to view every occurrence.

You aren't sure. Click Skip. Run spellcheck again later and the same word will come up again.

Usage hint: With practice, you can decide what to do with a word very quickly, but you must not let yourself decide too quickly. If the book contains many proper nouns, latin tags, archaisms, etc., you may start clicking Add to Project Dic so fast you click right past a real error. You must really look at every word Aspell presents. (Scannos can appear in latin or archaic terms, too!) If your "click rate" exceeds a click per second, you are probably going too fast.

Hot-keys

The following key-equivalents are available while the spellcheck window has the keyboard focus:

ctl-a Add word to Aspell dictionary.
ctl-p Add word to project dictionary.
ctl-i Skip All ("ignore").
ctl-s Skip.

Stopping and Restarting

If you need to break off spellcheck and start again later, you can do so. You can quit spellcheck at any time; just close the dialog window. When you later restart spellcheck it resumes with the first uncorrected word. Words you have skipped will again be found as errors and you will have to skip them again.

You can also click Set Bookmark before closing the dialog window. When you next open spellcheck in the same document, click Resume @ Bookmark. Checking begins at the bookmark, and you do not have to skip over the same skipped words.

Checking Part of the Document

To check only part of the document, select that part before you open the spellcheck dialog. Checking is confined to the selection. After checking has started, click in the document to clear the selection so you can see the found words.

Aspell and Unicode Text

Aspell does not properly handle Unicode characters. If your book contains Greek or Cyrillic or other other UTF8 letters, Aspell will work properly on the Latin-1 text. However, when it reaches a word containing non-Latin-1 letters, it will detect it as an error, but will not display the word, nor is the word highlighted in the document window. When Aspell stops on such an "invisible" error, just keep clicking Skip until it emerges from the stretch of Unicode text and highlights a normal error again.

Changing the Main Dictionary

Aspell is usually installed with more than one main dictionary, including at least three for variants of English. (A book published in England will produce many fewer errors when checked with the EN_GB dictionary than with the EN_US one!) To use a different dictionary, click the Options button in the spellcheck window. This opens a dialog that lists all available dictionaries. Double-click one to move it to the Current Dictionary field and click Close. The spellcheck process is restarted with the new dictionary. (Caution: the restart can take several seconds, during which Guiguts appears to be hung.)

The same dialog can be used to locate a different executable program for Aspell (normally you set the path to the executable during Setup).

A set of four buttons at the bottom of the dialog specify how generous Aspell should be in selecting similar words to display as possible replacements. The Ultra Fast setting will suggest only a few, very similar words. "Bad Speller" will suggest many words. This choice might have had a performance impact once, years ago on a time-sharing system. On today's personal machines, there is no detectable speed difference; you might as well set Bad Spellers to see the max options.

Using the Project Dictionary

Because of the way that Guiguts implements spellcheck, there is no practical difference between the "Skip All" and "Add To Project Dic" buttons. Both result in adding the current word to the project dictionary, so that it will not be shown if you run spellcheck again.

The Project Dictionary is a text file that Guiguts writes in the same folder as the document file. It has the same filename as the document and the suffix .dic. You can edit it, for example, to remove a word that you added in error. Or you can remove all words from the project dictionary by simply deleting the file.

Many projects come with a list of "good" words: words the proofers have noted as being correct, although they may fail a routine spellcheck. If your book has a file named good_words.txt you can use this as the basis for a project dictionary with the button "Add Good Words to Proj. Dic.". WARNING: you should first check this file for spelling mistakes such as multiple versions of the same word.

When you open your book and begin spellcheck none of the good words should appear as errors.