PPTools/Guiguts/Guiguts 2 Manual/Exporting Aspell Dictionaries

From DPWiki
Jump to navigation Jump to search

Exporting Aspell Dictionaries for Use with Guiguts 2's Spelling checker

Many language-specific dictionaries are available for the Aspell Spelling Checker. They are in a format used by Aspell, but can be exported into one-word-per-line wordlists. Those lists can be used by Guiguts' own spelling checker. This page contains instructions for doing this on various platforms.

These instructions assume that you already have installed Aspell on your system and have added to it the language dictionaries you use. If you have not yet installed Aspell on your system, you can't export dictionaries. For Aspell 0.5 program and dictionary installation instructions, please see Windows Aspell. If you are using Linux or Mac with Guiguts, you can and probably already are using the current version of Aspell (0.60.8).

Exporting Aspell Dictionaries in Windows

  1. right-click the Windows icon in the bottom-left corner of your screen
  2. click Run
  3. type: cmd and press ENTER (or click OK) to open a command-line window
  4. type: cd c:\program files (x86)\aspell\bin and press Enter to navigate to C:\Program Files (x86)\Aspell\bin, which is where the Aspell executable program is stored
  5. type: aspell -d xx dump master | aspell -l xx expand > %HOMEPATH%\Documents\GGprefs\dict_xx_user.txt and press Enter
    • where xx is a valid language code e.g., it
  6. wordlists exported from Aspell 0.5 will be encoded in ANSI, not in UTF-8, so you may have to convert the encoding. Notepad++ can do this for you:
    1. use the normal Windows File Explorer (not a command-line window) to find the %HOMEPATH%\Documents\GGprefs\ folder
      1. you can start File Explorer by double-clicking "This PC" or by right-clicking the Windows icon and then clicking "File Explorer" on the pop-up menu
      2. copy and paste %HOMEPATH%\Documents\GGprefs\ into the File Explorer box containing the current folder name (e.g., "This PC") and press Enter
      3. the files in the GGPrefs folder should appear
    2. open each dict file in Notepad++
    3. click the "Encoding" tab on its Menu bar
    4. if "ANSI" is marked, click "Convert to UTF-8", then save and close the dict file

NOTE: The vertical line after "master" is the vertical bar; what looks like a vertical line just before the second xx is the lowercase letter el, preceded by a dash.

The selected dictionary, as a wordlist, will be added to the GGPrefs folder and will then be available Guiguts' Spelling tool. You can repeat the last two steps of the procedure ("dump" and "convert to UTF-8") for other Aspell dictionaries installed on your system. Files in the GGPrefs folder will be available to future versions of Guiguts with no further intervention on your part.


Exporting Aspell Dictionaries on Linux or Mac systems

The procedure is similar to that described for Windows above, but using the following command to extract the word list:

aspell -d de dump master | aspell -l de expand | sed 's/ /\n/g' | sort --ignore-case


Latin Dictionary for Spell Query

If you have access to Aspell 0.60.8 and its Latin dictionary, you can use the above procedure to export a Latin wordlist (language code: la). Otherwise, you may be able to obtain a smaller (almost 26,000 words) list using the following procedure:

  1. use your Browser to access THIS page, which is part of Wikimedia
  2. set Format to "Plain Text" and Sort to "default sort"
  3. leave the other settings as-is
  4. click the Do it! button
  5. after a few seconds, a wordlist will appear in the same Browser window
  6. click in it, Select All (ctrl+A), then Copy (ctrl+C)
  7. open a text editor on your computer (Notepad++, even Guiguts itself)
  8. paste the copied list into the editing window (ctrl+V)
  9. save the file. It's name should be dict_la_user.txt and it should be placed in the same %HOMEPATH%\Documents\GGprefs\ folder as the other wordlists you have exported from Aspell.
  10. as with Aspell 0.5 exported wordlists, check the encoding and convert to UTF-8 if necessary


NOTE: Until we have determined the copyright and distribution status of these wordlists, they should not be shared with others; instead, refer them to this page of instructions so they can obtain the wordlists themselves.