User:Theshriek/my process for CP and PM

From DPWiki

My process for CP/PM:

I use this process for CPing because I do not have ABBYY finereader (or any other OCR software).

Next, I use the steps given to me by user Netsirk021.

  • On TIA download the ABBYY gz file and unzip to the work folder.
  • Open Guiguts and go to File > Content Providing > Import TIA Abbyy OCR file. Find the file you downloaded and save it as a .txt file.
  • Run Tools > Basic Fixup (with pretty much everything checked, but it's up to you how much you want to
  • Tools > Remove End of Line Spaces.
  • Find and replace on two single quotes (to replace with a double quote) and ^, to just delete altogether.
  • Find and replace double quote plus space. Replace with just a double quote.
  • Once you have the text file as a whole where you want it, you should then use the GG feature in File > Content Providing > Export as Prep Text Files into a subfolder called "textw" to split the master text file into separate text files for each page.
  • Delete all the blank pages at the start and end
  • From the Content Providing menu select the follow:

1. Run Dehyphenator 2. Filter File 3. Fix Common English Scannos 4. Add [Blank Page] to Empty Pages 5. Remove Headers/Footers