User:Theshriek/my process for CP and PM
From DPWiki
My process for CP/PM:
I use this process for CPing because I do not have ABBYY finereader (or any other OCR software).
- Follow steps 1-6 of User:Monicas wicked stepmother/PM process
- Follow step 8 of User:Monicas wicked stepmother/PM process
Next, I use the steps given to me by user Netsirk021.
- On TIA download the ABBYY gz file and unzip to the work folder.
- Open Guiguts and go to File > Content Providing > Import TIA Abbyy OCR file. Find the file you downloaded and save it as a .txt file.
- Run Tools > Basic Fixup (with pretty much everything checked, but it's up to you how much you want to
- Tools > Remove End of Line Spaces.
- Find and replace on two single quotes (to replace with a double quote) and ^, to just delete altogether.
- Find and replace double quote plus space. Replace with just a double quote.
- Once you have the text file as a whole where you want it, you should then use the GG feature in File > Content Providing > Export as Prep Text Files into a subfolder called "textw" to split the master text file into separate text files for each page.
- Delete all the blank pages at the start and end
- From the Content Providing menu select the follow:
1. Run Dehyphenator 2. Filter File 3. Fix Common English Scannos 4. Add [Blank Page] to Empty Pages 5. Remove Headers/Footers
- Follow steps 10-17 of User:Monicas wicked stepmother/PM process