General Workflow Diagram

DP Official Documentation - General

Languages: English Français Español

"Content Providers" provide both the scanned images and text prepared using OCR that are used for page-level comparison so that proofreaders can catch and correct mistakes in the text.
The "Proofreader" (or "proofer") chooses a project to work on, reads through the Project Comments on the associated Project Page, and clicks on the "Start Proofreading" link on that page.
The website shows proofreaders the page image of one page with the OCR text for that image.
Proofreaders read the OCR text and correct it to match the page image. They change any OCR text errors and do some typographical markup according to the Proofreading Guidelines. The site stores each proofread page in our database for the next round. Each book goes through three rounds of proofreading for OCR errors. Each round displays proofing images with their associated text.
Once the book has completed the proofreading rounds, it moves on to the formatting rounds in which things like bold or italic text are marked according to the Formatting Guidelines.
When all the pages in a book have been proofread, a Post-Processor does the finishing work of getting this book ready: combining all the pages into one big file, making sure that all the formatting is consistent, checking one more time for errors, etc.
Often the book is then submitted for Smooth Reading where it may be read through by volunteers who report anything that disrupts the sense or flow of the book.
Finally, the book is submitted to Project Gutenberg and is posted on mirror sites all over the world, freely available for anyone to read and enjoy.

To comment or request edits to this page, please contact lhamilton or wfarrell.

Return to DP Official Documentation Menu

General Workflow Diagram

Navigation menu

Search