PG Posting Team

From DPWiki
Jump to: navigation, search

Whitewashers (often abbreviated as WW) is the widely-used nickname given to the Project Gutenberg (PG) Posting Team. This name is in honor of the famous scene from Tom Sawyer and helps remind everyone of their tireless tasks. Their work is also usually referred to as whitewashing.

The actual posting of an e-text to Project Gutenberg is done by the Whitewashers. As described in Gutenberg's Volunteers' FAQ, their main job is to verify copyright clearance for a potential e-text has been obtained, follows the standards, is basically correct, add the PG headers, and, finally, copy the text to the two PG servers.

In the beginning of Project Gutenberg, this job fell solely on Michael Hart's (PG's founder) shoulders. In 2001, he created the Posting Team to take over these duties.

Whitewashing Steps

Receiving Files

Completed projects are submitted to PG by many means, but usually, in the case of Distributed Proofreaders, by a web page form.

Checking Clearance

The Posting Team's first step is to check for copyright clearance on the submitted file. This is the one rule that Project Gutenberg will not bend. If there is no clearance, they will not post the file.

Checking and Editing

They will then check the plain text file with the gutcheck tool and a quick spellcheck. This first sanity check will tell them immediately if it adheres to the standards or if there are any serious problems.

If the checks show some problems they may spend some time correcting the text. In extreme cases, they may send the file back for re-proofing.

When they come across completely obvious errors, they will just fix them quietly unless specifically asked otherwise.

Header and Footer

The final change to the text is the addition of the PG header and footer. Any existing versions of these will be replaced by the Team to ensure that the current version of the Project Gutenberg legalese appears in the text.


The PG e-text number is retrieved from a custom program used to guarantee that duplication does not occur with the many e-texts being whitewashed on a daily basis.


Finally, the e-text files are posted to two servers, one at and one at