Project Managing FAQ

From DPWiki
Jump to: navigation, search
DP Official Documentation - Content Providing and Project Management
Content of this page is being reviewed. If you have questions, please contact one of the page editors (shown in the footer at the bottom of this page).
Languages: English Français Português
FAQs

The Content Providing FAQ describes the life of a Distributed Proofreaders project up to the point at which the image files and text files have been uploaded to the DP server.

This Project Manager's Workflow describes everything that happens after that, from the point of view of a Project Manager (PM).

The PM and CP can be the same person. The PM may take over for a CP at any point in the process.

To become a PM, please check the PM Access requirements, and send an email to Linda (lhamilton) at dp-genmgr at pgdp dot net.

Contents

What you need to know first.

A few things will be helpful to know before reading this document.

Be aware of the time frame that it takes a project to move through the rounds. Depending on the popularity of the genre, and the complexity of the project, it may take more than a year for the project to complete the cycle. If you cannot commit to following the project, you should consider becoming a Content Provider. A Project Manager is expected to follow the project throughout all of the rounds, and be active in maintaining word lists and answering questions in the project thread.

Be sure you know how DP works. You should have at least 6 months time on site before becoming a PM. You should also be very familiar with the proofreading guidelines and the formatting guidelines. These are less important if you just wish to provide scans, but may help you understand this document a little better.

Pay special attention to the Jargon Guides.

See also the DP Workflow Diagram, which gives an overview of how material moves through the site and what the PMs do.

You also might want to look over the PG FAQ. This FAQ is huge, there is no need to read the entire thing, but you should be aware of its presence.

See the Content Providing FAQ for the Content Provider's Workflow &c.

It is now required to have a Project Manager Mentor for your first projects. Go to the PM Mentoring page to find out about it.

Project Manager's Workflow

Create a Project

When you have PM status, the first screen you see when you click the 'PM' link in the navigation bar is the Project Management page. What you will see on this page is determined by the Default PM Page setting in your user Prefs under the Project managing tab.

Exquisite-khelpcenter.png Note

This section being revised and updated to more closely match redesigned PM page.


You don't have any projects, so they don't really do anything. So, the first thing to do is to create a project! Click Create Project.

A new screen with a bunch of blank fields comes up. Createaproject.png

What's supposed to be done here? Well, in order to keep you from having to type in all sorts of details about your project, the Create Project system uses the information you put in here to try to find matches at the Library of Congress. In my experience, less is more. Don't put in too much info -- maybe just the title and author, and don't forget to drop the A and The words at the beginning of the title. When you are ready, press the Search button to start the search.

Shortly, depending on how responsive the LoC is, you'll get a group of possible matches for your project. Each possible match has a Title, Author, Publisher, Language, LCCN, and ISBN field, although they aren't all filled in. Next to each match is a little radio button. Click the one to the left of the closest match to your project. Here's an example:

Resultscreen.png

No match? Don't worry. You can keep trying different searches if you like, or you can create the project from scratch (which is often easier).

For the moment, let's assume that there is a match, and you've selected the proper radio button.

The only information transferrred from the LoC record is title and author. All you need is to be sure the Title and Author match your project.

At the bottom of the list of matches are several buttons:

Quit takes you back to the main DP: PM window.
Search Again takes you back one page to the empty fields.
Create the Project and No Matches take you on to the next page, which is the DP: Create a Project page.

If you were able to find a match (and chose Create the Project), some of the information is filled in for you. If not, and you chose No Matches, you get empty fields. But now you can enter in all of the project information and get your text going!

Enter Project Information

Once you have successfully found a project in the Library of Congress (or had no matches), you come to a new page Create a Project which has these items on the left, and mostly blank fields on the right:

 Name of Work
 Author’s Name
 Language
 Genre
 Difficulty Level
 Special Day (optional)
 PPer/PPVer
 Original Image Source
 Image Preparer
 Text Preparer
 Extra Credits (to be included in list of names)
 Clearance Information
 Posted Number
 Project Comments

Some of these may be filled in for you (from the Library of Congress search), but you can edit all of them.

Name of Work

The first two are pretty obvious. Most PMs try to use a short but descriptive version of the title for Name of Work. Sometimes you can use the whole title, but sometimes it’s just too long! If your project is part of a series, this is a good place to note it, e.g. (Vol 2 of 2).

Some PM's also like to include the copyright year in project titles, as a point of information for proofreaders.

If the work is broken into pieces which will eventually need to be re-joined before the work is posted to PG, enclose the "part" information in square brackets, e.g. [Part 1 of 4] or [1 of 4].

Merging projects causes a fair amount of work for the squirrels, so this should be kept to a minimum. The general rule is that if the item will be posted to PG as one whole unit, it should be run through DP as one whole project, unless there is some compelling reason not to do so, such as the for the BEGIN and P3 Qual projects.

Other descriptive information about a project which may be meaningful to the DP database and/or the DPers who may choose to work on the project can be enclosed in braces, e.g., {Fraktur}, {LaTeX}, {type-in}.

A useful general guideline when creating project titles is that anything in brackets or braces will definitely NOT be retained as part of the work's title in the PG Catalog, while anything in parentheses may become part of the work's title at PG.

Note: Not all PM's follow the above-mentioned guidelines on the use of brackets and braces in project titles. For some background on how certain SA's and squirrels thought the above guidelines would be helpful, see the Project Titles thread.

Author's Name

For the Author’s Name, list the author's last name first, as: Last name, First name (or initials). This is important for the queueing process, as multiple works by the same author shouldn’t be in the same round (with certain exceptions).

Language

Language is really obvious, and the allowable language definitions are on drop-down menus. If your text is half French and half German, you can pick which you want to be first (Primary Language and Secondary Language). If your text is English with only small bits of another language (Latin, for example), you’ll probably just want to specify English and leave the Secondary Language menu alone. The Word Check dictionaries are also derived from this option.

Genre

Genre is another drop-down menu. The items in it correspond to (some of) the Release Queues, but not necessarily in a one-to-one fashion. Its main purpose is to help proofers find works in the listings of available projects that they feel might appeal to them. Because of this, the genre is displayed in the lists of available projects, and can be used to filter these lists.

Difficulty Level

There is only one Difficulty Level associated with a project. Typically, an Easy project has simple formatting, clear scans and good OCR. See the What's Easy? thread. Novels are the most likely candidates for Easy, especially if they are not illustrated. Average is the most common. A little of this, a little of that. For Proofing, that means maybe longer pages, splotches, and or some longer passages in a second language. For Formatting, there may be sidenotes or more than a few footnotes. Hard projects may have lots of transliterated characters, bad OCR or extensive specialized formatting requirements.

Special Day

Special Day is optional. If you choose anything here your project will be released as follows:

  • P1: Projects will be held until that date. It will then be released with some special rules setup just for special day queues.
  • P2, P3: Projects are release either on that date or when they reach the top of their language/genre queues--whichever comes first.

For a list of author birthdays, see Authors' Birthdays.

Details of other special days can be seen at the Special Days page.

There are several people that need to be credited. Make sure they get credited properly:

PPer/PPVer (DP User ID)

If you have a PPer set it here. If you do not, leave it blank and the project will be put on a special "No PPer Assigned" list where PPers can look for projects.

Original Image Source

Under Original Image Source pick DP Internal if it was scanned by someone within DP. If it was harvested from an online source, pick that source from the pull-down list. If your image source is not on the list (check the "Image Sources Info" list if the pull-down in the project creation form doesn't give you enough information to decide) you can propose a new image source from your Project Manager page by filling out the "Propose a new Image Source" form, or by contacting an Image Source Manager at ism at pgdp dot net.

If you choose to fill out the form, please provide as much information as possible (the first three fields must be filled in:

Image Source ID (10 characters) -- This is a unique tag (usually all-caps) that is used only in the site code. Image Source Managers or Squirrels should be able to help with suggestions.
Display Name (30 characters) -- The name that displays in the pull-down for the project creation/edit project form. If the Full Name is long, this may be an abbreviated version.
Full Name (100 characters) -- The name as it will be displayed on the project page in the Image Source field, and in the credits line.
Website (200 characters) -- Link to the main page for the site where the images are stored.
Credits Line (200 characters) -- If they have any special wording for how they should be credited in the final product. Otherwise, just use the standard form.
Permissions -- Should be determined from their usage policy. If it's not obvious from their online policy, the Image Source Managers may need to correspond with them to set it.
Description (public comments) (255 characters)
Notes (internal comments) (more than you should need)
See Image Sources for examples of the types of information that should be included in these fields. One or both may be empty.
Image Preparer (DP User ID)

This is the DP username of the person who prepped the images, if you choose DP internal for the Original Image Source.

Text Preparer (DP User ID)

This is the person who ran the OCR on the page images.

Extra Credits (to be included in list of names)

These are other people that should be credited. This would include if someone cleaned up the illustrations, found missing pages, or if the images were prepared by two people (one scanned them, the other cropped them and cleaned them). There's no need to enter your own name here; it will be included automatically as you are the PM. You can change the way you are credited by using the "My Preferences" link at the top of most non-forum pages.

Clearance Information

Clearance Information should be a key that you received when you got clearance for your project. If you do not have the key any more, you can go to the Copyright Clearance Requests site to get it. If you've got a key of the form "numbersauthor" please enter the entire alpha-numeric key, with nothing else. If you've got one of the older keys, starting with "gbn", please enter the "gbn############" part. You must have a copyright clearance before loading the project onto DP. Only the clearance key should be entered into this line.

Unless you have Direct Upload access, please do not use this information for uploading Post-Processing projects.

Handling projects entering Public Domain on 1 January

On 1 January 2021 books first published in 1925 enter the Public Domain in the US. Project Gutenberg is now clearing books in anticipation of that day, but please don't load projects onto DP's server until they are fully in the public domain.

You may, if you have one of these clearances, set up a "shell" project for the book and leave the project in a "New Project" state provided you do not load the book text or images until 1 January. If you set up a shell, please set the project up as a Public Domain Day special day, put on a "P1 Hold in Waiting" and add the following text in red to the project comments: "This project will enter the public domain on 1 January 2020. No images or text will be loaded until that date." By setting up the project for Public Domain Day, it will be easier to locate projects that will be entering the public domain on the first of the year.

On 1 January 2020 at 00:00 server time, you may load the project and remove the hold and the red warning. After January 1, you should remove the special day designation if you want the project to release before 2021. You should also double-check Project Gutenberg to be sure that a solo producer has not uploaded the book.

From now on, unless copyright laws change, with each new 1 January, another year's-worth of books will be being made available. PG is also approving clearances for books that may not enter the public domain until 2021. It is the Project Manager's responsibility to make sure that only material that is public domain at the time it is uploaded is put on the DP server.

Posted Number

Posted Number should be left blank. It will be filled by the Squirrel who marks the project posted, once DP gets notification from Project Gutenberg that the project is posted.

Write Project Comments

The Project Comments section of the Project Page is the place where the project manager can put information he or she wants the proofers and formatters to know about the project. This can serve many functions. The most important function is to let proofers and formatters know if there are deviations from the proofing and formatting guidelines.

If you are running projects in languages other than English, make sure that any formatting instructions are available in English if you wish to encourage formatters who are not fluent in the project's language to work in your projects.

The Project Comments are one of the most important part of the project creation. Do not rush through them. Look through the book in advance looking for anything that will need special attention. Ask for advice when in doubt if you should say something or not. Have a Project Facilitator or other Experienced PM look over your comments for the first few projects you do if they are complicated at all.

The following sections provide some suggestions on composing project comments:

General comments about Project Comments

Deviations from the guidelines

It's best to keep deviations to a minimum. The greater the number of deviations, the higher the likelihood that proofers and formatters will make errors. Also, some proofers specifically avoid projects with deviations; that may slow your project down. With these in mind, consider carefully whether the project requires any deviations. Can the issue be dealt with easily by global searches and/or regular expressions in Post Processing? If so, it's best not to not to ask for any deviations from the guidelines.

If the project can be proofed and formatted using the standard guidelines, specifically state that in the project comments. Make sure it stands out. This will allow the proofers and formatters to skip over the informational portions of the project comments with confidence that they haven't missed an important instruction.

However, given the great variety in books being processed, some books require deviations from the standards (e.g. a Fortran compiler, a book with 5 levels of headings). In these cases, it's essential to provide some guidance to the proofers and formatters in order to get a consistently proofed and formatted book to the Post Processor. Make sure these stand out, make sure it's obvious that these are deviations (to be used for this project only), and provide examples if necessary. See common exceptions to the guidelines for examples of many exceptions that PMs have requested on past projects.


What to look out for/What to expect

In some cases, it's useful to reinforce the guidelines. Make it clear in your comments that these are reinforcements, not deviations. (e.g. "Proof diacritical marks just as the Guidelines specify; you may be doing that a lot in this book -- the OCR often missed them! So please be sure to include these when proofing.", or "This book contains thought breaks which are indicated by extra space on the page between the paragraphs. Format these as the Guidelines say: insert <tb> for these.") It's best to use these sparingly--perhaps one or two per project. Clearly separate these from any deviations to the guidelines.

Some project managers choose to give the proofers an idea of what to expect in the project (e.g. good OCR, poor OCR, index, tables, simple format, multi-line format, complex mathematical equations, dialect). This may be useful to proofers and formatters when they are choosing a project to work on.


Project Comments Checklist

Please see the Project Comments Checklist for a list of common issues that proofers and foofers have questions on. Many of these vary by project--or are not explicitly addressed in the Guidelines--and therefore require a PM ruling.

Information about the book

Some proofers and formatters enjoy having some information about the book and/or author when deciding whether to work on a book. Others find this information distracting. If you do include this sort of information, keep it short. Consider linking to Wikipedia and/or to information you've put in the DP Wiki.

If the project is marked as a special day, it's a good idea to briefly explain how the project relates to that special day.

Authors lifespans

Proofers and formatters working in countries with "life+" copyright laws often appreciate having information on author and contributor birth and death dates included.


Offensive material

Take into consideration your future proofers and formatters desires to avoid material that can be offensive. Warn them in advance about things like Offensive Language, Religion Criticisms, Derogative Ethnic subjects, Racisms, Offensive Stereo-typing to name a few. If the title does not make it clear what the content of the book is, give details here.

Every subject has the right to enter PG, for historic value if nothing else, but not everyone is willing to be part of that preservation.


Format of project comments

Most HTML formatting is supported. You can keep it simple and just enter text, but if you want any formatting (including line breaks or paragraph breaks) you must use HTML.

Project managers may use any format they wish. When choosing a format consider the following:

  • Don't make proofers or formatters search through the project comments to find out if there are any deviations from the guidelines. Put this information first and make sure it stands out. Make it easy for proofers and formatters to see what is requested.
  • Exercise caution if you specify foreground or background colors. Differences in monitors and color vision (e.g. color blindness) can make some combinations very difficult to read.


Updating project comments

If postings in the project thread indicate that there is need for additional clarity on how to proof or format, be sure to update the project comments with this information. Instructions which are in the project comment are more likely to be read and followed than instructions which are buried somewhere on page 3 of a 5 page discussion.

Clearly identify any changes which occur mid-course. Date these so that returning proofers/formatters will know what has changed.

Comments targeted toward Proofers

Here you need to give very specific requirements on how to handle all situations that are not in the guidelines. It is also a good idea to quote the guidelines for the less common situations that are covered by the guidelines.

Comments targeted toward Formatters

Here you need to give very specific requirements on how to handle all situations that are not in the guidelines. It is also a good idea to quote the guidelines for the less common situations that are covered by the guidelines.

If you would like help creating formatting instructions for your project, please send a note to dp-format. One of the formatting-round experts from the dp-format group will contact you to help create the instructions for your Project Comments.

How headers are to be handled

How to handle the more common headers are covered in the guidelines.

There are times though where there is no clear-cut way to decide if something should be treated as a chapter, sub-chapter or what. Periodicals and Poetry are the most common types of documents that need special instructions on how to handle. Explain in detail how you want it handled and give examples to reinforce what you want.

Comments targeted to PPers

While it is not necessary to give the PPer instruction it is a good idea to place information here that you feel the important to the final production of the project. The standard is now to create a HTML version of all book so that is not needed though PM still add it in.

Details

-- Illustrations: for the benefit of new PPers let them know if there are illustrations to be processed. If there are none, say that also as that is the easiest type of Project to complete.

-- Sidenotes: specify (if and how) you want Sidenotes.

-- Small Caps: specify if you want to keep them.

Upload the Files

Before you upload your project, it is a good idea to check the images. Even if this was done in CP, it is a good idea to double check to make sure they are all there, and are all complete and usable, and that they all have the associated text files. It is much easier to fix a project now than later.

Missing pages are a big hassle to fix later on; do not load any projects to DP unless they are 100% intact, this means:

  • no missing pages
  • no bad pages
  • all illustrations included

You should also double check the names. The text file and image file for any given page should have the same base name (e.g., 001.txt and 001.png), and illustration files (if any) should have base names that are distinct from those of the page files (so fig001.jpg or 001-illus.jpg are okay, but 001.jpg is not). Please also note that while filenames containing the word "ad" are valid, many proofers have ad-blocking software that will not allow images to be displayed if it contains this word. Therefore, names like ad001.png should not be used.

Filenames should adhere to the following rules: they must contain only (unaccented) letters, numbers, and the three symbols "-", "_" and ".". (In particular they can't contain spaces.) The extension must be one of .jpg, .png and .txt (in lower case only). Note that this is in line with the filename requirement documented in PG's wiki: PG: HTML FAQ#2. Requirement: File_names and extensions

The dpscans directory lives in a separate place on the DP server. In order to associate the text and image files with your new project, you first have to have the files available in dpscans. Most PMs have a subfolder in dpscans with their username (e.g. dpscans/Users/dpuser/) to which they can upload a zip using the Remote File Manager. A PM's dpscans directory is created the first time she or he accesses the Remote File Manager.

The rest of these instructions assume you've uploaded a zip of your OCR Output(.txt), Page scans (.png) and Illustrations (.jpg or .png) files to your dpscans directory.

After you've created your project, and edited its comments to your satisfaction, when you click on the Save and Go To Project button you will be taken to the project page. Following the project comments is a field with your user name already filled in ("Add/Replace text and images from directory or zip file: ~dpscans/Users/dpuser/"). The system assumes you've put the files somewhere in your user directory on dpscans. In the field, after your user name, type in the name of the zip containing the files, and click on the Add/Replace button. (Note that when a project is loaded for the first time, the system unzips the uploaded zip into a directory, and deletes the zip. If you subsequently delete the proofing images and text that have already been loaded, and wish to load everything again, you specify only the directory name that the files were unzipped into).

You will be supplied with a list of the illustrations, then a table with the text and the images. Look over this list to make sure everything is correct. You should have 1 text file per page scan, and vice versa. All pages should be there. And there should be no errors listed. If there are you should fix them before you continue.

Once everything is correct, click "Proceed with Load." Then at the next page, click "return to project page." If your pages are screwed up, you can fix them as long as the project is a "new project" or is in "p1 unavailable". Check "fixing a project" for more information.

Run Project Quick Check

Project Quick Check is a script that should be run for every project before it's released. It can also be useful to check projects between rounds to see if bad characters have been added by proofers or formatters.

In addition to providing the title, author and project manager, the output of the script gives you several useful pieces of information about the project:

Page images exist
Verifies that a proofing image exists for each text page. (Under normal circumstances, this won't occur.)
Bad Bytes in Page Text
Text pages should have only characters from the selected character suite, no HTML entities, no BOMs, and no tabs--in fact, no invisible characters other than regular spaces and returns. If this test reports bad characters, please fix the problem pages and reload.
The links to text files link to the OCR; the test that checks the text looks at the project's latest page-texts for bad byte-sequences. If the project is being checked after it's released into the rounds, you may need to edit the URL to see the latest version of the page-text.
If there are bad bytes on any page, there will be a link in this section to the Explainer for Bad Bytes Reports, which shows in detail what is considered bad and why.
Corrupt page PNGs
Occasionally, uploaded page images may be somehow corrupted. This test will identify any that are, so that the PM can correct the issue.
Large Page Image Files
For most proofing images images (Page scans), file size should be under 100KB. If the results of the script show that there are images over this size, and you are having trouble, please ask for help.
Illustration Image Files
This is just a test to determine whether there are any illustration images with the project.
Good/Bad Wordlists
This test issues a warning if both of the project's Wordlists are empty. There is no warning if at least one of them is not empty. It is not considered an error for a project not to have Wordlists, but this test is a reminder to check.
Credited Source
Verify that the listed Credited Source is the correct one.
Correct Genre and Language
Check to make sure that the Genre and Language have been set correctly. If in doubt, check the Page Details for the project, and skim through a few pages.
Missing Pages
There's no way that the script can check for missing pages. This section is a reminder to PMs that missing pages should be checked for, and that a last check before releasing the project is a good idea.


One of my proofing images was included as an illustration image

Sometimes a proofing image is accidentally loaded into a project without a corresponding text file. If this happens, that image will show up in the non-page images (illustrations) in the Image Index. If this happens, and you catch it before a project leaves P1 (preferably before it's released into P1), make sure that the project is in the New Project state, or in P1 Unavailable. Zip together the missing text file and the corresponding image file, upload it to dpscans, and load it into the project as you did the rest of the pages.

The script that loads the pages will load both of them, because you have matching text and image files. The older image will be overwritten. Because the database now has a text file for that image, it will no longer be sorted into the illustration listing.

If you don't catch it until after it leaves P1, it will need to be treated the way any other missing pages are treated.

OK, how do I finally get my project into the proofing rounds?

The project should show a status of "New Project" in the pull-down status column of the Project Managers page. Use that list to change the status first to "Proofreading Round 1: Unavailable". Once that's done, a new option appears in the pull-down list: "Proofreading Round 1: Waiting for Release". When you change the project to "Waiting for release" the software will automatically determine when it's time to actually enter P1.

Note for first-time PM's: When the queues of waiting projects are long and you're starting out as a new PM, it can seem to take forever to get your project to the proofers. To help with this, you get one "get out of queue-jail free" card per round ... to be used on the project of your choice (may or may not be your first project). Post a note in the DP forum or ask a Project Facilitator for help manually "pushing" your project into the round.

A little bit of clean-up, please

You should now remove the files from dpscans, as the text has been copied into the database, and the images have been copied into the project's directory.

Project Holds

The normal default sequence is for a project to progress through all the proofing and formatting rounds automatically, but there are various reasons why you as Project Manager may choose to put a hold on this automatic sequence to allow you to consider the following.

  • P1-->P1 If a project has a large number of corrections applied in P1 from the OCR output then it may benefit returning to the P1 round a second time, use the Project retread/skip recommendations to compare OCR to P1, and if you would like it to repeat P1 send an email to db-req at pgdp dot net.
  • P3 skip See P3 skip evaluation for a discussion of this option.
  • F2 skip This normally requires an assigned Post Processor who agrees to this.

Shepherd the project

Monitoring the Project Threads

You'll need to keep a close eye on the project threads of your projects as they go through the rounds, as ultimately it is up to you to ensure the most consistent output possible for the post-processor. The post-processor (as well as Project Facilitators, and other DPers) may help answer questions for you, too, though it's important that you take an active role in helping your projects progress.

One way to help yourself monitor the project thread is set yourself up (by selecting the "Subscribe topic" option at the top of the thread) to receive automatic forum notifications whenever someone posts in the thread. To subscribe to all of your project threads, go to the "Project managing" tab of the "Prefs" page, accessed via the link in the navbar on site pages. If you want to receive notifications by e-mail, your forum profile must have your correct e-mail address and your forum profile posting defaults must be set up to notify you when posts are made to your topics.

One way to help yourself monitor the project thread is to receive automatic forum notifications when someone adds to the thread. Your forum profile must have your correct e-mail address and your forum profile posting defaults be set up to notify you when posts are made to your topics. You may also create the thread before your project enters the rounds by clicking on the "Start a discussion..." link on the project page. The forum profile also allows you to be notified of private messages via email in case someone asks a question by Private Message instead of by using the project thread.

Answer what questions you can; most can be answered by referring to the Proofing Guidelines or Formatting Guidelines. If a question is too vague, please politely ask for the volunteer to expand upon his or her question. If you cannot answer the question yourself, you may wish to ask some of the experienced proofreaders or formatters, one of the Project Facilitators, or post in one of the forum threads related to the issue.

Above all, it's important to be calm, reasonable, and polite, and remember that a thank you goes a long way. Often another experienced proofreader or a Project Facilitator will get to the question before you; however, don't depend on someone doing that or that that answer will be correct. You need to verify such answers to make sure they are correct and follow what you want to happen in your project.

If you have a PPer assigned to the project and you believe that person could answer a question better, please let the PPer know so they he or she may go in and answer it. PPers do not always follow a project they have been assigned until it reaches the formatting rounds where their input is needed more.

Always remember that yours is the final word on how your project will be handled in the rounds. If something is wrong, you need to be there to correct it.

Maintaining WordCheck Lists

You will also need to maintain WordCheck lists as the project progresses through the rounds and proofreaders and formatters suggest words to add. Before the project is launched, you should minimally generate a basic WordCheck list with any common Proper Names, place names, or alternative spellings. Then, once the project is in rounds, you must update the WordCheck lists as proofreaders provide suggestions. When the project is posted you should look it over and make sure that it is what you had in mind when you started the project.

Checking Proofreading/Formatting Quality and Clearing Pages

You might also consider taking some time to spot check a few of the changes the proofreaders/formatters make to a page, especially volunteers with low page counts in the P1 and F1 rounds. Looking at their Diffs does this. If you see a major error send the volunteer a polite PM with what they are doing wrong and how the problem should have been handled. If you do not feel up to doing this yourself, please consult a Mentor or a Project Facilitator.

When there are many pages saved by a proofreader/formatter that contain incorrect changes, it's best to ask the volunteer to revisit the pages and fix the problems you've identified before the project moves into the next round. However, if you receive no response from the volunteer within the time (often a week) you specified in your Private Message, as a last resort, you may consider clearing the affected pages. If the project is progressing well, the round may have finished by the time you hear from the proofreader or formatter. In such a case, you may need to add a "hold in Available" to keep the project in the active round until you have heard back.

If the project has left the round, then please make a comment in the Project Comments for the proofreader/formatter in the next round to watch for those specific errors and make corrections.

If you do find it necessary to clear pages, please be sure to notify the proofreader/formatter in advance to explain your proposed action, so he or she may have a chance to review those pages and make any appropriate corrections. Some Project Managers add a statement in their Project Comments to alert volunteers that their pages may be cleared if they fail to follow specific instructions provided in the PCs. Those PMs should still contact the volunteers about the problem, but then may clear the pages if they have not heard back in a reasonable period of time.

If you notice a continuing problem with specific proofreaders or formatters, and cannot resolve the problems yourself, please ask the the Project Facilitators or Squirrels to deal with the issue.

Working with your team

Please keep in mind that, like you, everyone who works on your project is a volunteer. You can encourage people to work on your projects by always treating them with respect, and by always being helpful and courteous.

Some projects move slowly and you may wish to speed things up. There are a several ways to do this:

Spice up the comments
Put something interesting from the book in the project comments. However, do keep it short. If your comments are too long, people may not read them.
Ask your friends
You can ask your friends via PM, in your teams discussion, or on Jabber.
Special teams
There are special teams with special focuses that can help you with your project. If your project is a type-in project there are people that love these. If it is in a non-English language, find a thread for that language. If it is slow in a particular queue, post to that queues team. If it is on Quilting, there is a quilting team, if it is Christian there is a Christian team. There are even teams for just parts of a book. For example, maybe the last 25 pages of your book are an index. Often books will slow down at this point, because many proofreaders and formatters find index pages hard and boring -- but not so the Index Team! So let that team know when your book reaches the index pages, and they may jump in to quickly proofread/format those pages for you.

A complete list of all teams is here.

Find a PP

If you plan on Post Processing the project yourself, then you can skip this.

Finding a PP can be easy, if it is a project people want to work on, or hard if it is something weird and obscure. If you are not going to PP it, leave the PPer section blank when you create the project. Your project will then be listed on the no PPer yet list. It is also recommended that you put in the project comments "PPer wanted" until you get a PPer. This is usually enough, but not always. You can leave the project be, then when it finished F2 it will move into the PP pool where someone will eventually get it. Or you can beg others to do it. It is your choice.

Once a PP is assigned, you can take a back seat to the PP, but should still be around as backup. After this point your job is mostly to update project comments, repair pages that have been marked as Bad Pages by the proofreaders, and make sure the project keeps moving.

Once you have a PP, go to the project and edit it to add in the PP's DP username.

Why have a PM queue?

A PM queue ensures that PMs always have one project available in any round for which they have a PM queue, as long as they have a project that is eligible for that round.

Another advantage of a PM queue is that the PM can check his or her PM queue for a given round and will see everything in that round's queue, in the order in which it arrived in the queue.

Add a PM queue

Send an email to db-req at pgdp dot net with your dp username and the round for which you want a PM queue. PM queues are currently available for P1, P2, P3 and F2, but are not enabled for F1.

Another advantage of a PM queue is that the PM can check his or her PM queue for a given round and will see everything in that round's queue, in the order in which it arrived in the queue.

Add a PM queue

Send an email to db-req at pgdp dot net with your dp username and the round for which you want a PM queue. PM queues are currently available for P1, P2, P3 and F2, but are not enabled for F1.

Fixing a project, or it's broke, fix it

Someone reported a bad page, how do I fix it? (or I have 1 or a few pages to fix, but not many.)

NOTE: the techniques outlined below are not meant to be used to reorder pages that are out of order; for that, see "There are pages out of order, how do I fix them?", below.

It depends on why the page was marked bad. When a proofer or formatter reports a page as bad, they have to choose a reason: Missing Image, Missing Text, Image/Text Mismatch, Corrupt Image or Other.

In the project page, either use detail level 4, or click on "Images, Pages Proofread, & Differences" to see the list of pages. Find the page you are interested in fixing, then click fix. If the page is marked bad, it will say "bad" instead of "fix". Here you are given 4 options. You can view the text, view the image, modify the text, or modify the image. The text you would modify would be from the previous round, so do not use this once the page has been done for this round.

If the page was marked bad.
Once you have repaired the project, as above, click the radio button for "Fixed" and click "Continue" to return the page to available.
But the page is not really bad.
Sometimes someone will mark a page as bad, that is not really bad. This should not happen often, as the reporter now has to state why the page is bad, but if it does happen, just click the radio button for "Invalid Report" and click "Continue".
Missing Image
It's extremely unlikely that an image will be missing. There may be various reasons that it isn't showing up in the proofer's browser, though. The file size could be too large, or the image could be corrupted. You may need to get someone else to help you resolve this issue if you are not seeing the same problem from your browser.
Missing Text
Hopefully this will be caught before the project progresses past P1. If it's caught in P1, using the link in the Page Details to add the text to the OCR is a reasonable thing to do. You may wish to contact db-req and ask for advice. Some proofers may not report missing text, but just go ahead and type the page. If a page has been proofed and saved as done, and something you do affects the proofer's diffs, please let them know what's affected, and give them a chance to update any affected pages.
Image/Text Mismatch
As with missing text, this is likely to get caught in P1, and if you're lucky, the page or pages will be marked bad, and it won't leave the round. For a lot of mismatches, contact db-req for help or advice. If it's just one or two, you should be able to fix them yourself. If a page has been proofed and saved as done, and something you do affects the proofer's diffs, please let them know what's affected, and give them a chance to update any affected pages.
Corrupt Image
What may appear to be a corrupted image to some may not cause problems for others. As with a seemingly missing image, above, you may need to get someone else to help you resolve this issue if you are not seeing the same problem from your browser.
Other
Any issues that could cause a page to be reported bad should deal with image- or text-related problems. If "Other" is given as a reason (and the reason isn't that the wrong set of images was loaded into the project), and the reason isn't obvious, contact the proofer who marked the page bad and ask for clarification.

There are duplicate pages, how do I delete them?

If you need to delete a whole page (both image and text) from the project, go to detail level 4 or the project's Page Details and click the "Delete" link for the appropriate page.

To delete quite a few pages at once, you can click the checkboxes for the pages to be deleted, scroll to the bottom of the page, and in the "For each selected page" drop-down menu select "Delete".

After deleting the duplicates, you can email db-req to have the files renumbered, which will avoid questions later on from proofers or PPers who might wonder about the gap in the numbering.

If you need to delete an illustration, or other non-proofing-image files that are stored in the project directory, email db-req.

There are pages out of order, how do I fix them?

If the project hasn't been released yet,
you can reload the pages in question, in the correct order.
If the project has been released into the rounds
contact db-req, and ask them to reorder the pages. Please do not use the "fix" link -- this causes mis-matches with previous rounds and invalidates users' diffs.

The above deals only with pages where the text and images are properly matched, but just out of order in the project. A page where the image and text are mis-matched should be reported as a Bad Page.

How do I replace a poor proofing image, or poorly OCR'd text?

If you need to replace a poor proofing image with a better one, or replace poorly OCR'd text instead, use the "Fix" option explained above.

Note that if you need to delete duplicates and add missing pages in their place, do not use the "Fix" option. See "My project is missing some pages, how do I find them?", below, and then follow the db-req instructions: Db-req#Missing pages.

It is missing an illustration, how do I add it?

If there are no pages missing from the project, but the high-resolution scan of an illustration (or illustrations) needs to be added, it's pretty easy. Just load the illo scan (or scans) to dpscans, and send db-req at pgdp dot net an email message saying where the files are and which project they need to be added to. Make sure that they have different names to any existing illo or page files, unless you specifically want to overwrite existing files, in which case you should make that clear in your email.

It is missing a page, how do I add it?

This happens once in a while. You are fixing it now, that is what matters. What you do depends on what round the project is in, so read on.

Name your files correctly!
Make sure they are in order. If you are missing pages that goes between 020.png and 021.png then the replacement pages should be named 020a.png, 020b.png, etc. For the technically minded, name the pages in such a way that a lexical sort will put them into the right place in the main project
My project has not yet left P1.
You are in luck. As long as your project has not left P1, you can easily fix it. The project needs to be in either "new" or "P1 Unavailable" to do this. If the project is in P1 Waiting or P1 Available, move it to P1 Unavailable.
Now go into the project and follow the instructions for Uploading the Files. Any files you include in the zip will replace files of the same name, or be added to the project if the names do not match.
My project has left P1.
After the project has left P1, you cannot use the bulk repair feature, these take database work, so a System Admin will have to do it. You need to make a new project containing only the missing pages, make the project unavailable, and write to db-req at pgdp dot net with the project number and title. Those friendly squirrels will then push the project through the rounds until it has reached a state compatible with the main project, and merge the two projects. There are full instructions on the db-req page.

I really messed up and I need to replace a lot of the pages.

If it is a question of replacing pages, rather than adding new ones, just upload the new files to dpscans and send db-req at pgdp dot net an email saying what you want done. Be sure to include the project name and project id in your email!

My project is missing some pages, how do I find them?

First, return to the source of the scans (the content provider, or your library/collection if you were the content provider) and see if you can get the missing pages yourself. If the scans were harvested, it may be a long wait until the archive can fix the project and get the pages to you. It is helpful to file an error report with them, however it may be faster to go ahead and supplement the pages on your own...

  • Your Library. Try your library! If your local library does not have the book on the shelf, they may be able to get it for you through an interlibrary loan service.
  • Missing Pages Wiki. Post a request here with the edition information about the book and the pages/scans that you need, and other volunteers will also check their local libraries.
  • Missing Page Finders. You can search through these volunteers' library catalogues (or use Worldcat to get a list of libraries that have the book, and then cross-reference with this list), and then contact those that may have access about getting the missing material scanned.

Sometimes it takes a while to get missing page requests filled, but as new Missing Page Finders join the effort and add their libraries, the odds of your project getting fixed increase!

Once you've found the missing pages, please follow the instructions on the db-req page to get them added to the project.

The project is in PP and the PPer noticed something missing. Can I just give the PP the page?

NO! Please follow the instructions on the db-req page and get the files added to the project. The archives are going to be used to create an archive of the page scans. Missing pages will affect the quality of this archive.

FAQ, or what else do I need to know?

What is the difference between a CP and a PM? And what do those other abbreviations mean?

A: The CP or Content Provider supplies the scans to be processed at DP, and may also prepare the files for the proofreaders, but does not necessarily deal with the project beyond that. CPs do not have to be members of DP.

The PM or Project Manager is responsible for creating the project at DP, guiding it through the rounds, answering proofreader/formatter questions, and making decisions that will help create the most consistent output possible for the post-processor. PMs may provide their own content or acquire scans from another CP.

See Jargon related to Project Management for other abbreviations you should know.

Away from DP

If you are going to be away from DP for a period of time, please ask a Project Facilitator for help in monitoring your project. Proofreaders and formatters who wait for questions to be answered often abandon working on a project.

However, if you are a high-volume Project Manager and know you're going to be absent for more than a couple of days, please consider placing holds on your projects until you can return to actively monitoring them -- it can be challenging for Project Facilitators to try to watch over a large number of active projects.

If you are absent for a year or more, your Project Management access may be limited.

If you cannot continue with a project

If you feel, for some reason you cannot continue as Project Manager for any or all of your projects, please don't hesitate to ask for help, or to put the projects up for adoption at Content Providers seeking Project Managers.

How much of my time will PMing take?

It depends on how many projects you have going through and how complicated they are. Some projects require you to make lots of decisions regarding formatting, and so getting complete project comments established and providing examples can take some time. On the other hand, if you have a project that adheres completely to the current proofreading and formatting guidelines, it may flow all the way through the rounds without a single question.

If you do not have time for all this, please consider being a Content Provider, and allowing someone else to Manage your projects.

What are the qualifications necessary to become a PM?

A: While there are no specific requirements to become a PM, it is highly recommended that you have a minumum of 6 months time on site. The general manager Linda (lhamilton) processes the PM requests on an individual basis, and she should be contacted at dp-genmgr at pgdp dot net with requests.

What kind of equipment do I need to PM?

A: If you are not also the CP, there is no special equipment needed to PM. Although having a copy of GuiPrep installed, and OCR package and some image tools is very helpful.

Are there deadlines? Who sets the schedule? What if the schedule is not met?

A: The only deadlines and schedules are set by the PM. If as the PM you do not want to set a deadline or schedule, then don't. If you do set a deadline and it is passed, then the only one who is going to come down on you, is you. Some projects take very little time, others take a long time.

What files do I need to upload?

A: The CP will provide the files that you upload. You should double check these files, making sure they are complete and in good order.

For more information see the Content Providing FAQ.

My special day is not on the special day list, can I get it added?

A: Possibly. It depends on how many books will be added for that special day and if it will happen only one year, or if people will use it every year. If it is only a small number of books, or people will only use it the one year, then use the "Otherday" option. If there are several books and it will likely be done again the next year, then send a message to db-req at pgdp dot net and an Site Administrator will consider your request.

I have harvested a project from an archive that is not on the list, can I get it added?

A: Yes. Send a request to ism at pgdp dot net (Image Sources Manager) to get it added.

What does the AutoModify button on the project page do?

A: This will manually run the clean-up script on a particular project that is normally run at the end of each round. Any pages out for more than four hours will be reclaimed back to available. Avoid using the button too much, some people check-out pages to be done off-line and count on having more than 4 hours to proof them if the project is not close to being finished. Mainly, it should be used if you are waiting to send a project on to the next round and you know the four hours is up on all the incomplete pages or you want someone to get started on the ones that can be reclaimed.

For an example of what the process looks like, see Automodify log example for queues.

What are the command words in the Project Comments?

A: There is currently only one command word that a PM can put at the top of a project's Project Comments. The command word (including the opening and closing parentheses) must be the first characters in the PC. The one remaining command word is: "(nopmq)" which excludes the project from a PM's PM Queue.

"(HOLD)" has been discontinued as a command word. See Feature to allow PM to place Holds on a per-round basis, for the announcement of the hold state functionality.

How do I request a PM queue?

A: Ask db-req at pgdp dot net, instructions on Db-req's page.

How do I write Project Comments for a project containing music?

Consult the Music Guidelines for detailed help with projects containing music.

Fixing an incorrect title of a project thread

Occasionally the topic line of the automatically-generated project thread needs to be edited because it contains incorrect characters or is truncated. Project managers can repair this, as long as the first post in the thread was made by them. Adopted projects will require intervention by a PF or squirrel. To fix an incorrect project thread topic line, click on the 'edit post' button on the first post of the thread (it may look like a pencil, depending on the board style you are using). Then edit the subject line to read:

"Book title" by Author

(often you can copy from the body of the post). The subject line of the first post becomes the title of the whole thread. Preview the changes, then submit them.

The same process can be used to edit incorrect text in the body of the first post.

To comment or request edits to this page, please contact Monicas wicked stepmother or DACSoft.

Return to DP Official Documentation Menu