Beginning Proofreaders'
Frequently Asked Questions (FAQ)

Version 1.7, released May 27, 2004

The purpose of this FAQ is to provide answers to common questions that new people joining us at the Distributed Proofreaders web site have asked. Obviously not all questions can be included here. If you don't find an answer here, you can look in our other documentation pages or email us at DP Help.

  1. What is Distributed Proofreaders?
  2. What is Project Gutenberg?
  3. Why do we pick the books that we do?
  4. How can I help?
  5. How do I handle ...?
  6. How do I contact ...?
  7. What is the entire process for creating an etext?
  8. How can I get copies of the etexts I've worked on?
  9. How can I get copies of other Gutenberg etexts?
  10. I think I messed something up (did something wrong), how can I fix it?
  11. I'm having trouble on the webpage trying to ... Log in/Proofread a page/Get a new page

1. What is Distributed Proofreaders?

Distributed Proofreaders is an effort to support Project Gutenberg, and a recognized affiliated site of Project Gutenberg. The basic concept is that our website software allows several proofreaders to be working on the same book at the same time, each proofreading on different pages. This will significantly speed up the proofreading process.

How it works:

  1. This website uses online software and databases to create a "library".
  2. People ("content providers") scan books and upload the scanned images into this library.
  3. People like you ("proofreaders") choose a project ("book") to work on today.
  4. The website then shows you a webpage containing the scanned image of one page and the text from that image (as produced by OCR software). This allows you to easily compare the scanned text to the image of the page, so you can note the differences and fix them.
  5. You read the text, and correct it to match the page image. Basically fixing OCR errors, and marking things like bold or italic text, footnotes, etc. according to our guidelines (so we all mark them the same way).
  6. When done with that page, you save the page, and then either request another page to proofread or quit for the day.
    Note that, at the same time, others will be working on other pages from this book, or from different books. Each proofreader does just a bit (we suggest "a page a day"), but working together we can get a lot of books done! [In 2004, we average 300-400 proofreaders participating each day from countries all over the world, and we finish 4000-7000 pages per day. That's about 4 pages every minute of every day!]
  7. The site stores that proofread page in our database for the next round. (Each book goes through two rounds of proofreading, to try to catch all errors in the text.)
  8. When all the pages in a book have been proofread, a "post-processor" does the finishing work of getting this book ready: combining all the pages into one big file, making sure the markings are consistent, etc., and one last check for errors.
  9. Finally the book is submitted to the Project Gutenberg archive, and is posted on mirror sites all over the world, freely available for anyone to read and enjoy.

2. What is Project Gutenberg?

Michael Hart founded Project Gutenberg in 1971. His idea was: anything that can be entered into a computer can be reproduced indefinitely. This led to the concept of entering books into computers and sharing these books with the whole world.

These Electronic Texts (E-texts) would be made available in the simplest, easiest to use form. This means "Plain Vanilla ASCII." Italics, underlines, and bolds would be converted to ASCII. In the same vein, the books selected would be those that appealed to the greatest number of people possible. Due to copyright laws, it is only legal to do this with older books (in general, copyrighted before 1923). As a result, Project Gutenberg is mostly comprised of the "Classics."

You can read more about the history of Project Gutenberg here

3. Why do we pick the books that we do?

The Project Managers pick whatever books we can find. Due to US copyright laws, we are severely limited in the books we are allowed to work with. We go to Used & Rare bookshops and scour the Internet websites & auctions. We check out rare books from libraries and scan them. We obtain page images from other archive sites. We try to find books that we think people would enjoy reading and that we can find at an acceptable price.

Before selecting a book to convert to an etext, we check Project Gutenberg's list (to make certain that it hasn't already been done) and we check David's In-Progress List (to make certain that it isn't being done by someone else).

In summary, we do whatever books people provide to us (that we legally can). If you have a book that you would like to see done (and it is copyright cleared) we can probably do it (with your help). Contact us at DP Help. or see the "Content Providers" Forum.

4. How can I help?

The process of creating an etext is a long one.

Distributed Proofreaders was set up to make that go faster, by letting you help the Project Managers by proofreading pages in their books. If you have not already done so, click on the "Register" link and make an account. This enables you to select an available book and proofread a few pages. We encourage people to try to do at least "a page a day", but any work done is greatly appreciated and goes a long way toward assisting in creating etexts. This is the way most people help.

If you really catch Distributed Proofreading fever, you may want to become a Project Manager. Project managers mainly shepherd a project ("book") through the uploading, proofreading and post-processing processes on this website. Sometimes they do most of the tasks themselves; sometimes they coordinate others who are working on the tasks.

If you think that being a Project Manager is for you, Read the Project Manager's FAQ. (We do have experienced Project Managers who will mentor you in this process.) When you feel ready, contact us at DP Help.

If you want to do more for the site, but don't have the time, or inclination, to become a Project Manager, you might consider making a donation. Funding for the site comes entirely from Charles and the Project Managers, and voluntary donations. See the "donate" button on our main page if you wish to make a tax-deductable donation. Or here!

You can also donate books (Public Domain) by shipping them to us for scanning (better if they do not need to be returned). You can also scan the books and send us the images (best if you want to keep the book). We would prefer it if you would clear the books first with Project Gutenberg before scanning and sending us the images. Please refer to the Content Provider's FAQ for more details on clearing and scanning books.

So if you want to do more than just proofreading, you can also help by taking on any of the following roles:

  • Content Provider. Does any or all of the following tasks:
    1. Find a suitable (non-copyright) book to proofread.
    2. Obtain copyright clearance for the book.
    3. Run each page of the book through a scanner.
    4. Process each page image through OCR (optical Character Reader) software.
    5. Run pre-processing software on the OCR'd file to fix common problems.
    6. Upload the page image files and OCR'd text files to the DP website.
  • Project Manager. See discussion above.
  • Post Processor. Does all the finishing work to take a project from a set of proofread pages into a combined etext file suitable for adding to the Project Gutenberg archive. Combines all the pages into one big file, deals with words or paragraphs split across pages, moves footnotes & sidenotes to the proper place, and generally makes sure that all the proofreaders were consistent in the way they proofread the text, and then finally sends it on to Project Gutenberg.
  • Website Help. We always welcome people to help in the work of maintaining and improving this website. Programmers (PHP, mySQL and some Java Script) who can work on the website software, beta testers to check out new versions, document writers to help with our documentation are all needed. Contact DP Help if you you would like to help with any of these tasks.

You can do any of these entirely on your own, or you can work together with others to do the tasks. Most of our projects are done by a group of people working together.

5. How do I handle ...?

There are no set "Rules" enforced by Project Gutenberg, but in order to allow the distributed proofreading to work, we have written up our own Proofreading Guidelines and Formatting Guidelines. Please read these and any project comments that a project manager may have provided before starting to proofread. The main goal is to preserve as much formatting as possible, marked the same way, while making the etext readable on a computer. If you are a new proofreader it may be helpful to print out a copy of our 2-page summary, the Handy Proofreading Guide, and keep it handy while proofreading. This covers the basics of proofreading.

Also, some of our projects are marked "Beginners only". These are books that are straightforward, without complex proofreading issues. It's a good idea to choose one of these books when you first start proofreading.

6. How do I contact someone ...?

You can email DP Help at: DP Help

Other Project Managers can be reached by clicking on their name on the Projects page. Each project has a link to the Project Manager in charge of it.

Also, the "Discuss this book" link on the opening page where you start proofreading the book links to the Forum for this book. That's the best place to contact the Project Manager of the book, or to ask questions about the book or ask how to handle some proofreading issue in the book.

7. What is the entire process for creating an etext?

A book follows a long road to become an etext. These steps are covered in more detail in the Project Manager's FAQ.

This Workflow Diagram for the site shows the general flow of material into and out of the site.

8. How can I get copies of the etexts I've worked on?

On the opening page where you start proofreading a book there is an item "Book Completed". Click on "Yes, I would like to be notified when this has been posted to Project Gutenberg." If you do that, when the book is eventually added to the Project Gutenberg Archive, you will receive an email notifying you and giving the link to download this book.

Also, on the DP main page, there is a weekly list of links to recent books completed and sent to Project Gutenberg, books proofread and being post-processed, and books currently in the process of proofreading.

9. How can I get copies of other Gutenberg etexts?

You can go to Project Gutenberg's online catalog and get copies of any etext in the library, including the ones done through Distributed Proofreaders.

10. I think I messed something up (did something wrong), how can I fix it?

Don't panic. We all make mistakes. If you think you made a mistake on the last few pages of a particular project, go back to the Project Page and note the "DONE" links. They reconnect to the last 5 pages you proofread for that project. Click on one, and you can make corrections to your proofreading of that page.

If it's earlier than one of these last 5 pages, or you are not sure that you handled something correctly, leave a note in the Project Forum for that book (reached from the opening page where you started proofreading -- click on "Discuss this Project"). Give the number of the page you were on (if you remember) and what you did. This lets the second round proofreader or the post-processor fix it if it was not correct.

Remember that all your proofread pages will be proofread again in the 'second round' of proofreading. Few mistakes make it by both proofreaders undetected! So just do your best and don't worry. (Second-round proofreading is limited to more experienced proofreaders.)

Also, feel free to leave short notes in the pages as you do them, just make certain to mark them with an asterisk so that the next proofreader can find them. Like this:
      John Smyth* [**image too faint--I can't tell if it's Smythe or Smith here.]

11. I'm having trouble on the webpage trying to ... Log in/Proofread a page/Get a new page

Almost all browser-related problems (not being able to log in, not seeing the proofreading page, not getting a fresh page to proofread after you have proofread your first page) can be solved by verifying that your computer is set with the correct time and date and that your browser options are set the following way:

  1. Cookies accepted/on*
  2. Javascript enabled

    Also, if in your Preferences (located here), "Launch in New Window" is set to "Yes", then there is the following additional requirement:

  3. Pop-up Windows allowed* (and make sure they aren't being blocked by another utility)
Setting these options correctly solves most problems accessing or using the site. For specific examples of setting these options for various browsers, check the latest info on the DPWiki post, available by clicking here.
* Security note: for security and privacy reasons, many people have some of these options turned off. They must be turned on for the DP website to work.
However, they can be limited:
Cookies: DP cookies are only for the DP Website, so rather than setting this option to "Accept All Cookies", you can set it to the more restricted option "Accept Cookies for the originating website only".
Pop-up Windows: Most browser Pop-up options or pop-up blocking utilities offer an option to list specific sites from which you accept pop-ups. So rather than simply setting it to "Accept all pop-ups", you can set the more restrictive option of "Suppress all pop-ups" but include the DP website (www.pgdp.net) in the Exceptions list.
Note: the exact wording of these options will depend on your browser.

The DP site attempts to cooperate with firewalls, web caches and proxies, though if you see the 'I get the same page to proofread over and over again' difficulty, please email us at DP Help, including your browser details.

Revision History of this Document

05/27/2004 -- Version 1.7: Major changes by pourlean - updated for caching changes
04/16/2004 -- Version 1.6: Major changes by pourlean - removed all personal email addresses.
06/16/2003 -- Version 1.5: Additional updates & style revision done by Tim Bonham.
10/27/2002 -- updated version produced by Charles Franks.
10/16/2001 -- original version of this document produced by Robert Rowe.
