User:Srjfoo/temp

From DPWiki
Jump to navigation Jump to search

Purpose of this draft: disambiguate/simplify references to duplicates and clearances.

Selecting a book

If you don't already have a book in mind, you'll need to find one. Some things to take into consideration:

Possible sources for books to CP

  • Personal scans from books you own, have borrowed (with the owner's permission) or have checked out of the library.
  • Online scan sources
  • Check the "Books I'd Like to See" topic for the current year to see if something that's been requested appeals to you.

If you plan to also be the Project Manager, be sure to pick a project on which you will enjoy working, because you will be shepherding this project through up to 5 Rounds. This may take several years from start to finish, but much of this time can be spent waiting for your project to be released in a round.

If you want the project to go through the system quickly, pick a popular genre; watch which release queues are moving fast, as this changes regularly.

If you choose to get a book from one of the on-line book archives, please follow the individual site guidelines regarding acceptable use and protocol. We don't want to be bad neighbors. It is considered good form to credit the source of the scan when the text is submitted to Project Gutenberg, so make sure the PM knows its source.

Projects that are part of another publication

If the project is a portion of another publication and the entire publication is no longer under copyright restrictions, please do not separate out the smaller portion of the publication as a separate project.

Check for completeness

At this juncture, regardless of the source, you should check the book for completeness before going any further. See the Project completeness checklist (For purposes of this draft: Project completeness checklist) for a list of the types of things to check.

Digital Library of India/Public Library of India scansets

Some scansets from the Internet Archive that were provided by "Digital Library of India"/"Public Library of India" have incorrect publication information so that works appear to be in the public domain in the US when they aren't. If you are considering using one of these scansets, please verify publication information independently with other sources. If you can't confirm that the publication is Public Domain in the US, please don't run the book. For more information, please read this forum thread.

Difficulty

This applies both to the difficulty of preparing the project and the difficulty for the proofreaders and formatters.

If you're doing your own scanning, see Scan/Download images for more information. Narrow gutters and a large number of illustrations involve more time in preparation, and more time for the Post-Processor, but may not be terribly difficult for proofreaders and formatters. Pages with lots of dense text, tables, inline formatting, sub- and/or superscripts, poetry or block-quotes, but with a good sized gutter may make it easy to scan, but more difficult for proofreaders and formatters. See Prepare the Illustrations for more on preparing illustrations.

Is it in the Public Domain?

DP works only on projects that are in the Public Domain as determined by a clearance from Project Gutenberg. For the current year, (2022), usually that means it was published in or before 1926. See clearances, below, for more detailed information.

On 1 January 2023 books first published in 1927 enter the Public Domain in the US. Project Gutenberg is now clearing books in anticipation of that day, but please don't load projects onto DP's server until they are fully in the public domain.

Each year on 1 January, there will be an update in the public domain publication year. For information about handling projects that enter the public domain on 1 January, please read the Handling projects entering Public Domain on 1 January section of the Project Managing FAQ.

Check for Duplicates

Make sure the book is not already at PG, or underway at DP: PG's In Progress Search form searches the PG clearance and posted databases by title and/or author so you can find out whether a book is already at PG or whether a clearance for it has been requested, and when, and whether the clearance has been approved. The clearances searched go back to 2004; the posted titles search covers the whole collection, including those cleared before 2004. You should also use DP's [[dp:tools/search.php|Project Search] to check whether a book is in progress at DP as well as DP Canada's search (which also has the option to search fadedpage).

"Cleared" Status means someone has requested and received copyright clearance, but has not yet finished the project. If this clearance is several years old, it has probably (though not certainly) been abandoned.

After you get clearance, you will get an e-mail along with the other clearance holder, letting each of you know that the other is working on it. You can then communicate with them to find out if they are working on it, or if you are free to begin processing it.

Some projects, most notably periodicals and multi-volume editions, will have "blanket clearances." This does not mean that the person who requested the original clearance has all of the volumes ready to scan! Most of these clearances are associated with DP in some way, so if an Überproject doesn't exist for the periodical/set you have (where the PM will often list the volumes they have available), you can post in the Content Provider's forum to find out who's working on what.

If the project says "Posted" it has been posted to Project Gutenberg with the accompanying ebook number. It is a good idea to look at all of these lists by author and separately by title.

Running a project that is already in PG.

Even if a book is already in PG, it may be worth processing again. This will require some legwork to determine, so be sure you feel strongly about the book before pursuing this. PG welcomes different editions, illustrated versions, different translations, etc. In addition, many of the older ebooks have more errors than we would find acceptable today and reprocessing them through DP may be the best way to change that. If the book has a PG number under 10,000 then it probably doesn't have an illustrated version and might be a good candidate for an upgrade.

Below is a list of reasons you might provide an existing PG project through DP. You will need a copyright clearance for each of these cases. For reworks, the PM should put a note into the Project Comments section explaining why the project is being redone. If you are a CP only, then you should include details of why the project is being redone in a text file attached in the project zip file.

Basic upgrade
You have the same text version, there are no illustrations, and the PG version is riddled with errors: Be sure to let PG know, when you upload the final version, that this is a revision of an existing ebook, based on a paper copy in hand. If there are only a few problems, submit them via the PG errata process.
Illustration Upgrade
You have the same text version, but there are illustrations and they are not present in the PG version: Same as the basic upgrade except that you'll be submitting an illustrated html version.
Different Translation
PG will treat this as a completely different ebook and welcomes them. There are already at least half a dozen translations of the Iliad, for example, and more are always welcome.
Different Edition
Some books were published in very different editions. Where this is the case, PG welcomes them as separate ebooks. You will have to document the fact that your edition has significant differences from the version that is already in PG.

Selecting a Project

Which book you pick is up to you. The only requirement is that it be copyright clearable (discussion below). It is best if it is something in which you have interest. Chances are that you will find others who will work on it as well.

Finding a book.

There are several ways to find a project to CP (Content Provide). You can search the library, buy from a local bookstore, raid your own bookshelves, ask a friend, pull them out of the trash, or find projects that are already scanned at some of the many on-line sources for scans. Be sure to pick a project on which you will enjoy working, because you will be shepherding this project through up to 5 Rounds (if you choose to be the project manager), and until the project is posted. This may take several years from start to finish, but much of this time can be spent waiting for your project to be released in a round.

If you want the project to go through the system quickly, pick a popular genre; watch which release queues are moving fast, as this changes regularly.

If you choose to get a book from one of the on-line book archives, please follow the individual site guidelines regarding acceptable use and protocol. We don't want to be bad neighbors. It is considered good form to credit the source of the scan when the text is submitted to Project Gutenberg, so make sure the PM knows its source.

Projects that are part of another publication

If the project is a portion of another publication and the entire publication is no longer under copyright restrictions, please do not separate out the smaller portion of the publication as a separate project.

Check for completeness

At this juncture, regardless of the source, you should check the book for completeness before going any further. See the Project completeness checklist for a list of the types of things to check.

Digital Library of India/Public Library of India scansets

Some scansets from the Internet Archive that were provided by "Digital Library of India"/"Public Library of India" have incorrect publication information so that works appear to be in the public domain in the US when they aren't. If you are considering using one of these scansets, please verify publication information independently with other sources. If you can't confirm that the publication is Public Domain in the US, please don't run the book. For more information, please read this forum thread.

Difficulty.

Some things can make the project harder than others. The amount of time you wish to spend on this should be considered. Check the inner margin (gutter) of the book. The wider this is, the easier it will be to scan, and the fewer extra measures you'll need to take in OCR and answering forum questions. This does not mean that you should not work with books that have a narrow gutter, just that they will be much harder. Projects with a lot of illustrations are also harder and more time-consuming. This will be discussed more under Scan/Download images and Prepare the Illustrations.

Copyrights and clearances.

Do a preliminary check to see if it is clearable. For the current year (2022), usually that means it was published in or before 1926. See clearances, below, for more information.

On 1 January 2023 books first published in 1927 enter the Public Domain in the US. Project Gutenberg is now clearing books in anticipation of that day, but please don't load projects onto DP's server until they are fully in the public domain.

Each year on 1 January, there will be an update in the public domain publication year. For information about handling projects that enter the public domain on 1 January, please read the Handling projects entering Public Domain on 1 January section of the Project Managing FAQ.

Check for Duplicate Projects

Make sure the book is not underway or already at PG: PG's In Progress Search form searches the PG clearance and posted databases by title and/or author so you can find out whether a book is already at PG or whether a clearance for it has been requested, and when, and whether the clearance has been approved. These clearance listings searched go back to 2004. You should also use DP's Project Search to check whether a book is in progress at DP as well as DP Canada's search (which also has the option to search fadedpage).

"Cleared" Status means someone has requested and received copyright clearance, but has not yet finished the project. If this clearance is several years old, it has probably (though not certainly) been abandoned.

After you get clearance, you will get an e-mail along with the other clearance holder, letting each of you know that the other is working on it. You can then communicate with them to find out if they are working on it, or if you are free to begin processing it.

Some projects, most notably periodicals and multi-volume editions, will have "blanket clearances." This does not mean that the person who requested the original clearance has all of the volumes ready to scan! Most of these clearances are associated with DP in some way, so if an Überproject doesn't exist for the periodical/set you have (where the PM will often list the volumes they have available), you can post in the Content Provider's forum to find out who's working on what.

If the project says "Posted" it has been posted to Project Gutenberg with the accompanying ebook number. It is a good idea to look at all of these lists by author and separately by title.

Running a project that is already in PG.

Even if a book is already in PG, it may be worth processing again. This will require some legwork to determine, so be sure you feel strongly about the book before pursuing this. PG welcomes different editions, illustrated versions, different translations, etc. In addition, many of the older ebooks have more errors than we would find acceptable today and reprocessing them through DP may be the best way to change that. If the book has a PG number under 10,000 then it probably doesn't have an illustrated version and might be a good candidate for an upgrade.

Below is a list of reasons you might provide an existing PG project through DP. You will need a copyright clearance for each of these cases. For reworks, the PM should put a note into the Project Comments section explaining why the project is being redone. If you are a CP only, then you should include details of why the project is being redone in a text file attached in the project zip file.

Basic upgrade
You have the same text version, there are no illustrations, and the PG version is riddled with errors: Be sure to let PG know, when you upload the final version, that this is a revision of an existing ebook, based on a paper copy in hand. If there are only a few problems, submit them via the PG errata process.
Illustration Upgrade
You have the same text version, but there are illustrations and they are not present in the PG version: Same as the basic upgrade except that you'll be submitting an illustrated html version.
Different Translation
PG will treat this as a completely different ebook and welcomes them. There are already at least half a dozen translations of the Iliad, for example, and more are always welcome.
Different Edition
Some books were published in very different editions. Where this is the case, PG welcomes them as separate ebooks. You will have to document the fact that your edition has significant differences from the version that is already in PG.

Get a clearance

You have obtained a book, and have decided that it is both clearable and not already in PG or in progress, or you have a book you think is clearable and need to find out for sure. In both cases, it is time to ask the experts.

You will need to have scans of the Title Page and Verso (the back of the title page), also known as the TP&V. You may need scans of other material as well, such as an inscription on the fly-leaf, in order to establish date.

Check for Duplicates

The clearance team does not check for duplicate clearances. In addition, having a clearance does not mean you "own" that title for some period of time.

Before you move forward with a clearance request, please make sure that the book isn’t already in progress at common proofreading sites:

  • David's List: Clearances more than five years old roll off; new clearances may not make it on the list for a month or two, but in general, this is a good place to check and see if a book has been cleared within the last five years.
  • In progress at DP, DP Canada. Note that the extended search at DP Canada, linked to above, can also search their FadedPage display site.

DP also has an in-progress check script that combines several searches, but carries the warning at the top of the page: Do not rely solely on the information returned by this script. It does not check DP Canada or their FadedPage display site or any other sites similar to ours which may have publicly available searches. It's often best to use multiple search terms and search sites individually, rather than depending on a composite search.

If your search string doesn't find anything, please try variations. Sometimes projects are duplicated because the search string was too restrictive. Common words may find too many results, but the shortest string that will uniquely identify a title is usually best. If the title is extremely common ("Poems", for example), try searching for both title and author.

Copyright Clearance.

Copyright clearance is a process by which Project Gutenberg determines if a book is in the public domain according to the copyright laws of the United States. Project Gutenberg maintains a set of Rules that are used to determine if a book is clearable. This DP site operates under U.S. law; if you cannot obtain a clearance, your book cannot be processed through this site.

Please read PG's copyright clearance rules for details.

If your book is not clearable under PG's rules, but the author and everyone else associated with the book (i.e., the illustrator, editor, translator) has been deceased for at least 70 years, you may wish to send the book to one of our sister sites, DP Canada.

As of June, 2018, Project Gutenberg is approving clearances for books that will become Public Domain as of January 1, 2019 and January 1, 2020. Please do not upload projects that are not in the public domain in the US as of the time you upload the project files. See this post.

Create a PGLAF Account.

The next step is to set up an account at PGLAF (the branch of PG that handles clearances). If you have Direct Uploading or PPV access, you already have a PGLAF account.

To create a new account, browse to PGLAF and read the welcome page. This contains a lot of useful information on the clearance process, and a number of useful links. Next, Click the New username link, and fill out the form. Be sure that the email address you enter is valid and is checked regularly; this is the address where posted notices and clearance notifications go, and also where you will be contacted if a conflict occurs.

Submit a Clearance Request.

After completing the registration process, log in, and select "Submit a New Clearance Request". A large form will appear; most of the information required should be available directly from the title page of your book. If not, you will have to do some research. Document any findings in the field provided; be sure to list the source of any information not found on your book's title page and verso page (the page immediately following the title page). If a date is listed twice in different contexts (separate publication date and copyright date, for example) enter it twice. Remember when attaching images that they should be small in size (100k is a reasonable maximum; most should be smaller), but the smallest text should still be legible. Multi-volume works can be cleared in a single clearance request if the dates are the same, or if you provide the earliest and latest title and verso.

Checking the Clearance Registration Form details

Before you submit your clearance request, it is important to carefully review the title page scan of your book and the publication information and verify that the information you have entered on the form is correct. Please ensure that:

  • Author field contains data
  • Author's name is spelled correctly
  • All authors listed on title page are listed in upload form
  • Any illustrator, translator, editor, etc., listed on title page is listed in upload form
  • Title is complete and all words are spelled correctly
  • Title is appropriately capitalized
  • If English, the title and subtitle should be capitalized using Sentence case.
  Example: The story of the little red hen
  • For titles in languages other than English (LOTE), please follow the capitalization used on the book's Title Page. However, if the title there is fully capitalized, please use the conventions for title capitalization common in the book's language (If in doubt, you may refer to the capitalization used for books in that language within the catalogs of major libraries such as the Library of Congress or WorldCat).
  • Subtitle listed on title page is listed in upload form
  • If you are uploading a periodical, please check the Project Gutenberg for previously posted issues of that periodical and follow the title formatting they used.
  • If your project is part of a multiple volume set, state the number, i.e. English History, (Vol. 2/6).

Note: It is very important that all the information you enter in the clearance registration form match the details of the project for which you are requesting clearance: The uploaded item MUST match the clearance. This certainly includes all the publication metadata (publisher, location and date). If the information does not match, then Project Gutenberg should not accept the upload of your project once it completes post-processing.

Types of Clearances.

There are several types of clearances. The most common is rule 1, but some others are used on occasion. Project Gutenberg only clears based on the United States Copyright Laws. However, if you would like a detailed discussion of copyrights in other countries, visit The Online Books Page.

Wait.

All that is left now is to wait for the results of your request. Basic clearances using the standard rules are usually processed within several weeks. Rule 6 clearances, which require more research, usually take longer (and may require further research on your part before it clears). You must receive the clearance before loading the project onto DP.

You may get a response that says NOT OK. A reason for the denial of the clearance will always be given. Be sure to check that reason, since technical difficulties such as corrupted files can easily generate this response. Feel free to resubmit your clearance request after correcting whatever problem was noted.