DP Jargon

From DPWiki
Jump to: navigation, search
Languages: English Français Italiano


Jargon Guides

Organizations and specialized activities develop their own sets of specialized terminology, or jargon, and DP is no exception to that. Accordingly, we have developed some FAQ-like Jargon Guides you can access in order to learn some of our lingo.

The LONG DP Jargon Guide, and the Jargon Guides related to The Guidelines, User Roles, and Workflow contain acronyms and terms you will likely encounter as a new volunteer at DP.

Other Jargon Guides contain terms that are a bit more specialized. The Group Activities Jargon Guide will become especially relevant to you if you start using Jabber. The remaining Jargon Guides shown in the Jargon Navigator box relate to the specific activities mentioned in their titles.

If you come across an acronym or term that isn't mentioned in one of these Jargon Guides, please ask about it in one of the DP forums.

Detailed suggestions on how best to add and edit Jargon-related information can be found at Help:Jargon.


Contents

Activity Hub

The Activity Hub is a main entry page for the Distributed Proofreaders (DP) website, containing links to all the rounds and other workflow stages. It provides a progress overview for the various stages of e-book production since midnight of the current day and shows which stages you can work in.


avatar

An avatar is a small graphic image, chosen by an individual Distributed Proofreaders (DP) volunteer, to display on their DP User Details page and on their posts to the DP forum.

BAE

See Bureau of American Ethnology.

bad word list

A bad word list (BWL) is a list of words compiled by Distributed Proofreaders that are flagged in WordCheck.

begin/beginners only project

A beginners only (also knows as a BEGIN) project is an EASY project set aside to be proofed in the P1 round by Distributed Proofreaders's newest volunteers.

Big Eye, The

The Big Eye (a.k.a. "Show All Text") button gets its name from the icon in the Enhanced proofing interface: BigEye.png The same functionality is available in the Standard proofing interface through the button labeled "Show All Text."

Clicking this button opens up a new browser window and displays the proofread text as it would appear on an HTML-formatted page. You see italics, bold, etc.

Its usefulness for verifying mark-up during formatting rounds can be priceless. But, it can also help when you're proofreading because you will see the text in a different font and slightly different format. Sometimes that's all it takes for a sneaky scanno to suddenly jump off the page at you!


blackletter

Blackletter (also known as Gothic script) is an old font style used mostly in the 12th-15th centuries. Modern readers often find it difficult to read. There are many varieties of blackletter, but one example is:

Sample Text

For more on blackletter at DP, see also Proofing blackletter, Fraktur, and Common Fraktur OCR errors.

For more details on blackletter itself, consult Wikipedia.


block quotation or block quote

A block quotation is a long quotation (typically several lines and sometimes several pages) and is often (but not always) printed with wider margins or in a smaller font size—sometimes both.


Bureau of American Ethnology

The Bureau of American Ethnology (BAE) was a government-sponsored organization in the United States that coordinated and reported anthropologic research in the Americas. See Wikipedia's article about the Bureau of American Ethnology for more in-depth information.

BWL

See bad word list.

CP: Content Provision/Provider

Content Providing/Provision (CP) is the process of providing the page images used in proofreading, either by scanning a book or harvesting the images from an online source.

Also a person who does such work (Content Provider, or CPer).

If you are interested in becoming a CPer, visit Access Requirements.

You can automate some content providing tasks by using guiprep and guiguts. For more information you can see the Content Providing FAQ.


DG: Daily Goal

Each round has a Daily Goal (DG) which is the number of pages DP is aiming to proofread or format in that round for the given day. The daily goal for each round is visible in the top right corner of each round's page. It's labeled "Today's Goal" (just to confuse you!).


diacritical marks

Diacritical marks, sometimes referred to as diacritic marks or the "short-hand" term diacriticals, are small marks found above or below a basic character which change the pronunciation of that character. For example, the acute accent over the "e" in the character "é" is a diacritical mark.

Characters with diacritical marks may be proofed different ways in DP projects. If the needed character is available in an active character suite, it may be directly input or selected from a picker. If it is not in an active character suite it should be represented by the method described in Proofing Guidelines—Characters with Diacritical Marks.


diffs

Diffs (short for differences) are the changes made to the text of a project's individual pages as it progresses through each round at DP. The term can also refer to the Webpages at DP where you can view such changes. A "diff" doesn't necessarily mean there was something wrong on a page, just that the page text coming out of the subsequent round is different.

Proofreaders often help themselves improve their proofing and formatting skills by examining and analyzing their diffs. See Checking your diffs for more details.


Distributed Proofreaders Europe

Distributed Proofreaders Europe (DPE, DP-EU or DP-Europe) is a sister site of Distributed Proofreaders, Click here to go to their website.


DP: Distributed Proofreaders

Distributed Proofreaders (DP) refers to this Website, the organization behind it, the community using it (sometimes referred to individually as DPers, short for DP users), etc., in any or all combinations.

For a general introduction to DP, the organization, see New Volunteer Frequently Asked Questions.


DP Wiki

The DP Wiki is this document you are reading right now, and its brothers, sisters, children, cousins, parents, and other relations. As with any family, it is always growing and changing. What makes it so special is that you and every other DPer are what/who can make it grow and change.

In the DP Forums, the DP Wiki will often be referred to simply as "the Wiki."

For more information (of a less allegorical type), see DP Wiki. Also, compare to DPWiki.


DP-EU

See Distributed Proofreaders Europe.


DP-Europe

See Distributed Proofreaders Europe.


dp-feedback

dp-feedback is a username shared by a group of Distributed Proofreaders volunteers who provide proofreading and formatting feedback on request. Although this assistance is most often requested by P1ers, P2ers, and F1ers, all volunteers are welcome to ask for feedback.

DPCustomMono

In order to help proofers detect OCR text errors, Big Bill developed a custom font for DP called DPCustomMono2. You can read about the history of the font and why it was developed in the Custom DP proofing font thread.


DPE

See Distributed Proofreaders Europe.


DPF: Distributed Proofreaders Foundation

The Distributed Proofreaders Foundation (DPF) is the legal entity behind DP.

Created in May 2006, DPF is a non-profit corporation registered in New Jersey, USA. Its purpose is to support the primary website (pgdp.net) and possibly other DP sites/implementations, as determined by the Board of Trustees.


DPWiki

DPWiki was formerly an experimental, shared public identity on the DP Forums. Information that was formerly in forum posts labeled as written by "DPwiki" was migrated to the DP Wiki in May of 2006.


DU

Direct-Uploading (DU) is the ability to send a post-processed text directly to to the PG Whitewashers without it needing to be checked in PPV. It is given to any PPer who consistently produces quality work over a number of projects as stipulated in the access requirements according to the PPV Guidelines.

Also, a person who does such work (also DUer).


F1

F1 refers to Formatting round 1, the first round of formatting, in which markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., is added to individual pages in a project.

To see how you can qualify to work in F1, see the Access requirements article and the F1 round page.

See also F2, proofreading, DP-feedback, and Formatting Mentoring.


F2

F2 refers to Formatting round 2, the second round of formatting, in which F1 markup is checked and corrected.

To see how you can qualify to work in F2, see the Access requirements article, the F2 round page, and this forum post.

There is a team, F2 Fanatics, dedicated to moving projects through F2 towards completion more efficiently by concentrating efforts on a few projects. The F2 Fanatics project list shows the team's current and previous projects.


FAQ

We have so many write-ups of Frequently Asked Questions (FAQs) that we have a FAQ Central, and are developing an All FAQs page in DP Wiki.


Fast Formatting Feedback

Fast Formatting Feedback is a designation for projects that are fast-tracked into F2 after finishing F1 to give fast diffs to F1. Since the F2 queue for English is currently quite long and getting longer, this is an opportunity for F1 to learn from their diffs.

FFF projects have {Fast Formatting Feedback} in the title and are usually announced in the F1 news when available.


feedback

All well-designed systems have feedback mechanisms built into them, and the DP Workflow system is no exception in this regard. Accordingly, there are many kinds of feedback that are exchanged between volunteers here at DP.

  • One of the first types of feedback a new volunteer at DP is likely to receive, and the kind most likely to be the type being referred to when you see the word "feedback," will come from a mentor via the beginners only projects and/or the DP-feedback mechanisms, through which experienced proofreaders send detailed and constructive comments to proofreaders regarding what they did correctly and incorrectly (with respect to the Guidelines) on specific pages in specific projects, and on ways that the proofreaders can learn to become even better proofreaders.
  • As project pages progress through the rounds, each successive proofer and foofer may be creating diffs which can serve as feedback to the previous proofer or foofer. This is the primary mechanism by which proofers in Newcomers Only projects receive feedback.
  • PP Mentors give feedback to new PPers; HTML Mentors give help and feedback to PPers who are new to HTML; and one of the PPVers' primary tasks is to give feedback to PPers.

In addition to these relatively formalized and routine feedback mechanisms, any volunteer can ask for feedback on any issue, question, procedure, etc., in a project thread or any other place in the DP Forums which may seem to be appropriate.


foofing/foofer

Foofer is an informal term for formatter.

See also formatting, and compare to proofer.


footnote

See out-of-line footnote.

formatting

Formatting is the process of adding markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., to a project. Formatting of a project's individual pages is performed in rounds F1 and F2. Other, more project-wide, formatting is done in the PP stage.

See also foofing, and compare to proofreading.


Formatting Fast Track

Formatting Fast Track (FFT) projects are released into F2 immediately after completing F1. Formatting Fast Track allows F1s trying to qualify for F2 to work on projects with a variety of formatting needs that will get through F2 quickly so that they can be included in an F2 evaluation. Other F1s can also benefit from Fast Track by getting faster diffs.

FFT projects have {FFT} in their title.


Formatting Guidelines

Formatting Guidelines refers to a document that contains all the default instructions and standards for formatting (such as markup for italics, illustrations, footnotes, and poetry) in rounds F1 and F2. These standards apply to all projects, unless specifically overridden by instructions from the Project Manager in the Project Comments or the project thread.

You can access the Formatting Guidelines from FAQ Central and from any Proofing Interface window.

See also Proofreading Guidelines.


forum

A forum is an online discussion site and message board. Wikipedia's article about forums has a more detailed definition.

Fraktur

Fraktur is a particular kind of blackletter font, used principally for German, but occasionally for Scandinavian languages. Modern readers often find it difficult to read.

For more on Fraktur at DP, see also Common Fraktur OCR errors, Antiqua, and Proofing blackletter.

For more details on Fraktur itself, consult Wikipedia.

A small introduction into Fraktur can be found in keine Angst vor Fraktur or Don't be afraid of Fraktur.


FTC: Fine-Toothed Comb

A virtual Fine-Toothed Comb (FTC) Heebie.gif is issued to all P3 proofers to help the proofers "comb out" those last little teeny tiny sneaky evil stealth scannos.


ftealth fcanno

A ftealth fcanno is a specific kind of stealth scanno that occurs when a long s character forms a valid word in an OCR text, but not the word that appears in the page image.

good word list

A good word list (GWL) is a list of words compiled by Distributed Proofreaders that are not flagged in WordCheck.

GoEG: Grammar of English Grammars

The Grammar of English Grammars' (GoEG) is a huge prescriptive grammar of English (duh!) from the early 19th century; a paragon of linguistic fussiness and a legend for our times.

This text was proofed at DP during 2003, and is now available at Project Gutenberg (etext:11615).


groofing

Groofing refers to group-proofing, in which we cooperate and coordinate our efforts through the DP groupchat window on the Jabber network, the "Anyone for leapfrog?" forum thread, or the Groofers and Gfoofers team thread, proofreading the same text, encouraging each other, and sharing wisdom about the project.

A list of scheduled and past Groofs can be found at Scheduled groofs.

A Groofers and Gfoofers team has been established for those who enjoy Groofing and Gfoofing.


Guidelines, The

The Guidelines, often with a capital "G," refers to the Proofreading Guidelines and Formatting Guidelines documents, which contain the default instructions for working in the rounds.

"The Guidelines" can be accessed from FAQ Central and from any Proofing Interface window.

(Before June 2005, when "The Change" occurred, there was just one set of guidelines called the "Proofreading Guidelines," which were similar to the current formatting guidelines. Before June 2003, they were called the "Document Guidelines.")


GWL

See good word list.

Handy Guides

The "Handy Guides" are brief, printable (PDF format), self-exemplary summaries of the Guidelines. There are separate versions for proofreading and formatting.

You can access the Handy Guides at FAQ Central; they are listed as the "Proofreading and Formatting Summaries."


HTML: HyperText Markup Language

HTML is the abbreviation for Hyper-Text Markup Language. HTML text is normal (e.g. ASCII) plaintext but with certain parts of the text marked up to denote special formatting or layout or other properties, or to link it with other texts (hence the term hyper-text). A browser uses this information to render the text accordingly (for example with portions in bold or italics).


illo

illustration

Jabber

Jabber is an open-source instant messaging system commonly used by DPers. Members meet regularly in the conference rooms available through Jabber to socialize, ask/answer DP-related questions, and participate in group activities such as groofing and grentoring.



Jabber ID

In order to communicate via Jabber, each user must have a unique identity known as a Jabber ID, which looks confusingly like an e-mail address, and is, like an e-mail address, one of a kind.

Many DPers have Jabber IDs, most of which can be found in the PGDP Jabber IDs list.


LaTeX

LaTeX is a high-quality typesetting system, with features designed for the production of technical and scientific documentation. LaTeX is the de facto standard for the communication and publication of scientific documents. (From latex-project.org.)

Many DP projects that require technical, such as math or scientific, markup are formatted using LaTeX.


Latin-1

Latin-1 (or more formally, ISO-8859-1) is a character encoding standard. It defines a set of characters used for major western European languages.

The Distributed Proofreaders website used Latin-1 for processing all of its books, from its creation until May 19th, 2020. After that, it changed to the UTF-8 encoding of Unicode.


More information can be found at Wikipedia.


ligature

A ligature is one typographical character which combines two letters which are usually separate characters.

Here at DP, if the ligature is available in the Basic Latin character set, we usually proof it that way. For example, the "ae" ligature is normally proofed as the "æ" or "Æ" character, as appropriate.

Most other ligatures are proofed as their component characters with no special markup, but if a particular ligature will be seen frequently in a given project, the PM will usually have addressed the issue specifically in the Project Comments.

For more information related specifically to æ and œ, including how to distinguish between the two in italics fonts, see æ and œ ligatures.

To see some examples of other ligatures you may run across in projects, see Proofing blackletter, Proofing Civilité, Proofing old texts, and Transliterating Greek.


LoC/LOC

LoC and LOC are the standard abbreviatons used to refer to the Library of Congress (U.S.).


LOTE: Language-Other-Than-English

Projects written in a Language Other Than English are generally referred to in postings to the DP forum by the acronym LOTE, both lovingly and rancorously.

DP primarily processes projects in LOTEs that use the characters found in our Basic Latin character suite. Additional character suites will allow other other languages to be worked on as well.


markup

Here at DP, the term markup generally refers to the various tags that are or have been inserted into documents to format or otherwise designate data for some type of special handling.

Different styles of markup are used in different types of documents. For example, to indicate a reference to footnote "number 1" in

  • an HTML document, use markup like this: <sup>[<a name="1" href="#1">1</a>]</sup>
  • page text in the Proofing Interface, use markup like this: [1]

And to bold text in

  • a BBCode forum posting, use markup like this: [b]bold text[/b]
  • a DP Wiki article, use markup like this: '''bold text'''
  • an HTML document, use markup like this: <b>bold text</b>

For italic text in

  • a BBCode forum posting, use markup like this: [i]italic text[/i]
  • a DP Wiki article, use markup like this: ''italic text''
  • an HTML document, use markup like this: <i>italic text</i>
  • a plain text document, use markup like this: _italic text_


mentoring

There are various kinds of mentoring that go on at DP. There is mentoring for new Project Managers, mentoring for new Post-Processors, and mentoring for Post-Processors who are new to HTML. There is mentoring done for P1ers and F1ers via the DP-feedback mechanism. There is also a team of volunteers who provide Formatting Mentoring who will work with F1ers who would prefer one-to-one assistance.

The most visible form of mentoring at DP, however, is the mentoring of new proofreading volunteers.

The mentoring of new volunteers comes in two forms.

  • Experienced proofreaders proof Proof-only Mentoring projects, which were originally Newcomers Only projects proofed by new volunteers in P1. The diffs created serve as the primary feedback mechanism for newcomers who prefer this less-detailed, serve-yourself, type of mentoring approach.
  • Experienced proofers with "mentor status" proof mentors only projects in P2, in order to send detailed feedback via Private Message (PM) to the new volunteers who proofed the beginners only projects in P1. The PMs serve as the primary feedback mechanism for beginners who prefer this more-detailed, more full-service, type of mentoring approach.


mentors only project

A mentors only (also mentor) project is a Distributed Proofreaders (DP) project that is proofed in the P2 round by proofing mentors to provide feedback to DP's newest volunteers.

Missing Page Finders

The Missing Page Finders are volunteers who love to spend a lot of time in libraries and don't mind searching for the exact edition of obscure tomes, photographing or scanning the missing pages, and sending the files on to those who need to complete a project. In many ways, these hard workers are the unsung heroes of DP.

See also Missing Page Finders and Missing pages.


Newcomers Only project

Newcomers Only projects are (usually EASY) projects that have been set aside for our newest volunteers. These books contain most of the elements proofers need to deal with most frequently, such as "spacey quotes" and other wonky punctuation, a few diacritical marks here and there, an occasional "unclothed" dash or hyphen, and scannos you can really sink your teeth into.

New volunteers in P1 are asked to start with a beginners only project, but a Newcomer project is a lovely next step. Some Newcomers Only projects have page limits per proofreader. Please be sure to read through the Project Comments to determine the page limit before beginning.

Once a Newcomers project has been completed in P1, it is changed into a "Rapid Review" project for P2. These projects release quickly into P2. Experienced P2 proofers complete that round allowing for rapid turnaround. In many cases, each proofer who worked on the book in P1 will get individual PMs after P2, collecting and presenting all the diffs on all the pages they worked on. If individual PMs are not generated, the new proofreaders who worked on the project in round P1 are informed that their "diff files" for the project are ready, and they are giving instructions on how to check their diffs, and ask questions about and interpret them.

Depending on the availability of Newcomers Only projects, they are released in the queue to keep 2 projects available at any one time. The goal is to keep the turnaround time down so feedback is timely for the newcomers. Note to Project Managers with P1 PM queues submitting Newcomers Only projects: please put the line "(nopmq)" without the quotes as the first line of the project instruction. The NO queue is set to release no more than two NO projects at a time, so PMs with P1 queues should make sure their projects don't release through their PM queue and only throught the NO queue.


OCR

See optical character recognition.

OED: Oxford English Dictionary

The Oxford English Dictionary, or simply the OED, is widely considered to be the historical dictionary of the English language.

While the OED staff is currently working on producing the Third Edition of this massive work, DP is currently considering taking the First Edition on as an Uberproject. Consisting of ten volumes containing over 400,000 words and phrases, the original edition was published during the span of 1884-1928, under the name of A New English Dictionary on Historical Principles.

The first Supplement to the dictionary was published in 1933, and during that same year the original ten volumes were re-packaged into twelve volumes, and republished under the new name of the Oxford English Dictionary. More information about the Oxford English Dictionary itself can be found at the OED Website.

Information about DP's discussion of, and preliminary preparations for, an OED Uberproject can be found on the New English Dictionary on Historical Principles page.


optical character recognition

Optical character recognition (OCR) is the electronic translation of scanned images of printed text into editable text.

At Distributed Proofreaders, the abbreviation OCR is used in various contexts (and tenses/forms) to refer to:

  • OCR software - the software that performs optical character recognition,
  • the process of using optical character recognition software,
  • the person using optical character recognition software, and
  • OCR text - the editable text produced by optical character recognition software.
  • For more information about optical character recognition, see this DP article.

out-of-line footnote

An out-of-line footnote is a style of handling footnotes where the text of the footnote remains at the end of the page, with only a reference of the form "[X]" shown in the text body. For example,

This marker[1] references an out-of-line footnote.

(See it down there at the bottom of this article.)

This is the footnote style used for proofreading and formatting projects under the current Guidelines.

Compare to in-line footnotes, which are no longer used in DP projects.

[Footnote 1: Sample footnote text.]


P1

P1 refers to Proofreading round 1, which is the first of two or three rounds of proofing that each project goes through at DP. The initial OCR text is checked and corrected (except in the relatively infrequent type-in projects).

See also P2, P3, and formatting.


P2

P2 refers to Proofreading round 2, which is the second round of proofing. The page-texts have already been proofread, and now need to have the text spellchecked and carefully compared to the image.

Because P2 proofreaders are more experienced then most P1 proofreaders, P2 is expected to fix a variety of mistakes and oversights common in the P1 round. During P2 proofing, proofers can mentor P1 proofers by providing encouraging, helpful feedback via a PM.

To see how you can qualify to work in P2, see the Access requirements article and the P2 round page.

See also P1, P3 and formatting, and a discussion of the differences between the rounds.

(From April through June 2006, when we formally changed from two to three rounds of proofing, this round was called P2alt.)


P2alt

From April-June 2006, P2alt was the second round of proofing, in which the version of the page text produced in P1 was checked and corrected.

This round was added between P1 and the original P2 round in April 2006, as an experimental "alternative" to the "original" P2 round; hence the odd name. The P2alt experiment having been judged a success by Distributor Proofreaders's Site Administrators, an "official" third proofing round was added to the DP system. The original P2 round was transformed into the P3 round, and the experimental P2alt round became the new P2 round.

One of the problems resulting from the experimental and temporary nature of the P2alt round was the necessity of jerry-rigging the P1 diffs when a project moved from P2alt into P2. Projects with the label P2alt-r identify projects where "jerry-rigged diffs" have been retrieved and restored into the project's normal Page Details. Naturally, the P2alt-r label will become obsolete as the projects which were in progress when the formal transition to the current five-round system was made finish their time in the DP system.

P3

P3 refers to Proofing Round 3, which is the optional third round of proofing, in which the version of the page text produced in P2 is checked and corrected. See also P1 and formatting.

If you want to work in P3, you must satisfy the numerical requirements, and then apply for P3 qualification.

There is a team, P3 Junkies, dedicated to moving projects through P3 towards completion more efficiently by concentrating their efforts on a few projects. The P3 Junkies project list shows the team's current and previous projects.

(Prior to June 2006, when we formally changed from two to three rounds for proofing, this round was called "P2". See also a summary of recent changes to the DP process.)


What P3 can do.

P3 proofers examines the pages of projects in P3 for small errors. They are the last formal round for inspection of the characters on each page. Interestingly, this group of the most experienced proofers ask the most proofing questions about the project in the Forums.

P3 proofers proof the pages submitted for P3 qualification. The diffs they produce are inspected by a very small team to determine how many changes the P3 applicant missed. Only serious changes are counted. Not all diffs are errors and P3 should avoid unneeded diffs.

P3 can proof pages in P2 that were edited in P1 by relatively new proofers. These are labeled "Rapid Review". Feedback is provided to the P1 new proofer automatically based on the diffs generated. Again P3 should avoid unneeded diffs.

P3 Quals

Some projects in P2 have (P3 Qual) after their titles. After finishing P2, these projects will move quickly into P3, skipping the release queue, and will be proofed quickly once there. They are a way for proofers who have requested (or will request) P3 access to get diffs on their pages without having to wait a long time due to the length of the P3 queues. See more information at P3 qualification.


PCs: Project Comments

The Project Comments (PCs or PC) is a section in a Project Page, containing information specific to that project. These comments should be read before you start proofreading or formatting in that project. If the Project Manager (PM) wants any exceptions to be made to the regular Proofing Guidelines or Formatting Guidelines for the project, they will be noted here; instructions in the Project Comments override the rules contained in the Guidelines.

This is also where the Project Manager (PM) may give you interesting tidbits of information about the project or its author.


PD

See public domain.

PF: Project Facilitator

Project Facilitator (PF) is an administrative position at Distributed Proofreaders, similar to that of Site Administrator. A Project Facilitator's primary function is to help Project Managers, but only when the PMs need it. A PF can do anything a PM can do, with the difference that the PF can do it for all projects.

Visit DP Administrators to see a list of current Project Facilitators, and see this thread for more information.


PG: Project Gutenberg

You know: that place where all of the finished DP projects go.

Project Gutenberg (PG) is DP's "parent site," which hosts a growing online archive of public domain electronic texts available freely to all. See gutenberg.org.


PM

  1. Project Manager
  2. Private Message


PNG

See Portable Network Graphics.

Portable Network Graphics

Portable Network Graphics (PNG or png; file extension .png) is a lossless compressed image file format.


PP: Post-Processing

Post-Processing (PP) is the process of formatting and reassembling the pages of a project after it has completed the rounds of proofing and formatting. (Also called Post-Proofing.)

Also, a person who does such work (also Post-Proofer, or PPer).

If you are interested in becoming a PPer, visit Access requirements.

See also the Post-Processing FAQ, and Hands-on PPer. For more PPing resources in the DP wiki, see Post-Processing Advice. For LaTeX projects, see LaTeX postprocessing guidelines.


Proof-only Mentoring project

Proof-only Mentoring projects, which are found only in P3 (at least for right now), are the "reincarnation" of some of the Newcomers Only projects from P1. Except for the fact that these projects move straight from P1 to the active P3 list, the pages in these projects are proofed just as they would be in any other project.

Once all the pages in a given Proof-only Mentoring project have been proofed, the proofers who worked on the project in round P1 are notified via PM that their "diff files" for the project are ready, and they are giving instructions on how to check their diffs, and ask questions about and interpret them. The projects themselves move into F1 and complete the rest of their time at DP following the same process as all other projects.

Anyone with P3 status can do this type of mentoring. See the main Mentoring page for more information on the mentoring process.


PPV: Post-Processing Verification

Post-Processing Verification (PPV) is the process of final checking a post-processed text, done by a very experienced PPer. This is the last stage a project goes through at DP before being sent to the PG Whitewashers.

Also, a person who does such work (also PPVer).

Related Resources


Pre-processing

Pre-processing is the process of preparing a book (which becomes known as a "project") for proofreading here at DP. Steps include scanning the book (or "book-like thing"), running the OCR software (which generally includes some spellchecking function), and uploading the files to the DP servers using Remote File Manager. These tasks are performed by a person known as the Content Provider (CP), who may also serve as the Project Manager (PM).


Private Message (PM)

A Private Message (also PM, Personal Message) is an e-mail sent from one Distributed Proofreaders volunteer to another, using the same software as is used for the DP forum.

Check your DP Inbox here.


project

A project is a book (or book-like thing) that Distributed Proofreaders is converting to an e-text.

project discussion

See project thread.

project forum

See project thread.

Project Hospital

The Project Hospital is an ad-hoc clearinghouse for projects with problems that prevent them from being completed. These can include (but are not limited to) damaged, duplicate or missing pages, missing or poor illustrations, lacking copyright clearance, duplicate projects, and poorly prepared projects (huge page or illustration scans as proofing images).

The Project Hospital is currently outside the main DP workflow, though it is intended that the process become more formalized in the future. The current "patients" are listed on the Project Hospital page.

The Project Hospital page enables better tracking of projects that need to be fixed, to prevent them from being left in broken states indefinitely if the person who found the problem forgets about it.


Project Manager (PM)

The Project Manager (PM) is the person in charge of a project and its progress through the rounds. The ultimate goal of the PM is to help the project be as consistently proofed and formatted as possible for the PPer. One way the PM (usually) does this is by writing Project Comments.

Different PMs have different styles. Some provide a handful of books that they pre-process themselves, then during proofreading monitor the project threads closely, and finally post-process the project themselves; others provide large quantities of books and rely on others to PP them. Other PMs fall somewhere between, perhaps closely following some books, while only glancing in on others, as questions are asked in the project thread.

If you are interested in becoming a PM, visit Access Requirements. If you are a new PM, see the Project Managing FAQ.


Project Page

Each project going through DP has a sort of "home page," called its Project Page, which serves as a nexus to the various resources on the DP site related to the project. The Project Page provides basic information about each project, including its PM, PPer (if assigned), difficulty level, genre, Special Days (if any), its current stage (round, etc.), the date it was last worked on, its Project Comments, a link to its project thread, and other information. The page can be displayed in four different levels of detail.

Project Pages are customized for each individual DPer, providing easy access to the last five pages that each proofreader has started but not completed, and the last five pages each proofreader has finished processing in that project's current round. Access can also be gained to other pages in the project, including the "diffs" for the project, via the Page Details.


project thread

A project thread (also project discussion, project forum) is a thread in the DP forum dedicated to a specific Distributed Proofreaders project.

proofing/proofer

Proofer is a commonly used, relatively informal, term for proofreader.

See also proofing, and compare to foofer.


Proofing/Proofreading Interface

The Proofreading Interface, or Proofing Interface, is the part of the DP site where users can proofread a single page in a project. It shows an image of the page, and a textbox containing the text for that page (as produced by other DPers up to that point). The user compares the two and attempts to ensure that the text correctly reflects the content or formatting shown in the image.

The Proofing Interface comes in two versions, Standard and Enhanced, each of which can be toggled between horizontal and vertical layouts. For more information, see Working with the Proofing Interface.


proofreading

  1. In a specific sense, proofreading is the process of carefully correcting the OCR text's characters to match the text shown on the scanned pages of a project. This is often called "proofing", and is normally performed in rounds P1, P2, and P3. Compare to formatting.
  2. When used in a more generic sense, proofreading can refer to the entire process of getting a project ready for posting to the PG site. This is the sense in which the term is used in the name Distributed Proofreaders.


Proofreading Guidelines

Proofreading Guidelines refers to a document which contains all the "default" instructions and standards for proofreading (such as how to handle hyphenated words and letters with diacriticals) in rounds P1, P2, and P3. These standards apply to all projects, unless specifically over-ridden by instructions from the Project Manager in the Project Comments or the project thread.

You can access the Proofreading Guidelines from FAQ Central and from any Proofing Interface window.

There is also a Proofreading Summary, which is a 1-page document showing the most common proofing situations, and the proper way to proof them.

See also Formatting Guidelines and Proofreading Summary.


PTB: Powers That Be

See TPTB: The Powers That Be.

public domain

The term public domain (PD) refers to information, creative works, etc. that are part of the common body of knowledge or cultural heritage, which are not protected by any copyright or patent.

queue

See release queue.

R*/R1/R2

R* is a shorthand notation used to refer collectively to the old Workflow rounds of R1 and R2 from before "The Change." Each of the R* rounds involved both proofing and formatting.

The notation uses the asterisk character (*) as a wildcard, as it is used in many criteria and search strings in many computer contexts and applications.

Compare to P* and F*.

Occasionally when you see a reference to Distributed Proofreaders Europe (DPE), you may also see a reference to R*, since DPE still uses combined proofing and formatting rounds.

R* should not be confused with {R}, which is the shorthand notation used to refer to "retread" projects.


rank

  1. In each round at DP, a volunteer has an honorific rank which appears in the "Personal Statistics" section on the right-hand side of the window. These ranks are different for each round, and are determined by how many pages the volunteer has proofed or formatted in that round. Traditionally the names of these ranks, as well as the page counts at which they change, have not been shared in the DP forum so that they will be a surprise for others as they reach them.
  2. In each round, each volunteer has a page-count rank which appears in the Member Details section (under 'Page Statistics' and 'Neighbors'). This rank is (basically) the volunteer's numeric position in a list of all volunteers, sorted by the number of pages a volunteer has saved in that round. (E.g., the person with the highest page-count for that round has the rank of 1.)


release queue

A release queue (also queue) is a holding area for Distributed Proofreaders projects to be released into the rounds for proofing or formatting.

Retread

A "retread" is a project which has gone through one or more DP rounds and then was re-proofed through the same round or rounds. This may happen for several reasons, the most common of which is to bring a project up to the current proofreading guidelines.

In 2005, many projects were "retreaded" as part of the major updates to the DP site when moving from a two-round process to a four-round process, i.e., "The Change".

Retread projects are marked with an "{R}" [previously, an "(R)"] in the project title for easy recognition and automatic detection by the site software.


round

The word round refers to one of the several stages a project goes through at DP, in which it is prepared for Project Gutenberg's repository of e-texts. Since June 2005, each project goes through a minimum of two proofing rounds and one formatting round. See also P1, P2, P3, F1, and F2.

All projects also go through Post-Processing (PP) and Post-Processing Verification (PPV), and may go through the Smooth-Reading (SR) process, but these stages are seldom referred to as "rounds." See also the General Workflow Diagram.


SA: Site Administrator

Site Administrators (SA), aka "The Powers That Be" (TPTB), have the ultimate say-so on what happens at DP and on its Website, including the DP Forums and this DP Wiki. Note that because we are a community project, SAs rarely speak from "on high" in their official capacity.

For a little bit of background on the PTB moniker, see this post, especially starting about half-way through it.

Visit DP Administrators to see a list of current and former Site Administrators.


scans

The terms scan, scans, scanner, and scanning are used in many places in multiple ways at DP.

"Scan" and "scans" (n.) usually refer to the image files created by Content Providers (occasionally referred to as "scanners" [n.], in the sense of people who scan [v.]), who use hardware known as "scanners" (n.) to "scan" (v.) the individual pages of a book or other textual material. This process is referred to as "scanning" (v. or gerund). In other words, "scans" are the results of running a "scanner" or "scanning." (Sometimes Content Providers harvest scans from other online sources instead of scanning them themselves.)

OCR software is used to create an OCR text from the scanned images (scans). As a project begins its journey through DP's rounds, the proofers working in P1 compare each page's OCR text to its original scan. Thus, "the scans" are the foundation of the e-texts produced by DP.


scanno

A scanno is an incorrect character in an OCR text.

"silent correction"

The phrase "silent correction" is often used to refer to an intentional change a proofer made to the proofed text of a page to "correct" something shown in the scanned image without leaving a [**proofreader's note] informing the post-processor that the change has been made. Another way to put it is making a change that relies on reason rather than vision without [**noting] it.

A "silent correction" other than the very few changes specifically mandated by the Guidelines (such as removing page headers/footers and end-of-line hyphens) is pretty much the worst "sin" a proofer can commit at DP.

The reason behind this is that what we really do here at DP is to transcribe basically hard-copy documents into another form (digital text), not edit them. Thus, some PPers prefer to have the project text match the historical document rather than make any "obvious corrections;" and others will make "minor" punctuation corrections, but not corrections that could just be old spelling inconsistencies; and some PPers will tend to make spelling consistent throughout the entire project; but no matter what course they choose, they are likely to leave a Transcriber's Note about the various "corrections" to the original that were and were not made, and it's hard to do that when they don't know what "corrections" have or have not been made (such as in the case of "silent" ones).


Slashdot

See this thread to read about the "slashdotting of DP."


"spacey quotes"

The phrase "spacey quotes" is sometimes used to refer to the particular type error found in OCR texts in which quotation marks (single or double) are separated from text by spaces.

For example:

Justice " in the hands of Madame de Meroul " Le


This is an especially important error for proofers to be on the lookout for and correct, because while a software script can be used to find quote marks surrounded by spaces on both sides, no automated tool can determine to which of a pair of letters the quotation marks really apply as well as a human being's understanding of the text's context can.

In other words, it usually takes a human being to figure out if the text should be proofed as this

Justice "in the hands of Madame de Meroul" Le

or as this

Justice" in the hands of Madame de Meroul "Le


Special Days

Special Days are days (or sets of days) when specific projects which have topical significance are released from the release queues for proofing and/or formatting.


Squirrels

The tireless technical support crew who help keep DP running (by running and running and running...) are usually referred to as the squirrels. Some squirrels have lycra suits, which enable them to perform heroic feats with the DP database. Squirrel.gif
See some history here, and in the link you will find there.

In February 2007 the term "Squirrel" became a formal title, including all System Administrators, db-req and some dphelp folks, and Forum Administrators. See DP Administrators for the current members in those positions.


SR: Smooth-Reading

The goal of Smooth Reading (SR) is to read a post-processed text attentively, as for pleasure, with just a little more attention than usual to punctuation, etc. This is not full-scale proofreading, and comparison with the project's scans is not needed. Just read it as your normal, sensitized-to-proofing-errors self, and report any problem that disrupts the sense or the flow of the e-text.

Smooth Reading: also referred to as Smooth-Reading, SRing, smooth-reading, smooth reading, Smoothreading, smoothying, and many other variations.

Smooth Reader: person who Smooth Reads e-texts; also affectionately known as an SRer, smooth-reader, smoothier, smoothyer, smoothie, smoothy, etc.

For more information, see the Smooth Reading FAQ and visit the Smooth Reading Pool.


stealth scanno

A stealth scanno (also stealtho) is a specific type of scanno that occurs when a character forms a valid word in the OCR text, but is not the word that appears in the page image.

Summary Guidelines

Summary Guidelines is another name for the Handy Guides.


tags

In general, tags are characters inserted in a document of some type in order to apply some formatting to a set of characters or to indicate that a set of characters needs some sort of special handling. Collectively, tags are often referred to as markup.

Different styles of tags are used in different environments. For example, HTML tags are surrounded by angle brackets (also known as "less than" and "greater than" signs), while BBCode tags are surrounded with square brackets, and a variety of tag styles are used in DP Wiki and the DP Proofreading Interface.


Task

One way that changes get made to the Proofreading Interface or DP Website design is by individual DPers submitting "Task Requests" in the Task Center. A Task Request, or just Task for short, is simply an entry on the Task Center page where you can make a request for a software change to fix a bug, add a feature, etc. Developers review the Task Requests and, with the approval of the Site Administrators, decide if, how, and when a Task Request might be implemented.

Before adding a new task, please search the tasks that are already there to see if a task has already been created for what you're thinking of. If there's already a task, you may add your "Me too" vote or even add a comment. The number of "Me too" votes a feature request task gets, in combination with the complexity of implementing it and how the change relates to the DP objectives help DP Administration determine whether and when a task request may be implement.

When you create a new task, please create a meaningful summary and a detailed description. You should also select whether the task is a Bug Report, Feature Request, Support Request, or Site Administrator Request. You may also make an assessment of what you think the severity and priority are, but Task Center managers may later change that rating based on their more complete knowledge of the code. If your task is a Feature Request, please select "Enhancement" as the Severity.


team

A team is a self-identified group of DPers. The DP system software allows each registered DPer to belong to up to three teams. In addition, many DPers have informal affliations and interests with far more than three teams. Teams often have active threads in the DP Team Talk Forum, and usually include a strong social element.

There are teams based on common language, geography, and outside interests, as well as teams dedicated to specific needs related to the process of moving projects through DP to PG.

The Teams List shows the current teams broken down by their unifying elements, and Special Teams shows a few teams which are organized to serve a particular type of purpose in processing projects.


Tesseract

Tesseract is an OCR software program, usable with Windows, OSX and Linux operating systems.

test server

As well as the regular server that hosts the DP site (including this wiki) we have a test server at http://www.pgdp.org. This is used to test website code before it is installed for general use on the main site (sometimes called the production site). It is also used to experiment with new ways of doing things using the existing code.


"The Change"

The phrase "The Change" refers to the major change made to the DP Workflow in June 2005 when it went from two rounds (R1 & R2, collectively referred to as R*), which each involved both proofing and formatting, to four rounds, two for proofing and two for formatting. The Smooth Reading Pool was added at this time.

For more information on "The Change" and the reasons why it was made, see the New Rounds, New Workflow, New Site and Discussion of Upcoming Site Changes threads. In addition, the Transition to Four Rounds document presents a description of "The Change" appropriate for people who had worked at DP using the old two-round processing system.

In June 2006, there was a "sequel" to "The Change," when an optional third proofing round was officially added to the Workflow process. A bit more detail on that update can be found in the June, 2006 Upgrade - 3rd Proofing Round etc thread.

See the General Workflow Diagram for a graphical representation of the current DP process for creating e-texts.


ToC or TOC

ToC and TOC are the standard abbreviatons used to refer to a Table of Contents.

(For post-processing advice related to ToCs, see Tables of contents.)


TPTB: The Powers That Be

Site Administrators (SA), aka "The Powers That Be" (TPTB), have the ultimate say-so on what happens at DP and on its Website, including the DP Forums and this DP Wiki. Note that because we are a community project, SAs rarely speak from "on high" in their official capacity.

For a little bit of background on the PTB moniker, see this post, especially starting about half-way through it.

Visit DP Administrators to see a list of current and former Site Administrators.


transliteration

Transliteration is the process of converting a text from one writing system into another in a systematic way, such as converting Greek text Βιβλος to Roman text Biblos.

type-in project

A type-in project is a Distributed Proofreaders (DP) project that does not have an OCR text when it is made available for proofing in P1.

uberproject

An uberproject is large-scale, multi-volume Distributed Proofreaders project.

Wiki

A wiki is a type of website that allows users to easily add, remove, or otherwise edit all content, very quickly and easily. This ease of interaction and operation makes a wiki an effective tool for collaborative writing.

The term wiki can also refer to the collaborative software itself (wiki engine) that facilitates the operation of such a website, or to certain specific wiki sites, including the computer science site (and original wiki), WikiWikiWeb, and the online encyclopedia Wikipedia. When used to refer to a specific site, such as DP Wiki, wiki is often capitalized.

The word wiki is a shorter form of wiki wiki (weekie, weekie) which is from the native language of Hawaii (Hawaiian), where it is commonly used as an adjective to denote something "quick" or "fast" (Hawaiian dictionary). In English, it is an adverb meaning "quickly" or "fast".

For a fuller description, see Wikipedia

See also Wiki Jargon.

WordCheck

WordCheck is a tool for checking spelling and other details in the Proofreading Interface.

The WordCheck tool refined and expanded upon DP's previous spellchecking functionality in basically three inter-related areas:

  • page text is able to be checked in more than one language,
  • commonly occurring stealth scannos can be flagged for extra attention from proofers using bad word lists, and
  • a good word list of project-specific proper nouns and terminology can be specified so that correct spellings of those words will no longer be flagged for proofers.
  • punctuation characters are presented with a different background color.

A great deal of thanks is due to cpeel and jmdyck for all their work in making these frequently requested features a reality.

In the Standard proofreading interface, you can find the WordCheck button grouped with the other buttons below the proofreading window. In the Enhanced proofreading interface, the WordCheck button displays a picture of a page with a a blue "S" and checkmark: WordCheck.png

For more details, see the WordCheck FAQ, What Proofreaders will see, Word lists, WordCheck/Project Management, and What PMs will see.


ZIP archive

See ZIP file.

ZIP file

A ZIP file (also ZIP archive, file extension .zip) contains one or more files that have either been stored intact or been compressed to reduce file size, using the ZIP file format. Wikipedia's article has more detailed information about the ZIP file format and ZIP files.