Checking your diffs

From DPWiki
Jump to navigation Jump to search
DP Official Documentation - Proofreading
Languages: English Français Italiano Español

Many DP volunteers check their diffs to see what changes have been made in later rounds. This can be a good way to learn what you're doing correctly and what you may need to change in your proofreading or formatting habits.

When are diffs available?

Your pages must have been proofed (or formatted) in the next round in order for you to see what changes were made. For some short, popular projects it may only take a few days, but more often it takes weeks or even months. In addition, the release queues for some rounds are quite long, so the project may wait for quite a while before it is released into the next round. Because of this, it's generally impossible to know how long it will be before you can see the diffs.

To receive an alert when the project reaches the next round or rounds, subscribe to Event Subscriptions. Go to the project page, check the appropriate box or boxes, and then select Update Event Subscriptions.

Note: due to the potential delay in seeing your diffs, if you want feedback more quickly you should post in the project thread or send a Private Message to dp-feedback.

How to access your diffs

You can access your diffs one project at a time by going to the project page, or get to the diffs for multiple projects using the special "reviewing work" page. Each of these methods is explained below.

Via the project page

To look at your diffs for a particular project, first you need to go to the Project Page. There are multiple ways of doing this:

  • At the top of a DP page (such as the Activity Hub or the P1 page), click on My Projects. Find the project that you want, and click on the title.
  • If you know which round the project is currently in, go to the page for that round (via the Activity Hub). Find the project, and click on the title.
  • Click on the Project Search link at the top of a DP page, and then search for the title, author, or other information. In the search results, click on the title.

Once on the Project Page, click on one of these links in order to view the Page Details for the project:

  • Images, Pages Proofread, & Differences: shows details for all the pages in the project
  • Just my pages: shows only your DONE and IN PROGRESS pages in the project. This is usually the most efficient option to use for viewing your diffs.
  • Detail Level 4 (link at the top and bottom of the Project Page): shows the project's Page Details below the Project Comments on the Project Page

Via the "Reviewing Work" page

This page allows you to get to your diffs from a particular round across multiple projects more easily. For the "Work Round", select the round that you worked in; "Review Round" should be one round later. "Max days" allows you to choose how long ago to include, and if you enter a number greater than zero for "Max diffs to show" then you'll get direct links to some of your diffs on the projects.

Once you've selected the options, press the Search button to pull up a list of projects that you worked on in the "Work Round" during the time period you selected. Each title is a link to your "Just my pages" page for that project.

Understanding the page details

On the Page Details page, there is a table with information for each page. The columns, from left to right, increase as the project moves through the rounds. The columns (shown only up to P2 here but are the same for all rounds up to F2) are:


  1. I (an ID number in ascending numerical order
  2. Upload (the files, text and images, uploaded by the project manager)
    • Image: a .png file number, linking to the image
    • Text (the text as prepared by OCR; the number is the file size in bytes)
  3. Page State: this can end with page_avail (available), page_saved (saved as done), page_out (checked out, not saved), page_temp (saved as in progress), or page_bad (marked bad)
  4. P1
    • Diff: diff (link to the changes made in P1), or no diff if no changes were made
    • Date: date and time when the text for that page was last updated
    • User: name of the volunteer (only visible if you've worked on that page); in parentheses after the name is the number of pages proofread by that volunteer in the particular round
    • Text: the text as saved by P1 (the number is the file size in bytes)
    • Edit: a link to edit the page if the project is still in the particular round (only available if you worked on that page in that particular round)
  5. the same information as in (4) repeated for each round the project has advanced to

All pages that you have worked on will be highlighted. If a DONE page is available for re-editing it will have a green background; IN PROGRESS pages have an orange background. A red background indicates that you worked on the page in a previous round, and can no longer edit it.

Looking at the diffs

For any page that has a "diff" link, clicking on it will show you the text before and after that round, with changes highlighted. For instance, this example is from an OCR/P1 diff:


The text on the left is from the original OCR text and the text on the right is from the text saved by the P1 proofreader. The proofreaders’ names are shown in parentheses after the round name. On the left, changes are indicated by a - followed by a yellow vertical line; characters that have been changed are highlighted in yellow. On the right, changes are indicated by a + followed by a vertical blue line; the characters that have been changed are highlighted in blue.

The "Previous," "Next," and "Jump to" options near the top and bottom of the page allow you to look at another diff page from the same project without first having to go back to the page details. If you worked on the page in the previous round (the one on the left side of the diff display) and you also did other pages in the project, you can move from one diff of yours to the next using the "Proofreader previous" and "Proofreader next" buttons.

"Line 1" indicates where on the page the first change was made. It is the same for both the OCR and the P1 text. "Line 14" on the left indicates the next change was further down the page; and on the right the corrected line is now closer to "Line 13". Similarly, the next change or diff is close to "Line 21" in the OCR, and "Line 20" in the P1 text.

The diffs shown on this page are as follows, “a” indicating the left side; “b” indicating the right side—note that these numbers have been added for demonstration purposes and do not appear on the actual diffs page:


1a A blank line has been removed from the top of the page because the paragraph continues from the previous page.

1b The text has been moved up to fill what previously was the blank line at the top of the page.

2a The left is highlighted because the line is no longer in this position.

2b The right is highlighted to show that the text that was there has been moved upwards—note that it does not indicate a blank line has been inserted in this instance.

3a The left side shows “hnghtly” and a comma highlighted in yellow.

3b The right side shows that “hnghtly” has been changed to “brightly” (highlighted in blue), and the comma has been removed so there is no character to be highlighted.

4a The line has been flagged with – and a yellow horizontal line, but no characters have been hightlighted because the incorrect character is a space.

4b The line has been flagged with + and a blue horizontal line, but again no character is highlighted because the extra space has been removed. Diffs of this nature can often be tricky to spot.

5a The left side shows “auuts”.

5b The right side has been corrected to “aunts” as it appears in the original image.

The lines with a gray background and no highlights are the same in both the OCR and P1 versions of the text. They are included to show the changes in context.

Problems with the diffs

Lines are highlighted, but no change is apparent

Changes made to fix "spacey quotes" or other spacing problems may be tough to spot, so examine the spacing around punctuation marks and between words carefully for any differences.

Lots of text is highlighted

Occasionally if many changes are made to a page, the diffs program may get confused and incorrectly show a large amount of highlighted text. This will also happen if some text is moved from one place to another--it will be yellow on the left and blue on the right, even if nothing was actually changed within the block of text. In such cases you may need to compare the old and new texts yourself to see if any changes were made—for example, open the Text file from P1 to compare with the Text file from P2.

Odd blank lines or other problems

Sometimes a diff may show a blank line being inserted or removed even if it wasn't. In that case, or in general if something looks odd in the diff, it's best to open the "before" (in this example the text under “Upload”) and "after" (the text under P1) page texts and compare them yourself to be sure of what was changed.

Getting help understanding diffs

On the Page Details page, each volunteer's name is a link to send them a Private Message. If you see a diff that makes you wonder, "Why did they do that?", feel free to send them a PM inquiring why they made the change.

If you prefer not to contact the person who made the change, you can send a private message to a Project Facilitator instead, or ask in the forums. There are two different "Explain that diff!" threads, one for proofreading diffs and one for formatting.

Keep in mind that diffs are only differences; they don't necessarily mean that you did something wrong. In some cases there may be multiple correct ways to proofread or format something.

See also

To comment or request edits to this page, please contact jjz or John_NZ.

Return to DP Official Documentation Menu