Confidence in Page Perpetual P1

From DPWiki
Jump to navigation Jump to search

The Perpetual P1 experiment is an effort to quantify the P1 noise floor. There is a minimum average number of changes which a page will experience traveling through P1 as measured by the wa/w metric.

PM Recruitment Letter

[This is the letter I intend to send to PMs of projects randomly selected for the P1->P1 experiment.]

Your project <title>, <projectID> has been randomly selected for a study of P1->P1. Please let me know whether or not you are able to participate in this experiment.

The experiment takes a project through 10 rounds of P1. We hope to develop a curve which will help estimate the effectiveness of P1 for most projects. Additional information is available at [url=http://www.pgdp.net/wiki/Confidence_in_Page_Perpetual_P1]Perpetual P1[/url].

If you feel that participation is not practical, please let me know so that I can choose an alternate project.

Guidelines for PP1 PMs

These are guidelines for PMs running PP1 (Perpetual P1) projects.

If you have questions not answered here, try the Pepetual P1 thread, or the original Confidence in Page Algorithm thread.

At the end of each round collect the proofed pages, upload them to dpscans, and then arrange for another round of P1. After 10 rounds of P1, send your project on to P2. There is a high probability that the project will qualify for P3-skip after P2.

To collect proofed pages, in the detail level 2 or higher project page, there is a section called "Post Downloads". There is a subsection called "Download Concatenated Text". Select "P1" and "the text". Hit the download button. Upload the proofed pages for the round to dpscans in directory CiP/<project id>/R<round number>. E.g. I would put the data for the first round of P1 for projectID44aca0443c8cd in CiP/44aca0443c8cd/R1/.

At the start of the experiment, please put the OCR page bundle in CiP/<project id>/OCR/.

Once you've collected the data for the round, please send a note to db-req@pgdp.net asking for another round of P1.

Each time your project enters P1, you may want to edit the project somments to start with (HOLD). This will keep the project from entering P2 prematurely. You will need to remove the (HOLD) once the project is re-queued for P1, and then replace it once it enters the round.

Here is some boilerplate for the project comments:

(HOLD)
Beginners are welcome! If you would work on this kind of book if it weren't experimental, we need your participation.
Please use [@@ this format] for comments about the original text (typesetting errors, etc.). Continue to use [** this format] for other comments.
This is an experiment in support of the CiP (Confidence in Page) Algorithm Project. The project is chartered with developing an algorithm for deciding that a particular page is "done" or if it needs more proofreading.
The purpose of this project is to understand the effects of running P1 over and over again. If this project accidentally releases into P2, please do not work on it there. It's important for this experiment that it be open to all potential P1 proofers.

Please add the following to the project title: {Perpetual P1 experiment}

BEGIN, RR, & NO Projects

If your project is a BEGIN, Rapid Review, or Newcomers Only project, there are special considerations. Since we need to send these projects on to P2 quickly for feedback, for experimental purposes we clone them. The clone should NOT be marked as BEGIN, etc... It should have as its "OCR" pages, the output of P1 from the original project. You can use piggy's [pgdpsplit] program to make separate files out of the concatenated output of P1.

Create a directory under dpscans/CiP for the id of the NEW project. The OCR directory should contain the concatenated OCR from the original project. The R1 directory should contain the P1 from the original project (the "OCR" for the new project). The output of the first round of P1 for the new project will go in R2.

PP1 Projects

Notes

This is a project of the Confidence in Page Algorithm.