Statistics
From DPWiki
Distributed Proofreaders has a number of statistics pages. This is an overview of existing statistics, taken from the forum thread All Kinds of Statistics, which was started by Hael. For scripts that are not related to statistics try the Other tools page.
Overview of Available Statistics:
General
- Statistics Central: here you can find all kinds of statistics for the rounds (esp. daily/monthly proofreading statistics), User Logon Statistics, Most Requested Books, the General Site Statistic etc.
Workflow
- Activity Hub: Pages remaining in round, and Site Progress Snapshot. The Round Backlog Graphs article explains how the Pages Remaining in Round graphs are calculated.
- Projects Posted to PG: Graph with Total Projects Posted to Project Gutenberg.
- Period: Monthly. (Daily are available from Statistics Central)
- Time Span: 05/2006 - present.
- Completed Projects by Month: Table with Total Projects Posted to Project Gutenberg.
- Period: Monthly.
- Time Span: 03/2001, 01/2001-present.
- projects completed yesterday for each round.
- Round Equilibria Graph: Pie Chart with Pages Saved by Round.
- Period: Current Day, Previous Day, Rolling 7, 28 and 180 Days.
- Time Span: not available.
- garweyne maintains Backlog Numbers as well as English Backlog Numbers showing the total size of projects waiting to enter a round.
- Period: Daily.
- Time Span: 17/12/2007 - present (English: 28/6/2010 - present).
- bfoley maintains Backlog Graph and Raw Numbers: Shows the volume of projects, split by waiting (Proofing and Formatting Rounds only), available, and checked out (PP and PPV only), for each round from P1 to PPV.
- Period: Daily.
- Time Span: 05/2007 - present.
- Backlog Trend: Queue length in pages for P1-P3, F1-2, PP.
- Period: Weekly.
- Time Span: Rolling 4 Weeks
- Project Transitions: Total Projects Queued, Released, Completed, Round Growth and Queue Growth, for P1-3 and F1-2.
- Period/Time Span: Rolling 7, 30, 90, 180 days.
- An explanation from acunning40 around the data presented here, from her post in the All Kind of Statistics forum thread: As I understand it the page just adds up how many distinct projects entered a given state during a given time period. (The last 2 columns are just the results of subtraction involving the first 3 columns.) It's not counting how many projects actually were in that state, which means the numbers can be skewed in various ways by certain squirrel activites. For instance, if a project skips P3 it normally enters P3_waiting (adding it to the "queued" column) but it never releases into the round or completes it, so it will end up adding one to the "Queue Growth" even though it's no longer in the queue. As another example, BEGIN and P3 qual projects are split for proofing and then merged while waiting for F2. For one that was split into 4 parts, 4 projects would enter the F2 queue but only 1 would release. As with round skips, they'll add to the "Queue Growth" number on the noncvs page because it's just a tally of how many projects entered each state.
- F2 Wait Time: Average Days Waiting for project to be released into F2.
- Period: Daily.
- Time Span: 01/2007-08/2008. (no longer updated)
- Neglected Projects: Table showing time elapsed since last page done on a work. This is used to identify 'neglected' projects.
LOTE
- French Stats: Shows pages in each round for French projects. Note - article in French.
- Period: Monthly.
- Time Span: 01/2008-present.
- German Stats: Shows pages in each round for German projects.
- Period: Monthly.
- Time Span: 01/2008-present.
- Portuguese Stats: Shows pages in each round for Portuguese projects.
- Period: Monthly.
- Time Span: 01/2006-present.
- Note: Portuguese projects include both "Portuguese" and "Portuguese with..." projects.
- Italian Stats: Shows pages in each round for Italian projects.
- Period: 4 weeks.
- Time Span: 01/2009-present.
- Note: Originally the weekly stats were displayed in the Italian team thread. They include the data for DP-EU.
User Behaviour
- User registrations: Graph showing the number of new users each day.
- User by latest activity: Count of users who were last active in each month.
- Users by month joined: Signup and retention rates.
- Users active after registration by month: Users who were active 1, 7, and 28 days after registration, by month of registration
- Proofer Retention Graph: Users who went on to proof at least one page, in percentage of overall.
- Period: Monthly.
- Time Span: 09/2000-present.
- User Activity Graph and User Activity Raw Data: Registered Users, Users Active in Last Week and Users with 1 or more pages saved in P1-3 and F1-2.
- Period: Daily.
- Time Span: 28/2/2008-present.
- Web Server Statistics: Various figures around site activity, including volume of file requests; visitors' browsers and operating systems; and file size, type and directory.
- Period: All Time and Rolling 7 Days for file requests, Daily and Hourly for total activity, rest unspecified.
- Time Span: N/A.
- daniemers maintains proofers active in the last week for each round.
- Time Span: 24.07.2009 - 10.06.2011
General Manager's annual report
- GM Report graphs: Shows graphs of the statistics in the First Annual GM Report.
- Time Span: : June 1, 2009 - May 31, 2012.
- forumpost:860315