WordCheck Engine Enhancements
WordCheck Engine Interface Enhancements
Status
- Current status: Brainstorming
- Driving developers: cpeel
- Informed developers: jmdyck
- Informed non-developers: garweyne, kraester
- Discussion threads: WordCheck development: WordCheck Engine
Introduction
This page is designed to track the status of any WordCheck engine enhancements and provide a central location for things brought up in a forum thread.
Please feel free to contribute to this document.
Feature status legend
indicates the feature has been pushed down to the production server
indicates the feature has been checked into CVS and likely available on the main test site or will be shortly
indicates the feature is coded in someone's sandbox (see list of sandboxes below)
indicates that some developer has vouched for the feature and is willing to code it or is in the process of doing so
Distilled features
These are some features/changes that were distilled from the discussion forum (with an attempt to attribute the thought to the poster with a link to the post where applicable). Feel free to add to these lists but don't modify the status unless you're the developer.
General
- forumpost:334902 kraester: Allow adding "words" with spaces (such as Prairie du Chien) to the Good Words List.
- forumpost:334916 garweyne: Use language-dependent word formation rules (as detailed in the aspell dictionaries); for example, belle-mère is one word in french, two in english.
- forumpost:334916 garweyne: Allow hyphens in words, even if not present in the language rules.
Proofer Suggestion text -> database conversion
Currently the words that proofers suggest are stored in a text file, one per project, in the format:
timestamp/round/page/proofer/word1 word2 ...
Each time a proofer makes suggestions (regardless of how they exit the interface), a line in the above format is added to the file. If a proofer runs WordCheck but does not make a suggestion - no line is added.
garweyne had put forth the idea of using the suggestion file to track overall WordCheck usage, regardless of they have made a suggestion or not. During the discussion of implementing that change, it was proposed to move it to a more generic table that could be used to track not only WordCheck usage and related data, but also other page checks. Something like:
create table page_checks ( check_type char(4) not null, -- WC for WordCheck, maybe PC for Punctuation Check, etc projectid varchar(22) not null, timestamp int(10) unsigned not null, image varchar(12) not null, round_id char(2) not null, username varchar(25) not null, suggestions text, -- \n delimited list of suggestions corrections text, -- \n delimited list of corrections in wdiff form: {-ORIG-} [+NEW+] primary key(check_type,projectid,timestamp,image) );
This format would let us do the following things (and thus searches):
- Find all WC suggestions made for a specific project
select suggestions from page_checks where check_type='WC' and projectid='PID';
- Find all WC suggestions made after a specific point in time
select suggestions from page_checks where check_type='WC' and projectid='PID' and timestamp > NUM;
- Get a list of all pages that have been WordCheck'd in a given round:
select distinct image, timestamp from page_checks where check_type='WC' and projectid='PID' and round_id='RID' group by image;
Queries that would not be accessed completely via indexes with the above definition:
- Get a list of all proofers that interacted with a specific page
select username from page_checks where check_type='WC' and projectid='PID' and username='NAME';
Queries that would not use any index on the table above:
- Find all checks done on a specific page
select check_type from page_checks where projectid='PID' and image='IMAGE';
- Find all checks done on a specific project
select check_type from page_checks where projectid='PID';
It is unlikely that a punctuation check page would have 'suggestions', although it might have 'corrections'.
Testing sandboxes
None currently.