WordCheck Engine Enhancements

From DPWiki
Jump to navigation Jump to search

WordCheck Engine Interface Enhancements

Status


Introduction

This page is designed to track the status of any WordCheck engine enhancements and provide a central location for things brought up in a forum thread.

Please feel free to contribute to this document.


Feature status legend

  • S16 000000.png indicates the feature has been pushed down to the production server
  • S16 0000ff.png indicates the feature has been checked into CVS and likely available on the main test site or will be shortly
  • S16 00ff00.png indicates the feature is coded in someone's sandbox (see list of sandboxes below)
  • S16 ffff00.png indicates that some developer has vouched for the feature and is willing to code it or is in the process of doing so


Distilled features

These are some features/changes that were distilled from the discussion forum (with an attempt to attribute the thought to the poster with a link to the post where applicable). Feel free to add to these lists but don't modify the status unless you're the developer.

General

  • forumpost:334902 kraester: Allow adding "words" with spaces (such as Prairie du Chien) to the Good Words List.
  • forumpost:334916 garweyne: Use language-dependent word formation rules (as detailed in the aspell dictionaries); for example, belle-mère is one word in french, two in english.
  • forumpost:334916 garweyne: Allow hyphens in words, even if not present in the language rules.

Proofer Suggestion text -> database conversion

Currently the words that proofers suggest are stored in a text file, one per project, in the format:

timestamp/round/page/proofer/word1 word2 ...

Each time a proofer makes suggestions (regardless of how they exit the interface), a line in the above format is added to the file. If a proofer runs WordCheck but does not make a suggestion - no line is added.


garweyne had put forth the idea of using the suggestion file to track overall WordCheck usage, regardless of they have made a suggestion or not. During the discussion of implementing that change, it was proposed to move it to a more generic table that could be used to track not only WordCheck usage and related data, but also other page checks. Something like:

create table page_checks (
  check_type char(4) not null, -- WC for WordCheck, maybe PC for Punctuation Check, etc
  projectid varchar(22) not null,
  timestamp int(10) unsigned not null,
  image varchar(12) not null,
  round_id char(2) not null,
  username varchar(25) not null,
  suggestions text, -- \n delimited list of suggestions
  corrections text, -- \n delimited list of corrections in wdiff form: {-ORIG-} [+NEW+]
  primary key(check_type,projectid,timestamp,image)
);


This format would let us do the following things (and thus searches):

  • Find all WC suggestions made for a specific project
select suggestions from page_checks where check_type='WC' and projectid='PID';
  • Find all WC suggestions made after a specific point in time
select suggestions from page_checks where check_type='WC' and projectid='PID' and timestamp > NUM;
  • Get a list of all pages that have been WordCheck'd in a given round:
select distinct image, timestamp from page_checks where check_type='WC' and projectid='PID' and round_id='RID' group by image;


Queries that would not be accessed completely via indexes with the above definition:

  • Get a list of all proofers that interacted with a specific page
select username from page_checks where check_type='WC' and projectid='PID' and username='NAME';


Queries that would not use any index on the table above:

  • Find all checks done on a specific page
select check_type from page_checks where projectid='PID' and image='IMAGE';
  • Find all checks done on a specific project
select check_type from page_checks where projectid='PID';


It is unlikely that a punctuation check page would have 'suggestions', although it might have 'corrections'.


Testing sandboxes

None currently.

See Also (on WordCheck)