Site conversion to Unicode/Esperanto

From DPWiki
Jump to navigation Jump to search

In the first half of 2020, Distributed Proofreaders undertook a Site conversion to Unicode, which switched the site to using UTF-8. For Esperanto public domain texts, that meant that we could process them using the native accented letters (ĉapelitaj literoj), and no longer need to use the X-system surrogate.

There were a number of steps to go through before that was fully functional. This page is intended to be a place to track those.

  1. Convert the main site to use the UTF-8 code. (At this point the site will be fully Unicode-capable, but at first will stay restricted to a set of characters similar to Latin-1.) (Done! May 19th)
  2. Enable the "Extended European Latin" character suite, which will allow many more characters to be used in the system. This includes upper and lower case versions of the six letters needed for Esperanto. (Done! July 16th)
  3. Recode all Esperanto projects that are in process, to use accented letters. This will require having them unavailable, preferably in between rounds. (An alternative would be to wait until all Esperanto projects have moved into PP, without starting any new ones, but that would be a long wait.) This step was not needed. Instead a pseudo-language "Esperanto-x" was added to the site on July 17th, to allow WordCheck to continue to function properly for texts still using the x-method encoding.
  4. Install a new Esperanto dictionary for WordCheck, that uses accented letters, not X-system. Test to be sure it is working as expected. Task 1887 (Done! July 17th)
  5. Update project comments, documentation in wiki, etc. (Done! July 17th)
  6. Make Esperanto projects again available for proofing. As no recoding was done, this was not necessary. The first Esperanto project to be available in the rounds for proofing using UTF-8 accented characters was La rabistoj.

Input of accented characters

Once those steps are done, DP users may need guidance for how to input the characters. This is a draft version of that information, for use once the steps above are all done.

Draft moved to Esperanto for Proofers.