Custom characters

From DPWiki
Jump to navigation Jump to search

Custom characters are extra characters that can be added to projects by Project Managers. They are project-specific, and are only available to proofreaders for the individual project they are defined in. They are intended to complement our precomposed character suites, and be added on as "as-needed" basis.

Why do we have them?

Our character suites do not cover all the characters we might want to use. A Project manager may wish to add custom characters for various reasons: in order to add only a few characters without using a whole character suite, or to make characters available that can't be used otherwise.

Some characters may not be in character suites because they are infrequently used, or have potential for confusion with other characters, or their suite has not been planned yet.

How do we use them?

When a Project Manager edits a project, they can add characters directly in the Custom Characters line. They should be input directly as the characters themselves (not the U+xxxx number), and without any separators.

For such characters to be allowed when a project is loaded, they need to be defined before loading the project pages. The characters can be edited at any point, but PMs should be very careful not to remove any characters that are in use in the project.

Please mention any custom characters in the project comments, otherwise proofreaders may not know to use them.

From the point of view of DP site code, custom characters are treated as belonging to a project-specific character suite. They will show up in a single pickerset in the proofing interface. When the pickerset is selected, the custom characters will be displayed in two rows, with the first half on the upper row, and the remainder on the lower row.


Custom characters have the following constraints.

  • PMs are allowed a maximum of 32 custom characters per project.
  • All custom character in a project must be unique.
  • They cannot be one of the characters we convert to ASCII on project load.

These conditions are enforced when saving the project information and the site will tell you if any of these are violated.

Which characters can we use?

  • Any characters in already existing character suites.
  • Any characters in the recommended custom characters list.
  • With caution, other characters that are covered by the font DejaVu Sans Mono
  • Please do not use characters that are not covered by our web fonts

Further technical considerations

  • Just because you can find a Unicode character doesn't mean you should use it. You should take font support into account. DejaVu Sans Mono and DP Sans Mono support a wide range of Unicode characters, but not all of them, and proofreaders still have the option of specifying their own font for proofreading, which may or may not have the characters we have available in the character suites. Specifying a character that isn't in the DP-provided web fonts may give unpredictable results, with a reduced possibility that a proofreader can see them as intended (or see them at all).
  • Because of the maximum number of characters allowed, this is not a good way to try to define another alphabet/syllabary script. If there is a possible need for a new script, the DP administrators and developers will need to assess the potential need, and a new character suite will need to be created.
  • Note that custom characters will be NFC normalized automatically. This will usually not make any difference; but there are a few cases (for example some Greek letters with oxia) where one character will be replaced with another, considered to be canonically equivalent by the standard.
  • A custom character can be a combining character sequence. So it is possible to provide a letter with diacritical mark that does not have a precomposed form in Unicode.
  • A select number of Unicode characters are displayed as emoji by default in some fonts. In order to force them to display in text presentation, you have to include a variation selector. See official list of such characters from the Unicode consortium.