From DPWiki
Jump to navigation Jump to search

I've been asked about guidelines translations quite a few times, so I figured I'd put my answers in one place instead of having to keep resending the same info. :) Feel free to PM me (acunning40) with questions!

Please note that all of this just summarizes what I've figured out as I've gone along, based on what I've seen in the English guidelines, the existing translations, and the new translations in progress. This isn't authoritative; it's just a collection of my recommendations and ideas.

Where to start?

Updating an existing translation

If you're going to be updating an existing translation, you can download the current version here. Look for the name of the document(s) that are in your language. For each, click on the title, then right click on the first Download link at the top of the page and choose the option called something like Save Target As or Save Link As.

To figure out what needs to be updated, you can compare the current English and [your language] versions carefully to look for differences. However, it's very easy to overlook minor differences so a better option is to look at a diff comparing the current English document to the version that the existing translation was based on. That is, you'd look at a diff to find out all the changes that have been made to the English document, and then you make the same changes to yours. To get the diff that you need, go to the appropriate page:

The most recent version will be at the top; click on select for diffs there. Scroll down until you find the version that your existing translation was based on, and click on to selected #.#. For instance, if the current version of the English document is 1.88 and the last translation into your language was done in January 2006, then you'd click on "select for diffs" on the 1.88 version, scroll down to version 1.80 (from January 2, 2006) and click on "to selected 1.88". That will show you the changes that have been made to the English document since January 2, 2006.

Starting from scratch

There are various ways to go about this, and exactly how you do it is up to you. Here are some possibilities:

  1. You can download the English versions and make the changes directly. Use these links: formatting guidelines, proofreading guidelines. On each page, right click on Download at the very top of the page, and choose the option called something like Save Target As or Save Link As. Open each one up in a text editor and translate each bit of English into your language, ignoring the html <markup>. See What to translate below for more details.
  2. If that's difficult for you in some way, then you could copy all the text of the guidelines from your browser into a text editor and just translate the text itself. Then someone else would need to make copies of the English proofreading & formatting guidelines, and go through and put your translated text in place of the English. At least some of that would need to be done by someone who knows your language, in order to place the inline markup (like italics and links) correctly.
  3. If you want to distribute the translation process some more, you could do something like option 1 or 2, but instead of working on it in a text editor on your computer you could use a wiki page. That way anyone who wanted to could edit the page to translate what they wanted to, and then others could review the translations and make improvements.

For option #2, please note the utf-8 info below to make sure that you translate everything that you need to.

You could start with either the proofreading or formatting guidelines, and once you're finished you could use that as a basis for the other. Doing that will probably require a lot of careful checking to make sure that you don't accidentally include formatting instructions the proofreading guidelines or vice versa. That said, though, it's probably a reasonable way to go about it because there is a lot of overlap in some sections. I doubt that any of the sections are completely identical, because even where all the instructions are the same they'll almost always start out differently, such as:

Proofread a large and ornate graphic first letter ...
Format a large and ornate graphic first letter ...

There are various other little differences scattered throughout, like the proofreading guidelines saying, "the next proofreader, the formatters, and the post-processor" where the formatting guidelines say, "the next formatter and the post-processor". When you're basically finished, doing a search for "format" (in your language ;)) in the proofreading guidelines or "proof" in the formatting guidelines might help you to find anything that you missed.

Making the edits

For the actual process of making changes, it's best to avoid using a WYSIWYG html editor. Those types of programs will often make many changes to the code, which will make quite a mess of the diff between the existing document and your new version. Using a text editor to edit the code directly is the best way to avoid making unintentional changes to the html and php code. Unfortunately, it's harder to read for content when the code is in the way. You may find it easier to convert html entities into the actual characters (e.g. convert &auml; into ä) while you're editing the file. (Don't forget to change them back when you're all finished, though!)

If you know of any way to get around these problems and edit the content more easily without messing up the code, please feel free to add your ideas here or PM me. I'm sure there are people working on translations who would like to know!

What to translate

In general, just translate the content and leave the markup alone. For instance, if it says

<li><a href="#footnotes">Footnotes/Endnotes</a></li>

you'd need to translate "Footnotes/Endnotes" but you'd leave the rest alone.

The translation should be comprehensible to someone who doesn't know English.


Some people haven't been sure whether they should translate the contents of sample notes such as [**unclear] and [**typo fixed, changed from "txet" to "text"]. I don't see any reason to keep them in English; go ahead and translate them.

Text inside markup

These are a few places where there is content inside markup that could be translated:

Latin-1 shortcut tables
Each shortcut has markup such as
&agrave; Alt-0224
The title ("Small a grave" in this case) will appear when someone holds their mouse over the table cell in their browser. It would make sense to translate all of these titles.
Table summaries
Many tables have summaries such as summary="Play Example 1". I don't think it's not a big deal either way but you may as well translate these summaries.
Page title
Around the 10th line of the file, in the php code at the top, there's a line that says something like:
theme('Proofreading Guidelines','header',$theme_args);
Translate the phrase "Proofreading Guidelines" (or "Formatting Guidelines"), leaving the rest alone.

How to translate

My understanding is that you should do a direct translation, keeping all the same content as in the English guidelines. That is, your translations won't be guidelines for [your language] projects, they'll be guidelines for any project to the same extent as the English guidelines are.

In most cases this is pretty straightforward: just translate the content from English into your language. However, there are some things that aren't so obvious or that may need special handling.

Links to English webpages

Links leading to English pages have been dealt with different ways. Here are some links that might need special handling:

  • Proofreading Quiz and Tutorial; Formatting Quiz
  • Proofreading Summary; Formatting Summary
  • Standard Proofreading Interface Help; Enhanced Proofreading Interface Help
  • Gallery of Table Layouts (only in the F* guidelines)

For these links we want people to know where they lead without having to know English, but it would also be nice to inform them that the link targets are English documents. A couple options are to put a translation of the document's title in parentheses, or to translate the link but put "(in English)" in parentheses. For instance, in Dutch these two options would be (the underlining represents the linked part):

  • Poetry Books (Poëzie)
  • Poëzie (in het Engels)

I think either of these methods would be fine, but it would probably be good to be consistent throughout.

There are also three links at the bottom of the page:

  • Distributed Proofreaders home page, DP FAQ Central page, Project Gutenberg home page.

I think these could be translated directly without doing anything else, since I'd think that people would be able to figure out that the links lead to English pages.

If the link is to a document that has already been translated into your language, then go ahead and change the link to lead to that translation not to the English document. For instance, where the English Proofreading Guidelines give a link to the (English) Formatting Guidelines, in the German translation of the P guidelines that link should lead to the German translation of the F guidelines rather than to the English document.

Quotes from DP pages

The "Fixing errors on Previous Pages" section of the guidelines mentions various links and phrases that appear on the project page and diffs page:

  • Images, Pages Proofread, & Differences; Just My Pages
  • Edit

Since it's talking about what you'll see on the diffs or project page, and the phrases always appear in English on those web pages, I think you should leave them that way in the guidelines. However, you could put their translations in parentheses if it would help people understand the section better.


Some terms in the English guidelines may be problematic in translation. "LOTE" and "stealth scanno" are a couple examples.

It looks like all our current guidelines translations use the term "scanno" directly. Most also use "stealth scanno" (the French guidelines have "Scannos furtifs" instead). Some have translations in parentheses while others don't.

The term "LOTE" appears twice in the English guidelines for ellipses:

English and Languages Other Than English (LOTE)
LOTE: (Languages Other Than English)

The abbreviation only works from the English phrase, so this is problematic in translation. A quick survey of the current translations finds:

inglés y otros idiomas (LOTE)
LOTE: (Languages Other Than English--otros idiomas que no sean inglés.)
em inglês (English) ou não (Languages Other Than English -- LOTE)
LOTE: (Languages Other Than English)
en anglais ou non
für Englisch und für andere Sprachen (LOTE, Languages Other Than English)
LOTE (Languages Other Than English)
voor Engels en andere talen (Languages Other Than English (LOTE))
LOTE: (Andere talen dan Engels)

There's obviously a lot of inconsistency, and I don't have any particular preference. I'd just suggest trying to make it clear, as well as making sure that it's understandable without knowing English.

utf-8 text

In the proofreading and formatting guidelines, there are a few places that actually contain more text in the .php files than what you see in your browser. It's all utf-8 stuff, so it doesn't appear in the browser since DP isn't set to use utf-8. If you aren't working directly with a PHP file, you'll need someone to provide you with copies of the content to be translated. Contact me (acunning40) or anybody who has worked on site development for assistance.

A general note: if some of the markup or php code stuff doesn't make sense to you, don't worry about it. The important thing is to get the content translated and the inline markup correctly placed. The larger issues can be checked over and fixed up if necessary by me or someone else, but the inline markup should be done by the person translating things or at least by someone who knows the language.

If you're curious, the code "if(!$utf8_site)" means basically "if the site isn't set to use utf-8, then display the following." The code "else" means "otherwise (i.e. if the site is set to use utf-8), display the following." The final "}" marks the place where the different instructions end, so anything after that will be displayed no matter what. Each of those code bits also has some extra stuff around it that marks it as being code.

In all cases except for double quotes, the different versions are at the level of paragraphs, so it's easy to separate them. For the double quotes section it splits off into the separate versions in the middle of the sentence, so you may have to adjust where the split occurs due to different word order in your language.

Other stuff to do

Once you're finished translating the content you'll need to convert all non-ASCII characters to html entities. For instance, use &auml; instead of ä. Edit in 2010: I'm starting to think we shouldn't require going down to ASCII only. That was done to avoid any encoding problems, but as long as we're careful to check everything, Latin-1 should be okay as an option. But the files definitely have to use entities for anything outside Latin-1.

If you've produced a new translation, you'll also need to provide translations of these phrases:

Proofreading Guidelines in [your language]
Formatting Guidelines in [your language]

so that someone can go through all the other translations and make links to yours.

When you're finished

It's probably a good idea to put your translation on the test site so that you or others can check things over and make sure that things display properly and so forth. I'm happy to put translations in my sandbox on the test site if needed; let me know by PM and I'll give you my email address to send the files to me.

Once the translation is on the test site it should be checked over to make sure that everything is working correctly, and then a squirrel or admin will install it on the main site.

Other documents

If you want to translate other documents besides the proofreading and formatting guidelines that would be fine, as far as I know. You can see a list of all the FAQs here if you want to download an English document in order to translate it. For a handy guide (proofreading or formatting summary), you'll need to download the .sxw file, which is an OpenOffice document. Edit the document, then create a pdf from the file menu.

For developers/squirrels

For the moment this is just a list of reminders to myself, to make sure that I don't forget anything. It could apply to anyone who's putting a translation into their sandbox or a squirrel who receives a translated file and is going to commit it to CVS.

  • Update (or insert if necessary) the comment at the top of the php noting the translator and date.
  • For new translations:
    • Make sure all the other translations have links to the new one
    • Update FAQ Central to include a link to the translation
    • Make sure that the title (inside the php theme line at the top) has the translation; translators often seem to miss this
    • Add a php comment at the top stating the source

See also