- 1 Put the <html>, <body>, </body>, and </html> tags each on their own line
- 2 Use heading elements for headings, and not for things that are not headings
- 3 Distinguish between purely decorative italics/bold/gesperrt and semantic uses of them
- 4 Do not use tables for things that are not tables
- 5 Do not use pre-formatted text
- 6 Do not use empty tags (with entities) or <br/> elements for vertical spacing
- 7 Be very careful when setting (large) margins
- 8 Include all text as text, not just as images
- 9 Always include a cover image
- 10 If you declare a language, declare the correct language
- 11 Consider if you need different CSS for different media types or mobile formats
- 12 Keep your code line lengths reasonable
- 13 Indent more complex parts of your code to convey their structure
The WhiteWashers at PG rely on this for putting in the PG boilerplates at the top and bottom of the e-book. If you disregard it, you are unnecessarily causing extra work for already-overworked people.
Use heading elements for headings, and not for things that are not headings
According to the W3C’s HTML Techniques for Web Content Accessibility Guidelines, headings should be used “to convey document structure” and “H2 elements should follow H1 elements, H3 elements should follow H2 elements, etc.” This means that just by looking at the headings of your document, a user should be able to get an idea of the structure of your document, or a kind of outline or hierarchy of the content.
See the Case Study on Headings for more information.
Distinguish between purely decorative italics/bold/gesperrt and semantic uses of them
<em> for emphasis and
<strong> for strong emphasis, no matter what they look like, and style them to match the original using CSS; use
<i> for italics and
<b> for bold that is purely for show, e.g., on title pages.
See the Case Study on Inline Formatting for more information.
Do not use tables for things that are not tables
<table> element is designed for representing tabular data in an HTML document. It is not meant to be used for, e.g., creating a border around content, or centring text.
See the example in the Case Study on Tables for more information.
Do not use pre-formatted text
It is sometimes suggested that, e.g., poetry is “pre-formatted” text and that it might thus be formatted using either the HTML
<pre> element or the “white-space” CSS property set to the value “pre”.
“Pre-formatted” text means that the browser (or e-reader, etc.) has to keep the spaces and line breaks intact exactly as written in the HTML document. This can cause real problems especially on small screens—where not allowing line breaks whenever they are needed might cause the text to run off the screen and be inaccessible to the user.
See the Case Study on Poetry for more information on formatting poetry.
Newbies to HTML often don’t know how to vertically set things off from one another and thus resort to hacks like empty tags (e.g., something like
<p> </p>) or rows of several line breaks (
<br /><br /><br />). These methods do not adequately reflect what is going on (an empty paragraph or a row of line breaks is not the same thing as vertical spacing above, e.g., a heading) and also needlessly clutter up the code.
The best way of controlling the spacing around elements is to use the “padding” and “margin” CSS properties. For some examples, see the Case Study on Title Pages.
Be very careful when setting (large) margins
Large margins are often set to approximately centre certain text (e.g., poetry) on the screen. Note that this might not come out correctly on differently-sized screens and with different font sizes even on a desktop browser. More importantly, setting left and right margins of, say, 25% means taking away half of the already-small screen space on handheld devices such as e-readers, tablets and smartphones—and will thus very likely result in hardly-legible text.
[#media|Use @media if you need different CSS for different media types] and also see the Case Study on Poetry for alternative ways of handling poetry.
Include all text as text, not just as images
It is perfectly legitimate to include text as an image in order to show what the original looked like, particularly if the text is ornately decorated or printed in an unusual typeface. If you do so, however, replicate these images using text as well so that the content is searchable.
This applies to things like illustrated title pages, passages in Gothic script, advertisements, etc.
Always include a cover image
On some e-readers, the cover is the only or the most obvious way of distinguishing e-books (see this screenshot of Apple’s iBooks for an example). If you do not specify a cover image in your HTML, the ebookmaker conversion software at PG will generate a random patterned image including basic title and author information.
So, in order to make your book recognisable, you need to provide a cover image. The most obvious choice is the actual book cover—if you have an adequate scan of it, and it states at least the book’s title and author. Alternatively, the book’s title page might work, which should include any relevant information. A third option is a custom cover created specifically for your e-book; if you want to go that route, make sure to read our PP guide to cover pages.
If you declare a language, declare the correct language
(X)HTML includes the “lang” and “xml:lang” attributes (depending on the flavour you’re using) to declare the language of an element’s content. They can be used on the
<html> element to declare the document’s main language, and on other elements—e.g.,
<blockquote> for longer quotations in a foreign language, or
<i> for italicised foreign phrases—to change the language for a part of the text.
If you specify the language for any part of your document, make sure you specify the correct language. (Ancient Greek is not the same thing as Modern Greek, for example.) The ISO 639-2 Language Code List can help with determining the code for a language.
Consider if you need different CSS for different media types or mobile formats
If you want your document to be displayed differently depending on whether it is viewed on a computer screen, printed, or on a handheld device, you can use CSS @media declarations or special classes supported by ebookmaker.
The Introduction to @media for DP and PG Use explains how this is done, the Introduction to Mobile Formats tells you what will happen once your HTML file is converted to mobile formats, and the Case Study on Media Types and Mobile Formats gives some examples.
Keep your code line lengths reasonable
When writing the HTML code, keep your lines at a reasonable length. Long lines will make it a lot harder to read and understand your code (even for yourself). The two best options to keep lines at a reasonable length are to either keep the lines as they are wrapped in the book, or to keep them as you have wrapped them in the text version. Use whichever option better fits your workflow.
Indent more complex parts of your code to convey their structure
Nested HTML code can get very hard to read if its indentation does not match its structure. As a courtesy to the people who might look at your code later on (your PPVer, the WhiteWashers at PG, the PG Errata Team, possibly potential readers of the e-book), indent complex parts of your code to make them more readable.
The Case Study on Tables contains some examples of neatly-indented tables.