PP guide to cross-volume links

From DPWiki

The problem...

You have a multi-volume book that contains links from one volume to another. These may be in the form of references in the text, e.g. "See Volume 1, Chapter 4 for a detailed description", or a Table of Contents or Index that covers all the volumes. When PPing, in the HTML version of any book you would normally make hyperlinks for references to pages, chapters, etc. However, there are difficulties when the targets of those links are not in the same volume (actually when those targets are in a text that will be a separate upload to PG). There are several options available to you, all of which have advantages and disadvantages.

Simplest option

Only hyperlink the references that are within the volume. Leave the cross-volume references unlinked. As well as simplicity, another argument for this approach is that any cross-volume links you make using one of the options below will only work for the HTML versions of your volumes hosted at PG. In all other circumstances your reader will be unable to use the cross-volume links. In particular, the ever-more-popular epub versions will not support the links, nor will harvested copies of the PG library, whether hosted elsewhere online, or written to CD, or downloaded to the reader's computer to view offline. You may want to comment in your TN that only references within this volume are hyperlinked.

It would normally be acceptable to copy an index from the last volume into an earlier volume, and hyperlink the local references within the volume. If you do that, you should mention it in the TN.

Single reference

Perhaps you want to give your readers a little more help finding the other volumes, but do not want to hyperlink all the cross-volume references. You might therefore want to add a Transcriber's Note at the top, something like, "This book was published in two volumes, of which this is the first. The second volume was released as Project Gutenberg ebook #99992, available at https://www.gutenberg.org/ebooks/99992." The problem is, you don't know what number to put in place of 99992, because volume 2 has not yet been uploaded to PG. If you already have Direct Upload access, then contact a friendly WWer when you are close to being ready to upload - the best way to do this is to email the WWers' list (pgww AT lists.pglaf.org). Ask for ebook numbers to be reserved for you for a multi-volume set - they will need to know how many volumes, of course! Then use those ebook numbers in place of 99992 above to refer to the other volumes in your TN.

If you do not have DU access, then your volumes will need to be PPVed. Since there may be some delay before this happens, and the WWers have to manually work around reserved numbers, you should leave the number as 99991, 99992, etc when you upload your volumes for PPV. Leave a note for the PPVer that this is what you have done. Once they pick up your projects, they will be able to request the actual numbers from a WWer for you once the project is ready for uploading. You (or they) can then change the 99991-type numbers to correctly reflect the ones that have been allocated, just before the project is uploaded.

Again, you may also want to comment in your TN that only references that are within this volume are hyperlinked. This option is quite simple, a bit more helpful to the reader, even if they have downloaded an epub version to an ebook reader, but of course, it's a bit more work for you the PPVer and the WWer.

Full cross-referencing

If you decide to go for full cross-referencing, then as described above, you will need to have ebook numbers reserved for you, either directly from a WWer or via your PPVer. As mentioned above, this must be done at the end of the PPing/PPVing process, so that the reserved numbers are not outstanding for long periods of time.

In order for you to test your links thoroughly before they are uploaded to PG, make a folder for each volume, e.g. vol1, vol2, etc. Have one HTML file (and images folder if needed) in each of these, e.g. vol1/vol1.html and vol1/images, vol2/vol2.html and vol2/images, etc. Now begin to create links...

Where a link is in the same volume, e.g. a reference to page 371 in an index, it might look like this: <a href="#Page_371">371</a> This is the same as you would do with a single volume text, and can be created in the same ways, either by manual editing, Guiguts HTML Markup Internal Link button, or regex wizardry (see elsewhere for helpful regexes).

If the index is in volume 2, but page 25 is in volume 1, then the link might look like this: <a href="../vol1/vol1.html#Page_25">25</a> The path to vol1.html is called a relative path because it describes the location of vol1.html relative to the location of vol2.html. It is important to use a relative path, rather than an absolute path (such as "C:/DP/vol1") since an absolute path will not work if you move your folders around, or when you send your files to the PPVer (see below)

You can still use the manual editing or the regex method to create these, but you won't be able to use the Guiguts HTML Markup box, since it is not an Internal Link, and the External Link button only links to a file, not to an anchor within a file.

Test your first link - when you view volume 2 in your browser, and click on a link to a page in volume 1, the browser should load volume 1 and jump to the correct page. Then go on to create all the other links, and complete any other work in all the volumes. Of course, you will want to describe the cross-volume links in your TN, and you may want to include a reference to the other volumes, as described in the "Single Reference" option above.

The only editing task left to do, is to convert the relative paths to the absolute ones required by PG. However, if you have a PPVer, it is worth communicating with them again at this point, to see if they would prefer to receive a first upload of the files with the relative paths. This will make it easier for them to test out your links during the PPV process.

Once the project has been PPVed and any necessary changes made, the final task is to do search and replaces in all the HTML files. Your PPVer may be happy to do this for you as their last task before uploading to PG, or they may ask you to do it and then check your work. If you have DU access, then of course, you will make all these changes yourself - it is not part of the WWers job.

Assuming your reserved ebook numbers were 77701, 77702, etc, replace all occurrences of

  • "../vol1/vol1.html" with "https://www.gutenberg.org/files/77701/77701-h/77701-h.htm"
  • "../vol2/vol2.html" with "https://www.gutenberg.org/files/77702/77702-h/77702-h.htm"

etc. Note the extension of HTML files at PG is ".htm"

Note for DUers and PPVers:

  • Only request reserved numbers once all volumes are ready to upload - the reserving process requires manual intervention from the WWers and can therefore occasionally go wrong, as well as potentially being awkward for them to work around for extended periods.

When the projects are being uploaded to PG, make sure you add a note to the WWers

  • First, mark it for the attention of the WWer who reserved the ebook numbers
  • Second, remind the WWer that this a multivolume set and you will soon be uploading (or have just uploaded) the other volumes.

Checking links

Some people have had trouble getting the W3C link checker to do cross-volume link checking if they don't have webspace to host their HTML files. Below is my description of how I downloaded and tested Linklint on Windows for a PPer, in case it is helpful for anyone else. I can't guarantee anything about Linklint's efficacy, only that it appears to be free and to do some checking of cross-volume links, albeit with a 15-year old user interface.


I unzipped the download & it ended up in C:\DP\PP\Tools\PerlTools\linklint-2.3.5\linklint-2.3.5 on my computer. I edited the linklint.bat file using Notepad to look like this (should be all on one line)

C:/DP/guiguts-win/perl/perl C:/DP/PP/Tools/PerlTools/linklint-2.3.5/linklint-2.3.5/linklint-2.3.5 -error -xref -out links.txt vol1/vol1.html vol2/vol2.html

I put the path to get to the perl.exe that comes with guiguts, and the path to the linklint-2.3.5 (Perl) file. Also note that I have used forward slashes (/) throughout. It seems a bit sensitive about backslashes (\). Also note the two last arguments, vol1/vol1.html and vol2/vol2.html. These are the files that will be checked. You could add vol3, etc if there are more.

I made folders for two volumes under my project folder which was C:/DP/Surgery, so I had C:/DP/Surgery/vol1 and C:/DP/Surgery/vol2. In the vol1 folder I made a vol1.html which had external links to things like "../vol2/vol2.html#Page_99". In the vol2 folder I made a vol2.html which had links like "../vol1/vol1.html#Chapter_7". Both files also had internal links, like "#Page_27".

To check the links, I copied the edited linklint.bat file from the folder where it unzipped, and pasted it into the Surgery folder (i.e. level with the vol1 and vol2 folders). Then I double clicked it. It briefly shows a black command prompt window. It creates a file called links.txt in the Surgery folder, which has the output from the program.