User:Mairis/Workflow

From DPWiki
Jump to navigation Jump to search

Work in progress...

Project Setup

Software

Reference Folder

  • \clearance
  • \illo_jpg
  • \illo_tifs
  • \img_jp2
  • \img_png
  • \img_raw_tifs
  • \img_ref
  • \img_split
  • \img_st
  • \scantailor
  • \text
  • \textw
  • \upload
  • notes.txt

Preparation

Check Suitability

  1. Check the work is in the public domain
    • USA published ≤ 1929
    • UK author's death ≤ 1954
  2. Search the In Progress List
  3. Compare sources for best scans:
    • Colour preferred over b&w
    • Good quality scans
      • Clear, legible print
      • HQ illustrations (600dpi)
    • Complete
      • No missing pages
      • Includes all illustrations listed

Create Project Folder

  1. Make a copy of reference folder
  2. Rename to book title
  3. Download images from source
    1. DL raw JP2 files (zip or tar)
    2. Extract to \img_jp2
    3. Delete blank pages before/after book content
  4. Prep TP&V and save to \clearance

Clearance Request

  1. Fill out a clearance request
    • Upload tp&v images
    • Missing information can be found at WorldCat or national libraries.
    • Include explanations and links to sources
  2. Submit and wait for approval
    • Usually takes a few days to be reviewed
    • Recieve email with results

Image Prep

Keep notes.txt open while you work and record anything that might be useful.

Convert JP2 files to TIF

  1. Open jp2 image in XnView
  2. Browse (folder icon)
  3. Select all images
  4. Batch convert (icon)
  5. Select tif.xbs preset
  6. Convert
  7. Close XnView

Renumber TIF files

  1. Open \img_raw_tifs in Bulk Rename Utility
  2. Rename all pages prior to page 1 using "front matter" preset
    • Add > prefix > 000
    • Number > insert > at 3
    • Type > a-z
    • Remove > last 3
  3. Rename remaining pages using "renumber" preset
    • Numbering
      • Mode: prefix
      • Pad: 3
    • Remove > last 3
  4. Check last page; the page number and file number should match
  5. Close Bulk Rename Utility

Scan Tailor

  1. Open ST and create new project
    • Input: \img_raw_tifs
    • Output \img_st
    • Select all images except cover
    • Set 600dpi for all pages
    • Save to \scantailor (save periodically as you work)
  2. Fix orientation
    1. Rotate first page the right way
    2. Apply to > every other page
    3. Do the same with the next page (opposite direction)
  3. Select content
    1. Press arrow button for auto
    2. Click ‘beep when finished’ and wait
    3. Arrange by height and check top and bottom images
      • Resize box as necessary until the pages are more or less the same size
      • ‘Remove content box’ from empty pages
    4. Repeat for arrange by width
    5. Scan image thumbnails and alter anything that doesn’t look right
  4. Margins
    1. Set all margins to 5.0 > apply to all pages
    2. Edit alignment of pages
      1. Order by height and focus on the top of the list
      2. Title page, dedication, etc should be centred
      3. Chapters that start midway down the page set to bottom
  5. Output
    1. Set output resolution to 600 > apply to all pages
    2. Black and white > apply to all pages
    3. Set despeckling to none > apply to all pages
    4. Select title page and press the arrow to apply to all pages
    5. Select ‘beep when finished’ and wait

Split Pages

  1. Index
    1. Start a new project in Scan Tailor
      1. Input > \img_raw_tif
      2. Select pages that need to be split
      3. Output > img_st_index
    2. Follow the same steps above, except:
      1. Split the page in half > apply to all and adjust
      2. Include the header only in the first image, crop to exclude in all others
    3. Run output
    4. Rename files using Bulk Rename
      1. Add “r_” prefix
      2. Replace “_1L” with “a”
      3. Replace “_2R” with “b”
    5. Move the reference images from \img_st to \img_ref
    6. Copy the split files from img_st_index to img_st
    7. Open the first split file and the first reference file in Paint
      1. Copy the full title onto the split page and save
  2. Change image format
    1. Open XnView
    2. Tools > Batch convert
      1. Input:
        1. Add TIF files from \book\img_st  
      2. Output
        1. Output folder to img_png
        2. Filename: {Filename}
        3. Format: PNG
      3. Click ‘convert’ and wait
    3. Batch convert again
      1. Actions
        1. Resize shortest side to 1000px
          1. (if format and resize are changed at the same time, the size gets messed up)
    4. In the file explorer, sort the images and

Text Prep

gImageReader

  • Open \img_st
  • Select all TIF files > Recognise all English > Batch mode…
  • Leave options boxes unticked and click OK
  • When finished, close gImageReader

Guiprep

  1. Add all txt files from \img_st to \textw in the book folder
  2. Change directory > \book
  3. Process Text
    1. Leave ‘rename txt files’ unchecked
    2. Leave ‘convert to ISO 8859-1’ unchecked
    3. Leave ‘fix old English’ unchecked (unless necessary)
    4. Click ‘Start processing’ and wait.
  4. Check headers/footers and remove if necessary
  5. Close Guiprep

Notepad++

  1. 10. Use Notepad++ to remove any tabs in text files (??? maybe skip)
    1. Search > Find in Files
    2. Find what: \t
    3. Replace with: (one space)
    4. Directory: \text
    5. Search mode: Extended
    6. Click "Replace in Files" and OK when the window pops up "Are you sure?"
    7. Close Notepad++

Illustration Prep

  1. MOVE cover to folder illo_tifs
  2. COPY the title page and other illos to same folder
  3. Open Scan Tailor
  4. New project > open illo_tifs
  5. Crop and rotate
  6. Save project to same folder

Change illustration format

  1. Select files from \book\illo_tifs\out
  2. Actions unchecked
  3. Output
    1. Folder: \illo_jpg
    2. Filename: {Filename}
    3. Format: JPG
    4. Convert

Create ZIP

  1. Copy images, text, and illustrations (and ref images) into \book\book
  2. Zip this folder

Upload

Create Project

  1. Create a new project on DP
    1. PM tab > Create project
    2. Fill in the information (it should match the clearance request)
      1. Include publishing date in title: [1873]
      2. Author: Surname, First Name
      3. Add extra character sets if necessary
      4. Copy and paste standard format to project comments
      5. Fill in information (links to author wiki, information about context of book, things to look out for)
    3. Add project comments for proofreaders
      1. Are there other languages included? Do they have WordCheck dictionaries available?
      2. Have extra character sets or special characters been added?
      3. Are there characters without unicode that need special handling?
      4. Is there anything that might make the characters difficult to read? (fading, ink blots)
  2. Upload files to DP
    1. Upload the zip folders to file manager
    2. On project page, enter zip file location in the Add/Replace field
    3. Delete files from files manager
  3. Update the word lists
    1. On project page, click ‘edit project words list’
  4. Review project page
    1. Project Quick Check
    2. Check images
  5. Message dp-format for advice on what to include in the notes for formatters
  6. Change state of project from “New Project” to “Proofreading Round 1: Unavailable”

Release the Project

  1. When dp-format provides feedback:
    1. If there are no notes, proceed to release the project
    2. If there are notes:
      1. Write the notes for formatters and save in a message draft
      2. Add a hold in F1 Waiting
  2. Release into P1
    1. Change project status to “Proofreading Round 1: Waiting for Release”
  3. Monitor project
    1. Review suggestions and keep good/bad words list up-to-date
    2. Answer questions in the project thread
    3. Search concatenated text file for proofer’s notes
    4. Add any salient info to the project comments
  4. When the project reaches F1 Waiting:
    1. Add the formatting notes to the project comments
    2. Remove the hold

Notes

See also

Official Docs

User Guides