Regex in formatting
Regular expressions (regex) may assist with repetitive tasks in the formatting rounds, especially when there is a sequence of pages each requiring similar formatting.
Caution
The use of regex can assist in formatting by reducing the manual select-and-press work. It does not automate the formatting task. After running a regex, it is vital to check the page in detail to ensure that the regex has had the desired effect and only that effect, as well as to identify any other formatting that is needed.
How to use
- Bring up the search/replace window
- Type or copy code into the Search and Replace boxes
- Tick the Regular Expression box
- Press Replace All
If the regex goes wrong, there is an Undo button next to the Replace All button.
Specimens of regex
Inline formatting in general text
Italicize a set of frequent abbreviations (example)
- Search
(etc\.|Ib\.|e\.g\.|i\.e\.|cf\.)
- Replace
<i>$1</i>
Italicize stand-alone letters (maths/science texts)
- Search
([(\s|^)])([B-Zb-z])([,.;:\-\n ])
- Replace
$1<i>$2</i>$3
Inline formatting in lists and works of reference
Italicize single letter followed by . and space (Add space at end in both boxes)
- Search
([a-z]\.)
- Replace
<i>$1</i>
Bold first word in paragraph
- Search
(\n\n)([A-Za-z.,]+)
- Replace
$1<b>$2</b>
Indexes
Rejoin consecutive lines (unwrap)
- Search
(.)\n(.)
- Replace
$1 $2
Line throw after semicolon with indent on next line (Keep pressing Replace All until all semicolons processed) Note: Doesn't work on last line of page
- Search
; ([^\n]+)(\n)
- Replace
;$2 $1$2
Close up sub-entries starting with a lower-case letter
- Search
\n([a-z])
- Replace
$1
Single line throw before a line ending with a page number, with indent on that line Note: Doesn't work on last line of page
- Search
\n([^\n]+)([0-9])\n
- Replace
$1$2\n
Replace a row of dots & spaces before a number by comma & 1 space
- Search
([A-Za-z])[\. ]+([1-9])
- Replace
$1, $2
Contents
Small caps on a new line up to a punctuation mark
- Search
(\n\n)([A-Za-z:;., ]+)
- Replace
$1<sc>$2</sc>
Replace a row of dots & spaces by six spaces (Put six spaces in the Replace box)
- Search
[\. ]{3,}
- Replace
Lists
Delete blank lines between list items to match requirements of formatting guidelines
- Search
(\S)(\n)\n(\S)
- Replace
$1$2$3
Italicize "s." and "d." in price lists
- Search
([0-9])([s|d])\.
- Replace
$1<i>$2.</i>
Italicize first word on every line that is followed by a comma. Used for a glossary.
- Search
(\n)(\w+?),
- Replace
$1<i>$2</i>,
Tables
Insert a vertical bar | in front of each number on the page
- Search
( )([0-9\-\.,%·]+)
- Replace
$1|$1$2
Insert a vertical bar | in front of content with 2 or more spaces before it
- Search
(.)([ ]{1,80}) (.)
- Replace
$1$2| $3
Please add more here
[Description]
- Search
- Replace
[Description]
- Search
- Replace
See also
- Regex Cookbook (mainly for post-processing)
- Regex course