User:Lvl/Regex course

From DPWiki

A small course on regular expressions

What is it?

A "regular expression" (often abbreviated regex) is a way of representing specific patterns of text strings.

It serves two purposes:

  • to search for text matching a given pattern (e.g. "find a comma followed by three digits")
  • to replace that text with another text, possibly reusing parts of the original text in the replacement string (e.g. "replace the comma with a dot, and keep the digits")


Variants

Regular expressions have been used since ages, and several variants emerged during history. We shall discuss the most common variant (i.e. the so-called "extended regular expressions", or ERE) supported in the widest range of software; then in an other lesson we shall look at some extensions introduced in the perl computer language (which are therefore supported e.g. in guiguts).


Software

What software supports regular expressions? Many, including these most obvious places:

  • many text editors
  • PP tools like guiguts
  • the search/replace function in the proofreading interface

I'm not saying that it's imperative to try on a computer tool when learning, but it helps a lot.


Table of contents

Chapter 1
Base concepts, with lots of examples, avoiding too arduous problems.
Chapter 2
More complex examples, diacritics, details.
Chapter 3
Extensions: sober quantifiers; assertions; multi-line.


Links

Chantier.png TODO