What is a Pattern?

The main purpose of regular expressions is to perform advanced pattern matching on text strings. A pattern defines rules so that a computer can recognize a sequence of characters.

Let's take for example, a pattern which matches a 7-digit phone number.

$test = '867-5309';
$pattern = '|^[0-9]{3}-[0-9]{4}$|';
$isphone = preg_match($pattern, $test);
echo 'result = ' . $isphone;

A pattern like this may seem daunting at first if you are new to regular expressions. Let's break down the individual components and see how this pattern works.

  • Pipe (|) is used to mark the beginning and the end of the pattern. It basically says that the text between the two pipes is the pattern we want to match.
  • Caret (^) means we want to match the beginning of the string. Without this, the pattern can start matching at any character in the string. With this, the pattern CAN ONLY match at the start of the string.
  • Brackets ([ and ]) indicate that we are defining a set or range characters. By itself, this type of set will only match one character (belonging to the set or range).
  • Hypen (-) when used within brackets indicates that we are defining a range of characters. So when we say [0-9], what we mean is any character between 0 and 9 (meaning any digit.)
  • Braces ({ and }) are used to indicate the exact number of occurances that the preceding character should match. So our digit should match 3 characters in-a-row.
  • Hypen (-) when used OUTSSIDE OF brackets just matches a normal hyphen character.
  • Dollar Sign ($) is used to match the end of the search text. By including caret (^) and dollar sign ($), we are telling the pattern matcher that the pattern must match the entire search string.

So the pattern we have made will only match a 7-digit phone number with a hyphen between the first 3 digits and the last 4 digits. The pattern [0-9] is such a common pattern, that there exists a special notation to define this (\d). You could also do away with the brace notation and just repeat this sequence the required number of times like:

$pattern = '|^\d\d\d-\d\d\d\d$|';

For simple patterns, this type of pattern is much simpler to understand. We leave it up to you to decide which is best for your purposes.

 
pattern_modifiers.txt · Last modified: Apr 18, 2008 - 2:01pm (external edit)
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki