1. Computing
Send to a Friend via Email

Regular Expressions

10. Predefined Classes

clr gif

A couple of tutorials back we looked at character classes that can be defined using []. While any combination of alternative characters can be defined using character classes as described in that tutorial there are a few classes that are so common that an alternative shorter notation was developed that can be used in place of those common classes.

The most obvious character class that deserves (and has) a shorter alternative is the class that tests for numeric. Instead of specifying [0-9] we can instead simply specify \d which means the same thing. The backslash (slosh) in front of the d escapes its regular meaning as a lowercase 'd' character and gives it the alternative meaning of any digit (ie. 0 through 9) instead.

There are a number of additional predefined classes that you will find useful.

  • \D any non-digit ~ equivalent to [^0-9]
  • \w any normal word character (letters numbers and underscore) ~ equivalent to [a-zA-Z0-9_]
  • \W any non- word character (anything except letters, numbers, and underscore)
  • \s any whitespace character (spaces, tabs, linefeeds, carriage returns, and nulls)
  • \S anything except whitespace characters

All of these predefined classes use the same format of escaping a letter of the alphabet to give that letter a special meaning. There is one more predefined class that has a special meaning without being escaped (and which therefore needs to be escaped to give it back its normal character meaning. That character is the period (.) and will match any character except for a line feed or carriage return.

©2014 About.com. All rights reserved.