Character classes - JavaScript | MDN

Character classes

Character classes distinguish kinds of characters such as, for example, distinguishing between letters and digits.

Try it

const chessStory = "He played the King in a8 and she moved her Queen in c2.";
const regexpCoordinates = /\w\d/g;
console.log(chessStory.match(regexpCoordinates));
// Expected output: Array [ 'a8', 'c2']

const moods = "happy 🙂, confused 😕, sad 😢";
const regexpEmoticons = /[\u{1F600}-\u{1F64F}]/gu;
console.log(moods.match(regexpEmoticons));
// Expected output: Array ['🙂', '😕', '😢']

Types

Examples

>

Looking for a series of digits

In this example, we match a sequence of 4 digits with \d{4}. \b indicates a word boundary (i.e., do not start or end matching in the middle of a number sequence).

js
const randomData = "015 354 8787 687351 3512 8735";
const regexpFourDigits = /\b\d{4}\b/g;

console.table(randomData.match(regexpFourDigits));
// ['8787', '3512', '8735']

See more examples in the character class escape reference.

Looking for a word (from the latin alphabet) starting with A

In this example, we match a word starting with the letter A. \b indicates a word boundary (i.e., do not start matching in the middle of a word). [aA] indicates the letter "a" or "A". \w+ indicates any character from the Latin alphabet, multiple times (+ is a quantifier). Note that because we already match until there are no more word characters, an end \b boundary is not necessary.

js
const aliceExcerpt =
  "I'm sure I'm not Ada,' she said, 'for her hair goes in such long ringlets, and mine doesn't go in ringlets at all.";
const regexpWordStartingWithA = /\b[aA]\w+/g;

console.table(aliceExcerpt.match(regexpWordStartingWithA));
// ['Ada', 'and', 'at', 'all']

See more examples in the character class escape reference.

Looking for a word (from Unicode characters)

Instead of the Latin alphabet, we can use a range of Unicode characters to identify a word (thus being able to deal with text in other languages like Russian or Arabic). The "Basic Multilingual Plane" of Unicode contains most of the characters used around the world and we can use character classes and ranges to match words written with those characters.

js
const nonEnglishText = "Приключения Алисы в Стране чудес";
const regexpBMPWord = /([\u0000-\u0019\u0021-\uFFFF])+/gu;
// BMP goes through U+0000 to U+FFFF but space is U+0020

console.table(nonEnglishText.match(regexpBMPWord));
["Приключения", "Алисы", "в", "Стране", "чудес"];

See more examples in the Unicode character class escape reference.

Counting vowels

In this example, we count the number of vowels (A, E, I, O, U, Y) in a text. The g flag is used to match all occurrences of the pattern in the text. The i flag is used to make the pattern case-insensitive, so it matches both uppercase and lowercase vowels.

js
const aliceExcerpt =
  "There was a long silence after this, and Alice could only hear whispers now and then.";
const regexpVowels = /[aeiouy]/gi;

console.log("Number of vowels:", aliceExcerpt.match(regexpVowels).length);
// Number of vowels: 26

See also