GROUPS & ALTERNATION

Backreferences

Groups capture what they matched. \1 demands it again.

Every plain group stores the text it actually matched, numbered by opening paren: first group is \1, second \2, and so on. A backreference means "the exact same text again" - not the same pattern, the same text. So (\w+)\1 matches "byebye" but not "byelo".

Classic use: finding doubled words. \b(\w+) \1\b matches "is is" and "the the". The boundaries matter - without them, the second "is" in "this is" could pair with the "is" hiding inside "this".

Counting parens gets error-prone fast. You can also NAME a group: (?<word>\w+) \k<word> reads better and survives reordering, and in a replacement the name is $<word>. MDN: named capturing group.

PRACTICE - 2 DRILLS 0/2 DONE
DRILL 1/2

Match every doubled word - a word, a space, then the same word again.

/ /
this is is a typo
must match: "is is"
the the end
must match: "the the"
nothing wrong here
must match nothing
DRILL 2/2- Double letters

Match any character immediately followed by itself - the doubled letters in a word.

/ /
balloon
must match: "ll" "oo"
hello
must match: "ll"
abc
must match nothing