GROUPS & ALTERNATION
Backreferences
Groups capture what they matched. \1 demands it again.
Every plain group stores the text it actually matched, numbered by opening paren: first group is \1, second \2, and so on. A backreference means "the exact same text again" - not the same pattern, the same text. So (\w+)\1 matches "byebye" but not "byelo".
Classic use: finding doubled words. \b(\w+) \1\b matches "is is" and "the the". The boundaries matter - without them, the second "is" in "this is" could pair with the "is" hiding inside "this".
Counting parens gets error-prone fast. You can also NAME a group: (?<word>\w+) \k<word> reads better and survives reordering, and in a replacement the name is $<word>. MDN: named capturing group.
Match every doubled word - a word, a space, then the same word again.
this is is a typo
the the end
nothing wrong here
Match any character immediately followed by itself - the doubled letters in a word.
balloon
hello
abc