Regex for Text Cleaning: 15 Essential Patterns
5 min read
Regular expressions are the Swiss Army knife of text cleaning. Here are the 15 patterns every developer should know.
Essential Patterns
| Task | Regex |
|---|---|
| Strip HTML tags | <[^>]*> |
| Collapse whitespace | \s+ → |
| Remove URLs | https?://\S+ |
| Remove emails | [\w.+-]+@[\w-]+\.[\w.-]+ |
| Remove numbers | \d+ |
| Remove non-ASCII | [^\x00-\x7F] |
| Remove punctuation | [^\w\s] |
| Trim each line | ^\s+|\s+$ (multiline) |
| Remove blank lines | ^\s*$\n (multiline) |
| Remove duplicate spaces | {2,} → |
Online Tool
Use our plain text converter — it applies these patterns with toggleable options, no regex knowledge needed.