Every programmer has encountered a seemingly simple task that requires a complex solution. Often, such tasks involve handling and understanding text, whether it's cleaning data, validating input, or extracting information from disorganized files. Enter the world of regular expressions, an essential tool that is easy to start and difficult to master.
What are Regular Expressions?
Regular expressions (also known as regex) are a sequence of characters that define a pattern to be searched within text. They are widely supported in popular programming languages like Python, JavaScript, Java, and PHP, and commonly used in text editors and Unix-based systems to search, manipulate and validate text.
Enough theory already, let's try it out. If you don't have a regexp interpreter handy, there are multiple websites that provide the service for free:
regex101 or
regexr
Example 1: Extracting dates
Extract date formats such as "01.05.2026" with the following regex :
\d{2}\.\d{2}\.\d{4}
Feel free to follow along with
regex101, copy past the regexp in the "regular expression" text input, and then type a sentence such as "today's date is the 20.05.2023 and it's sunny"
Example 2: Check for 1 word or the other
You can check for an entire word or another, for examples months in a year, with the following regexp
(August)|(September)
You can type: "This is a line with August in it"
Example 3: Validate an email address
The following regex pattern checks whether a string contains an "@" and a period, along with valid email characters.
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}
We hope you'll play around with these, or at least know they exist. Here are a few more examples to help you discover this beautiful tool.
1. Alphanumeric Check: [a-zA-Z0-9]*
Ensure a provided string has only letters or digits. This is helpful for usernames, IDs, or simple passwords.
2. Numeric Range: (\d{1,2}|100)
Validate if a number is between 1 and 100, useful for percentage or rating checks.
3. US Phone Number: (\+\d{1,2}\s)?\(?\d{3}[\-\s)]?\d{3}[\-\s]?\d{4}
Validate formatting of US phone numbers, supporting multiple formats like "(123) 456-7890" and "123-456-7890."
4. Simple Password Check: (?=.*[A-za-z])(?=.*\d)[A-Za-z\d]{8,}
Ensure a password has a minimum length of eight characters, at least one letter and one number.
5. URL: (https?:\/\/)?([\da-z.-]+)\.([a-z.]{2,6})([\/\w.-]*)*\/?
Validate a URL with support for various domains and paths, including optional "http://" or "https://."