Regular Expressions

Learn how to use regular expressions to search, match, and manipulate text.

⬅ Previous Next ➡

Regular Expressions in Python

Regular expressions (often shortened to "regex" or "regexp") are sequences of characters that define a search pattern. They are powerful tools for searching, matching, and manipulating text based on patterns. Python's re module provides a robust way to work with regular expressions.

The `re` Module

To use regular expressions in Python, you need to import the re module:

import re

Key Functions in the `re` Module

The re module provides several key functions for working with regular expressions. Here's an overview of the most commonly used ones:

`re.search(pattern, string, flags=0)`

The re.search() function searches the string for the first occurrence of the pattern. If the pattern is found, it returns a match object; otherwise, it returns None. The optional flags argument can be used to modify how the pattern is matched (e.g., case-insensitive matching).

Example:

 import re

text = "The quick brown fox jumps over the lazy dog."
pattern = "fox"

match = re.search(pattern, text)

if match:
    print("Pattern found:", match.group())  # Output: Pattern found: fox
    print("Start index:", match.start())    # Output: Start index: 16
    print("End index:", match.end())      # Output: End index: 19
else:
    print("Pattern not found.")

`re.match(pattern, string, flags=0)`

The re.match() function attempts to match the pattern at the beginning of the string. If the pattern matches at the beginning, it returns a match object; otherwise, it returns None. Like re.search(), it also accepts optional flags.

Example:

 import re

text = "The quick brown fox jumps over the lazy dog."
pattern = "The"
pattern2 = "quick"

match = re.match(pattern, text)
match2 = re.match(pattern2, text)


if match:
    print("Pattern found at the beginning:", match.group()) # Output: Pattern found at the beginning: The
else:
    print("Pattern not found at the beginning.")           # This will be printed for quick

if match2:
    print("Pattern found at the beginning:", match2.group())
else:
    print("Pattern not found at the beginning.")

`re.findall(pattern, string, flags=0)`

The re.findall() function returns a list of all non-overlapping matches of the pattern in the string. If the pattern contains capturing groups, it returns a list of tuples, where each tuple contains the matches for each group. If no capturing groups are present, it returns a list of the matched strings.

Example:

 import re

text = "The cat sat on the mat. Another cat is here."
pattern = "cat"

matches = re.findall(pattern, text)

print("All matches:", matches)  # Output: All matches: ['cat', 'cat']

# Example with capturing groups
text = "user1@example.com, user2@domain.net"
pattern = r"(\w+)@(\w+\.\w+)"  # Capture username and domain

matches = re.findall(pattern, text)

print("Matches with groups:", matches) #Output: Matches with groups: [('user1', 'example.com'), ('user2', 'domain.net')]

`re.sub(pattern, replacement, string, count=0, flags=0)`

The re.sub() function replaces all occurrences of the pattern in the string with the replacement. The count argument specifies the maximum number of replacements to make. If count is 0 (the default), all occurrences are replaced. The replacement can be a string or a function.

Example:

 import re

text = "The quick brown fox jumps over the lazy dog."
pattern = "fox"
replacement = "wolf"

new_text = re.sub(pattern, replacement, text)

print("Original text:", text)           # Output: Original text: The quick brown fox jumps over the lazy dog.
print("Modified text:", new_text)     # Output: Modified text: The quick brown wolf jumps over the lazy dog.

#Example with count
text2 = "apple banana apple cherry apple"
pattern2 = "apple"
replacement2 = "orange"

new_text2 = re.sub(pattern2, replacement2, text2, count=2)

print("Original text:", text2) # Original text: apple banana apple cherry apple
print("Modified text:", new_text2) #Modified text: orange banana orange cherry apple