Regular Expressions (RegEx) in Python

Regular Expressions (RegEx) in Python

Kishore V


Regular Expressions (RegEx) in Python

A Regular Expression (RegEx) is a sequence of characters that defines a search pattern.

It is mainly used for matching, searching, and manipulating text.

With RegEx, you can:

    • Validate input

    • Search text patterns

    • Extract specific data

    • Replace parts of strings

RegEx Module in Python

Python provides a built-in module called re to work with regular expressions.

Importing the re Module

import re

Once imported, you can use RegEx functions to search and manipulate strings.

Using RegEx in Python

Example: Check Pattern at Start and End of a String

The following example checks whether a sentence starts with "Hello" and ends with "World".

Output:

<re.Match object; span=(0, 18), match='Hello Python World'>
Try it Yourself

If a match is found, a Match object is returned; otherwise, None is returned.

RegEx Functions

The re module provides several useful functions:

Function Description
findall()Returns all matching patterns as a list
search()Returns the first match as a Match object
split()Splits a string based on a pattern
sub()Replaces matched patterns with new text

Metacharacters

Metacharacters have special meanings in RegEx patterns.

Symbol Meaning Example
[]Set of characters[a-z]
\Special sequence\d
.Any characterc.t
^Starts with^Hi
$Ends withend$
*Zero or morego*
+One or morego+
?Zero or onego?
{}Exact count\d{3}
|Either ora|b
()Grouping(abc)

Flags in Regular Expressions

Flags modify how RegEx patterns behave.

Flag Short Description
re.IGNORECASEre.ICase-insensitive matching
re.MULTILINEre.MMatch beginning of each line
re.DOTALLre.SDot matches newline
re.ASCIIre.AASCII-only matching
re.VERBOSEre.XReadable RegEx patterns

Special Sequences

Special sequences start with a backslash \ and have specific meanings.

Sequence Description Example
\AStart of string\AHello
\bWord boundaryr"\bcat"
\BNot word boundaryr"\Bcat"
\dDigits (0–9)\d+
\DNon-digits\D
\sWhitespace\s
\SNon-whitespace\S
\wWord characters\w+
\WNon-word characters\W
\ZEnd of stringend\Z

Sets in RegEx

Sets are defined using square brackets [].

Set Description
[xyz]Matches x, y, or z
[a-z]Lowercase letters
[^abc]Anything except a, b, c
[0-9]Digits
[A-Za-z]Upper and lowercase letters
[+]Matches literal +

The findall() Function

Returns all matches as a list.

Example: Find All Occurrences

Output:

['Python', 'Python']
Try it Yourself

If no matches are found, an empty list is returned.

The search() Function

Returns the first match found in the string.

Example: Search for First Digit

Output:

16
Try it Yourself

If no match exists, None is returned.

The split() Function

Splits a string at each match.

Example: Split by Comma

Output:

['apple', 'banana', 'orange']
Try it Yourself

Split with Limit

Output:

['2024', '05', '20']
Try it Yourself

The sub() Function

Replaces matching patterns with new text.

Example: Replace Digits

Output:

My pin is ****
Try it Yourself

Limit Replacements

Output:

Call XXXX543210 now
Try it Yourself

Match Object & Methods

A Match object contains details about the match.

Example: Get Match Information

Output:

<re.Match object; span=(17, 21), match='East'>
(17, 21)
Sun rises in the East
East
Try it Yourself

Our website uses cookies to enhance your experience. Learn More
Accept !