Regular Expressions (RegEx) in Python
gocourse.in Maintenance

We'll be back soon

Our CDN (cdn.gocourse.in) is currently unreachable. Some images, JavaScript, or CSS files may not load properly.

Estimated downtime: ~30 minutes

Regular Expressions (RegEx) in Python

Kishore V


Regular Expressions (RegEx) in Python

A Regular Expression (RegEx) is a sequence of characters that defines a search pattern. It is mainly used for matching, searching, and manipulating text.

With RegEx, you can:

  • Validate input
  • Search text patterns
  • Extract specific data
  • Replace parts of strings

RegEx Module in Python

Python provides a built-in module called re to work with regular expressions.

Importing the re Module:

import re

Once imported, you can use RegEx functions to search and manipulate strings.

Using RegEx in Python

Example: Check Pattern at Start and End of a String

The following example checks whether a sentence starts with "Hello" and ends with "World".

<re.Match object; span=(0, 18), match='Hello Python World'>

If a match is found, a Match object is returned; otherwise, None is returned.

RegEx Functions

The re module provides several useful functions:

Function Description
findall() Returns all matching patterns as a list
search() Returns the first match as a Match object
split() Splits a string based on a pattern
sub() Replaces matched patterns with new text

Metacharacters

Metacharacters have special meanings in RegEx patterns.

Symbol Meaning Example
[] Set of characters [a-z]
\ Special sequence \d
. Any character c.t
^ Starts with ^Hi
$ Ends with end$
* Zero or more go*
+ One or more go+
? Zero or one go?
{} Exact count \d{3}
| Either or a|b
() Grouping (abc)

Flags in Regular Expressions

Flags modify how RegEx patterns behave.

Flag Short Description
re.IGNORECASE re.I Case-insensitive matching
re.MULTILINE re.M Match beginning of each line
re.DOTALL re.S Dot matches newline
re.ASCII re.A ASCII-only matching
re.VERBOSE re.X Readable RegEx patterns

Special Sequences

Special sequences start with a backslash \ and have specific meanings.

Sequence Description Example
\A Start of string \AHello
\b Word boundary r"\bcat"
\B Not word boundary r"\Bcat"
\d Digits (0–9) \d+
\D Non-digits \D
\s Whitespace \s
\S Non-whitespace \S
\w Word characters \w+
\W Non-word characters \W
\Z End of string end\Z

Sets in RegEx

Sets are defined using square brackets [].

Set Description
[xyz] Matches x, y, or z
[a-z] Lowercase letters
[^abc] Anything except a, b, c
[0-9] Digits
[A-Za-z] Upper and lowercase letters
[+] Matches literal +

The findall() Function

Returns all matches as a list.

Example: Find All Occurrences

['Python', 'Python']

If no matches are found, an empty list is returned.

The search() Function

Returns the first match found in the string.

Example: Search for First Digit

16

If no match exists, None is returned.

The split() Function

Splits a string at each match.

Example: Split by Comma

['apple', 'banana', 'orange']

Split with Limit

['2024', '05', '20']

The sub() Function

Replaces matching patterns with new text.

Example: Replace Digits

My pin is ****

Limit Replacements

Call XXXX543210 now

Match Object & Methods

A Match object contains details about the match.

Example: Get Match Information

<re.Match object; span=(17, 21), match='East'> (17, 21) Sun rises in the East East

Our website uses cookies to enhance your experience. Learn More
Accept !