How to check if string contains substring in Python

Jul 29, 2023#python#strings

In Python, a string is a sequence of characters enclosed within single quotes (' '), double quotes (" "), or triple quotes (""" """ or ''' '''). Strings are immutable, meaning once you create a string, you cannot change its content. However, you can create new strings based on existing ones through various string manipulation techniques.

# Strings can be defined using single quotes
str_single = 'Hello, World!'

# Or double quotes
str_double = "Python Programming"

# Triple quotes can be used for multiline strings
str_multiline = '''This is a
multiline
string.'''

A substring is a smaller sequence of characters that appears within a larger string. It is essentially a part of the original string. Python provides different methods to check for substrings within a larger string.

Remember to choose the appropriate method based on your specific requirements. The in keyword and find() method are generally the simplest and most commonly used methods for basic substring existence checks. For more complex patterns, regular expressions provide powerful capabilities.

  1. Using in keyword

In Python, the in keyword is used to check for membership or containment within a sequence. It allows you to determine if a particular element (or substring for strings) is present within a given iterable object, such as a list, tuple, string, set, or dictionary.

element in iterable

The in keyword returns a boolean value (True or False) depending on whether the element is found in the iterable.

my_string = "Hello, World!"

# Check if a substring is present in the string
is_present = "Hello" in my_string  # True
is_present = "Python" in my_string  # False
  1. Using find() method

In Python, the find() method is a built-in string method used to find the index of the first occurrence of a substring within a string. If the substring is present, the method returns the index (position) where the substring starts. If the substring is not found, it returns -1.

string.find(substring, start, end)

The starting index from where the search should begin. It is inclusive, meaning it will consider the character at that index. The default value is 0, which means the search starts from the beginning of the string.

The ending index where the search should stop. It is exclusive, meaning it will not consider the character at that index. The default value is the length of the string, which means the search goes until the end of the string.

string = "Hello, World!"

# Find the index of the first occurrence of the substring "World"
index = string.find("World")  # Returns 7

# Search for a substring that doesn't exist
index_not_found = string.find("Python")  # Returns -1

# Find the index of the first occurrence of the substring "l"
index_l = string.find("l")  # Returns 2

# Search within a specific range
index_range = string.find("l", 3, 9)  # Returns 3 (search between index 3 and 8)

# Find the index of the first occurrence of the substring "o" after index 5
index_after = string.find("o", 5)  # Returns 8
  1. Using index() method

In Python, the index() method is a built-in string method used to find the index of the first occurrence of a substring within a string. If the substring is present, the method returns the position where the substring starts. If the substring is not found, it raises a ValueError.

string.index(substring, start, end)

The index() method is similar to the find() method. However, the main difference is in how they handle cases when the substring is not found. find() returns -1 when the substring is not found, while index() raises a ValueError.

string = "Hello, World!"
try:
    index = string.index("World")
    print("Substring found at index:", index)
except ValueError:
    print("Substring not found.")
  1. Using regular expressions with re module

Regular expressions (often abbreviated as regex or regexp) are powerful tools for pattern matching and text manipulation. They allow you to search, extract, and manipulate strings based on specific patterns.

In Python, the re module is a built-in module that provides support for working with regular expressions, including searching for patterns, replacing substrings, and splitting strings based on patterns.

Some of the key functions provided by the re module are:

  • re.search(pattern, string): Searches for a pattern in a given string and returns a match object if the pattern is found, or None otherwise.
  • re.match(pattern, string): Checks if the pattern matches at the beginning of the string and returns a match object if there is a match, or None otherwise.
  • re.findall(pattern, string): Returns all non-overlapping occurrences of a pattern in the string as a list of strings.
  • re.finditer(pattern, string): Returns an iterator yielding match objects for all occurrences of the pattern in the string.
  • re.sub(pattern, replacement, string): Replaces occurrences of the pattern in the string with the specified replacement.
  • re.split(pattern, string): Splits the string by occurrences of the pattern and returns a list of substrings.

Here’s a simple example of using re.search to check substring:

import re

string = "Hello, World!"
if re.search(r"World", string):
    print("Substring found!")
  1. Using startswith() and endswith() methods

In Python, startswith() and endswith() are built-in string methods used to check whether a string starts or ends with a specific substring, respectively. They return a boolean value (True or False) based on whether the given conditions are satisfied.

string = "Hello, World!"
if string.startswith("Hello"):
    print("String starts with 'Hello'")
if string.endswith("World!"):
    print("String ends with 'World!'")
  1. Using count() method

In Python, the count() method is a built-in string method used to count the number of non-overlapping occurrences of a substring within a given string. It returns an integer that represents how many times the specified substring appears in the original string.

string.count(substring, start, end)

The count() method is useful when you need to know how many times a particular substring appears in a string, regardless of their positions or overlaps. It can be helpful for tasks such as counting specific characters or validating the frequency of certain patterns in a text.

string = "Hello, Hello, Hello!"

# Count the number of occurrences of the substring "Hello" in the string
count_hello = string.count("Hello")  # Returns 3

# Count the number of occurrences of the substring "o"
count_o = string.count("o")  # Returns 3

# Count the number of occurrences of the substring "o" after index 5
count_o_after = string.count("o", 5)  # Returns 2

Keep in mind that count() is different from find() and index() methods, which return the index of the first occurrence of the substring or raise an error if the substring is not found. The count() method simply provides a count of occurrences and doesn’t give you the specific locations.