subset pandas dataframe based on str.contains in python

To subset a pandas dataframe based on a partial string match using str.contains() method, you can use the following code:

main.py
subset_df = df[df['column_name'].str.contains('partial_string')]
65 chars
2 lines

Here, df is the name of the original dataframe and "column_name" is the name of the column you want to subset based on the partial string match. "partial_string" is the string you want to search for in the column. The resulting dataframe subset_df will only have rows where the column contains the substring "partial_string".

You can use optional parameters for str.contains() method such as case=False to do a case-insensitive match or regex=True to use a regular expression pattern instead of a partial string match.

main.py
subset_df = df[df['column_name'].str.contains('partial_string', case=False, regex=True)]
89 chars
2 lines

related categories

gistlibby LogSnag