subset columns of a dataframe if they contain a string in python in python

To subset columns of a dataframe based on whether they contain a string in Python, you first need to import the pandas library.

Then you can use the .filter() method along with a lambda function to check if each column name contains the desired string. Here's an example:

main.py
import pandas as pd

# create a sample dataframe
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'Address': ['123 Main St', '456 Elm St', '789 Oak St']}
df = pd.DataFrame(data)

# subset columns that contain the string 'dd'
df_subset = df.filter(lambda x: 'dd' in x, axis=1)
309 chars
11 lines

In this example, the resulting df_subset dataframe would only contain the Address column, since the other columns do not contain the string 'dd'.

Note that the axis=1 argument specifies that we are filtering columns, not rows.

related categories

gistlibby LogSnag