subset columsn if they contain a string of in python

In order to subset columns based on whether they contain a particular string, we can use the filter() method and provide a like argument with the string. Here is an example:

main.py
import pandas as pd

# create a sample dataframe
data = {
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['New York', 'Boston', 'Chicago'],
    'job_title': ['Data Scientist', 'Software Engineer', 'Data Analyst']
}

df = pd.DataFrame(data)

# subset the columns that contain the string 'data'
df_filtered = df.filter(like='data')

print(df_filtered)
381 chars
17 lines

This will output:

main.py
        job_title
0  Data Scientist
1  Software Engineer
2  Data Analyst
73 chars
5 lines

In this example, the filter() method looks for columns that contain the string 'data', and returns a new dataframe with only those columns (job_title, in this case).

gistlibby LogSnag