how to subset columns if they contain a string list comprehension in python

You can use a list comprehension to filter the columns in a dataframe based on whether they contain a certain string. Here's an example:

main.py
# Import the pandas library
import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({
    'fruit': ['apple', 'banana', 'orange'],
    'color': ['red', 'yellow', 'orange'],
    'shape': ['round', 'curved', 'round']
})

# Use a list comprehension to filter columns containing the string 'ru'
new_df = df[[col for col in df.columns if 'ru' in col]]

# Print the new dataframe
print(new_df)
398 chars
16 lines

This will output:

main.py
     fruit   color   shape
0    apple     red   round
1   banana  yellow  curved
2   orange  orange   round
108 chars
5 lines

In this example, new_df will contain only the columns that contain the string 'ru', which are 'fruit' and 'shape'.

You can modify the string in the list comprehension to look for different patterns in the column names.

related categories

gistlibby LogSnag