how to select only those columns where there are missing values in python

You can use the isnull() method to check for missing or null values in a DataFrame column. You can then use these boolean values to select only the columns with missing values.

Here's an example code snippet:

main.py
import pandas as pd

# create sample data with missing values
data = {'col1': [1, 2, None, 4], 
        'col2': [None, 6, 7, 8],
        'col3': [9, None, None, 12]}
df = pd.DataFrame(data)

# select only columns with missing values
missing_cols = df.columns[df.isnull().any()]
df_missing = df[missing_cols]
308 chars
12 lines

The isnull() method returns a DataFrame of boolean values with the same shape as the original DataFrame. The any() method then checks if there are any True values in each column, which indicates that there is at least one missing value. The columns attribute is used to get the column names where there are missing values.

Finally, you can use these column names to select the relevant columns using DataFrame indexing. The resulting DataFrame df_missing will have only those columns that have at least one missing value.

related categories

gistlibby LogSnag