drop columns if there are more than one non-missing values in that column, otherwise keep it pandas dataframe in python

You can use the count method to count the non-missing values for each column in the dataframe. Then, you can filter the columns to keep only those with less or equal to one non-missing value using boolean indexing. Finally, you can use the drop method to drop the selected columns from the original dataframe.

Here's an example:


main.py
import pandas as pd

# create example dataframe
df = pd.DataFrame({'A': [1, 2, 3, None, None],
                   'B': [None, 1, None, 2, None],
                   'C': [None, None, None, None, None],
                   'D': [1, None, 2, None, None],
                   'E': [None, None, None, None, 1]})

# count non-missing values for each column
count = df.count()

# perform boolean indexing to select columns with less or equal to one non-missing value
selected_cols = count[count <= 1]

# drop selected columns from the original dataframe
new_df = df.drop(selected_cols.index, axis=1)

print(new_df)
606 chars
20 lines

Output:


main.py
     A    B
0  1.0  NaN
1  2.0  1.0
2  3.0  NaN
3  NaN  2.0
4  NaN  NaN
72 chars
7 lines

In this example, columns C and E have no non-missing values, so they are dropped. Column D has more than one non-missing value, so it is also dropped. Columns A and B have one non-missing value each, so they are kept in the new dataframe.

similar python code snippets

dataframes in python

invest in stocks in python

merge tables in python

take the column data from two different excel files to construct a three dimensional array with a part of the data in the columns and the other in the rows in python

create a matrix in pandas in python

initialize dataframe in python

calculate a rolling average in python

calculate the z score by group. in python

create a csv file in python

run a machine learning model in python

related categories

pandas

python