for each column in a pandas dataframe, take the first value that's not nan in python

You can use the df.apply method to apply a custom function to each column, which takes the first value that is not NaN for that column. Here is an example:

main.py
import pandas as pd
import numpy as np

# Create example dataframe
df = pd.DataFrame({
    'col1': [np.nan, np.nan, 1, 2],
    'col2': [3, np.nan, np.nan, 4],
    'col3': [np.nan, 5, np.nan, np.nan]
})

# Define custom function to get first non-NaN value
def first_non_nan(col):
    return col.loc[~col.isna()].iloc[0]

# Apply function to each column
first_vals = df.apply(first_non_nan)

# Output results
print(first_vals)
425 chars
20 lines

This will output:

main.py
col1    1.0
col2    3.0
col3    5.0
dtype: float64
51 chars
5 lines

related categories

gistlibby LogSnag