You can use the duplicated
and fillna
methods in pandas to achieve this.
Assuming you have a dataframe called df
, you can keep only the first unique values in each column using the following code:
main.py59 chars2 lines
This will replace all duplicated values (excluding the first occurrence) with NaN
. The mask
function is used to replace values where the condition is True with a given value (in this case, NaN
).
Alternatively, if you want to replace all duplicated values (including the first occurrence) with NaN
, you can use the where
function instead:
main.py59 chars2 lines
This will replace all duplicated values with NaN
. The ~
symbol is used to negate the condition (~True
is False
and vice versa), so the where
function will replace values where the condition is False with a given value (NaN
).
gistlibby LogSnag