create a new column in pandas based on a condition in python

To create a new column in pandas based on a condition in Python, you can use the apply method along with a lambda function or the numpy where function.

Consider the following example DataFrame:

main.py
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40]}
df = pd.DataFrame(data)
132 chars
6 lines

Suppose you want to create a new column called "Category" based on the condition that people aged 30 or above are assigned a category of "Senior" and others are assigned a category of "Junior".

Using apply and a lambda function:

main.py
df['Category'] = df['Age'].apply(lambda x: 'Senior' if x >= 30 else 'Junior')
78 chars
2 lines

Using numpy where:

main.py
import numpy as np

df['Category'] = np.where(df['Age'] >= 30, 'Senior', 'Junior')
83 chars
4 lines

Both methods will produce the same result:

main.py
     Name  Age Category
0   Alice   25   Junior
1     Bob   30   Senior
2  Charlie   35   Senior
3    David   40   Senior
122 chars
6 lines

Note that in the lambda function, you can define any condition based on your requirements and assign the appropriate category value.

related categories

gistlibby LogSnag