create a variable based on a condition using pandas in python

To create a variable based on a condition using pandas in Python, you can use the np.where() function from the numpy library, which is commonly used with pandas.

Here is an example of how to create a new variable new_var based on a condition in a pandas DataFrame df:

main.py
import pandas as pd
import numpy as np

# Create a DataFrame
df = pd.DataFrame({'A': [10, 20, 30, 40, 50]})

# Create a new variable based on a condition
df['new_var'] = np.where(df['A'] > 30, 'Greater than 30', 'Less than or equal to 30')

# Print the DataFrame
print(df)
273 chars
12 lines

In this example, the condition is df['A'] > 30. If the condition is True, the corresponding value 'Greater than 30' is assigned to the new variable new_var, and if the condition is False, the corresponding value 'Less than or equal to 30' is assigned.

The resulting DataFrame will have the new variable new_var along with the original column A.

Output:

main.py
    A                   new_var
0  10    Less than or equal to 30
1  20    Less than or equal to 30
2  30    Less than or equal to 30
3  40          Greater than 30
4  50          Greater than 30
196 chars
7 lines

By using np.where(), you can flexibly assign values based on conditions in pandas. Keep in mind that you can customize your conditions according to your specific requirements.

related categories

gistlibby LogSnag