create a new lable in pandas dataframe based on a condition for 3 possible labels in python

To create a new label in a pandas DataFrame based on a condition for three possible labels in Python, you can use the np.where() function or the DataFrame.loc[] method.

Here's an example using np.where():

main.py
import pandas as pd
import numpy as np

# Create a DataFrame
data = {'col1': [1, 2, 3, 4, 5],
        'col2': [6, 7, 8, 9, 10]}
df = pd.DataFrame(data)

# Define the conditions
condition1 = (df['col1'] > 3)
condition2 = (df['col1'] <= 3)
condition3 = (df['col2'] > 8)

# Define the labels
label1 = 'High'
label2 = 'Low'
label3 = 'Medium'

# Create a new column based on the conditions and labels
df['label'] = np.where(condition1, label1, np.where(condition2, label2, np.where(condition3, label3, '')))

# Display the updated DataFrame
print(df)
546 chars
24 lines

Output:

main.py
   col1  col2  label
0     1     6    Low
1     2     7    Low
2     3     8    Low
3     4     9   High
4     5    10 Medium
126 chars
7 lines

In the above example, we define three conditions (condition1, condition2, condition3) and three labels (label1, label2, label3).

The np.where() function checks the conditions sequentially and assigns the corresponding label based on the condition result. If none of the conditions are met, an empty string is assigned to the label. The new label column is then added to the DataFrame.

Alternatively, here's an example using DataFrame.loc[]:

main.py
import pandas as pd

# Create a DataFrame
data = {'col1': [1, 2, 3, 4, 5],
        'col2': [6, 7, 8, 9, 10]}
df = pd.DataFrame(data)

# Define the conditions
condition1 = (df['col1'] > 3)
condition2 = (df['col1'] <= 3)
condition3 = (df['col2'] > 8)

# Define the labels
label1 = 'High'
label2 = 'Low'
label3 = 'Medium'

# Create a new column based on the conditions and labels
df.loc[condition1, 'label'] = label1
df.loc[condition2, 'label'] = label2
df.loc[condition3, 'label'] = label3

# Display the updated DataFrame
print(df)
531 chars
25 lines

The output will be the same as in the previous example.

Using DataFrame.loc[], we assign the corresponding labels to the 'label' column based on the conditions. Each condition is checked separately, and the label is assigned where the condition is true.

Note: Make sure to import the required libraries (pandas and numpy) before using them in your code.

gistlibby LogSnag