differnce between multiple columns of two different dataframes in python

To compute the difference between multiple columns of two different dataframes in Python, you first need to make sure the dataframes have some common columns that can be used as join keys. You can then merge the two dataframes on the common columns using the merge() method from the pandas library. After the merge, you can simply subtract the columns of interest to compute the difference.

Here's an example:

main.py
import pandas as pd

# create two dataframes
df1 = pd.DataFrame({'id': [1, 2, 3],
                    'col1': [10, 20, 30],
                    'col2': [100, 200, 300]})

df2 = pd.DataFrame({'id': [2, 3, 4],
                    'col1': [15, 25, 35],
                    'col2': [150, 250, 350]})

# merge dataframes on 'id' column
merged_df = pd.merge(df1, df2, on='id')

# compute difference between columns of interest
merged_df['diff_col1'] = merged_df['col1_x'] - merged_df['col1_y']
merged_df['diff_col2'] = merged_df['col2_x'] - merged_df['col2_y']

# select relevant columns
result_df = merged_df[['id', 'diff_col1', 'diff_col2']]
638 chars
21 lines

In this example, we create two dataframes, df1 and df2, with some common columns (id, col1, and col2). We then merge the dataframes on the common id column using the merge() method. Finally, we compute the difference between the columns of interest (col1 and col2), and select relevant columns to obtain the final result dataframe.

related categories

gistlibby LogSnag