how to join two dataframes of different years on the same days in python

Assuming your dataframes have columns date, value1 and value2, you can join them on the same days using merge and specifying the how argument as "inner". Here's an example:

main.py
import pandas as pd

# create sample dataframes
df1 = pd.DataFrame({'date': ['2020-01-01', '2020-01-02', '2020-02-01'], 'value1': [10, 20, 30], 'value2': [100, 200, 300]})
df2 = pd.DataFrame({'date': ['2021-01-01', '2021-01-02', '2021-02-01'], 'value1': [15, 25, 35], 'value2': [150, 250, 350]})

# convert date column to datetime
df1['date'] = pd.to_datetime(df1['date'])
df2['date'] = pd.to_datetime(df2['date'])

# join on same day and month
result = pd.merge(df1, df2, on=[df1['date'].dt.day == df2['date'].dt.day, df1['date'].dt.month == df2['date'].dt.month], how='inner')

print(result)
594 chars
15 lines

This will output a dataframe with only the rows where the days and months match between the two dataframes. You can customize the join condition by modifying the on argument.

Note that if your dataframes have different columns, you may need to use the suffixes argument to avoid column name collisions.

related categories

gistlibby LogSnag