how to merge multiple dataframes with different amount of rows into one giant dataframe in python using pd.merge in python

You can use pd.merge function with how='outer' as parameter to combine multiple dataframes with different number of rows into a single dataframe.

Here is an example code that shows how to do it:

main.py
import pandas as pd

# create sample dataframes
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'c']})
df2 = pd.DataFrame({'A': [4, 5], 'B': ['d', 'e']})
df3 = pd.DataFrame({'A': [6, 7, 8, 9], 'B': ['f', 'g', 'h', 'i']})

# merge all dataframes into one
result = pd.merge(df1, df2, how='outer').merge(df3, how='outer')

# print final result
print(result)
359 chars
13 lines

This will result in the following output:

main.py
   A  B
0  1  a
1  2  b
2  3  c
3  4  d
4  5  e
5  6  f
6  7  g
7  8  h
8  9  i
80 chars
11 lines

Note that how='outer' will include all the rows from all dataframes, even if there is no match in the other dataframes. If you only want to include the rows that match in all dataframes, you can use how='inner' instead.

related categories

gistlibby LogSnag