The diff()
function in pandas library in Python calculates the difference between two consecutive data rows in a DataFrame.
Here's the general syntax of the diff()
function:
main.py27 chars2 lines
periods
specifies the order of difference. When periods = 1, it calculates the difference between the current and previous row. axis
specifies the direction along which we want to calculate the difference.
Here's an example:
main.py231 chars11 lines
This will output:
main.py81 chars6 lines
As you can see, the diff_A
column is the difference between consecutive rows of the A
column. The first row contains a NaN
value because there is no previous value to calculate the difference.
Note that the diff()
function can also be used with time series data to calculate the difference between consecutive timestamps.
gistlibby LogSnag