run a difference in differences analysis in r

To run a difference in differences (DiD) analysis in R, you can use a package called “plm” (Panel Data Econometrics in R). The approach involves comparing pre- and post-treatment outcomes of a treatment group to those of a control group that did not receive the intervention. Here are the basic steps to run a DiD analysis:

  1. Load the necessary packages:
main.r
library(plm)
library(tidyverse)
32 chars
3 lines
  1. Load in your data, making sure that your treatment status and time periods are coded properly.
main.r
data <- read.csv("data.csv")
29 chars
2 lines
  1. Create a “plm” object by specifying which variable is the outcome variable, which is the panel ID variable, which is the time variable, and which variable indicates treatment status.
main.r
paneldata <- pdata.frame(data, index = c("panel_id", "time_period"))
DiD_model <- plm(outcome ~ treatment * time_period, data = paneldata, model = "within")
157 chars
3 lines

The explanatory variable treatment * time_period permits us to see how the effect of the treatment varies between the different time periods.

  1. You can now extract the DiD estimate from the model object:
main.r
summary(DiD_model)
19 chars
2 lines

There are of course different assumptions that need to be met to make a good DiD estimate. It is recommended to also plot the average outcomes pre-treatment and post-treatment over time and see if there are discontinuities in these trends that may suggest another explanation for your results.

gistlibby LogSnag