how to make a shapiro test on some columns from a data table in r

To perform a Shapiro-Wilk test on some columns of a data.table in R, we can make use of the shapiro.test() function from the base stats package. Here's an example:

main.r
# Load required packages
library(data.table)

# Create example data
dt <- data.table(x = rnorm(100), y = rnorm(100), z = rnorm(100))

# Columns to test
cols <- c("x", "y")

# Function to perform Shapiro-Wilk test on selected columns
shapiro_dt <- function(dt, cols) {
  # Perform the Shapiro-Wilk test on each column
  res <- lapply(dt[, ..cols], shapiro.test)
  
  # Combine the test results into a data.table
  test <- data.table(Column = cols,
                     W = unlist(lapply(res, function(x) x$statistic)),
                     p_value = unlist(lapply(res, function(x) x$p.value)))
  
  # Print the test results
  print(test)
  return(test)
}

# Apply function to test columns x and y
shapiro_dt(dt, cols)
717 chars
27 lines

In this example, we create an example data.table with 3 columns x, y, and z. We then specify the columns to test using the cols variable (in this case, columns x and y). We define a function shapiro_dt() that takes in the data.table and columns to test as inputs. This function applies the shapiro.test() function to each selected column and returns the results in a new data.table.

When the shapiro_dt() function is called with inputs dt and cols, the output will show the Shapiro-Wilk test statistic W and p-value p_value for columns x and y. This allows us to assess the normality of the data in these columns and perform further hypothesis tests or statistical analyses as needed.

gistlibby LogSnag