group by then summarize in r

To group a dataset by one or more columns and apply a summary function to each group in R, you can use the dplyr, data.table, or datatable libraries. Here's an example using the dplyr package:


main.r
library(dplyr)
data %>%
  group_by(column_name) %>%
  summarise(new_column = mean(column_to_summarise))
104 chars
5 lines

This code will group the data data frame by column_name and then calculate the mean of column_to_summarise for each group using the summarise() function. The result will be a new data frame with one row per group and a new column called new_column.

If using the data.table package, the code would look like this:


main.r
library(data.table)
data <- as.data.table(data)
data[, .(new_column = mean(column_to_summarise)), by=.(column_name)]
117 chars
4 lines

This code uses the . notation to refer to columns in the data table, and specifies the by argument to group the data by column_name.

If using the datatable package, the code would look like this:


main.r
library(datatable)
data <- as.data.table(data)
data[, mean(column_to_summarise), by=.(column_name)]
100 chars
4 lines

This code is very similar to the code for data.table, but does not require the . notation to refer to columns in the data table, and uses the [] operator instead of data.frame() to specify the data to operate on.

similar r code snippets

set the first row as headers in r

filter nycflights13 by month of july in r

read all files in a directory into a data frame, with a column recording filename in r

check if column values have 90 percent overlap in vlaues in r

mutate a column via its name in r

tidyverse overlap 90 percent column values in r

in a dataframe where the concentration column goes 10, 5, 2, 1 repeatedly, reverse the order of the concentration column in r

read all files in a directory into a data frame, with a column recording filename using tidyverse in r

how to create a dataframe in r

recode some rows of dataframe in r

related categories