how to remove duplication in r

To remove duplicates from a dataset in R, you can use the unique() function. This function returns a vector with the unique elements of the input vector, which means any duplicates are removed.

If you want to remove duplicates from an entire dataframe, you can use the distinct() function from the dplyr package.

Here is an example:

main.r
# create a dataframe with duplicates
df <- data.frame(x = c(1, 1, 2, 3, 3), y = c("a", "a", "b", "c", "c"))

# remove duplicates using the distinct() function
library(dplyr)
df_no_dup <- distinct(df)

# display the result
df_no_dup
232 chars
10 lines

This will return a dataframe with only the unique rows:

main.r
  x y
1 1 a
2 2 b
3 3 c
24 chars
5 lines

gistlibby LogSnag