Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

r - Select groups based on number of unique / distinct values

I have a data frame like below

sample <- data.frame(ID = 1:9,
                     Group = c('AA','AA','AA','BB','BB','CC','CC','BB','CC'),
                     Value = c(1,1,1,2,2,2,3,2,3))

ID       Group    Value
1        AA       1
2        AA       1
3        AA       1
4        BB       2
5        BB       2
6        CC       2
7        CC       3
8        BB       2
9        CC       3

I want to select groups according to the number of distinct (unique) values within each group. For example, select groups where all values within the group are the same (one distinct value per group). If you look at the group CC, it has more than one distinct value (2 and 3) and should thus be removed. The other groups, with only one distinct value, should be kept. Desired output:

ID       Group    Value
1        AA       1
2        AA       1
3        AA       1
4        BB       2
5        BB       2
8        BB       2

Would you tell me simple and fast code in R that solves the problem?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a solution using dplyr:

library(dplyr)

sample <- data.frame(
  ID = 1:9,  
  Group= c('AA', 'AA', 'AA', 'BB', 'BB', 'CC', 'CC', 'BB', 'CC'),  
  Value = c(1, 1, 1, 2, 2, 2, 3, 2, 3)
)

sample %>%
  group_by(Group) %>%
  filter(n_distinct(Value) == 1)

We group the data by Group, and then only select groups where the number of distinct values of Value is 1.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...