Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
471 views
in Technique[技术] by (71.8m points)

r - Reshape multiple categorical variables to binary response variables

I am trying to convert the following format:

mydata <- data.frame(movie = c("Titanic", "Departed"), 
                     actor1 = c("Leo", "Jack"), 
                     actor2 = c("Kate", "Leo"))

     movie actor1 actor2
1  Titanic    Leo   Kate
2 Departed   Jack    Leo

to binary response variables:

     movie Leo Kate Jack
1  Titanic   1    1    0
2 Departed   1    0    1

I tried the solution described in Convert row data to binary columns but I could get it to work for two variables, not three.

I would really appreciate if there is a clean way to do this.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

How much spice is too much? Here is a solution via tidyr:

library(dplyr)
library(tidyr)

mydata %>%
  gather(actor,name,starts_with("actor")) %>%
  mutate(present = 1) %>%
  select(-actor) %>%
  spread(name,present,fill = 0)

       movie Jack Kate Leo
 1 Departed    1    0   1
 2  Titanic    0    1   1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...