Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
297 views
in Technique[技术] by (71.8m points)

r - Concatenate row-wise across specific columns of dataframe

I have a data frame with columns that, when concatenated (row-wise) as a string, would allow me to partition the data frame into a desired form.

> str(data)
'data.frame':   680420 obs. of  10 variables:
 $ A              : chr  "2011-01-26" "2011-01-26" "2011-02-09" "2011-02-09" ...
 $ B              : chr  "2011-01-26" "2011-01-27" "2011-02-09" "2011-02-10" ...
 $ C              : chr  "2011-01-26" "2011-01-26" "2011-02-09" "2011-02-09" ...
 $ D              : chr  "AAA" "AAA" "BCB" "CCC" ...
 $ E              : chr  "A00001" "A00002" "B00002" "B00001" ...
 $ F              : int  9 9 37 37 37 37 191 191 191 191 ...
 $ G              : int  NA NA NA NA NA NA NA NA NA NA ...
 $ H              : int  4 4 4 4 4 4 4 4 4 4 ...

For each row, I would like to concatenate the data in columns F, E, D, and C into a string (with the underscore character as separator). Below is my unsuccessful attempt at this:

data$id <- sapply(as.data.frame(cbind(data$F,data$E,data$D,data$C)), paste, sep="_")

And below is the undesired result:

  > str(data)
    'data.frame':   680420 obs. of  10 variables:
     $ A              : chr  "2011-01-26" "2011-01-26" "2011-02-09" "2011-02-09" ...
     $ B              : chr  "2011-01-26" "2011-01-27" "2011-02-09" "2011-02-10" ...
     $ C              : chr  "2011-01-26" "2011-01-26" "2011-02-09" "2011-02-09" ...
     $ D              : chr  "AAA" "AAA" "BCB" "CCC" ...
     $ E              : chr  "A00001" "A00002" "B00002" "B00001" ...
     $ F              : int  9 9 37 37 37 37 191 191 191 191 ...
     $ G              : int  NA NA NA NA NA NA NA NA NA NA ...
     $ H              : int  4 4 4 4 4 4 4 4 4 4 ...
     $ id             : chr [1:680420, 1:4] "9" "9" "37" "37" ...
      ..- attr(*, "dimnames")=List of 2
      .. ..$ : NULL
      .. ..$ : chr  "V1" "V2" "V3" "V4"

Any help would be greatly appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try

 data$id <- paste(data$F, data$E, data$D, data$C, sep="_")

instead. The beauty of vectorized code is that you do not need row-by-row loops, or loop-equivalent *apply functions.

Edit Even better is

 data <- within(data,  id <- paste(F, E, D, C, sep=""))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...