Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
289 views
in Technique[技术] by (71.8m points)

tidyr - Reading "X%"-formatted percentages into R

I am reading a CSV into R, where several columns contain percentages that are formatted as text strings with a percentage symbol at the end, e.g. "35%". readr::read_csv() interprets these as character-type data, but I want the data to be numeric so I can perform analysis.

The following code achieves this, but seems like a lot of "hoops" to jump through. Is there a standard function (or option for a function) that does the same thing? There doesn't seem to be a relevant option in the read_csv() function.

convert_percentage_string <- function(percentage_string) {
  percentage_string %>%
    stringr::str_extract(., "[0-9]+") %>%
    as.numeric()
}

read_csv("my_data.csv") %>% 
  mutate_at(columns_with_percentages, convert_percentage_string)

Sample data:

tribble(~name, ~count, ~percentage, 
   "Alice", 4, "40%", 
   "Bob", 10, "65%", 
   "Carol", 15, "15%")

Expected result:

tribble(~name, ~count, ~percentage, 
       "Alice", 4, 40, 
       "Bob", 10, 65, 
       "Carol", 15, 15)
question from:https://stackoverflow.com/questions/65600364/reading-x-formatted-percentages-into-r

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a dplyr and readr solution:

library(dplyr) # Version >= 1.0.0
library(readr)
library(stringr)
data %>% 
   mutate(across(where(~any(str_detect(.,"%"))), parse_number))
# A tibble: 3 x 3
  name  count percentage
  <chr> <dbl>      <dbl>
1 Alice     4         40
2 Bob      10         65
3 Carol    15         15

Feel free to replace any with all if you prefer.

A benefit of this approach is it detects columns that have the % and only parses those. No need to know which columns need to be convereted in advance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...