Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
167 views
in Technique[技术] by (71.8m points)

Create violin plots, spaceship charts or similar for discrete variables in R using ggplot2

I'm trying to create a series of charts like these at the link below. These charts show the population at each of the UK Civil Service grades, with a chart for each government department. This allows for easy comparison across the charts to show how they are structured. For example, I can quickly see that DfID is very senior-heavy whereas MOJ is much more bottom-heavy.

https://www.instituteforgovernment.org.uk/charts/grade-composition-and-change-department

I'd like to do this in R and have been trialling some solutions using ggplot. I've tried the following approaches so far:

  • violin plots (not suitable for a discrete variable on the vertical axis)
  • line charts stuck back to back, one positive one negative, to recreate the shape (struggling to fill the space in between)
  • population pyramids (I want a smooth line rather than bars)

I've included an example below which would create a pair of lines showing the average fantasy football points by position for a particular team. I'd then like to do this across all Premier League teams, in a similar way to what has been done across the Civil Service departments at the link above.

library(tidyverse)
library(dplyr)

position <- c('Goalkeeper','Defender','Midfielder','Forward')
average_points <- c(100, 150, 185, 170)

football_df <- data.frame(position, average_points) %>%
  dplyr::mutate(negative_average_points = average_points * -1) %>% # create a column that shows the negative to create the mirrored line
  gather(key = key, value = average_points, -position, na.rm = TRUE) # turn into long format to create the line chart

ggplot(football_df, 
       aes(x = position, y = average_points, group = key)) + 
  geom_line() +
  coord_flip()

This is the route I'm heading down at the moment. I'd love to do something more like an area chart but the stacking won't allow negative values.

There are still a couple of issues with taking this approach:

  • Filling the area under the line to make it look more like an area chart
  • The positions have now gone out of order - I want it to maintain the 'goalkeeper, defender, midfielder, forward' order. I have tried using factors to do this but the long format of the data won't allow factors to be used as each position appears twice.

I'd welcome any thoughts on better approaches, or how I might develop the line chart idea to make it look more like the charts in the example at the link above. Thank you!

question from:https://stackoverflow.com/questions/65944736/create-violin-plots-spaceship-charts-or-similar-for-discrete-variables-in-r-usi

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To achieve your desired result you could switch to geom_area and for the ordering you could set the limits to the desired order:

library(tidyverse)

position <- c('Goalkeeper','Defender','Midfielder','Forward')
average_points <- c(100, 150, 185, 170)

football_df <- data.frame(position, average_points) %>%
  dplyr::mutate(negative_average_points = average_points * -1) %>% # create a column that shows the negative to create the mirrored line
  gather(key = key, value = average_points, -position, na.rm = TRUE) # turn into long format to create the line chart

ggplot(football_df, 
       aes(x = position, y = average_points, group = key)) + 
  geom_area() +
  scale_x_discrete(limits = position) +
  coord_flip()


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...