September 2019 Democratic Debates Added to {ggchicklet}

[This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The latest round of the 2020 Democratic debates is over and the data from all the 2019 editions of the debates have been added to {ggchicklet}. The structure of the debates2019 built-in dataset has changed a bit:

library(ggchicklet)
library(hrbrthemes)
library(tidyverse)

debates2019
## # A tibble: 641 x 7
##    elapsed timestamp speaker   topic   debate_date debate_group night
##      <dbl> <time>    <chr>     <chr>   <date>             <dbl> <dbl>
##  1   1.04  21:03:05  Warren    Economy 2019-09-13             1     1
##  2   1.13  21:04:29  Klobuchar Economy 2019-09-13             1     1
##  3   1.13  21:06:02  O'Rourke  Economy 2019-09-13             1     1
##  4   0.226 21:07:20  O'Rourke  Economy 2019-09-13             1     1
##  5   1.06  21:07:54  Booker    Economy 2019-09-13             1     1
##  6   0.600 21:09:08  Booker    Economy 2019-09-13             1     1
##  7   0.99  21:09:50  Warren    Economy 2019-09-13             1     1
##  8   0.872 21:11:03  Castro    Economy 2019-09-13             1     1
##  9   1.07  21:12:00  Gabbard   Economy 2019-09-13             1     1
## 10   1.11  21:13:20  de Blasio Economy 2019-09-13             1     1
## # … with 631 more rows

There are now debate_date, debate_group and night columns to make it easier to segment out or group together the debate nights.

The topic names across the online JavaScript data for the June, July and September debates weren’t uniform so they’ve been cleaned up as well:

distinct(debates2019, topic) %>% 
  arrange(topic) %>% 
  print(n=nrow(.))
## # A tibble: 26 x 1
##    topic                  
##    <chr>                  
##  1 Abortion               
##  2 Age                    
##  3 Campaign Finance Reform
##  4 Civil Rights           
##  5 Climate                
##  6 Closing                
##  7 Economy                
##  8 Education              
##  9 Elections Reform       
## 10 Foreign Policy         
## 11 Gun Control            
## 12 Healthcare             
## 13 Immigration            
## 14 Lead                   
## 15 Opening                
## 16 Other                  
## 17 Party Strategy         
## 18 Politics               
## 19 Race                   
## 20 Resilience             
## 21 Socialism              
## 22 Statement              
## 23 Trade                  
## 24 Trump                  
## 25 Veterans               
## 26 Women's Rights 

This should make it easier to compare speaker times per-topic across the debates.

Here’ how to generate the chart in the featured image slot for the September debate:

debates2019 %>%
  filter(debate_group == 3) %>% 
  mutate(speaker = fct_reorder(speaker, elapsed, sum, .desc=FALSE)) %>%
  mutate(topic = fct_inorder(topic)) %>% 
  ggplot(aes(speaker, elapsed, group = timestamp, fill = topic)) +
  geom_chicklet(width = 0.75) +
  scale_y_continuous(
    expand = c(0, 0.0625),
    position = "right",
    breaks = seq(0, 18, 2),
    labels = c(0, sprintf("%d min.", seq(2, 18, 2))),
    limits = c(0, 18)
  ) +
  ggthemes::scale_fill_tableau("Tableau 20") +
  guides(
    fill = guide_legend(nrow = 2)
  ) +
  coord_flip() +
  labs(
    x = NULL, y = NULL, fill = NULL,
    title = "How Long Each Candidate Spoke",
    subtitle = "September 2019 Democratic Debates",
    caption = "Each bar segment represents the length of a candidate’s response to a question.\nOriginal <https://www.nytimes.com/interactive/2019/09/12/us/elections/debate-speaking-time.html>\n#rstats reproduction by @hrbrmstr"
  ) +
  theme_ipsum_rc(grid="X") +
  theme(axis.text.x = element_text(color = "gray60", size = 10)) +
  theme(legend.position = "top")

Now that the field has been thinned a bit (yes, others are still running, but really?) we can see who has blathered the most on stage so far:

debates2019 %>%
  filter(debate_group == 3) %>% 
  distinct(speaker) %>% 
  left_join(debates2019) %>% 
  count(speaker, wt=elapsed, sort=TRUE) %>% 
  mutate(speaker = fct_inorder(speaker) %>% fct_rev()) %>% 
  ggplot(aes(speaker, n)) +
  geom_col(fill = ft_cols$slate, width=0.55) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0.55), position = "right") +
  labs(
    x = NULL, y = "Speaking time (minutes)",
    title = "Total Speaking Time Across All 2019 Debates\nfor Those Left Standing in September"
  ) +
  theme_ipsum_es(grid="X")


And, here’s what they’ve all blathered about:

debates2019 %>%
  filter(debate_group == 3) %>% 
  distinct(speaker) %>% 
  left_join(debates2019) %>% 
  count(topic, wt=elapsed, sort=TRUE) %>% 
  mutate(topic = fct_inorder(topic) %>% fct_rev()) %>% 
  ggplot(aes(topic, n)) +
  geom_col(fill = ft_cols$slate, width=0.55) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0.25), position = "right") +
  labs(
    x = NULL, y = "Topic time (minutes)",
    title = "Total Topic Time Across All 2019 Debates\nfor Those Left Standing in September"
  ) +
  theme_ipsum_es(grid="X")

To leave a comment for the author, please follow the link and comment on their blog: R – rud.is.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)