Statswars and Matrices

Relational analysis that says more than 25 000 emotional words (Recraft)

In a short but sweet article that appeared on r-bloggers, Kieran Healy uses one image to convey 25 000 emotional words about the asymmetric relationships between 5 different user communities in the data science world - R, Python, STATA, SPSS and SAS communities. A full resolution version of this single compact “emo matrix” is published at Kieran Healy’s blog.

How can we create a tabular matrix-like visual like this, filled with images that summarize how a certain community views itself and other communities while also including the opposite perspective - how those other communities views that community from the outside?

Deconstructing the emo matrix

The data displayed in the emomatrix is “relational data” - data describing a relationship between sets of two communities. For instance: the python community sees the R community in certain way that evokes emotions such as those inspired by a particular image.

We start deconstructing the emo matrix but breaking out the sub images so we can learn more about it.


emomatrix <- 
  "" %>%

imagemagick_geometries <- function() {
  # assemble ImageMagick geometry strings
  # data for image regions to crop
  # :) inspired by TI Extended Basic DATA syntax
  widths <- c(400, 400, 300, 278, 278)
  heights <- c(270, 270, 270, 270, 240)
  offsets_w <- cumsum(c(266, 400, 440, 350, 300))
  offsets_h <- cumsum(c(226, 270, 270, 270, 270))
  paste_plus <- function(x, y) paste0("+", paste(x, y, sep = "+"))
  paste_x <- function(x, y) paste0(x, "x", y)
  xy <- outer(offsets_w, offsets_h, paste_plus)
  wh <- outer(widths, heights, paste_x)
  whxy <- paste0(wh, xy)
  dim(whxy) <- c(5, 5)
  return (whxy)

# sub-images
five <- c("STATA", "R", "SAS", "Python", "SPSS")
geoms <- imagemagick_geometries()
emoimages <- map(geoms, function(x) image_crop(emomatrix, x))
dim(emoimages) <- c(5, 5)
colnames(emoimages) <- rownames(emoimages) <- five

We can now inspect a particular relation from both ends.

For example, how does the R community view the Python community. The image seems to indicate a mostly positive view, with some envy in it?

emoimages["R", "Python"][[1]]

From the other end we can see how the Python community views the R community. If you’re Homer - wouldn’t you like to have and use Homer’s Dream Car - it has all the features you would ever need? There is some ambiguity however - or don’t you perceive a hint of ridicule being present there?

emoimages["Python", "R"][[1]]

Each image says a thousand emotional words in the form of inside jokes - not easy to decode for outsiders to the communities involved, but with a lot of emotional content that can be quickly be deciphered by members inside these communities.

For an outsider, at this point, we’d have to use the individual images for example with a reverse image search API to locate their origins in order to learn more about a specific view and to try to get to the gist of what a particular image is trying to communicate.

res <- map(whxy, function(x) image_crop(emomatrix, x) %>% 
  image_resize("x240") %>% image_write(paste0("statswars-", x, ".png")))

The emotional matrix reconstructed

Here follows an attempt to reproduce this diagram using DT which allows us to get a HTML table that can link to images and provide tooltips that add some clarifications.


imagedata <- read_delim(delim = ";", trim_ws = TRUE, file = "emo; url
IBM Business Automation Content Analyzer on Cloud;
A Rube Goldberg Machine;
IBM 1800 DACS;
Tech Bro Says You Just Do Not Understand the Blockchain;
Giphy Monkey Stuff;
The Matrix 4 Rebooted;
Distracted Boyfriend;
Spongebob and Patrick Are Best Friends Forever;
European Teenage Business Man;
Avalon - Where Excalibur Gets Forged - Messy Computer Desk;
Analytics Dashboard;
Tech Bro Says You Just Do Not Understand the Blockchain;
Baby with Laptop Feels Confused;,1315555947,1/stock-photo-baby-with-laptop-confused-84320152.jpg
I Hate New Excel;
The Homer - Your Dream Car With All Features;
Tech Bro Says You Just Do Not Understand the Blockchain;
Correct Execution of the Process;
I Hate Old Excel;
IBM Selective Sequence Electronic Calculator (SSEC)- Electromechanical Computing technology from 1959;
HAL 900 Eye;
African Rock Python;
Homework for Schoolkids;$img400$

# fcn to generate base64 encoded local image linked to external source
img_uri <- function(url, title) { 
  x <- tempfile(fileext = ".png")
  path <- image_read(url) %>% 
    image_convert(format = "png") %>% 
    image_resize("64x64^") %>% 
  message("writing image at ", x, " got return ", path)
  sprintf('<a href="%s"><img height=64 title="%s" src="%s"/></a>', 
          url, title, knitr::image_uri(x)) 

# add the from, to relation as well as tooltip data and a link to the image url
id <- imagedata %>% bind_cols(data_frame(
    alt = as.vector(outer(five, five, function(x, y) paste0(x, " as seen by ", y))),
    from = as.vector(outer(five, five, function(x, y) paste0(x))),
    to = as.vector(outer(five, five, function(x, y) paste0(y)))
  )) #%>%
#  mutate(img = img_uri(url, emo))
#  mutate(imgurl = paste0("<a href='", url, "'><img height=52 title='", emo, "' src='", url, #"'/></a>"))

img <- purrr::map2(id$url, id$emo, img_uri)
id$img <- unlist(img)

id %>% 
  select(alt, from, to, emo) %>%
  filter(from == "R", to == "R") %>% 
  mutate(url = "") %>%
  kable() %>% 
alt from to emo url
R as seen by R R R The Matrix 4 Rebooted

With the source data expressed like above - with relationships added (from, to) in addition to the image urls - we can use tidyr to spread this tall table of data into a format that is suitable for being visualized with DT as a webfriendly table of image thumbnails, where hovering will provide extra info through a tooltip and where clicking leads to the source image.