- Understanding the Grammar of Graphics
- Terminology for Interactive Graphics
- Interactive Graphic Examples
- Leading Platforms and Packages
- Future Research
Matthew Sigal (msigal@yorku.ca)
York University
The grammar of graphics takes us beyond a
The grammar is broken up into three components:
Assembly and Display are typically products of the software and hardware we use, so Wilkinson's
primary emphasis is on Specification.
Important notes:
The first step is to extract data into variables.
varset
We then can apply various algebraic techniques to the varset, which will define the structure (or frame) of our plot.
Three primary operators:
These are functions that are used to map varsets to dimensions (size, shape, and location).
Statistical operations can be employed to reduce the number of rows in the varset.
These functions create graph objects that can be represented by magnitudes in a space.
Our next step is to choose and apply a coordinates system.
Form | Surface | Motion | Sound | Text |
---|---|---|---|---|
Position | Color | Direction | Tone | Label |
Size | * Hue | Speed | Volume | |
Shape | * Brightness | Acceleration | Rhythm | |
* Polygon | * Saturation | Voice | ||
* Glyph | Texure | |||
* Image | * Pattern | |||
Rotation | * Granularity | |||
Resolution | * Orientation | |||
Blur | ||||
Transparency |
(Wilksinson, 2005, p. 274)
In GPL, any statistical graphic can be expressed in terms of six statements:
DATA: y = "SepalWidth"
DATA: z = "species"
TRANS: x = x
TRANS: y = y
ELEMENT: point(position(x*y), color(z))
COORD: rect(dim(1,2))
SCALE: linear(dim(1))
SCALE: linear(dim(2))
GUIDE: axis(dim(1), label("Sepal Length"))
GUIDE: axis(dim(2), label("Sepal Width"))
However, as most of these actions would be the default of a well-organized graphical system, only the ELEMENT statement is truly necessary.
-Murrell, 2011
gg
in ggplot2
:General Principals for ggplot2
:
ggplot()
aes()
geoms
) that will be used to view the data
geom_point()
or geom_line()
Components of a ggplot2
object:
Layers
consisting of:
Data
: What we want to see!Mapping
: Defines the aesthetics of the graphicStat
: Statistical transformations of the data (e.g., binning or averaging)Geom
: Geometric objects that are drawn to represent the data (simple or complex)Position
: Position adjustments for each geom (e.g., jitter, dodge, stack)Scale
: Controls mapping between data and aesthetics (variable or constant; colour/position) Themes
: Relatively new ggplot2
feature that allows for visual adjustments of a plot object Coord
: The coordinate system (provides axes and gridlines)Facet
: Allows us to break up the data into subsetsGPL
and ggplot2
(Based upon Wickham, 2010)
Building the grouped scatterplot:
library(ggplot2)
dat <- iris
p1 <- ggplot(data = dat,
aes(x = Sepal.Length, y = Sepal.Width, colour = Species)) +
geom_point() +
theme_bw()
p1
Is this a perfect implementaion of the Grammar of Graphics?
ggplot2
, we have to deal with:
data.frame
object)ggplot2
ggplot2
objectprint()
or ggsave()
is calledIs this problematic?
ggplot2
maintains the core beliefs of the system
str(plot)
+
operator allows us to make changes to the general plot objectDynamic, interactive visualizations...
to explore the data for themselves.
-Murray, 2013, p. 2
Overview first, zoom and filter, then details-on-demand.
-Shneiderman, 1996, p. 337
(the "Visual Information Seeking Mantra")
The most basic interactions allow the user to dynamically alter the parameters of a plot. This feature is already built into RStudio with the manipulate
package.
For example, the following code allows users to dynamically alter:
manipulate(plot(iris$Sepal.Length, iris$Sepal.Width,
xlim = c(x.min, x.max), type = type, axes = axes, ann = label),
x.min = slider(0, 10, initial = 4),
x.max = slider(0, 10, initial = 8),
type = picker("points" = "p", "line" = "l", "none" = "n",
initial = "points"),
axes = checkbox(TRUE, "Draw Axes"),
label = checkbox(TRUE, "Draw Axes Labels"))
manipulate
packagemanipulate
packageWhile easy to use, unfortauntely, manipulate
has some draw-backs:
clickme
ScatterplotA more interesting example comes courtesy of Nacho Cabellero's clickme
package.
The goals of this package are to:
For example, here are the results of conducting multidimensional scaling on the iris
dataset:
Interactivity can also be harnessed for pedagogical purposes. For instance, while teaching introductory statistics, we might want to visually demonstrate how skewness and kurtosis affect a distribution.
We can do this live via the shiny
package, which allows us to create a web application framework for R with "reactive bindings".
(This is an approximation based upon the sinh-arcsinh transformation; Jones & Pewsey, 2009)
R
(if it should be done, there probably is a package)D3.js
(Data-Driven Documents)Protovis
from the Stanford Visualization GroupD3.js
(Data-Driven Documents)sigma.js
: A library for interactive networksPolymaps
: US Unemployment 2009 ExamplerCharts
: Highcharts.js
rCharts
: NVD3.js
Processing and Processing.js:
R Packages
Acinonyx
, aka "iPlots eXtreme" - designed for large data (development limbo?)
install.packages("Acinonyx","http://rforge.net")
animint
has a similar feature set to clickme
, but targeted specifically for ggplot2
graphics.
require(devtools)
and then install_github("animint","tdhock")
animation
(self-explanatory; no interactivity)cranvas
reimplements GGobi
(parallel coordinate plots; limited interactivity)
d3network
allows for the creation of D3-based force direction graphs in R
require(devtools); install_github("d3Network", "christophergandrud")
gridSVG
creates interactive ggplot2
+ D3
objects
install.packages("gridSVG", repos="http://R-Forge.R-project.org")
ggvis
is still in development, but already has a plethora of examples.ggplot2
)ggplot2
, geom is kind of abstract (e.g., geom_histogram()
combines geom_bar()
and stat_bin()
). In ggvis
, pure geoms are called "marks", and combined geoms and stats are referred to as "branches".qplot()
(or overloaded +)!mark_symbol(props(size = input_slider(100, 1000))
# Installation:
library(devtools)
install_github(c("assertthat", "testthat"))
install_github(c("httpuv", "shiny", "ggvis"), "rstudio")
rCharts
leverages this somewhat by allow us to utilize JS libraries from within R
.ggvis
and ggplot2
are both attempts at implementing Wilkinson's Grammar of Graphics in R
.ggplot2
.ggsubplot
), we can think of an interactive plot as a Grammar of Graphics pipeline that is continuously rendering.Matthew J. Sigal, MA
Department of Psychology
262 Behavioural Science Building
York University, 4700 Keele St.
Toronto, ON, Canada M3J 1P3
(416) 736-2100 x66163
matthewsigal@gmail.com / msigal@yorku.ca
http://www.matthewsigal.com
http://www.dfconsulting.org
ggplot2
and ggvis
clickme
require(devtools); install_github("clickme", "nachocab")
manipulate
and shiny
Ramnath Vaidyanathan's rCharts
require(devtools); install_github('rCharts', 'ramnathv')
Slides made with RStudio via Ramnath Vaidyanathan's slidify
require(devtools); install_github('slidify', 'ramnathv');
require(devtools); install_github('slidifyLibraries', 'ramnathv')