Skip to Main Content

Data Applications Services: Data Visualization

What Is Data Visualization?

Data Visualization is the graphical representation of data.

Generally, there are two categories of data visualization: exploration and explanation. 

Exploration

Exploratory data visualizations are a great starting point with you have are not sure what's in your data. These data visualizations give you a sense of what is in your data set and allow you to identify patterns and trends in your data.

Explanation

Explanatory data visualizations are most appropriate when you have a sense of what your data has to say and you are ready to tell the story of your data. 

Why Do We Visualize Data?

Let's look at Anscombe's quartet, which presents the argument for considering data in a visual medium and the importance of exploration in data analysis. First published by Francis Anscombe in the 1973 paper Graphs in Statistical Analysis, Anscombe's quartet presents four made-up datasets each containing eleven observations of two variables, x and y.

  x1 y1 x2 y2 x3 y3 x4 y4
1 10 8.04 10 9.14 10 7.46 8 6.58
2 8 6.95 8 8.14 8 6.77 8 5.76
3 13 7.58 13 8.74 13 12.74 8 7.71
4 9 8.81 9 8.77 9 7.11 8 8.84
5 11 8.33 11 9.26 11 7.81 8 8.47
6 14 9.96 14 8.10 14 8.84 8 7.04
7 6 7.24 6 6.13 6 6.08 8 5.25
8 4 4.26 4 3.10 4 5.39 19 12.50
9 12 10.84 12 9.13 12 8.15 8 5.56
10 7 4.82 7 7.26 7 6.42 8 7.91
11 5 5.68 5 4.74 5 5.73 8 6.89

By design, each of these datasets have nearly identical means, variances, and correlation coefficients. However, when the datasets are plotted, they reveal great differences between the datasets. Thus, illustrating how exploratory data visualizations reveal patterns that may not be readily apparent from summary statistics alone.

 

Four scatter plots displaying each Anscombe's Quartet dataset.

 

Historical Examples of Data Visualizations

References

Iliinsky, N., & Steele, J. (2011). Designing data visualizations: Representing informational Relationships. " O'Reilly Media, Inc.".

Anscombe, F. J. (1973). Graphs in statistical analysis. The american statistician27(1), 17-21.

Dutta, D. (2017). Anscombe’s quartet. RPubs. https://rpubs.com/debosruti007/anscombeQuartet.

R Core Team (2024). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.