Data visualization is a great way to represent huge amounts of data in a simple and intuitive fashion. All data visualizations have the same goal: help viewers easily grasp information to make quick inferences or decisions. However, it is important that visualizations are not overdone and hit the sweet spot where they are catchy, informative, and easy to navigate. This requires a bit of learning. Putting up a good data visualization is not just a matter of throwing together some data in colorful charts.
Today we are excited to announce the official 1.0 release of Vega-Lite, a high-level format for rapidly creating visualizations for analysis and presentation. With Vega-Lite, one can concisely describe a visualization as a set of encodings that map from data fields to the properties of graphical marks, using a JSON format. Vega-Lite also supports data transformations such as aggregation, binning, filtering, and sorting, along with visual transformations including stacked layouts and faceting into small multiples. As you might have guessed, Vega-Lite is built on top of Vega, a visualization grammar built using D3. Vega and D3 provide a lot of flexibility for custom visualization designs; however, that power comes with a cost.
Comparisons and context are at the core of data visualization. We humans have a hard time grasping large numbers, such as "9 trillion gallons of rain," so transforming that magnitude into a pictorial illustration may help. Here's a graphic by The Washington Post; it's part of this story: John Grimwade has a great collection of this kind of side-by-side pictorial comparisons. They surprise and illuminate, but they are quite limited. Sometimes it may be preferable to present data in a more abstract and precise manner, like this (h/t Sam Lillo): Which graphics are better, the pictorial or the abstract?
Effective visualization resizing is important for many visualization tasks, where users may have display devices with different sizes and aspect ratios. Our recently designed framework can adapt a visualization to different displays by transforming the resizing problem into a non-linear optimization problem. However, it is not scalable to a large amount of dense information. Undesired cluttered results would be produced if dense information is presented in the target display. We present an extension to our resizing framework with a seamless integration of a sampling-based data abstraction mechanism, such that it is scalable with not only different display sizes, but also different amounts of information.