Ben Shneiderman from the University of Maryland, College Park, visited Yahoo! on Monday February 9, 2009, and gave a fabulous talk entitled ?Information Visualization for Knowledge Discovery.?
A well-known professor in the research community, Ben founded the University of Maryland's Human-Computer Interaction Laboratory, and has authored many books and articles on effective human-computer interaction. It is unsurprising, therefore, that his work focuses strongly on how people understand and interact with datasets, and that his focus is on creating tools and applications for domain experts.
Ben emphasized that a deep understanding of human information processing capabilities and propensities can help application designers and developers create tools that help people ?use vision to think." My favorite phrase from his talk: ?discovery takes place between the ears.? In Ben?s view, major conceptual leaps and significant new discoveries are made by people -- not by machine learning or visualization techniques.
Over the years, Ben and his students have created great applications that combine powerful data-mining methods with visualizations ? where the interfaces are user-controlled and designed with human perception and creativity in mind. A recurring theme, or Ben?s ?mantra? for viewing big datasets, is ?overview first, zoom and filter, then details-on-demand.? Breaking that out, Ben stresses the importance of well-designed data overviews, combined with methods for allowing people to zoom in on areas of interest and filter out unwanted items, and the ability to click for details on demand.
Ben illustrated his points with a number of demos and screenshots: including Spotfire, a business intelligence visualization tool that was acquired by Tibco in 2007; the Smart Money treemap, a powerful interactive treemap visualization that shows the changing stock prices of over 500 publicly-traded companies in (almost) real-time on a single screen; IBM?s Many Eyes data visualization tools; and the Hive Group?s treemap software used by many enterprise and government agencies. He also showed work he has done more recently with students and colleagues.
Timesearcher is an application that allows visual exploration of large time-series data applied in financial, medical, and genomic domains. The Hierarchical Clustering Explorer includes the rank-by-feature framework. The LifeLines project illustrates how unifying statistics with visualization over network data and electronic health records can reveal surprising relationships between geographic location, poverty, health, and so on. One of the most compelling visualizations for me personally was called Social Action, which is the work of Ben and his colleague Adam Perer. Here again, activity statistics and visualization are combined with social network analysis. Check out the work they did on visualizing the voting patterns among United States senators. Fascinating!
Ben clearly believes -- and I agree with him -- that manipulable, nested, and faceted data visualizations are particularly helpful for analysis of dynamic data sets. Imagine being able to visualize activity on your network or drill into what is happening with a suite of APIs from multiple perspectives and to surface unexpected correlations or what appear to be data anomalies. With powerful, flexible visualization tools, exploratory data analysis becomes fun and game-like, and you find answers to questions you didn't know you had. Which leads to new questions that lead to new answers, and so on down the rabbit hole, until you end up seeing your data world in a fresh way.
Such exploratory, interactive analyses are good for business. Think of mashups, which are quintessentially about surfacing unexpected relationships through data integration/combination and visualization. Richard MacManus, in a ReadWriteWeb blog post from 2006 spells out some thoughts on the advantages of rich data visualization for businesses such as advertising, services (see simplyhired.com), lead generation, affiliate programs, and subscriptions. More recently we have been hearing a lot about ?business intelligence mashups.? Michael Ogrinz?s book ?Mashup Patterns: Designs and Examples for the Modern Enterprise? outlines some ways in which data mashups and visualization can offer advantages for enterprises. (Here's a review .)
In the context of the tools he presented, Ben emphasized three ways in which rich, interrogable and usable data visualizations offer a competitive advantage for companies: by offering strategies for discovery that accommodate missing and uncertain data; by visualizing complex data types such as time series, patient histories, maps, and social networks, so people can handle a wider array of problems; and by offering a tighter integration of data exploration and discovery into organizational workflows that can not only ?amplify individual creativity? but also offer the ?catalytic benefits? of social creativity.
The talk ended with a rallying cry: There is much more work to be done and we need new ways of thinking about dataset visualization and exploration tools. And if you need convincing that this is a good area to work in (and I mean lucrative as well as intellectually stimulating), check out Ben?s web post about Spotfire?s humble beginnings and its acquisition by Tibco.
For more general information check out Ben?s home page and for related presentations, check out the research group?s presentations website.
Principal Research Scientist