Visual Exploration of Multi-dimensional Data Sets
Sam Gerber (Duke Math)
In many scientific fields data acquisition and storage is relatively cheap and fast, leading to large data sets capturing very detailed information. This talk will cover two new approaches to visualize and explore such large multi-dimensional data sets. The first approach focuses on the visualization of high-dimensional scalar functions: An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in numerical simulations, as energy landscapes in optimization problems, or in the analysis of socio-economic data. The second approach concerns visualizing correlation: The degree of correlation between random variables is a key quantity in many scientific inquires. In today's scientific process correlation is often used as an exploratory tool to help form new hypotheses and sift through vast amounts of data. However, low-level visualization tools for exploratory correlation analysis lack the capacity to deal with these increasingly large data sets. We develop a visualization method for Pearson's correlation that takes advantage of human pattern recognition capabilities to explore correlations and is able to scale to data sets with tens to hundred of thousands of random variables.