Mondrian data analysis
Encyclopedia
Mondrian is a general-purpose statistical data-visualization system. It features outstanding visualization techniques for data of almost any kind, and has its particular strength compared to other tools when working with Categorical Data, Geographical Data and LARGE Data.
All plots in Mondrian are fully linked, and offer various interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots.

Currently implemented plots comprise Mosaic Plot, Scatterplots and SPLOM, Maps, Barcharts, Histograms, Missing Value Plot, Parallel Coordinates/Boxplots and Boxplots y by x.

Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in databases.

Mondrian links to R and offers statistical procedures like interactive density estimation
Density estimation
In probability and statistics,density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function...

, scatterplot smoother
Scatterplot smoother
In statistics, several scatterplot smoothing methods are available to fit a function through the points of a scatterplot to best represent the relationship between the variables....

s, multidimensional scaling
Multidimensional scaling
Multidimensional scaling is a set of related statistical techniques often used in information visualization for exploring similarities or dissimilarities in data. MDS is a special case of ordination. An MDS algorithm starts with a matrix of item–item similarities, then assigns a location to each...

 (MDS) and principal component analysis (PCA).

Overview

Starting in 1997, Mondrian was first developed with a focus on visualization techniques for categorical data and enhanced selection techniques. Over the years, a complete suite of visualizations for univariate and multivariate data measured on any scale were added. The link to R offers well tested statistical procedures, which integrate seamlessly into the interactive graphics. Today, even geographical data is supported with highly interactive maps.

Supported data sources

Mondrian works on plain text files with tab-separated columns with variable header, as exported from Microsoft Excel
Microsoft Excel
Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

 as ".txt". If the Rserve link and R are present, Mondrian also reads data directly from R workspace files (.RData files).

Visualizations

  • 1-d: Barchart, Spineplot, Histogram
    Histogram
    In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

    , Spinogram, Boxplot
  • 2-d: Scatterplot
    Scatterplot
    A scatter plot or scattergraph is a type of mathematical diagram using Cartesian coordinates to display values for two variables for a set of data....

    , Boxplot y by x
  • High-D:
    • Multivariate continuous: Scatterplot matrix, Parallel coordinates
      Parallel coordinates
      Parallel coordinates is a common way of visualizing high-dimensional geometry and analyzing multivariate data.To show a set of points in an n-dimensional space, a backdrop is drawn consisting of n parallel lines, typically vertical and equally spaced...

    • Multivariate categorical: Mosaic plot (see also Treemapping
      Treemapping
      In information visualization and computing, treemapping is a method for displaying hierarchical data by using nested rectangles.- Main idea :...

      )
  • Geographical: Map
    Map
    A map is a visual representation of an area—a symbolic depiction highlighting relationships between elements of that space such as objects, regions, and themes....

  • Special: missing value plot

Further reading

  • Theus, M. (2002). Interactive Data Visualization using Mondrian, in Journal of Statistical Software 7 (11): 1–9.
  • Theus, M. and Urbanek, S. (2008). Interactive Graphics for Data Analysis: Principles and Examples (Computer Science and Data Analysis), Chapman & Hall / CRC.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK