Data drilling
Encyclopedia
Data drilling refers to any of various operations and transformations on tabular, relational, and multidimensional data. The term has widespread use in various contexts, but is primarily associated with specialized software designed specifically for data analysis
Data analysis
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...

.

Common data drilling operations

There are certain operations that are common to applications that allow data drilling. Among them are:

Query
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

 operations
:
  • tabular query
  • pivot query

Tabular query

Tabular query operations consist of standard operations on data tables.

Among these operations are:
  • search
  • sort
  • filter (by value)
  • filter (by extended function or condition)
  • transform (e.g., by adding or removing columns)


Consider the following example:

Fred and Wilma table (Fig 001):

gender , fname , lname , home
male , fred , chopin , Poland
male , fred , flintstone , bedrock
male , fred , durst , usa
female , wilma , flintstone , bedrock
female , wilma , rudolph , usa
female , wilma , webb , usa
male , fred , johnson , usa

The preceding is an example of a simple flat file table formatted as comma-separated values. The table includes first name, last name, gender and home country for various people named fred or wilma. Although the example is formatted this way, it is important to emphasize that tabular query operations (as well as all data drilling operations) can be applied to any conceivable data type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

, regardless of the underlying formatting. The only requirement is that the data be readable by the software application in use.

Pivot query

A pivot query allows multiple representations of data according to different dimensions. This query type is similar to tabular query, except it also allows data to be represented in summary format, according to a flexible user-selected hierarchy
Hierarchy
A hierarchy is an arrangement of items in which the items are represented as being "above," "below," or "at the same level as" one another...

. This class of data drilling operation is formally (and loosely) known by different names, including crosstab query, pivot table
Pivot table
In data processing, a pivot table is a data summarization tool found in data visualization programs such as spreadsheets or business intelligence software. Among other functions, pivot-table tools can automatically sort, count, total or give the average of the data stored in one table or spreadsheet...

, data pilot, selective hierarchy, intertwingularity
Intertwingularity
Intertwingularity is a term coined by Ted Nelson to express the complexity of interrelations in human knowledge.Nelson wrote in Computer Lib/Dream Machines :EVERYTHING IS DEEPLY INTERTWINGLED...

and others.

To illustrate the basics of pivot query operations, consider the Fred and Wilma table (Fig 001). A quick scan of the data reveals that the table has redundant information. This redundancy could be consolidated using an outline or a tree structure
Tree structure
A tree structure is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, even though the chart is generally upside down compared to an actual tree, with the "root" at the top and the...

 or in some other way. Moreover, once consolidated, the data could have many different alternate layouts.

Using a simple text outline as output, the following alternate layouts are all possible with a pivot query:

Summarize by gender (Fig 001):
female
flintstone, wilma
rudolph, wilma
webb, wilma
male
chopin, fred
flintstone, fred
durst, fred
johnson, fred

(Dimensions = gender; Tabular fields = lname, fname;)

Summarize by home, lname (Fig 001):
bedrock
flintstone
fred
wilma
Poland
chopin
fred
usa
...

(Dimensions = home, lname; Tabular fields = fname;)

Uses

Pivot query operations are useful for summarizing a corpus of data in multiple ways, thereby illustrating different representations of the same basic information. Although this type of operation appears prominently in spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...

s and desktop database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

software, its flexibility is arguably under-utilized. There are many applications that allow only a 'fixed' hierarchy for representing data, and this represents a substantial limitation.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK