Kalido
Encyclopedia
Kalido is a software company headquartered in Burlington, Massachusetts with offices in the US, London and India.

History

The ideas behind Kalido started in 1985, when the Royal Dutch/Shell Group
Royal Dutch Shell
Royal Dutch Shell plc , commonly known as Shell, is a global oil and gas company headquartered in The Hague, Netherlands and with its registered office in London, United Kingdom. It is the fifth-largest company in the world according to a composite measure by Forbes magazine and one of the six...

 began twelve years of advanced data-modeling research, involving highly generic models
Data modeling
Data modeling in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques.- Overview :...

 and time variance.

Between 1997 and 2000, a Shell team led by Andy Hayler
Andy Hayler
Andy Hayler, data warehousing innovator, entrepreneur, writer, and food critic, is known for leading the creation of the dynamic data warehousing architecture within Royal Dutch Shell that he later commercialized as the KALIDO Active Information Management software product of Kalido, the Shell...

 spotted the opportunity to develop Kalido software on the basis of this research to solve the challenge of obtaining performance information across multiple Shell organizations throughout business change. The software was deployed within Shell in 100 countries worldwide, powering dozens of projects and generating tens of millions of dollars of annual cost savings.

Kalido DIW

The architecture of Kalido DIW (Dynamic Information Warehouse) is based on "generic data model
Generic data model
Generic data models are generalizations of conventional data models. They define standardised general relation types, together with the kinds of things that may be related by such a relation type.- Overview :...

ing" principles. Generic data modeling is an advanced database design technique that offers advantages over conventional designs. Shell developed the technique and offered the data design approach to the ISO standards community. The approach is now used extensively within the ISO STEP
ISO 10303
ISO 10303 is an ISO standard for the computer-interpretable representation and exchange of product manufacturing information. Its official title is: Automation systems and integration — Product data representation and exchange...

 world.

The approach involves the structure of the data being held as data, rather than being defined by a specific physical database design. Generic data modeling is a radical departure from traditional data modeling principles.

Kalido MDM

Kalido MDM (Master Data Management) is a software application for harmonizing, storing and managing master data over time. It increases the
consistency and accuracy of corporate performance reporting by enabling business people to collaboratively manage
and control master data in a workflow-driven environment. It produces a master data warehouse from which “golden-copy”
master data can be distributed to enterprise applications and business people throughout the organization.

Kalido MDM features:
  • Manages any type of master data — from products and customers to brands, markets, territories and more.
  • Facilitates data governance in a collaborative, workflow-driven environment
  • Flexible master data modeling—featuring cataloging, segmenting, merging and mapping facilities
  • Loads non-conforming master data — all master data is loaded even if it doesn’t conform to the master data model. Workflows can be used to ensure that the data — or the model — is revised accordingly
  • Maintains master data history

Generic Modeling and the Data Warehouse

The generic structure, compared to the traditional data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

 design based on third normal form
Third normal form
In computer science, the third normal form is a normal form used in database normalization. 3NF was originally defined by E.F. Codd in 1971. Codd's definition states that a table is in 3NF if and only if both of the following conditions hold:...

 schemas and snowflake
Snowflake schema
In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake in shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions.The snowflake schema...

 or star schema
Star schema
In computing, the star schema is the simplest style of data warehouse schema. The star schema consists of one or more fact tables referencing any number of dimension tables...

s, has both advantages and disadvantages.

Advantages
  • The generic structure can store time variant
    Time variance
    Time variance is the ability to remember historic perspectives. The requirement is to be able to know how something was classified or who owned something and how this changed as time passed....

     business context data (i.e., changes to the business context data that happen over time such as a reorganization where departments are grouped differently), without requiring any database design changes. By contrast, traditional data models represent a snapshot of the requirements that were valid at the time the model was created. This makes it difficult to store historic data, which may require as much analysis as the current data. Often historic information is discarded due to the extra design required.

  • The generic structure presents a highly standardized approach to loading and retrieval, enabling the automatic creation of loading and retrieval routines by Kalido DIW.

  • The generic structure enables the loading of new classes of data through the simple addition of a few records of metadata. Conventionally, changes in requirements cause changes to the design, requiring a database administrator to alter the table structure of the warehouse and to reorganize the data in the database. The costs and time involved can be considerable.

  • The generic structure allows the capture of complex business rules that are difficult to capture using a conventional relational structure.

  • The use of metadata allows the structure of business context and transaction data to be easily understood by business users.


Disadvantages
A pure implementation of generic modeling principles will bring with it some disadvantages such as:
  • Conventional star schema can give better performance than physical implementations of the generic structure. Kalido DIW addresses these issues by combining elements of the generic structure with those of a star schema.

  • The generic structure supports the business structure by holding multiple rows, linked by pointers, instead of the conventional columns in a table. This makes the data difficult to read and the SQL difficult to write, requiring a codegenerating front-end to read and load data. Kalido DIW has such a code-generating front-end.


Despite the generic structure being different from conventional designs, it is far easier to query once understood as it combines the business metadata dictionary with the business context data. Finding out where something is stored is far simpler than navigating through hundreds of obscure tables.

Implications
Given the above advantages and disadvantages, a mix of the generic design for business context data and the star schema for transaction data and retrieval would make an ideal situation. This has been the basis for the physical implementation of Kalido
DIW. The results of the Kalido implementation have proved that this innovative design can, and does, work. Kalido has UK patents on this design. The generic design of Kalido DIW is highly flexible but could have made processing transactions against the hierarchies of business context data it rather inefficient. To improve performance, the complex hierarchies
are automatically flattened out by Kalido DIW to create "mapping" tables.

These mapping tables are complex and contain the full structure of the business context data hierarchies, including the date and time stamping of changes. They are regenerated when either the master data or its structure change so Kalido DIW fully manages both the generic data storage and its replication in mapping tables. This replication is done incrementally and can be delayed so that bulk changes can be made over a period with only a single generation of the mapping tables concerned. This ensures that optimum performance is delivered, in accessing both the generic data for exploration queries and the mapping tables for OLAP queries.

The creation of mapping tables makes a Kalido warehouse appear like any other star schema. Conventional star schemas include the business context data, but they are keyed reference tables with all the attributes, classifications, etc. as columns. This causes duplication of data and difficulty in maintenance, but is fast to process. This is why the Kalido warehouse can equal the query
performance of a conventional design. The creation of the mapping tables can be a scheduled task or the user can initiate it. Batch tasks can also be used for business context data loading, transaction loading, summary generation, mapping table generation,
data mart building, or export of transaction or business context data.

Data marts are generated by extracting information from the warehouse in a form that can be analyzed using tools such as Excel or
BusinessObjects to slice and dice, or drill-down through it. The data mart can be separated from the database, and small ones can take the form of Excel pivot tables, which can be taken away on a portable computer for offline analysis.

In summary, one of the requirements of a data warehouse is that it should be capable of storing and managing almost any data from any source.

In a Kalido warehouse:
  • Information is held in a neutral format, i.e. not limited to a particular type of business data.
  • There are neutral formats for transaction data and business context data.


Metadata is used for:
  • validation and loading of data into the warehouse
  • structuring data in the warehouse
  • defining data marts


The neutral formats allow you to select and view information as you want in data marts.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK