Talend Open Profiler
Encyclopedia
Talend Open Profiler is an open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 computer software
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....

 project for data profiling
Data profiling
Data profiling is the process of examining the data available in an existing data source and collecting statistics and information about that data...

. The project is driven by commercial open source vendor Talend
Talend
Talend is an open source software vendor that provides data integration, data management and enterprise application integration software and solutions. Headquartered in Suresnes, France and Los Altos, California, Talend has offices in North America, Europe and Asia, and a global network of...

.

Talend Open Profiler Project

Talend Open Profiler is open source software that enables companies to assess the quality of data contained in their databases and business applications
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...

, and to decide which actions must be taken to correct erroneous or incomplete data.

Talend Open Profiler addresses the following profiling needs:
  • Metadata discovery
    Metadata discovery
    In metadata, metadata discovery is the process of using automated tools to discover the semantics of a data element in data sets. This process usually ends with a set of mappings between the data source elements and a centralized metadata registry....

    , which identifies the structure of the databases that need to be analyzed.
  • Statistics
    Statistics
    Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

     definition, which defines the statistics
    Statistics
    Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

     and metrics that need to be measured on each data item.
  • Results and graphs, which make it easy to view the results and assess the level of quality of the data.

Licence and extensions

Talend Open Profiler is distributed under GPLv2 and was launched in June 2008.

Talend provides services on Talend Open Profiler, including hosting a community forum
Internet forum
An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...

 for support and tutorials.

Talend also provides Talend Data Quality, integrated in Talend Integration Suite, which is a commercial extension to Talend Open Profiler with additional features, technical support and IP
Intellectual property
Intellectual property is a term referring to a number of distinct types of creations of the mind for which a set of exclusive rights are recognized—and the corresponding fields of law...

 indemnification.

Other major data profiling products

  • Microsoft SQL Server 2008
  • Oracle Data Quality
    Oracle Corporation
    Oracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...

  • DataCleaner
    DataCleaner
    DataCleaner is the flag-ship application of the eobjects.org open source community. DataCleaner is a data quality application suite with functionality for data profiling, transformation and reporting...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK