Clover.ETL
Encyclopedia
CloverETL is a Java-based data integration framework used to transform, cleanse, standardize and distribute data to applications, databases or warehouses. Its component-based structure allows easy customization and embeddability.

CloverETL suite

  • CloverETL Community - free-of-charge data integration tool with GUI
  • CloverETL Designer - the graphical user interface to create and modify data transformations for CloverETL Server and Engine.
  • CloverETL Engine - executes the transformations (run-time); can be embedded as a library. Available under LGPL.
  • CloverETL Server - full-fledged server application with a rich WEB-based administrative interface, which leverages the existing CloverETL Engine.
  • CloverETL Cluster/Cloud - server extension allowing multiple instances to form data transformation cluster capable of running in-house or in cloud environment with dynamic load-balancing & nodes provisioning.

Main Features

Platform, Application and Database independent
  • utilizing industry standards of Java (SE/EE) and Eclipse


Embeddable
  • both engine and the server can be embedded as a transformation library/service


Scalable & Efficient
  • Data transformation is performed by independent components, each running as an independent thread – allows utilization of multiple CPUs or cores
  • Compact size, very small memory footprint
  • High speed – Clover outperforms custom scripts by 45% and is the fastest Java Open Source ETL tool
  • Clustering extension is available for CloverETL Server which allows both pipeline & data parallelism


Stable & Mature
  • CloverETL was founded in 2002 and is continuously developed by stable team of programmers
  • Project is self-funded - no venture capital involved


Customizable
  • Can be quickly and easily extended by custom components

More details

CloverETL is a Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 based ETL
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

 tool with Open Source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 components.
It can be used standalone - as a command-line application or server application or can be easily embedded in other application (as a Java library). CloverETL is accompanied by CloverGUI graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

 developed as Eclipse
Eclipse (software)
Eclipse is a multi-language software development environment comprising an integrated development environment and an extensible plug-in system...

 plugin. Data transformation
Data transformation
In metadata and data warehouse, a data transformation converts data from a source data format into destination data.Data transformation can be divided into two steps:...

 is described by transformation graph that is represented by Java class. However, description of graphs can be stored also in XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 format. Graph consists of nodes (perform various simple transformations) and edges (connect nodes and pass data around). Each node is run as separate thread
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

 which helps utilize more CPUs (cores). The Clover ETL engine can also be used in transaction mode - i.e. transformation graphs are executed repetitively as a step in transaction.

CloverETL can be easily extended by creating new custom components in Java. Such components can be registered within the GUI and used as any other component delivered in the standard pack.

Server version of CloverETL allows parallel execution of transformations and also supports execution of transformations in transaction mode. The server runs inside application container.

CloverETL currently contains connectors for following data sources

  • text file delimited*, fix-length* & combined*
  • XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

    * supports very large XML data files - GBs
  • XLS (MS Excel)*
  • any RDBMS through JDBC
  • WebServices through REST
    Rest
    Rest may refer to:* Leisure* Human relaxation* SleepRest may also refer to:* Rest , a pause in a piece of music* Rest , the relation between two observers* Rest , a 2008 album by Gregor Samsa...

    /SOAP
    SOAP
    SOAP, originally defined as Simple Object Access Protocol, is a protocol specification for exchanging structured information in the implementation of Web Services in computer networks...

     protocols
  • JMS
    JMS
    - Buildings :*EverBank Field, a sports stadium in Jacksonville, Florida, home of the Jacksonville Jaguars. Formerly known as Jacksonville Municipal Stadium...

  • LDAP
  • dBase/FoxBase/FoxPro
    FoxPro
    ' has two meanings:*Visual FoxPro, an object-oriented programming language and RDBMS, published by Microsoft, for Microsoft Windows*FoxPro 2, a text-based procedural programming language and DBMS, originally published by Fox Software and later by Microsoft, for MS-DOS, Microsoft Windows, Macintosh,...

  • bulk-loaders for Oracle, DB2, MS SQL, Informix, MySQL and PostgreSQL
  • QuickBase (by Intuit)


*remote reading/writing through FTP/SFTP/HTTP/HTTPS protocols and also from ZIP/GZIP/TAR archives supported

CloverETL has been successfully deployed on following OS platforms

  • Linux (both 32&64 bits)
  • Windows (both 32&64 bits)
  • HP-UX
  • AIX
  • AS/400 (IBM System I)
  • Solaris
  • Mac OS X

Other open-source Java ETL frameworks

  • Apatar
    Apatar
    Apatar is an open source ETL and data integration software application.-History:Apatar open source project was founded in 2005 . The first version of the tool was released under the GPLv2 license at www.sourceforge.net in February 2007. In April 2007, Apatar alpha version was demonstrated to its...

  • Talend Open Studio
    Talend Open Studio
    Talend Open Studio is an open source data integration product developed by Talend and designed to combine, convert and update data in various locations across a business.- History :...

  • expressor
  • Enhydra Octopus (launches from web browser via Java Web Start
    Java Web Start
    In computing, Java Web Start is a framework developed by Sun Microsystems that allows users to start application software for the Java Platform directly from the Internet using a web browser....

    )
  • Pentaho Data Integration

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK