Waffles (machine learning)
Encyclopedia
Waffles is a collection of command-line tools for performing machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

 operations developed at Brigham Young University
Brigham Young University
Brigham Young University is a private university located in Provo, Utah. It is owned and operated by The Church of Jesus Christ of Latter-day Saints , and is the United States' largest religious university and third-largest private university.Approximately 98% of the university's 34,000 students...

. These tools are written in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, and are available under the GNU Lesser General Public License
GNU Lesser General Public License
The GNU Lesser General Public License or LGPL is a free software license published by the Free Software Foundation . It was designed as a compromise between the strong-copyleft GNU General Public License or GPL and permissive licenses such as the BSD licenses and the MIT License...

.

Description

The Waffles machine learning toolkit contains command-line tools for performing various operations related to machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

, data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

, and predictive modeling. The primary focus of Waffles is to provide tools that are simple to use in scripted experiments or processes. For example, the supervised learning algorithms included in Waffles are all designed to support multi-dimensional labels, classification
Classification
Classification may refer to:* Library classification and classification in general* Taxonomic classification * Biological classification of organisms* Medical classification* Scientific classification...

 and regression
Regression
Regression could refer to:* Regression , a defensive reaction to some unaccepted impulses* Regression analysis, a statistical technique for estimating the relationships among variables...

, automatically impute missing values, and automatically apply necessary filters to transform the data to a type that the algorithm can support, such that arbitrary learning algorithms can be used with arbitrary data sets. Many other machine learning toolkits provide similar functionality, but require the user to explicitly configure data filters and transformations to make it compatible with a particular learning algorithm. The algorithms provided in Waffles also have the ability to automatically tune their own parameters (with the cost of additional computational overhead).

Because Waffles is designed for script-ability, it deliberately avoids presenting its tools in a graphical environment. It does, however, include a graphical "wizard" tool that guides the user to generate a command that will perform a desired task. This wizard does not actually perform the operation, but requires the user to paste the command that it generates into a command terminal or a script. The idea motivating this design is to prevent the user from becoming "locked in" to a graphical interface.

All of the Waffles tools are implemented as thin wrappers around functionality in a C++ class library. This makes it possible to convert scripted processes into native applications with minimal effort.

Waffles was first released as an open source project in 2005. Since that time, it has been developed at Brigham Young University
Brigham Young University
Brigham Young University is a private university located in Provo, Utah. It is owned and operated by The Church of Jesus Christ of Latter-day Saints , and is the United States' largest religious university and third-largest private university.Approximately 98% of the university's 34,000 students...

, with a new version having been released approximately every 6–9 months. Waffles is not an acronym—the toolkit was named after the food for historical reasons.

Advantages

Some of the advantages of Waffles in contrast with other popular open source machine learning toolkits include:
  • Waffles automatically takes care of many issues related to data format in order to simplify its tools.
  • Because it is implemented in C++, many of its algorithms are particularly fast. Also, the lack of dependency on any virtual machine makes it easier to deploy in conjunction with other applications.
  • The functionality included in Waffles is very broad, including algorithms for dimensionality reduction, collaborative filtering, visualization, clustering, supervised learning, optimization, linear algebra, data transformation, image and signal processing, policy learning, and sparse matrix operations.

Disadvantages

  • Although Waffles provides significant breadth, it lacks the depth of many toolkits that focus on a particular area of machine learning. The Weka (machine learning)
    Weka (machine learning)
    Weka is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand...

     toolkit, for example, provides many more classification algorithms than Waffles provides.
  • Waffles only has a limited graphical interface.

See also

  • Weka (machine learning)
    Weka (machine learning)
    Weka is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand...

  • RapidMiner (formerly YALE (Yet Another Learning Environment)), an open-source machine learning framework implemented in Java, fully integrating Weka
  • List of numerical analysis software
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK