Machine-generated data
Encyclopedia
Machine-generated data is the generic term for information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...

 which was automatically created from a computer process, application, or other machine without the intervention of a human. However, there is some indecision as to the breadth of the term. Monash Research's Curt Monash, who is generally credited with the introduction of the term, defines it as "data that was produced entirely by machines OR data that is more about observing humans than recording their choices." Meanwhile, Daniel Abadi, CS Professor at Yale
Yale University
Yale University is a private, Ivy League university located in New Haven, Connecticut, United States. Founded in 1701 in the Colony of Connecticut, the university is the third-oldest institution of higher education in the United States...

, proposes a narrower definition of "Machine-generated data is data that is generated as a result of a decision of an independent computational agent or a measurement of an event that is not caused by a human action." Regardless of the conflict in definition, both exclude data manually entered by an end user. Machine-generated data crosses all industry sectors, and humans increasingly generate the data unknowingly .

Relevance of machine generated data

Machine-generated data tends to be amorphous; typically, users never modify this data. Machines often generate this data as a consistent response to an event which occurred. Since the event is historical, the data is less prone to updates and modifications. Partly because of this quality, the U.S.
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...

 court systems consider machine-generated data as highly reliable..

Handling machine-generated data

In 2009, Gartner
Gartner
Gartner, Inc. is an information technology research and advisory firm headquartered in Stamford, Connecticut, United States. It was known as GartnerGroup until 2001....

 published that data will grow by 650% over the following five years.. Most of the growth in data is the byproduct of machine-generated data..

Processing machine-generated data

Given the fairly static yet voluminous nature of machine-generated data, data owners rely on highly scalable tools to process and analyze the resulting dataset. Almost all machine-generated data is unstructured but then derived into a common structure. Typically, these derived structures contain many data point
Data point
In statistics, a data point is a set of measurements on a single member of a statistical population, or a subset of those measurements for a given individual...

s/columns. With these data points, the challenge lies mostly with analyzing the data. Given high performance requirements along with large data sizes, traditional database indexing and partitioning limits the size and history of the dataset for processing. Alternative approaches exist with columnar databases as only particular "columns" of the dataset would be accessed during particular analysis.

Examples of machine-generated data

  • Web logs
  • Call detail record
    Call detail record
    A call detail record , also known as call data record, is a data record produced by a telephone exchange or other telecommunications equipment documenting the details of a phone call that passed through the facility or device...

    s
  • Financial instrument trades
  • Network event logs
  • SEIM logs
  • Telemetry
    Telemetry
    Telemetry is a technology that allows measurements to be made at a distance, usually via radio wave transmission and reception of the information. The word is derived from Greek roots: tele = remote, and metron = measure...

    collected by the government
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK