Avro (serialization system)
Encyclopedia
Avro is a remote procedure call
Remote procedure call
In computer science, a remote procedure call is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space without the programmer explicitly coding the details for this remote interaction...

 and serialization
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...

 framework developed within Apache's Hadoop project. It uses JSON
JSON
JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...

 for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop
Hadoop
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data...

, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services.

It is similar to Thrift
Thrift (protocol)
Thrift is an interface definition language that is used to define and create services for numerous languages. It is used as a remote procedure call framework and was developed at Facebook for "scalable cross-language services development"...

, but does not require running a code-generation program when a schema changes (unless desired for statically-typed languages).

Languages with APIs

Though theoretically any language could use Avro, the following languages have already had APIs written for them:
  • Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

  • C#
  • C
    C (programming language)
    C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

  • C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

  • Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

  • Ruby
    Ruby (programming language)
    Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...


Avro IDL

In addition to supporting JSON for type and protocol definitions, Avro includes experimental support for an alternate interface description language
Interface description language
An interface description language , or IDL for short, is a specification language used to describe a software component's interface...

 (IDL) syntax known as Avro IDL. Previously known as GenAvro, this format is designed to ease adoption by users familiar with more traditional IDLs and programming languages, with a syntax similar to C/C++, Protocol Buffers
Protocol Buffers
Protocol Buffers are a serialization format with an interface description language developed by Google. The original Google implementation for C++, Java and Python is available under a free software, open source license....

 and others.

See also

  • Apache Thrift
  • Google
    Google
    Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

    's Protocol Buffers
    Protocol Buffers
    Protocol Buffers are a serialization format with an interface description language developed by Google. The original Google implementation for C++, Java and Python is available under a free software, open source license....

  • Cisco
    Cisco
    Cisco may refer to:Companies:*Cisco Systems, a computer networking company* Certis CISCO, corporatised entity of the former Commercial and Industrial Security Corporation in Singapore...

    's Etch
    Etch (protocol)
    Etch is an open source, cross-platform framework for building network services, first announced in May 2008 by Cisco Systems. Etch encompasses a service description language, a compiler, and a number of language bindings...

  • ZeroC
    ZeroC
    ZeroC, Inc. is a company based in Palm Beach Gardens, Florida, U.S., revolving around the development and licensing of the Internet Communications Engine, or ICE, an object middleware system considered an alternative to CORBA and SOAP...

    's ICE
    Internet Communications Engine
    The Internet Communications Engine, or Ice, is an object-oriented middleware that provides object-oriented Remote Procedure Call, grid computing and Publish/subscribe functionality developed by ZeroC and dual-licensed under the GNU GPL and a proprietary license...

  • Microsoft
    Microsoft
    Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

    's "M
    M (programming language)
    M is a programming language developed by Microsoft. The language is designed specifically for building textual domain-specific languages and software models with XAML....

    "
  • MessagePack
    MessagePack
    MessagePack is a computer data interchange format. It is a binary form for representing simple data structure like arrays and associative arrays. MessagePack aims to be as compact and simple as possible...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK