Au file format
Encyclopedia
The Au file format is a simple audio file format
Audio file format
An audio file format is a file format for storing digital audio data on a computer system. This data can be stored uncompressed, or compressed to reduce the file size. It can be a raw bitstream, but it is usually a container format or an audio data format with defined storage layer.-Types of...

 introduced by Sun Microsystems
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

. The format was common on NeXT
NeXT
Next, Inc. was an American computer company headquartered in Redwood City, California, that developed and manufactured a series of computer workstations intended for the higher education and business markets...

 systems and on early Web pages. Originally it was headerless, being simply 8-bit µ-law-encoded data at an 8000 Hz sample rate. Hardware from other vendors often used sample rates as high as 8192 Hz, often integer factors of video clock signals. Newer files have a header that consists of six unsigned
Signedness
In computing, signedness is a property of data types representing numbers in computer programs. A numeric variable is signed if it can represent both positive and negative numbers, and unsigned if it can only represent non-negative numbers .As signed numbers can represent negative numbers, they...

 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 words, an optional information chunk and then the data (in big endian format).

Although the format now supports many audio encoding
Digital audio
Digital audio is sound reproduction using pulse-code modulation and digital signals. Digital audio systems include analog-to-digital conversion , digital-to-analog conversion , digital storage, processing and transmission components...

 formats, it remains associated with the µ-law logarithmic encoding. This encoding was native to the SPARCstation 1
SPARCstation 1
The SPARCstation 1, or Sun 4/60, is the first of the SPARCstation series of SPARC-based computer workstations sold by Sun Microsystems. It had a distinctive slim enclosure and was first sold in April 1989, with Sun's support for it ending in 1995.Based around a LSI Logic RISC CPU running at...

 hardware, where SunOS
SunOS
SunOS is a version of the Unix operating system developed by Sun Microsystems for their workstation and server computer systems. The SunOS name is usually only used to refer to versions 1.0 to 4.1.4 of SunOS...

 exposed the encoding to application programs through the /dev/audio interface. This encoding and interface became a de facto
De facto
De facto is a Latin expression that means "concerning fact." In law, it often means "in practice but not necessarily ordained by law" or "in practice or actuality, but not officially established." It is commonly used in contrast to de jure when referring to matters of law, governance, or...

 standard for Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 sound.

New format

All fields are stored in big-endian format, including the sample data.
32 bit word (unsigned) field Description/Content Hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...

 numbers in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 notation
0 magic number
Magic number (programming)
In computer programming, the term magic number has multiple meanings. It could refer to one or more of the following:* A constant numerical or text value used to identify a file format or protocol; for files, see List of file signatures...

the value 0x2e736e64 (four ASCII characters ".snd")
1 data offset the offset to the data in byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

s. The minimum valid number is 24 (decimal), since this is the header length (six 32-bit words) with no space reserved for extra information.
2 data size data size in bytes. If unknown, the value 0xffffffff should be used.
3 encoding Data encoding format:
  • 1 = 8-bit G.711
    G.711
    G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...

     µ-law
    Mu-law algorithm
    The µ-law algorithm is a companding algorithm, primarily used in the digital telecommunication systems of North America and Japan. Companding algorithms reduce the dynamic range of an audio signal...

  • 2 = 8-bit linear PCM
  • 3 = 16-bit linear PCM
  • 4 = 24-bit linear PCM
  • 5 = 32-bit linear PCM
  • 6 = 32-bit IEEE floating point
    IEEE floating-point standard
    IEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...

  • 7 = 64-bit IEEE floating point
    IEEE floating-point standard
    IEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...

  • 8 = Fragmented sample data
  • 9 = DSP program
  • 10 = 8-bit fixed point
    Fixed-point arithmetic
    In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

  • 11 = 16-bit fixed point
    Fixed-point arithmetic
    In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

  • 12 = 24-bit fixed point
    Fixed-point arithmetic
    In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

  • 13 = 32-bit fixed point
    Fixed-point arithmetic
    In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

  • 18 = 16-bit linear with emphasis
  • 19 = 16-bit linear compressed
  • 20 = 16-bit linear with emphasis and compression
  • 21 = Music kit DSP commands
  • 23 = 4-bit ISDN u-law
    Mu-law algorithm
    The µ-law algorithm is a companding algorithm, primarily used in the digital telecommunication systems of North America and Japan. Companding algorithms reduce the dynamic range of an audio signal...

     compressed using the ITU-T G.721
    G.726
    G.726 is an ITU-T ADPCM speech codec standard covering the transmission of voice at rates of 16, 24, 32, and 40 kbit/s. It was introduced to supersede both G.721, which covered ADPCM at 32 kbit/s, and G.723, which described ADPCM for 24 and 40 kbit/s. G.726 also introduced a new...

     ADPCM voice data encoding scheme
  • 24 = ITU-T G.722
    G.722
    G.722 is a ITU-T standard 7 kHz wideband speech codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in November 1988. Technology of the codec is based on sub-band ADPCM ....

     ADPCM
  • 25 = ITU-T G.723
    G.723
    G.723 is a ITU-T standard speech codec using extensions of G.721 providing voice quality covering 300 Hz to 3400 Hz using Adaptive Differential Pulse Code Modulation to 24 and 40 kbit/s for digital circuit multiplication equipment applications...

     3-bit ADPCM
  • 26 = ITU-T G.723
    G.723
    G.723 is a ITU-T standard speech codec using extensions of G.721 providing voice quality covering 300 Hz to 3400 Hz using Adaptive Differential Pulse Code Modulation to 24 and 40 kbit/s for digital circuit multiplication equipment applications...

     5-bit ADPCM
  • 27 = 8-bit G.711
    G.711
    G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...

     A-law
    A-law algorithm
    An A-law algorithm is a standard companding algorithm, used in European digital communications systems to optimize, i.e., modify, the dynamic range of an analog signal for digitizing.It is similar to the μ-law algorithm used in North America and Japan....

4 sample rate the number of samples/second, e.g., 8000
5 channels the number of interleaved channels, e.g., 1 for mono, 2 for stereo; more channels possible, but may not be supported by all readers.


The type of encoding depends on the value of the "encoding" field (word 3 of the header). Formats 2 through 7 are uncompressed PCM, therefore lossless. Formats 23 through 26 are ADPCM, which is a lossy, roughly 4:1 compression. Formats 1 and 27 are μ-law and A-law, respectively, both lossy. Several of the others are DSP
Digital signal processing
Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...

 commands or data, designed to be processed by the NeXT
NeXT
Next, Inc. was an American computer company headquartered in Redwood City, California, that developed and manufactured a series of computer workstations intended for the higher education and business markets...

 MusicKit software.

Note: PCM data appear to be encoded as signed, rather than unsigned.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK