Real-time Transport Protocol
Encyclopedia
The Real-time Transport Protocol (RTP) defines a standardized packet format for delivering audio and video over IP networks. RTP is used extensively in communication and entertainment systems that involve streaming media
Streaming media
Streaming media is multimedia that is constantly received by and presented to an end-user while being delivered by a streaming provider.The term "presented" is used in this article in a general sense that includes audio or video playback. The name refers to the delivery method of the medium rather...

, such as telephony
Telephony
In telecommunications, telephony encompasses the general use of equipment to provide communication over distances, specifically by connecting telephones to each other....

, video teleconference applications, television services
IPTV
Internet Protocol television is a system through which television services are delivered using the Internet protocol suite over a packet-switched network such as the Internet, instead of being delivered through traditional terrestrial, satellite signal, and cable television formats.IPTV services...

 and web-based push-to-talk features.

RTP is used in conjunction with the RTP Control Protocol (RTCP). While RTP carries the media streams (e.g., audio and video), RTCP is used to monitor transmission statistics and quality of service
Quality of service
The quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...

 (QoS) and aids synchronization of multiple streams. RTP is originated and received on even port numbers and the associated RTCP communication uses the next higher odd port number.

RTP is one of the technical foundations of Voice over IP
Voice over IP
Voice over Internet Protocol is a family of technologies, methodologies, communication protocols, and transmission techniques for the delivery of voice communications and multimedia sessions over Internet Protocol networks, such as the Internet...

 and in this context is often used in conjunction with a signaling protocol
Signaling protocol
A signaling protocol is a type of protocol used to identify signaling encapsulation. Signaling is used to identify the state of connection between telephones or VOIP terminals . The following is a list of signaling protocols:...

 which assists in setting up connections across the network.

RTP was developed by the Audio-Video Transport Working Group of the Internet Engineering Task Force
Internet Engineering Task Force
The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...

 (IETF) and first published in 1996 as RFC 1889, superseded by RFC 3550 in 2003.

Overview

RTP was developed by the Audio/Video Transport working group of the IETF standards organization. RTP is used in conjunction with other protocols such as H.323
H.323
H.323 is a recommendation from the ITU Telecommunication Standardization Sector that defines the protocols to provide audio-visual communication sessions on any packet network...

 and RTSP. The RTP standard defines a pair of protocols, RTP and RTCP. RTP is used for transfer of multimedia data, and the RTCP is used to periodically send control information and QoS parameters.

RTP is designed for end-to-end
End-to-end principle
The end-to-end principle is a classic design principle of computer networking which states that application specific functions ought to reside in the end hosts of a network rather than in intermediary nodes, provided they can be implemented "completely and correctly" in the end hosts...

, real-time
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

, transfer of stream
Streaming media
Streaming media is multimedia that is constantly received by and presented to an end-user while being delivered by a streaming provider.The term "presented" is used in this article in a general sense that includes audio or video playback. The name refers to the delivery method of the medium rather...

 data. The protocol provides facility for jitter
Jitter
Jitter is the undesired deviation from true periodicity of an assumed periodic signal in electronics and telecommunications, often in relation to a reference clock source. Jitter may be observed in characteristics such as the frequency of successive pulses, the signal amplitude, or phase of...

 compensation and detection of out of sequence arrival in data, that are common during transmissions on an IP network. RTP supports data transfer to multiple destinations through IP multicast
IP Multicast
IP multicast is a method of sending Internet Protocol datagrams to a group of interested receivers in a single transmission. It is often employed for streaming media applications on the Internet and private networks. The method is the IP-specific version of the general concept of multicast...

. RTP is regarded as the primary standard for audio/video transport in IP networks and is used with an associated profile and payload format.

Real-time multimedia
Multimedia
Multimedia is media and content that uses a combination of different content forms. The term can be used as a noun or as an adjective describing a medium as having multiple content forms. The term is used in contrast to media which use only rudimentary computer display such as text-only, or...

 streaming applications require timely delivery of information and can tolerate some packet loss to achieve this goal. For example, loss of a packet in audio application may result in loss of a fraction of a second of audio data, which can be made unnoticeable with suitable error concealment algorithms. The Transmission Control Protocol
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...

 (TCP), although standardized for RTP use, is not normally used in RTP application because TCP favors reliability over timeliness. Instead the majority of the RTP implementations are built on the User Datagram Protocol
User Datagram Protocol
The User Datagram Protocol is one of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol network without requiring...

 (UDP). Other transport protocols specifically designed for multimedia sessions are SCTP and DCCP, although, , they are not in widespread use.

Protocol components

The RTP specification describes two sub-protocols:
  • The data transfer protocol, RTP, which deals with the transfer of real-time data. Information provided by this protocol include timestamps (for synchronization), sequence numbers (for packet loss and reordering detection) and the payload format which indicates the encoded format of the data.
  • A control protocol, RTCP, used to specify quality of service (QoS) feedback and synchronization
    Synchronization
    Synchronization is timekeeping which requires the coordination of events to operate a system in unison. The familiar conductor of an orchestra serves to keep the orchestra in time....

     between the media streams. The bandwidth of RTCP traffic compared to RTP is small, typically around 5%.
  • An optional signaling protocol such as H.323
    H.323
    H.323 is a recommendation from the ITU Telecommunication Standardization Sector that defines the protocols to provide audio-visual communication sessions on any packet network...

     or Session Initiation Protocol
    Session Initiation Protocol
    The Session Initiation Protocol is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol . The protocol can be used for creating, modifying and terminating two-party or multiparty sessions...

     (SIP)
  • An optional media description protocol such as Session Description Protocol
    Session Description Protocol
    The Session Description Protocol is a format for describing streaming media initialization parameters. The IETF published the original specification as an IETF Proposed Standard in April 1998, and subsequently published a revised specification as an IETF Proposed Standard as RFC 4566 in July...


Sessions

An RTP Session is established for each multimedia stream. A session consists of an IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 with a pair of ports for RTP and RTCP. For example, audio and video streams will have separate RTP sessions, enabling a receiver to deselect a particular stream. The ports which form a session are negotiated using other protocols such as RTSP (using SDP
Session Description Protocol
The Session Description Protocol is a format for describing streaming media initialization parameters. The IETF published the original specification as an IETF Proposed Standard in April 1998, and subsequently published a revised specification as an IETF Proposed Standard as RFC 4566 in July...

 in the setup method) and SIP
Session Initiation Protocol
The Session Initiation Protocol is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol . The protocol can be used for creating, modifying and terminating two-party or multiparty sessions...

. According to the specification, an RTP port should be even and the RTCP port is the next higher odd port number. RTP and RTCP typically use unprivileged UDP ports (1024 to 65535), but may use other transport protocols (most notably, SCTP and DCCP) as well, as the protocol design is transport independent.

Profiles and Payload formats

One of the design considerations of RTP was to support a range of multimedia formats (such as H.264, MPEG-4, MJPEG, MPEG, etc.) and allow new formats to be added without revising the RTP standard. The design of RTP is based on the architectural principle known as application level framing (ALF). The information required by a specific application's needs is not included in the generic RTP header, but is instead provided through RTP profiles and payload formats.
For each class of application (e.g., audio, video), RTP defines a profile and one or more associated payload formats. A complete specification of RTP for a particular application usage will require a profile and payload format specification(s).

The profile defines the codecs used to encode the payload data and their mapping to payload format codes in the Payload Type (PT) field of the RTP header (see below). Each profile is accompanied by several payload format specifications, each of which describes the transport of a particular encoded data. Some of the audio payload formats include: G.711
G.711
G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...

, G.723
G.723
G.723 is a ITU-T standard speech codec using extensions of G.721 providing voice quality covering 300 Hz to 3400 Hz using Adaptive Differential Pulse Code Modulation to 24 and 40 kbit/s for digital circuit multiplication equipment applications...

, G.726
G.726
G.726 is an ITU-T ADPCM speech codec standard covering the transmission of voice at rates of 16, 24, 32, and 40 kbit/s. It was introduced to supersede both G.721, which covered ADPCM at 32 kbit/s, and G.723, which described ADPCM for 24 and 40 kbit/s. G.726 also introduced a new...

, G.729
G.729
G.729 is an audio data compression algorithm for voice that compresses digital voice in packets of 10 milliseconds duration. It is officially described as Coding of speech at 8 kbit/s using conjugate-structure algebraic code-excited linear prediction .Because of its low bandwidth requirements,...

, GSM, QCELP
QCELP
Qualcomm code-excited linear prediction , also known as Qualcomm PureVoice, is a speech codec developed in 1994 by Qualcomm to increase the speech quality of the IS-96A codec earlier used in CDMA networks. It was later replaced with EVRC since it provides better speech quality with fewer bits...

, MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...

, DTMF etc., and some of the video payload formats include: H.261
H.261
H.261 is a ITU-T video coding standard, ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group , and was the first video codec that was useful in practical terms.H.261 was originally designed for...

, H.263
H.263
H.263 is a video compression standard originally designed as a low-bitrate compressed format for videoconferencing. It was developed by the ITU-T Video Coding Experts Group in a project ending in 1995/1996 as one member of the H.26x family of video coding standards in the domain of the ITU-T.H.263...

, H.264, MPEG-4
MPEG-4
MPEG-4 is a method of defining compression of audio and visual digital data. It was introduced in late 1998 and designated a standard for a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group under the formal standard ISO/IEC...

 etc.

Examples of RTP Profiles include:
  • The RTP profile for Audio and video conferences with minimal control (RFC 3551) defines a set of static payload type assignments, and a mechanism for mapping between a payload format, and a payload type identifier (in header) using Session Description Protocol
    Session Description Protocol
    The Session Description Protocol is a format for describing streaming media initialization parameters. The IETF published the original specification as an IETF Proposed Standard in April 1998, and subsequently published a revised specification as an IETF Proposed Standard as RFC 4566 in July...

     (SDP).
  • The Secure Real-time Transport Protocol
    Secure Real-time Transport Protocol
    The Secure Real-time Transport Protocol defines a profile of RTP , intended to provide encryption, message authentication and integrity, and replay protection to the RTP data in both unicast and multicast applications...

     (SRTP) (RFC 3711) defines a profile of RTP that provides cryptographic
    Cryptography
    Cryptography is the practice and study of techniques for secure communication in the presence of third parties...

     services for the transfer of payload data.
  • The experimental Control Data Profile for RTP (RTP/CDP) for machine-to-machine communications.

Packet header

RTP packet header
bit offset 0-1 2 3 4-7 8 9-15 16-31
0 Version P X CC M PT Sequence Number
32 Timestamp
64 SSRC identifier
96 CSRC identifiers
...
96+32×CC Profile-specific extension header ID Extension header length
128+32×CC Extension header
...


The RTP header has a minimum size of 12 bytes. After the header, optional header extensions may be present. This is followed by the RTP payload, the format of which is determined by the particular class of application. The fields in the header are as follows:
  • Version: (2 bits) Indicates the version of the protocol. Current version is 2.
  • P (Padding): (1 bit) Used to indicate if there are extra padding bytes at the end of the RTP packet. A padding might be used to fill up a block of certain size, for example as required by an encryption algorithm.
  • X (Extension): (1 bit) Indicates presence of an Extension header between standard header and payload data. This is application or profile specific.
  • CC (CSRC Count): (4 bits) Contains the number of CSRC identifiers (defined below) that follow the fixed header.
  • M (Marker): (1 bit) Used at the application level and defined by a profile. If it is set, it means that the current data has some special relevance for the application.
  • PT (Payload Type): (7 bits) Indicates the format of the payload and determines its interpretation by the application. This is specified by an RTP profile. For example, see RTP Profile for audio and video conferences with minimal control (RFC 3551).
  • Sequence Number: (16 bits) The sequence number is incremented by one for each RTP data packet sent and is to be used by the receiver to detect packet loss and to restore packet sequence. The RTP does not specify any action on packet loss; it is left to the application to take appropriate action. For example, video applications may play the last known frame in place of the missing frame. According to RFC 3550, the initial value of the sequence number should be random to make known-plaintext attack
    Known-plaintext attack
    The known-plaintext attack is an attack model for cryptanalysis where the attacker has samples of both the plaintext , and its encrypted version . These can be used to reveal further secret information such as secret keys and code books...

    s on encryption
    Encryption
    In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...

     more difficult. RTP provides no guarantee of delivery, but the presence of sequence numbers makes it possible to detect missing packets.
  • Timestamp: (32 bits) Used to enable the receiver to play back the received samples at appropriate intervals. When several media streams are present, the timestamps are independent in each stream, and may not be relied upon for media synchronization. The granularity of the timing is application specific. For example, an audio application that samples data once every 125 µs (8 kHz, a common sample rate in digital telephony) could use that value as its clock resolution. The clock granularity is one of the details that is specified in the RTP profile for an application.
  • SSRC: (32 bits) Synchronization source identifier uniquely identifies the source of a stream. The synchronization sources within the same RTP session will be unique.
  • CSRC: Contributing source IDs enumerate contributing sources to a stream which has been generated from multiple sources.
  • Extension header: (optional) The first 32-bit word contains a profile-specific identifier (16 bits) and a length specifier (16 bits) that indicates the length of the extension (EHL=extension header length) in 32-bit units, excluding the 32 bits of the extension header.

RTP-based systems

A complete network based system will include other protocols and standards in conjunction with RTP. Protocols like SIP, RTSP, H.225 and H.245
H.245
H.245 is a control channel protocol used with[in] e.g. H.323 and H.324 communication sessions, and involves the line transmission of non-telephone signals. It also offers the possibility to be tunneled within H.225.0 call signaling messages...

are used for session initiation, control and termination. Other standards like H.264, MPEG, H.263 etc., are used to encode the payload data as specified via RTP Profile.

An RTP sender captures the multimedia data, which is then encoded, framed and transmitted as RTP packets with appropriate timestamps and increasing sequence numbers. Depending on the RTP Profile in use, the Payload Type field is set. The RTP receiver, captures the RTP packets, detects missing packets and may perform reordering of packets. The frames are decoded depending on the payload format and presented to the end user.

RFC references

  • RFC 3550, Standard 64, RTP: A Transport Protocol for Real-Time Applications
  • RFC 3551, Standard 65, RTP Profile for Audio and Video Conferences with Minimal Control
  • RFC 6184, Proposed Standard, RTP Payload Format for H.264 Video
  • RFC 3984, Obsolete, RTP Payload Format for H.264 Video
  • RFC 4103, RTP Payload Format for Text Conversation
  • RFC 3640, RTP Payload Format for Transport of MPEG-4 Elementary Streams
  • RFC 3016, RTP Payload Format for MPEG-4 Audio/Visual Streams
  • RFC 2250, Proposed Standard, RTP Payload Format for MPEG1/MPEG2 Video

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK