Supported Protocols

HLS

The HLS (HTTP Live Streaming) protocol, developed by Apple, allows Live and VoD content to be delivered over the HTTP protocol. Currently standardized as RFC8216.

The protocol is supported by a large number of devices: set-top boxes (STB), SmartTV, different browsers (some have built-in support, some with JavaScript players), etc. But the standard implementation of the protocol can be considered Apple products — QuickTime Player, Safari web-browser, AVPlayer (a part of AVFoundationFramework), etc.

The basic principle of the HLS protocol is to segment the stream into small fragments (chunks), subsequently downloaded via the HTTP protocol. Using the HTTP protocol allows the stream to pass firewalls or proxy servers (as opposed to UDP-based protocols such as RTSP). The standard HTTP caches and HTTP-based content delivery networks can be used with HLS without any specialized software for video delivery.

At the beginning of the session, the device requests a master playlist containing metadata about the available streams (the so-called “variants”) and their characteristics: the type of tracks (audio, video, subtitles), bitrate, resolution, language, etc. After retrieving a list of available variants from the main playlist, the client device, based on the bandwidth and user preferences, selects the appropriate streams and requests media playlists for each stream containing media URLs.

When bandwidth changed or user choose another playback quality settings, the player can select another stream and, if possible, smoothly switch to it. As a rule, the player requires synchronization of key frames of video tracks and correspondence of chunks between all variants for smooth switching.

All playlists has the m3u8 format.

The main playlist can contain:

  • list of content variants depending on the bitrate (tag #EXT-X-STREAM-INF);
  • list of alternative tracks: audio, video, subtitles, Closed Captions (tag #EXT-X-MEDIA);
  • list of playlists for fast playback: the so-called “trickplay” (tag #EXT-X-I-FRAME-STREAM-INF), and other information.

The simplest master playlist for content with one “variant” may look like this:

The simplest media playlist (playlist.m3u8) for Live broadcast may look like this:

Support for multiple audio and subtitle languages in HLS can be implemented in two ways:

  • In accordance with RFC8216: for each language, there is a individual set of chunks and a individual playlist specified in the tag #EXT-X-MEDIA. So, alternative tracks are downloaded only when they are needed. As a rule, in this case, “main” data chunks contains only video and no other tracks — audio or subtitles (for the M4F container this is the only possible option).
  • In accordance with the MPEG2-TS standard: inside single TS-chunk. Each chunk contains a video track and a full set of audio tracks and subtitles. This method is not supported by all devices and is less preferred, because the subscriber device downloads all tracks, even when they are not needed.

The playlist with multiple audio tracks may look like this:

Only MPEG2-TS container was supported on Apple devices until 2016; in 2016 (iOS10 and later, macOS 10.12 and later, tvOS10 and later), support for the M4F container was added. The protocol does not impose limitations on codecs. The H.265 codec is supported on iOS11 devices (only with M4F container) since 2017.

MPEG-DASH

The MPEG-DASH (Dynamic Adaptive Streaming over HTTP) protocol, just like HLS, is based on the HTTP protocol and segmenting of a continuous stream into chunks. Unlike HLS, it was developed by the MPEG group and standardized as an ISO standard in 2012. The current standard version is ISO/IEC 23009-1:2014. In addition to the main standard, there are several sets of recommendations. The most known is DASH-IF Interoperability Points (DASH-IF-IOP). At the time of writing, the current version of DASH-IF-OP is v4.0.

All information about media data (the list of chunks, their duration, events in the stream, etc.) is contained in an XML file called Media Presentation Descriptor (MPD).

MPD has a multi-level nested structure. The simplified representation of this structure can be demonstrated as follows:

Image1

The MPEG-DASH standard involves dividing a continuous stream into periods. In the simplest case, there will be only one period, but in some cases (they will be considered below) there will be several periods.

Each period includes one or more so-called AdaptationSets. AdaptationSet, in turn, contains one or more Representations.

In fact, AdaptationSet is a “logical” stream: one video track, one audio track in a specific language, one subtitle track, etc. If the content contains several audio tracks (in different languages, for example), there will be several AdaptationSets. If several video tracks (for example, from different view angles or with different localizations), there also will be several AdaptationSets.

The stream, that will be delivered to the client, is a Representation. For example, if the video stream is encoded in several bitrates and the player can automatically switch between them — these are several Representations within the same AdaptationSet.

The composition of AdaptationSets within the period and the composition of Representations within the AdaptationSet of one period can not be changed. If such changes are necessary, the media server must generate a new period with a new set of AdaptationSets and Representations. In this case, Representation‘s identifiers representing the same stream must be preserved.

Representation contains information about the media chunks list. Unlike HLS, where each chunk should be explicitly listed in a playlist, MPEG-DASH has three options for content encoding and enumerating chunks in MPD:

  • SegmentBase. Suitable only for VoD content. The list of chunks should be contained in the sidx atom of the MP4 container. Only the sidx atom URL is described in MPD.
    If the content is encoded in the MP4 container, the
    sidx atom is usually placed in the same file, and sidx is specified by ByteRange in the indexRange attribute:
    In this example, there is a so-called Initialization Data (atom
    moov of the MP4 container) in bytes 0–684, and sidx atom in bytes 685–2192 .
    In case the content is encoded in the MPEG2-TS container, the
    sidx atom may be placed in a separate file:

<Representation id=“rep1” width=“1280” height=“720” mimeType=“video/mp4” codecs=“avc1.64001F” bandwidth=“1743409”>
      <BaseURL>vid_bw2500000.mp4</BaseURL>
  <SegmentBase timescale=“10000000” indexRange=“685-2192”>
          <Initialization range=“0-684”/>
      </SegmentBase>
</Representation>

In this example, there is a so-called Initialization Data (atom moov of the MP4 container) in bytes 0–684, and sidx atom in bytes 685–2192 .

In case the content is encoded in the MPEG2-TS container, the sidx atom may be placed in a separate file:

<Representation id=“rep1” bandwidth=“5000000” width=“1920” height=“1080”>
      <BaseURL>content.ts</BaseURL>
      <SegmentBase>
          <RepresentationIndex sourceURL=“content.sidx”/>
      </SegmentBase>
</Representation>

  • SegmentList. The URL of each chunk is specified explicitly. The duration of each chunk should be the same (in this case it is described in the duration attribute of the SegmentList tag) or described in the SegmentTimeline tag, as in the example below:

<Representation id=“0” width=“1920” height=“1080” mimeType=“video/mp4” codecs=“avc1.640028” bandwidth=“4500000”>
  <SegmentList timescale=“10000000”>
      <Initialization sourceURL=“vid_bw4500000_.mp4”/>
      <SegmentTimeline>
          <S d=“59999904” r=“99”/>
      </SegmentTimeline>
      <SegmentURL media=“vid_bw4500000_1.mp4”/>
      <SegmentURL media=“vid_bw4500000_2.mp4”/>
      <SegmentURL media=“vid_bw4500000_3.mp4”/>
      ….
        <SegmentURL media=“vid_bw4500000_100.mp4”/>
  </SegmentList>
</Representation>

  • SegmentTemplate. The URL of each chunk is generated from a template containing:

    • Representation’s identifier — $ RepresentationID $
    • Representation’s bitrate — $ Bandwidth $
    • chunk’s sequence number (identifier) — $ Number $
    • chunk’s start time — $ Time $

The chunk duration must be either the same or described in the SegmentTimeline tag. If all chunks have the same duration, an “endless” playlist, that describes the Live stream and does not require to update, can be created.

Representation example of the playlist with SegmentTimeline:

<Representation width=“1920” height=“1080” mimeType=“video/mp4” codecs=“avc1.640028” id=“0” bandwidth=“3141752”>
   <SegmentTemplate timescale=“10000000” startNumber=“1” media=“vid_bw4500000_$Number$.mp4” initialization=“vid_bw4500000_.mp4”>
       <SegmentTimeline>
           <S d=“59999904” r=“99”/>
       </SegmentTimeline>
   </SegmentTemplate>
</Representation>

Example of “endless” playlist’s Representation:

<Representation width=“1920” height=“1080” mimeType=“video/mp4” codecs=“avc1.640028” id=“0” bandwidth=“3141752”>
   <SegmentTemplate duration=“59999904” timescale=“10000000” startNumber=“1” media=“vid_bw4500000_$Number$.mp4” initialization=“vid_bw4500000_.mp4” />
</Representation>

The MPEG-DASH standard implies the use of two container types: ISO BMFF and MPEG-2 TS. It does not depend on the audio/video codecs used and supports various content protection schemes. However, de facto, MPEG-DASH implies the use of ISO-BMFF as a container and Common Encryption (CENC) for content protection, allowing you to encrypt and broadcast a single copy of the content to clients supporting different DRM systems.

Smooth Streaming

Smooth Streaming protocol allows to broadcast Live and VoD content similar to HLS and DASH protocols. As the client devices supporting the Smooth Streaming protocol, there may be various Silverlight players and devices, mobile devices running Windows Phone/Windows 10 Mobile and other devices.

Similar to the HLS and MPEG-DASH protocols, the Smooth Streaming protocol uses a fragmented structure of the media stream with HTTP delivery. The content is segmented and packed by the SmartMEDIA server, eliminating the need to install the IIS web server.

Smooth Streaming is the predecessor of the MPEG-DASH protocol and has fewer possibilities (for example, periods are not supported and the chunk duration of all tracks must match).

RTSP

The SmartMEDIA server also supports VoD and Live content streaming to devices supporting Real Time Streaming Protocol (RTSP). It can be software players and hardware devices, including QuickTime player (version 10 and higher), VideoLAN VLC player, various set-top boxes and other devices.

The RTSP protocol is now considered obsolete, and the number of modern RTSP-enabled subscriber devices that do not support HTTP-based delivery (HLS, MPEG-DASH, Smooth Streaming), is steadily decreasing. However, SmartMEDIA supports such devices, allowing you to cover a wider range of devices.

CONTENTS
Sign-in
Sign-in with your SmartLabs Support Portal account credentials to see non-public articles.