GStreamer

From LinuxTVWiki
Jump to navigation Jump to search

GStreamer is a toolkit for building audio- and video-processing pipelines. A pipeline might stream video from a file to a network, or add an echo to a recording, or (most interesting to us) capture the output of a Video4Linux device. Gstreamer is most often used to power graphical applications such as Totem, but can also be used directly from the command-line. This page will explain how GStreamer is better than the alternatives, and how to build an encoder using its command-line interface.

Before reading this page, see V4L capturing to set your system up and create an initial recording. This page assumes you have already implemented the simple pipeline described there.

Introduction to GStreamer

No two use cases for encoding are quite alike. What's your preferred workflow? Is your processor fast enough to encode high quality video in real-time? Do you have enough disk space to store the raw video then process it after the fact? Do you want to play your video in DVD players, or is it enough that it works in your version of VLC? How will you work around your system's obscure quirks?

Use GStreamer if you want the best video quality possible with your hardware, and don't mind spending a weekend browsing the Internet for information.

Avoid GStreamer if you just want something quick-and-dirty, or can't stand programs with bad documentation and unhelpful error messages.

Why is GStreamer better at encoding?

GStreamer isn't as easy to use as mplayer, and doesn't have as advanced editing functionality as ffmpeg. But it has superior support for synchronising audio and video in disturbed sources such as VHS tapes. If you specify your input is (say) 25 frames per second video and 48,000Hz audio, most tools will synchronise audio and video simply by writing 1 video frame, 1,920 audio frames, 1 video frame and so on. There are at least three ways this can cause errors:

  • initialisation timing: audio and video desynchronised by a certain amount from the first frame, usually caused by audio and video devices taking different amounts of time to initialise. For example, the first audio frame might be delivered to GStreamer 0.01 seconds after it was requested, but the first video frame might not be delivered until 0.7 seconds after it was requested, causing all video to be 0.6 seconds behind the audio
    • mencoder's -delay option solves this by delaying the audio
  • failure to encode: frames that desynchronise gradually over time, usually caused by audio and video shifting relative to each other when frames are dropped. For example if your CPU is not fast enough and sometimes drops a video frame, after 25 dropped frames the video will be one second ahead of the audio
    • mencoder's -harddup option solves this by duplicating other frames to fill in the gaps
  • source frame rate: frames that aren't delivered at the advertised rate, usually caused by inaccurate clocks in the source hardware. For example, a low-cost webcam that advertises 25 FPS video and 48kHz audio might actually deliver 25.01 video frames and 47,999 audio frames per second, causing your audio and video to drift apart by a second or so per hour
    • video tapes are especially problematic here - if you've ever seen a VCR struggle during those few seconds between two recordings on a tape, you've seen them adjusting the tape speed to accurately track the source. Frame counts can vary enough during these periods to instantly desynchronise audio and video
    • mencoder has no solution for this problem

GStreamer solves these problems by attaching a timestamp to each incoming frame based on the time GStreamer receives the frame. It can then mux the sources back together accurately using these timestamps, either by using a format that supports variable framerates or by duplicating frames to fill in the blanks:

  1. If you choose a container format that supports timestamps (e.g. Matroska), timestamps are automatically written to the file and used to vary the playback speed
  2. If you choose a container format that does not support timestamps (e.g. AVI), you must duplicate other frames to fill in the gaps by adding the videorate and audiorate plugins to the end of the relevant pipelines

Getting GStreamer

GStreamer, its most common plugins and tools are available through your distribution's package manager. Most Linux distributions include both the legacy 0.10 and modern 1.0 release series - each has bugs that stop them from working on some hardware, and this page focuses mostly on the modern 1.0 series. Converting between 0.10 and 1.0 is mostly just search-and-replace work (e.g. changing instances of av to ff because of the switch from ffmpeg to libavcodec). See the porting guide for more.

Other plugins are also available, such as GEntrans (used in some examples below). Google might help you find packages for your distribution, otherwise you'll need to download and compile them yourself.

Using GStreamer with gst-launch-1.0

gst-launch is the standard command-line interface to GStreamer. Here's the simplest pipline you can build:

gst-launch-1.0 fakesrc ! fakesink

This connects a single (fake) source to a single (fake) sink using the 1.0 series of GStreamer:

Very simple pipeline

GStreamer can build all kinds of pipelines, but you probably want to build one that looks something like this:

Idealised pipeline example

To get a list of elements that can go in a GStreamer pipeline, do:

gst-inspect-1.0 | less

Pass an element name to gst-inspect-1.0 for detailed information. For example:

gst-inspect-1.0 fakesrc
gst-inspect-1.0 fakesink

The images above are based on graphs created by GStreamer itself. Install Graphviz to build graphs of your pipelines. For faster viewing of those graphs, you may install xdot from [1]:

mkdir gst-visualisations
GST_DEBUG_DUMP_DOT_DIR=gst-visualisations gst-launch-1.0 fakesrc ! fakesink
xdot gst-visualisations/*-gst-launch.*_READY.dot

You may also compiles those graph to PNG, SVG or other supported formats:

 dot -Tpng gst-visualisations/*-gst-launch.*_READY.dot > my-pipeline.png

To get graphs of the example pipelines below, prepend GST_DEBUG_DUMP_DOT_DIR=gst-visualisations to the gst-launch-1.0 command. Run this command to generate a graph of GStreamer's most interesting stage:

xdot gst-visualisations/*-gst-launch.PLAYING_READY.dot

Remember to empty the gst-visualisations directory between runs.

Using GStreamer with entrans

gst-launch-1.0 is the main command-line interface to GStreamer, available by default. But entrans is a bit smarter:

  • it provides partly-automated composition of GStreamer pipelines
  • it allows you to cut streams, for example to capture for a predefined duration. That ensures headers are written correctly, which is not always the case if you close gst-launch-1.0 by pressing Ctrl+C. To use this feature one has to insert a dam element after the first queue of each part of the pipeline

Building pipelines

You will probably need to build your own GStreamer pipeline for your particular use case. This section contains examples to give you the basic idea.

Note: for consistency and ease of copy/pasting, all filenames in this section are of the form test-$( date --iso-8601=seconds ) - your shell should automatically convert this to e.g. test-2010-11-12T13:14:15+1600.avi

Record raw video only

A simple pipeline that initialises one video source, sets the video format, muxes it into a file format, then saves it to a file:

gst-launch-1.0 \
    v4l2src device=$VIDEO_DEVICE \
        ! $VIDEO_CAPABILITIES \
        ! avimux \
        ! filesink location=test-$( date --iso-8601=seconds ).avi

This will create an AVI file with raw video and no audio. It should play in most software, but the file will be huge.

Record raw audio only

A simple pipeline that initialises one audio source, sets the audio format, muxes it into a file format, then saves it to a file:

gst-launch-1.0 \
    alsasrc device=$AUDIO_DEVICE \
        ! $AUDIO_CAPABILITIES \
        ! avimux \
        ! filesink location=test-$( date --iso-8601=seconds ).avi

This will create an AVI file with raw audio and no video.

Record video and audio

gst-launch-1.0 \
    v4l2src device=$VIDEO_DEVICE \
        ! $VIDEO_CAPABILITIES \
        ! mux. \
    alsasrc device=$AUDIO_DEVICE \
        ! $AUDIO_CAPABILITIES \
        ! mux. \
    avimux name=mux \
        ! filesink location=test-$( date --iso-8601=seconds ).avi

Instead of a straightforward pipe with a single source leading into a muxer, this pipe has three parts:

  1. a video source leading to a named element (! name. with a full stop means "pipe to the name element")
  2. an audio source leading to the same element
  3. a named muxer element leading to a file sink

Muxers combine data from many inputs into a single output, allowing you to build quite flexible pipes.

Create multiple sinks

The tee element splits a single source into multiple outputs:

gst-launch-1.0 \
    v4l2src device=$VIDEO_DEVICE \
        ! $VIDEO_CAPABILITIES \
        ! avimux \
        ! tee name=network \
        ! filesink location=test-$( date --iso-8601=seconds ).avi \
    tcpclientsink host=127.0.0.1 port=5678 

This sends your stream to a file (filesink) and out over the network (tcpclientsink). To make this work, you'll need another program listening on the specified port (e.g. nc -l 127.0.0.1 -p 5678).

Encode audio and video

As well as piping streams around, GStreamer can manipulate their contents. The most common manipulation is to encode a stream:

gst-launch-1.0 \
    v4l2src device=$VIDEO_DEVICE \
        ! $VIDEO_CAPABILITIES \
        ! videoconvert \
        ! theoraenc \
        ! queue \
        ! mux. \
    alsasrc device=$AUDIO_DEVICE \
        ! $AUDIO_CAPABILITIES \
        ! audioconvert \
        ! vorbisenc \
        ! mux. \
    oggmux name=mux \
        ! filesink location=test-$( date --iso-8601=seconds ).ogg

The theoraenc and vorbisenc elements encode the video and audio using Ogg Theora and Ogg Vorbis encoders. The pipes are then muxed together into an Ogg container before being saved.

Add buffers

Different elements work at different speeds. For example, a CPU-intensive encoder might fall behind when another process uses too much processor time, or a duplicate frame detector might hold frames back while it examines them. This can cause streams to fall out of sync, or frames to be dropped altogether. You can add queues to smooth these problems out:

gst-launch-1.0 -q -e \
    v4l2src device=$VIDEO_DEVICE \
        ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! $VIDEO_CAPABILITIES \
        ! videoconvert \
        ! x264enc interlaced=true pass=quant quantizer=0 speed-preset=ultrafast byte-stream=true \
        ! progressreport update-freq=1 \
        ! mux. \
    alsasrc device=$AUDIO_DEVICE \
        ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! $AUDIO_CAPABILITIES \
        ! audioconvert \
        ! flacenc \
        ! mux. \
    matroskamux name=mux min-index-interval=1000000000 \
        ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! filesink location=test-$( date --iso-8601=seconds ).mkv

This creates a file using FLAC audio and x264 video in lossless mode, muxed into in a Matroska container. Because we used speed-preset=ultrafast, the buffers should just smooth out the flow of frames through the pipelines. Even though the buffers are set to the maximum possible size, speed-preset=veryslow would eventually fill the video buffer and start dropping frames.

Some other things to note about this pipeline:

  • FFmpeg's H.264 page includes a useful discussion of speed presets (both programs use the same underlying library)
  • quantizer=0 sets the video codec to lossless mode (~30GB/hour). Anything up to quantizer=18 should not lose information visible to the human eye, and will produce much smaller files (~10GB/hour)
  • min-index-interval=1000000000 improves seek times by telling the Matroska muxer to create one cue data entry per second of playback. Cue data is a few kilobytes per hour, added to the end of the file when encoding completes. If you try to watch your Matroska video while it's being recorded, it will take a long time to skip forward/back because the cue data hasn't been written yet

Common caputuring issues and their solutions

Reducing Jerkiness

If motion that should appear smooth instead stops and starts, try the following:

Check for muxer issues. Some muxers need big chunks of data, which can cause one stream to pause while it waits for the other to fill up. Change your pipeline to pipe your audio and video directly to their own filesinks - if the separate files don't judder, the muxer is the problem.

  • If the muxer is at fault, add ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 immediately before each stream goes to the muxer
    • queues have hard-coded maximum sizes - you can chain queues together if you need more buffering than one buffer can hold

Check your CPU load. When GStreamer uses 100% CPU, it may need to drop frames to keep up.

  • If frames are dropped occasionally when CPU usage spikes to 100%, add a (larger) buffer to help smooth things out.
    • this can be a source's internal buffer (e.g. alsasrc buffer-time=2000000), or it can be an extra buffering step in your pipeline (! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0)
  • If frames are dropped when other processes have high CPU load, consider using nice to make sure encoding gets CPU priority
  • If frames are dropped regularly, use a different codec, change the parameters, lower the resolution, or otherwise choose a less resource-intensive solution

As a general rule, you should try increasing buffers first - if it doesn't work, it will just increase the pipeline's latency a bit. Be careful with nice, as it can slow down or even halt your computer.

Check for incorrect timestamps. If your video driver works by filling up an internal buffer then passing a cluster of frames without timestamps, GStreamer will think these should all have (nearly) the same timestamp. Make sure you have a videorate element in your pipeline, then add silent=false to it. If it reports many framedrops and framecopies even when the CPU load is low, the driver is probably at fault.

  • videorate on its own will actually make this problem worse by picking one frame and replacing all the others with it. Instead install entrans and add its stamp element between v4l2src and queue (e.g. v4l2src do-timestamp=true ! stamp sync-margin=2 sync-interval=5 ! videorate ! queue)
    • stamp intelligently guesses timestamps if drivers don't support timestamping. Its sync- options drop or copy frames to get a nearly-constant framerate. Using videorate as well does no harm and can solve some remaining problems

Avoiding pitfalls with video noise

If your video contains periods of video noise (snow), you may need to deal with some extra issues:

  • Most devices send an EndOfStream signal if the input signal quality drops too low, causing GStreamer to finish capturing. To prevent the device from sending EOS, set num-buffers=-1 on the v4l2src element.
  • The stamp plugin gets confused by periods of snow, causing it to generate faulty timestamps and framedropping. stamp will recover normal behaviour when the break is over, but will probably leave the buffer full of weirdly-stamped frames. stamp only drops one weirdly-stamped frame each sync-interval, so it can take several minutes until everything works fine again. To solve this problem, set leaky=2 on each queue element to allow dropping old frames
  • Periods of noise (snow, bad signal etc.) are hard to encode. Variable bitrate encoders will often drive up the bitrate during the noise then down afterwards to maintain the average bitrate. To minimise the issues, specify a minimum and maximum bitrate in your encoder
  • Snow at the start of a recording is just plain ugly. To get black input instead from a VCR, use the remote control to change the input source before you start recording

Investigating bugs in GStreamer

GStreamer comes with a extensive tracing system that let you figure-out the problems. Yet, you often need to understand the internals of GStreamer to be able to read those traces. You should read this documentation page for the basic of how the tracing system works. When something goes wrong you should:

  1. try and see if there is a useful error message by enabling the ERROR debug level, GST_DEBUG=2 gst-launch-1.0
  2. try similar pipelines - reducing to its most minimal form, and add more elements until you can reproduce the issue.
  3. as you are most likely having issue with V4L2 element, you may enable full v4l2 traces using GST_DEBUG="v4l2*:7,2" gst-launch-1.0.
  4. find an error message that looks relevant, search the Internet for information about it
  5. try more variations based on what you learnt, until you eventually find something that works
  6. ask on Freenode #gstreamer or through GStreamer Mailing List
  7. if you think you found a bug, you should report it through Gnome Bugzilla

Sample pipelines

record from a bad analog signal to MJPEG video and RAW mono audio

gst-launch-1.0 \
    v4l2src device=$VIDEO_DEVICE do-timestamp=true \
        ! $VIDEO_CAPABILITIES \
        ! videorate \
        ! $VIDEO_CAPABILITIES \
        ! videoconvert \
        ! $VIDEO_CAPABILITIES \
        ! jpegenc \
        ! queue \
        ! mux. \
    alsasrc device=$AUDIO_DEVICE \
        ! $AUDIO_CAPABILITIES \
        ! audiorate \
        ! audioresample \
        ! $AUDIO_CAPABILITIES, rate=44100 \
        ! audioconvert \
        ! $AUDIO_CAPABILITIES, rate=44100, channels=1 \
        ! queue \
        ! mux. \
    avimux name=mux ! filesink location=test-$( date --iso-8601=seconds ).avi

The chip that captures audio and video might not deliver the exact framerates specified, which the AVI format can't handle. The audiorate and videorate elements remove or duplicate frames to maintain a constant rate.

View pictures from a webcam (GStreamer 0.10)

gst-launch-0.10 \
    v4l2src do-timestamp=true device=$VIDEO_DEVICE \
        ! video/x-raw-yuv,format=\(fourcc\)UYVY,width=320,height=240 \
        ! ffmpegcolorspace \
        ! autovideosink

In GStreamer 0.10, videoconvert was called ffmpegcolorspace.

Entrans: Record to DVD-compliant MPEG2 (GStreamer 0.10)

entrans -s cut-time -c 0-180 -v -x '.*caps' --dam -- --raw \
    v4l2src queue-size=16 do-timestamp=true device=$VIDEO_DEVICE norm=PAL-BG num-buffers=-1 \
        ! stamp silent=false progress=0 sync-margin=2 sync-interval=5 \
        ! queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! dam \
        ! cogcolorspace \
        ! videorate silent=false \
        ! 'video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3' \
        ! queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! ffenc_mpeg2video rc-buffer-size=1500000 rc-max-rate=7000000 rc-min-rate=3500000 bitrate=4000000 max-key-interval=15 pass=pass1 \
        ! queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! mux. \
    pulsesrc buffer-time=2000000 do-timestamp=true \
        ! queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! dam \
        ! audioconvert \
        ! audiorate silent=false \
        ! audio/x-raw-int,rate=48000,channels=2,depth=16 \
        ! queue silent=false max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! ffenc_mp2 bitrate=192000 \
        ! queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 \
        ! mux. \
    ffmux_mpeg name=mux \
        ! filesink location=test-$( date --iso-8601=seconds ).mpg

This captures 3 minutes (180 seconds, see first line of the command) to test-$( date --iso-8601=seconds ).mpg and even works for bad input signals.

  • I wasn't able to figure out how to produce a mpeg with ac3-sound as neither ffmux_mpeg nor mpegpsmux support ac3 streams at the moment. mplex does but I wasn't able to get it working as one needs very big buffers to prevent the pipeline from stalling and at least my GStreamer build didn't allow for such big buffers.
  • The limited buffer size on my system is again the reason why I had to add a third queue element to the middle of the audio as well as of the video part of the pipeline to prevent jerking.
  • In many HOWTOs you find ffmpegcolorspace instead of cogcolorspace. You can even use this but cogcolorspace is much faster.
  • It seems to be important that the video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3-statement is after videorate as videorate seems to drop the aspect-ratio-metadata otherwise resulting in files with aspect-ratio 1 in theis headers. Those files are probably played back warped and programs like dvdauthor complain.

Bash script to record video tapes with entrans

For most use cases, you'll want to wrap GStreamer in a larger shell script. This script protects against several common mistakes during encoding.

See also the V4L capturing script for a a wrapper that represents a whole workflow.

#!/bin/bash
 
 targetdirectory="~/videos"
 
 
 # Test ob doppelt geöffnet
 
 if [[ -e "~/.lock_shutdown.digitalisieren" ]]; then
     echo ""
     echo ""
     echo "Capturing already running. It is impossible to capture to tapes simultaneously. Hit a key to abort."
     read -n 1
     exit
 fi
 
 # trap keyboard interrupt (control-c)
 trap control_c 0 SIGHUP SIGINT SIGQUIT SIGABRT SIGKILL SIGALRM SIGSEGV SIGTERM
 
 control_c()
 # run if user hits control-c
 {
   cleanup
   exit $?
 }
 
 cleanup()
 {
   rm ~/.lock_shutdown.digitalisieren
   return $?
 }
 
 touch "~/.lock_shutdown.digitalisieren"
 
 echo ""
 echo ""
 echo "Please enter the length of the tape in minutes and press ENTER. (Press Ctrl+C to abort.)"
 echo ""
 while read -e laenge; do
     if [[ $laenge == [0-9]* ]]; then
         break 2
     else
         echo ""
         echo ""
         echo "That's not a number."
         echo "Please enter the length of the tape in minutes and press ENTER. (Press Ctrl+C to abort.)"
         echo ""
     fi
 done
 
 let laenge=laenge+10  # Sicherheitsaufschlag, falls Band doch länger
 let laenge=laenge*60
 
 echo ""
 echo ""
 echo "Please type in the description of the tape."
 echo "Don't forget to rewind the tape?"
 echo "Hit ENTER to start capturing. Press Ctrl+C to abort."
 echo ""
 read -e name;
 name=${name//\//_}
 name=${name//\"/_}
 name=${name//:/_}
 
 # Falls Name schon vorhanden
 if [[ -e "$targetdirectory/$name.mpg" ]]; then
     nummer=0
     while [[ -e "$targetdirectory/$name.$nummer.mpg" ]]; do
        let nummer=nummer+1
     done
     name=$name.$nummer
 fi
 
 # Audioeinstellungen setzen: unmuten, Regler
 amixer -D pulse cset name='Capture Switch' 1 >& /dev/null      # Aufnahme-Kanal einschalten
 amixer -D pulse cset name='Capture Volume' 20724 >& /dev/null  # Aufnahme-Pegel einstellen
 
 # Videoinput auswählen und Karte einstellen
 v4l2-ctl --set-input 3 >& /dev/null
 v4l2-ctl -c saturation=80 >& /dev/null
 v4l2-ctl -c brightness=130 >& /dev/null
 
 let ende=$(date +%s)+laenge
 
 echo ""
 echo "Working"
 echo "Capturing will be finished at "$(date -d @$ende +%H.%M)"."
 echo ""
 echo "Press Ctrl+C to finish capturing now."
 
 
 nice -n -10 entrans -s cut-time -c 0-$laenge -m --dam -- --raw \
 v4l2src queue-size=16 do-timestamp=true device=$VIDEO_DEVICE norm=PAL-BG num-buffers=-1 ! stamp sync-margin=2 sync-interval=5 silent=false progress=0 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
    cogcolorspace ! videorate ! \
    'video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3' ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
    ffenc_mpeg2video rc-buffer-size=1500000 rc-max-rate=7000000 rc-min-rate=3500000 bitrate=4000000 max-key-interval=15 pass=pass1 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
 pulsesrc buffer-time=2000000 do-timestamp=true ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
    audioconvert ! audiorate ! \
    audio/x-raw-int,rate=48000,channels=2,depth=16 ! \
    queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
    ffenc_mp2 bitrate=192000 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
 ffmux_mpeg name=mux ! filesink location=\"$targetdirectory/$name.mpg\" >& /dev/null
 
 echo "Finished Capturing"
 rm ~/.lock_shutdown.digitalisieren

The script uses a command line similar to this to produce a DVD compliant MPEG2 file.

  • The script aborts if another instance is already running.
  • If not it asks for the length of the tape and its description
  • It records to description.mpg or if this file already exists to description.0.mpg and so on for the given time plus 10 minutes. The target-directory has to be specified in the beginning of the script.
  • As setting of the inputs and settings of the capture device is only partly possible via GStreamer other tools are used.
  • Adjust the settings to match your input sources, the recording volume, capturing saturation and so on.

Further documentation resources

External Links