V4L capturing: Difference between revisions

From LinuxTVWiki
Jump to navigation Jump to search
(Rewrote from scratch)
(Another pass on the rewrite)
Line 4: Line 4:


Analogue video technology was largely designed before the advent of computers, so accurately digitising a video is a difficult problem. For example, software often assumes a constant frame rate throughout a video, but analogue technologies can deliver different numbers of frames from second to second for various reasons. This page will discuss some of the problems you will encounter digitising video and some of the techniques and programs you can use to solve them.
Analogue video technology was largely designed before the advent of computers, so accurately digitising a video is a difficult problem. For example, software often assumes a constant frame rate throughout a video, but analogue technologies can deliver different numbers of frames from second to second for various reasons. This page will discuss some of the problems you will encounter digitising video and some of the techniques and programs you can use to solve them.
You should be able to find similar functionality in whatever program you use.


=== Recommended process ===
=== Recommended process ===


Digitising videos is a complex problem, and everyone has their own preferred process. Your workflow should look something like this:
Your workflow should look something like this:


# '''Set your system up''' - understand the quirks of your TV card, VCR etc.
# '''Set your system up''' - understand the quirks of your TV card, VCR etc.
Line 15: Line 14:
# '''Try the video and transcode again''' - check whether the video works how you want, then transcode again
# '''Try the video and transcode again''' - check whether the video works how you want, then transcode again


Analogue-to-digital encoding has all sorts of problems - VCRs overheat and damage tapes, computers drop frames if they use too much CPU, and so on. Creating a digital video also presents challenges - not all software supports all formats, you might want to remove overscan and background hiss, and so on. It's much easier to produce a quality result if you tackle each problem separately.
Converting analogue input to a digital format is hard - VCRs overheat and damage tapes, computers use too much CPU and drop frames, disk drives fill up, etc. Creating a ''good'' digital video is also hard - not all software supports all formats, overscan and background hiss distract the viewer, videos need to be split into useful chunks, and so on. It's much easier to learn the process and produce a quality result if you tackle one problem at a time.


=== Choosing formats ===
=== Choosing formats ===


When you create a video, you need to choose your ''audio format'' (e.g. WAV or MP3), ''video format'' (e.g. XviD or MPEG-4) and ''container format'' (e.g. AVI or MP4). There's constant work to improve the ''codecs'' that create audio/video and the ''muxers'' that create containers, and whole new formats are invented fairly regularly, so this page can't recommend any specific formats.
When you create a video, you need to choose your ''video format'' (e.g. XviD or MPEG-2), ''audio format'' (e.g. WAV or MP3) and ''container format'' (e.g. AVI or MP4). There's constant work to improve the ''codecs'' that create audio/video and the ''muxers'' that create containers, and whole new formats are invented fairly regularly, so this page can't recommend any specific formats. For example, as of late 2015 [https://en.wikipedia.org/wiki/MPEG-2 MPEG-2] was the recommended video codec for backwards compatibility because it was supported by older DVD players, [https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC H.624] was becoming popular because support was starting to land in recent web browsers, and [https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding HEVC] wasn't yet widely supported because people were waiting to see if patent claims would be made against it. Each solution is better for different use cases, and better solutions will most likely have been created within a year.

Wikipedia's comparisons of [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio], [https://en.wikipedia.org/wiki/Comparison_of_video_codecs video] and [https://en.wikipedia.org/wiki/Comparison_of_container_formats container] formats are a good place to start your research - here are some important things to look for:
You'll need to do some research to find the currently-recommended formats. Wikipedia's comparisons of [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio], [https://en.wikipedia.org/wiki/Comparison_of_video_codecs video] and [https://en.wikipedia.org/wiki/Comparison_of_container_formats container] formats are a good place to start. Here are some important things to look for:


* '''encoding speed''' - during the encoding stage, using too much CPU load will cause frame-drops as the computer tries to keep up
* '''encoding speed''' - during the encoding stage, using too much CPU load will cause frame-drops as the computer tries to keep up
Line 27: Line 27:
* '''compatibility''' - newer formats usually produce better results but can't be played by older software
* '''compatibility''' - newer formats usually produce better results but can't be played by older software


Speed and accuracy are usually the most important when encoding, but size and compatibility the most important for playback. So it you should use a modern, fast, lossless format for the ''encoding'' stage then a format that produces a smaller or more compatible file for the ''transcoding'' stage. For example, as of 2015 it might make sense to encode FLAC audio and x264 video into a Matroska file, then transcode MP3 audio and MPEG-4 video into an AVI file. The transcoded file might be larger or lower quality, but it should play on most software, and you can just delete it and try again if your grandmother's DVD player doesn't like it.
Remember that you can use different formats in the ''encode'' and ''transcode'' stages. Speed and accuracy are most important when encoding, so you should use a modern, fast, low-loss format to create your initial accurate copy of the source video. But size and compatibility are most important for playback, so you should transcode to a format that produces a smaller or more compatible file. For example, as of late 2015 you might encode FLAC audio and x264 video into a Matroska file, then transcode MP3 audio and MPEG-2 video into an AVI file. You can examine the result and transcode again from the original if the file is too large or your grandmother's DVD player won't play it.


== Setting up ==
== Setting up ==


Before you can record a video, you need to set your system up. This section will guide you through the details.
Before you can record a video, you need to set your system up and identify the following information:

* connector type (RF, composite or S-video)
* TV norm (some variant of PAL, NTSC or SECAM)
* video device (<code>/dev/video''<number>''</code>)
* audio device (<code>hw:CARD=''<id>'',DEV=''<number>''</code>)
* video capabilities (<code>video/x-raw, format=''<string>'', framerate=''<fraction>'', width=''<int>'', height=''<int>'', interlace-mode=''<string>'', pixel-aspect-ratio=''<fraction>''</code>)
* audio capabilities (<code>audio/x-raw, format=''<string>'', layout=''<string>'', rate=''<int>'', channels=''<int>''</code>)
* colour settings (optional - hue, saturation, brightness and contrast)

This section will explain how to find these.


=== Connecting your video ===
=== Connecting your video ===
Line 37: Line 47:
{|
{|
| style="text-align:right;" |[[File:Rf-connector.png]]
| style="text-align:right;" |[[File:Rf-connector.png]]
|[https://en.wikipedia.org/wiki/RF_connector RF Connector]
| style="text-align:center; font-weight: bold; color:#FF0000" | avoid
| style="text-align:center; font-weight: bold; color:#FF0000" | avoid
|[https://en.wikipedia.org/wiki/RF_connector RF Connector]
|Usually input #0, these connectors tend to create more noise than the alternatives, and produce snow at the start of a recording
| tends to create more noise than the alternatives. Usually input #0, shows snow when there's no input
|-
|-
| style="text-align:right;" |[[File:Composite-video-connector.png]]
| style="text-align:right;" |[[File:Composite-video-connector.png]]
|[https://en.wikipedia.org/wiki/Composite_video Composite video connector]
| style="text-align:center; font-weight: bold; color:#00AA00" | use
| style="text-align:center; font-weight: bold; color:#00AA00" | use
|[https://en.wikipedia.org/wiki/Composite_video Composite video connector]
|Usually input #1, these are widely supported and produce a good signal, and produce blackness at the start of a recording
| widely supported and produces a good signal. Usually input #1, shows blackness when there's no input
|-
|-
| style="text-align:right;" |[[File:S-video-connector.png]]
| style="text-align:right;" |[[File:S-video-connector.png]]
|[https://en.wikipedia.org/wiki/S-Video S-video connector]
| style="text-align:center; font-weight: bold; color:#777700" | use if available
| style="text-align:center; font-weight: bold; color:#777700" | use if available
|[https://en.wikipedia.org/wiki/S-Video S-video connector]
|Usually input #2, these should produce a good video signal but most hardware needs a converter
| should produce a good video signal but most hardware needs a converter. Usually input #2, shows blackness when there's no input
|}
|}


Connect your video source (TV or VCR) to your computer however you can. Each type of connector has slightly different properties - try whatever you can and see what works. If you have a TV card that supports multiple inputs, you will need to specify the input number when you come to record.
Connect your video source (TV or VCR) to your computer however you can. Each type of connector has slightly different properties - try whatever you can and see what works. If you have a TV card that supports multiple inputs, you will need to specify the input number when you come to record. You can cut the recording into pieces during the transcoding stage, so snow/blackness won't appear in the final video.

=== Finding your TV norm ===

Most TV cards only support the TV norm of the country they were sold in (e.g. PAL-I in the UK or NTSC-M in the Americas), but it's best to confirm this just in case. Wikipedia has [https://en.wikipedia.org/wiki/File:PAL-NTSC-SECAM.svg an image of colour systems by country] and [https://en.wikipedia.org/wiki/Broadcast_television_systems#ITU_standards a complete list of standards] with countries they're used in.

If you like, you can store your TV norm in an environment variable:

TV_NORM=<norm>

For example, if your norm was <code>PAL-I</code>, you might type <code>TV_NORM=PAL-I</code> into your terminal. This guide will use <code>$TV_NORM</code> to refer to your video norm - if you choose not to set an environment variable, you will need to replace instances of <code>$TV_NORM</code> with your TV norm.


=== Determining your video device ===
=== Determining your video device ===
Line 68: Line 88:
mpv --tv-device=<device> tv:///<whichever-input-number-you-connected>
mpv --tv-device=<device> tv:///<whichever-input-number-you-connected>


(if your source is a VCR, remember to play a video so you know the right one when you see it)
If your source is a VCR, remember to play a video so you know the right one when you see it. If you see snow when you were expecting blackness (or vice versa), double-check your input number with the output of <code>v4l2-ctl</code> above.


If you like, you can store your device in an environment variable:
If you like, you can store your device and input number in environment variables:


VIDEO_DEVICE=<device>
VIDEO_DEVICE=<device>
VIDEO_INPUT=<whichever-input-number-you-connected>


Further examples on this page will use <CODE>$VIDEO_DEVICE</CODE> in place of an actual video device
Further examples on this page will use <CODE>$VIDEO_DEVICE</CODE> and <CODE>$VIDEO_INPUT</CODE> - you will need to replace these if you don't set environment variables.


=== Determining your audio device ===
=== Determining your audio device ===
Line 88: Line 109:
If you're not sure which one you want, try each in turn:
If you're not sure which one you want, try each in turn:


mpv --tv-device=$VIDEO_DEVICE --tv-adevice=<device> tv:///$VIDEO_INPUT
gst-launch-0.10 alsasrc do-timestamp=true device=hw:<device> ! autoaudiosink


Again, you should hear your tape playing when you get the right one. Note: always use an ALSA ''hw'' device, as they are closest to the hardware. Pulse audio devices and ALSA's ''plughw'' devices add extra layers that, while more convenient for most uses, only cause headaches for us.
Again, you should hear your tape playing when you get the right one. Note: always use an ALSA ''hw'' device, as they are closest to the hardware. Pulse audio devices and ALSA's ''plughw'' devices add extra layers that, while more convenient for most uses, only cause headaches for us.
Line 96: Line 117:
AUDIO_DEVICE=<device>
AUDIO_DEVICE=<device>


Further examples on this page will use <CODE>$AUDIO_DEVICE</CODE> in place of an actual audio device
Further examples on this page will use <CODE>$AUDIO_DEVICE</CODE> in place of an actual audio device - you will need to replace this if you don't set environment variables.


=== Getting your device capabilities ===
=== Getting your device capabilities ===


To find the capabilities of your video device, do:
To find the capabilities of your video device, do:
gst-launch-1.0 --gst-debug=v4l2src:5 v4l2src device=$VIDEO_DEVICE ! fakesink 2>&1 | sed -une '/caps of src/ s/;/\n/gp'
gst-launch-1.0 --gst-debug=v4l2src:5 v4l2src device=$VIDEO_DEVICE ! fakesink 2>&1 | sed -une '/caps of src/ s/[:;] /\n/gp'


To find the capabilities of your audio device, do:
To find the capabilities of your audio device, do:
gst-launch-1.0 --gst-debug=alsa:5 alsasrc device=$AUDIO_DEVICE ! fakesink 2>&1 | sed -une '/returning caps/ s/;/\n/gp'
gst-launch-1.0 --gst-debug=alsa:5 alsasrc device=$AUDIO_DEVICE ! fakesink 2>&1 | sed -une '/returning caps/ s/[s;] /\n/gp'


You will need to press <kbd>ctrl+c</kbd> to close each of these programs when they've printed some output. When you record your video, you will need to specify capabilities based on the ranges displayed here.
You will need to press <kbd>ctrl+c</kbd> to close each of these programs when they've printed some output. When you record your video, you will need to specify capabilities based on the ranges displayed here.

For options where you have a choice, you should usually just pick the highest number with the following exceptions:

* audio <code>format</code> is optional (your software can decide this automatically)
* video <code>format</code> should be optional, but as of 2015 a bug means you need to specify <code>format=UYVY</code>
* video <code>height</code> (discussed below) should be the appropriate height for your TV norm
* video <code>framerate</code> (discussed below) should be the appropriate value for your TV norm, but may need to be tweaked for your hardware
* <code>pixel-aspect-ratio</code> should be ignored (it will be set later)

For example, if your TV norm was some variant of PAL and your video card showed these results:

<nowiki>$ gst-launch-1.0 --gst-debug=v4l2src:5 v4l2src device=$VIDEO_DEVICE ! fakesink 2>&1 | sed -une '/caps of src/ s/[:;] /\n/gp'
0:00:00.052071821 29657 0x139fc50 DEBUG v4l2src gstv4l2src.c:306:gst_v4l2src_negotiate:<v4l2src0> caps of src
video/x-raw, format=(string)YUY2, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)UYVY, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)Y42B, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)I420, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)YV12, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)xRGB, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)BGRx, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)BGR, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB16, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB15, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)GRAY8, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59</nowiki>

<nowiki>$ gst-launch-1.0 --gst-debug=alsa:5 alsasrc device=$AUDIO_DEVICE ! fakesink 2>&1 | sed -une '/returning caps/ s/[s;] /\n/gp'
0:00:00.039231863 30898 0x25fcde0 INFO alsa gstalsasrc.c:318:gst_alsasrc_getcaps:<alsasrc0> returning cap
audio/x-raw, format=(string){ S16LE, U16LE }, layout=(string)interleaved, rate=(int)32000, channels=(int)2, channel-mask=(bitmask)0x0000000000000003
audio/x-raw, format=(string){ S16LE, U16LE }, layout=(string)interleaved, rate=(int)32000, channels=(int)1</nowiki>

Then you would select <code>video/x-raw, format=UYVY, framerate=25/1, width=720, height=576, interlace-mode=mixed, pixel-aspect-ratio=1</code> and <code>audio/x-raw, layout=interleaved, rate=32000, channels=2, channel-mask=0x0000000000000003</code>

Once again, you can set your capabilities in an environment variable:

VIDEO_CAPABILITIES=<capabilities>
AUDIO_CAPABILITIES=<capabilities>

Further examples on this page will use <CODE>$VIDEO_CAPABILITIES</CODE> and <code>$AUDIO_CAPABILITIES</code> in place of an actual capabilities - you will need to replace this if you don't set environment variables.


==== Video heights ====
==== Video heights ====
Line 113: Line 173:


gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
! video/x-raw,format=UYVY,width=720,height=578 \
! $VIDEO_CAPABILITIES, height=578 \
! imagefreeze \
! imagefreeze \
! autovideosink
! autovideosink
Line 120: Line 180:


gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
! video/x-raw,format=UYVY,width=720,height=<appropriate-height> \
! $VIDEO_CAPABILITIES, height=<appropriate-height> \
! imagefreeze \
! imagefreeze \
! autovideosink
! autovideosink
Line 128: Line 188:
You may want to test this yourself and set your height to whatever looks best.
You may want to test this yourself and set your height to whatever looks best.


=== Measuring your video framerate ===
==== Video framerates ====


Due to hardware issues, some V4L devices produce slightly too many (or too few) frames per second. To check your system's actual frame rate, start your video source (e.g. a VCR or webcam) then run this command:
Due to hardware issues, some V4L devices produce slightly too many (or too few) frames per second. To check your system's actual frame rate, start your video source (e.g. a VCR or webcam) then run this command:


gst-launch-1.0 v4l2src \
gst-launch-1.0 v4l2src device=$VIDEO_DEVICE \
! video/x-raw,format=UYVY \
! $VIDEO_CAPABILITIES \
! fpsdisplaysink fps-update-interval=100000
! fpsdisplaysink fps-update-interval=100000


Line 148: Line 208:
Most TV cards have correct colour settings by default, but if your picture looks wrong (or you just want to check), first capture an image that has a good range of colours:
Most TV cards have correct colour settings by default, but if your picture looks wrong (or you just want to check), first capture an image that has a good range of colours:


mpv --tv-device=$VIDEO_DEVICE tv:///<whichever-input-number-you-connected>
mpv --tv-device=$VIDEO_DEVICE tv:///$VIDEO_INPUT


Press "s" to take screenshots, then open them in an image editor and alter the hue, saturation brightness and contrast until it looks right. If possible, print a [https://en.wikipedia.org/wiki/Testcard testcard], capture a screenshot of it in good lighting conditions, then compare the captured image to the original. When you find the settings that look right, you can set your TV card.
Press "s" to take screenshots, then open them in an image editor and alter the hue, saturation brightness and contrast until it looks right. If possible, print a [https://en.wikipedia.org/wiki/Testcard testcard], capture a screenshot of it in good lighting conditions, then compare the captured image to the original. When you find the settings that look right, you can set your TV card.
Line 167: Line 227:
== Encoding an accurate video ==
== Encoding an accurate video ==


Your first step should be to record an accurate copy of your source video. A good quality encoding can use up to 30 gigabytes per hour, so figure out how long your video is and make sure you have enough space. Most software isn't optimised for analogue video encoding, causing audio and video to desynchronise in some circumstances. Although hard to use, [[GStreamer]] has excellent analogue recording support.
Your first step should be to record an accurate copy of your source video. A good quality encoding can use anything up to 30 gigabytes per hour, so figure out how long your video is and make sure you have enough space. Most software isn't optimised for analogue video encoding, causing audio and video to desynchronise in some circumstances.


[[GStreamer]] is the best program for inputting video on Linux (see [[GStreamer|the GStreamer page]] for details), but is poorly-documented and hard to use. [http://ffmpeg.org/ FFmpeg] has much better documentation, but can't handle the quirks of analogue video. To get the best results, you'll need to use GStreamer as an FFmpeg source. Assuming you have set the environment variables from the previous section, and also set <code>$VIDEO_FORMAT</code>, <code>$VIDEO_FORMAT_OPTIONS</code>, <code>$AUDIO_FORMAT</code>, <code>$AUDIO_FORMAT_OPTIONS</code> and <code>$MUXER_FORMAT</code> based on your preferences, you can record video with a command like this:
The exact steps to encode video will depend on your system and software, but here are some general tips:

<nowiki>ffmpeg \
-i <(
gst-launch-1.0 -q \
v4l2src device=$VIDEO_DEVICE do-timestamp=true pixel-aspect-ratio=1 norm=$TV_NORM ! $VIDEO_CAPABILITIES ! mux. \
alsasrc device=$AUDIO_DEVICE do-timestamp=true ! $AUDIO_CAPABILITIES ! mux. \
matroskamux name=mux ! fdsink fd=1
) \
-c:v $VIDEO_FORMAT $VIDEO_FORMAT_OPTIONS \
-c:a $AUDIO_FORMAT $AUDIO_FORMAT_OPTIONS \
"accurate-video.$MUXER_FORMAT"</nowiki>

This command does two things:
* tells GStreamer to record raw audio and video and mux it into a [http://www.matroska.org/ Matroska media container], which represents the input in a form that's accurate and which FFmpeg can handle
* tells FFmpeg to accept the video from GStreamer and convert it to a more usable format

If you have enough free disk space, you could just save the raw video as your accurate copy - see [[GStreamer]] for details.


=== Handling desynchronised audio and video ===
=== Handling desynchronised audio and video ===


[[GStreamer]] should be able to synchronise your audio and video automatically, but some hardware doesn't provide accurate timestamps to make this possible. Before following these instructions, check you're using a raw ''hw'' audio device - ''plughw'' devices can cause synchronisation issues.
[[GStreamer]] should be able to synchronise your audio and video automatically using the <code>do-timestamp</code> setting, but some hardware doesn't support timestamps so you'll have to fix it during transcoding. Before following these instructions, make sure you're using a raw <code>hw</code> audio device - <code>plughw</code> devices can cause synchronisation issues on any hardware.


If possible, create [https://en.wikipedia.org/wiki/Clapperboard clapperboard] effects at the start of your videos - hook up a camcorder, start capturing, then clap your hands in front of the camera before pressing play. Failing that, make note of moments in videos where an obvious visual element occurred at the same moment as an obvious audio moment.
If possible, create [https://en.wikipedia.org/wiki/Clapperboard clapperboard] effects at the start of your videos - hook up a camcorder, run your capture command, then clap your hands in front of the camera before pressing play on your VCR. Failing that, make note of moments in videos where an obvious visual element occurred at the same moment as an obvious audio moment.


Once you've recorded your video, you'll need to calculate your desired A/V offset. You can usually get a good enough result here by opening the video in your favourite video player and adjusting the A/V sync until it looks right to you - different players put this in different places, for example it's ''Tools > Track Synchronisation'' in VLC. If you want to get a very precise result, play your video with precise timestamps (e.g. <code>mpv</code>'s <code>--osd-fractions</code> option) and open your audio in an audio editor (e.g. [http://audacityteam.org/ Audacity]), then find the microsecond when your clapperboard video/audio occurred and subtract one from the other.
Once you've recorded your video, you'll need to calculate your desired A/V offset. For the best result, play your video with precise timestamps (e.g. <code>mpv --osd-fractions your-file.$MUXER_FORMAT</code>) and open your audio in an audio editor (e.g. [http://audacityteam.org/ Audacity]), then find the exact frame when your clapperboard video/audio occurred and subtract one from the other. To confirm your result, run <code>mpv --audio-delay=<result> accurate-video.$MUXER_FORMAT</code> and make sure it looks right.


=== Measuring audio noise ===
=== Measuring audio noise ===


Your hardware will create a small amount of audio noise in your recording. If you want to remove this later, you'll need to measure it for each configuration you use (e.g. different input sources or laptop charging/unplugged). You can remove this and other noise sources during transcoding - for now you just need to record a sample.
Your hardware will create a small amount of audio noise in your recording. If you want to remove this later, you'll need to measure it for every hardware configuration you use - S-video vs. composite, laptop charging vs. unplugged, and so on.


You'll need a recording of about half a second of your system in a resting state. This can be a silent TV channel or paused tape, but if you're using composite or S-video connectors, the easiest thing is probably just to record a few moments of blackness before pressing play.
You'll need a recording of about half a second of your system in a resting state, which you will use later to remove noise. This can be a silent TV channel or paused tape, but if you're using composite or S-video connectors, the easiest thing is probably just to record a few moments of blackness before pressing play.


=== Choosing formats ===
=== Choosing formats ===


Your encoding formats needs to encode in real-time and lose as little information as possible. Even if you plan to throw that information away during transcoding, an accurate initial recording will give you more freedom when the time comes. For example, your muxer format should support ''variable frame rates'' so you can measure your video's frame rate. Once you have that information, you could use it to calculate an accurate transcoding frame rate or to cut out sections where your VCR delivered the wrong number of frames - either way the information is useful even though it was lost from the final video.
Your encoding formats need to encode in real-time and lose as little information as possible. Even if you plan to throw that information away during transcoding, an accurate initial recording will give you more freedom when the time comes. For example, your muxer format should support ''variable frame rates'' so you can measure your video's frame rate. Once you have that information, you could use it to calculate an accurate transcoding frame rate or to cut out sections where your VCR delivered the wrong number of frames - either way the information is useful even though it was lost from the final video.


== Transcoding a usable video ==
== Transcoding a usable video ==


The video you recorded should accurately represent your source video, but will probably be a large file, be a noisy experience, and might not even play in some programs. You need to ''transcode'' it to a more usable format.
The video you recorded should accurately represent your source video, but will probably be a large file, be a noisy experience, and might not even play in some programs. You need to ''transcode'' it to a more usable format. You can use any program(s) to do this, but it's probably easiest to continue using [http://ffmpeg.org/ FFmpeg]:

<nowiki>ffmpeg -i "accurate-video.$MUXER_FORMAT" \
-c:v <transcoded-video-format> <transcoded-video-options> \
-c:a <transcoded-audio-format> <transcoded-audio-options> \
usable-video.<usable-muxer></nowiki>

If you're happy with the result, you can stop here. But you might want to improve the video, for example:

* [https://trac.ffmpeg.org/wiki/Seeking Cut sections out of your video with the -ss and -t options]
* [https://trac.ffmpeg.org/wiki/FilteringGuide Add complex filters to clean up the video and audio]
* [http://stackoverflow.com/questions/20254846/how-to-add-an-external-audio-track-to-a-video-file-using-vlc-or-ffmpeg-command-l Edit the audio with Audacity then copy the new version back in]

This section will discuss some of the high-level issues you'll face if you choose to improve your video.


=== Cleaning audio ===
=== Cleaning audio ===
Line 224: Line 314:
Some programs need video to have a specified aspect ratio. If you simply crop out the ugly overscan lines at the bottom of your video, some programs may refuse to play your video. Instead you should ''mask'' the area with blackness. In <code>ffmpeg</code>, you would use a <code>crop</code> filter followed by a <code>pad</code> filter to create the appropriate result.
Some programs need video to have a specified aspect ratio. If you simply crop out the ugly overscan lines at the bottom of your video, some programs may refuse to play your video. Instead you should ''mask'' the area with blackness. In <code>ffmpeg</code>, you would use a <code>crop</code> filter followed by a <code>pad</code> filter to create the appropriate result.


Analogue video is [https://en.wikipedia.org/wiki/Interlaced_video interlaced], essentially interleaving two consecutive video frames within each image. This confuses video filters that compare neighbouring pixels (e.g. to look for bright grains in dark areas of the screen), so you should ''deinterleave'' the frames before using such filters, then ''interleave'' them again afterwards. For example, an <code>ffmpeg</code> filter chain might start with <code>il=d:d:d</code> and end with <code>il=i:i:i</code>. You can create a video without re-interleaving the video to better understand how this process works.
Analogue video is [https://en.wikipedia.org/wiki/Interlaced_video interlaced], essentially interleaving two consecutive video frames within each image. This confuses video filters that compare neighbouring pixels (e.g. to look for bright grains in dark areas of the screen), so you should ''deinterleave'' the frames before using such filters, then ''interleave'' them again afterwards. For example, an <code>ffmpeg</code> filter chain might start with <code>il=d:d:d</code> and end with <code>il=i:i:i</code>. If you skip the trailing <code>il=i:i:i</code>, you can see that de-interleaving works by putting each image in a different half of the frame to trick other filters into doing the right thing.


=== Choosing formats ===
=== Choosing formats ===

Revision as of 00:53, 4 September 2015

This page discusses how to capture analogue video for offline consumption (especially digitising old VHS tapes). For information about streaming live video (e.g. webcams), see the streaming page. For information about streaming digital video (DVB), see TV-related software.

Overview

Analogue video technology was largely designed before the advent of computers, so accurately digitising a video is a difficult problem. For example, software often assumes a constant frame rate throughout a video, but analogue technologies can deliver different numbers of frames from second to second for various reasons. This page will discuss some of the problems you will encounter digitising video and some of the techniques and programs you can use to solve them.

Recommended process

Your workflow should look something like this:

  1. Set your system up - understand the quirks of your TV card, VCR etc.
  2. Encode an accurate copy of the source video - handle issues with the analogue half of the system here. Do as little digital processing as possible
  3. Transcode a usable copy of the video - convert the previous file to something pleasing to use
  4. Try the video and transcode again - check whether the video works how you want, then transcode again

Converting analogue input to a digital format is hard - VCRs overheat and damage tapes, computers use too much CPU and drop frames, disk drives fill up, etc. Creating a good digital video is also hard - not all software supports all formats, overscan and background hiss distract the viewer, videos need to be split into useful chunks, and so on. It's much easier to learn the process and produce a quality result if you tackle one problem at a time.

Choosing formats

When you create a video, you need to choose your video format (e.g. XviD or MPEG-2), audio format (e.g. WAV or MP3) and container format (e.g. AVI or MP4). There's constant work to improve the codecs that create audio/video and the muxers that create containers, and whole new formats are invented fairly regularly, so this page can't recommend any specific formats. For example, as of late 2015 MPEG-2 was the recommended video codec for backwards compatibility because it was supported by older DVD players, H.624 was becoming popular because support was starting to land in recent web browsers, and HEVC wasn't yet widely supported because people were waiting to see if patent claims would be made against it. Each solution is better for different use cases, and better solutions will most likely have been created within a year.

You'll need to do some research to find the currently-recommended formats. Wikipedia's comparisons of audio, video and container formats are a good place to start. Here are some important things to look for:

  • encoding speed - during the encoding stage, using too much CPU load will cause frame-drops as the computer tries to keep up
  • accuracy - some formats are lossless, others throw away information to improve speed and/or reduce file size
  • file size - different formats use different amounts of disk space, even with the same accuracy
  • compatibility - newer formats usually produce better results but can't be played by older software

Remember that you can use different formats in the encode and transcode stages. Speed and accuracy are most important when encoding, so you should use a modern, fast, low-loss format to create your initial accurate copy of the source video. But size and compatibility are most important for playback, so you should transcode to a format that produces a smaller or more compatible file. For example, as of late 2015 you might encode FLAC audio and x264 video into a Matroska file, then transcode MP3 audio and MPEG-2 video into an AVI file. You can examine the result and transcode again from the original if the file is too large or your grandmother's DVD player won't play it.

Setting up

Before you can record a video, you need to set your system up and identify the following information:

  • connector type (RF, composite or S-video)
  • TV norm (some variant of PAL, NTSC or SECAM)
  • video device (/dev/video<number>)
  • audio device (hw:CARD=<id>,DEV=<number>)
  • video capabilities (video/x-raw, format=<string>, framerate=<fraction>, width=<int>, height=<int>, interlace-mode=<string>, pixel-aspect-ratio=<fraction>)
  • audio capabilities (audio/x-raw, format=<string>, layout=<string>, rate=<int>, channels=<int>)
  • colour settings (optional - hue, saturation, brightness and contrast)

This section will explain how to find these.

Connecting your video

Rf-connector.png avoid RF Connector tends to create more noise than the alternatives. Usually input #0, shows snow when there's no input
Composite-video-connector.png use Composite video connector widely supported and produces a good signal. Usually input #1, shows blackness when there's no input
S-video-connector.png use if available S-video connector should produce a good video signal but most hardware needs a converter. Usually input #2, shows blackness when there's no input

Connect your video source (TV or VCR) to your computer however you can. Each type of connector has slightly different properties - try whatever you can and see what works. If you have a TV card that supports multiple inputs, you will need to specify the input number when you come to record. You can cut the recording into pieces during the transcoding stage, so snow/blackness won't appear in the final video.

Finding your TV norm

Most TV cards only support the TV norm of the country they were sold in (e.g. PAL-I in the UK or NTSC-M in the Americas), but it's best to confirm this just in case. Wikipedia has an image of colour systems by country and a complete list of standards with countries they're used in.

If you like, you can store your TV norm in an environment variable:

TV_NORM=<norm>

For example, if your norm was PAL-I, you might type TV_NORM=PAL-I into your terminal. This guide will use $TV_NORM to refer to your video norm - if you choose not to set an environment variable, you will need to replace instances of $TV_NORM with your TV norm.

Determining your video device

Once you have connected your input, you need to determine the name Linux gives it. See all your video devices by doing:

ls /dev/video*

One of these is the device you want. Most people only have one, or can figure it out by disconnecting devices and rerunning the above command. Otherwise, check the capabilites of each device:

for VIDEO_DEVICE in /dev/video* ; do echo ; echo ; echo $VIDEO_DEVICE ; echo ; v4l2-ctl --device=$VIDEO_DEVICE --list-inputs ; done

Usually you will see e.g. a webcam with a single input and a TV card with multiple inputs. If you're still not sure which one you want, try each one in turn:

mpv --tv-device=<device> tv:///<whichever-input-number-you-connected>

If your source is a VCR, remember to play a video so you know the right one when you see it. If you see snow when you were expecting blackness (or vice versa), double-check your input number with the output of v4l2-ctl above.

If you like, you can store your device and input number in environment variables:

VIDEO_DEVICE=<device>
VIDEO_INPUT=<whichever-input-number-you-connected>

Further examples on this page will use $VIDEO_DEVICE and $VIDEO_INPUT - you will need to replace these if you don't set environment variables.

Determining your audio device

See all of our audio devices by doing:

arecord -l

Again, it should be fairly obvious which of these is the right one. Get the device names by doing:

arecord -L | grep ^hw:

If you're not sure which one you want, try each in turn:

mpv --tv-device=$VIDEO_DEVICE --tv-adevice=<device> tv:///$VIDEO_INPUT

Again, you should hear your tape playing when you get the right one. Note: always use an ALSA hw device, as they are closest to the hardware. Pulse audio devices and ALSA's plughw devices add extra layers that, while more convenient for most uses, only cause headaches for us.

Optionally set your device in an environment variable:

AUDIO_DEVICE=<device>

Further examples on this page will use $AUDIO_DEVICE in place of an actual audio device - you will need to replace this if you don't set environment variables.

Getting your device capabilities

To find the capabilities of your video device, do:

gst-launch-1.0 --gst-debug=v4l2src:5 v4l2src device=$VIDEO_DEVICE ! fakesink 2>&1 | sed -une '/caps of src/ s/[:;] /\n/gp'

To find the capabilities of your audio device, do:

gst-launch-1.0 --gst-debug=alsa:5 alsasrc device=$AUDIO_DEVICE ! fakesink 2>&1 | sed -une '/returning caps/  s/[s;] /\n/gp'

You will need to press ctrl+c to close each of these programs when they've printed some output. When you record your video, you will need to specify capabilities based on the ranges displayed here.

For options where you have a choice, you should usually just pick the highest number with the following exceptions:

  • audio format is optional (your software can decide this automatically)
  • video format should be optional, but as of 2015 a bug means you need to specify format=UYVY
  • video height (discussed below) should be the appropriate height for your TV norm
  • video framerate (discussed below) should be the appropriate value for your TV norm, but may need to be tweaked for your hardware
  • pixel-aspect-ratio should be ignored (it will be set later)

For example, if your TV norm was some variant of PAL and your video card showed these results:

$ gst-launch-1.0 --gst-debug=v4l2src:5 v4l2src device=$VIDEO_DEVICE ! fakesink 2>&1 | sed -une '/caps of src/ s/[:;] /\n/gp'
0:00:00.052071821 29657      0x139fc50 DEBUG                v4l2src gstv4l2src.c:306:gst_v4l2src_negotiate:<v4l2src0> caps of src
video/x-raw, format=(string)YUY2, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)UYVY, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)Y42B, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)I420, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)YV12, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)xRGB, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)BGRx, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)BGR, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB16, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)RGB15, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
video/x-raw, format=(string)GRAY8, framerate=(fraction)25/1, width=(int)[ 48, 720 ], height=(int)[ 32, 578 ], interlace-mode=(string)mixed, pixel-aspect-ratio=(fraction)54/59
$ gst-launch-1.0 --gst-debug=alsa:5 alsasrc device=$AUDIO_DEVICE ! fakesink 2>&1 | sed -une '/returning caps/  s/[s;] /\n/gp'
0:00:00.039231863 30898      0x25fcde0 INFO                    alsa gstalsasrc.c:318:gst_alsasrc_getcaps:<alsasrc0> returning cap
audio/x-raw, format=(string){ S16LE, U16LE }, layout=(string)interleaved, rate=(int)32000, channels=(int)2, channel-mask=(bitmask)0x0000000000000003
audio/x-raw, format=(string){ S16LE, U16LE }, layout=(string)interleaved, rate=(int)32000, channels=(int)1

Then you would select video/x-raw, format=UYVY, framerate=25/1, width=720, height=576, interlace-mode=mixed, pixel-aspect-ratio=1 and audio/x-raw, layout=interleaved, rate=32000, channels=2, channel-mask=0x0000000000000003

Once again, you can set your capabilities in an environment variable:

VIDEO_CAPABILITIES=<capabilities>
AUDIO_CAPABILITIES=<capabilities>

Further examples on this page will use $VIDEO_CAPABILITIES and $AUDIO_CAPABILITIES in place of an actual capabilities - you will need to replace this if you don't set environment variables.

Video heights

Some devices report a maximum height of 578. A PAL TV signal is 576 lines tall and an NTSC signal is 486 lines, so height=578 won't give you the best picture quality. To confirm this, tune to a non-existent TV channel then take a screenshot of the snow:

gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
    ! $VIDEO_CAPABILITIES, height=578 \
    ! imagefreeze \
    ! autovideosink

Here's an example of what you might see - notice the blurring in the middle of the picture. Now take a screenshot with the appropriate height for your TV norm:

gst-launch-1.0 -q v4l2src device=$VIDEO_DEVICE \
    ! $VIDEO_CAPABILITIES, height=<appropriate-height> \
    ! imagefreeze \
    ! autovideosink

Here's an example taken with height=576 - notice the middle of this picture is nice and crisp.

You may want to test this yourself and set your height to whatever looks best.

Video framerates

Due to hardware issues, some V4L devices produce slightly too many (or too few) frames per second. To check your system's actual frame rate, start your video source (e.g. a VCR or webcam) then run this command:

gst-launch-1.0 v4l2src device=$VIDEO_DEVICE \
    ! $VIDEO_CAPABILITIES \
    ! fpsdisplaysink fps-update-interval=100000
  1. Let it run for 100 seconds to get a large enough sample. It should print some statistics in the bottom of the window - write down the number of frames dropped
  2. Let it run for another 100 seconds, then write down the new number of frames dropped
  3. Calculate (second number) - (first number) - 1 (e.g. 5007 - 2504 - 1 == 2502)
    • You need to subtract one because fpsdisplaysink drops one frame every time it displays the counter
  4. That number is exactly one hundred times your framerate, so you should tell your software e.g. framerate=2502/100

Note: VHS framerates can vary within the same file. To get an accurate measure of a VHS recording's framerate, encode to a format that supports variable framerates then retrieve the video's duration and total number of frames. You can then transcode a new file with your desired frame rate.

Correcting your colour settings

Most TV cards have correct colour settings by default, but if your picture looks wrong (or you just want to check), first capture an image that has a good range of colours:

mpv --tv-device=$VIDEO_DEVICE tv:///$VIDEO_INPUT

Press "s" to take screenshots, then open them in an image editor and alter the hue, saturation brightness and contrast until it looks right. If possible, print a testcard, capture a screenshot of it in good lighting conditions, then compare the captured image to the original. When you find the settings that look right, you can set your TV card.

First, make a backup of the current settings:

v4l2-ctl --device=$VIDEO_DEVICE --list-ctrls | tee tv-card-settings-$( date --iso-8601=seconds ).txt

Then input the new settings:

v4l2-ctl --device=$VIDEO_DEVICE --set-ctrl=hue=0          # set this to your preferred value
v4l2-ctl --device=$VIDEO_DEVICE --set-ctrl=saturation=64  # set this to your preferred value
v4l2-ctl --device=$VIDEO_DEVICE --set-ctrl=brightness=128 # set this to your preferred value
v4l2-ctl --device=$VIDEO_DEVICE --set-ctrl=contrast=68    # set this to your preferred value

Note: you can update these while a video is playing. If your settings are too far off, or if you're able to record a testcard, you might want to change the settings by eye before you bother with screenshots.

Encoding an accurate video

Your first step should be to record an accurate copy of your source video. A good quality encoding can use anything up to 30 gigabytes per hour, so figure out how long your video is and make sure you have enough space. Most software isn't optimised for analogue video encoding, causing audio and video to desynchronise in some circumstances.

GStreamer is the best program for inputting video on Linux (see the GStreamer page for details), but is poorly-documented and hard to use. FFmpeg has much better documentation, but can't handle the quirks of analogue video. To get the best results, you'll need to use GStreamer as an FFmpeg source. Assuming you have set the environment variables from the previous section, and also set $VIDEO_FORMAT, $VIDEO_FORMAT_OPTIONS, $AUDIO_FORMAT, $AUDIO_FORMAT_OPTIONS and $MUXER_FORMAT based on your preferences, you can record video with a command like this:

ffmpeg \
    -i <(
        gst-launch-1.0 -q \
    	    v4l2src device=$VIDEO_DEVICE do-timestamp=true pixel-aspect-ratio=1 norm=$TV_NORM ! $VIDEO_CAPABILITIES ! mux. \
            alsasrc device=$AUDIO_DEVICE do-timestamp=true                                    ! $AUDIO_CAPABILITIES ! mux. \
            matroskamux name=mux ! fdsink fd=1
    ) \
    -c:v $VIDEO_FORMAT $VIDEO_FORMAT_OPTIONS \
    -c:a $AUDIO_FORMAT $AUDIO_FORMAT_OPTIONS \
    "accurate-video.$MUXER_FORMAT"

This command does two things:

  • tells GStreamer to record raw audio and video and mux it into a Matroska media container, which represents the input in a form that's accurate and which FFmpeg can handle
  • tells FFmpeg to accept the video from GStreamer and convert it to a more usable format

If you have enough free disk space, you could just save the raw video as your accurate copy - see GStreamer for details.

Handling desynchronised audio and video

GStreamer should be able to synchronise your audio and video automatically using the do-timestamp setting, but some hardware doesn't support timestamps so you'll have to fix it during transcoding. Before following these instructions, make sure you're using a raw hw audio device - plughw devices can cause synchronisation issues on any hardware.

If possible, create clapperboard effects at the start of your videos - hook up a camcorder, run your capture command, then clap your hands in front of the camera before pressing play on your VCR. Failing that, make note of moments in videos where an obvious visual element occurred at the same moment as an obvious audio moment.

Once you've recorded your video, you'll need to calculate your desired A/V offset. For the best result, play your video with precise timestamps (e.g. mpv --osd-fractions your-file.$MUXER_FORMAT) and open your audio in an audio editor (e.g. Audacity), then find the exact frame when your clapperboard video/audio occurred and subtract one from the other. To confirm your result, run mpv --audio-delay=<result> accurate-video.$MUXER_FORMAT and make sure it looks right.

Measuring audio noise

Your hardware will create a small amount of audio noise in your recording. If you want to remove this later, you'll need to measure it for every hardware configuration you use - S-video vs. composite, laptop charging vs. unplugged, and so on.

You'll need a recording of about half a second of your system in a resting state, which you will use later to remove noise. This can be a silent TV channel or paused tape, but if you're using composite or S-video connectors, the easiest thing is probably just to record a few moments of blackness before pressing play.

Choosing formats

Your encoding formats need to encode in real-time and lose as little information as possible. Even if you plan to throw that information away during transcoding, an accurate initial recording will give you more freedom when the time comes. For example, your muxer format should support variable frame rates so you can measure your video's frame rate. Once you have that information, you could use it to calculate an accurate transcoding frame rate or to cut out sections where your VCR delivered the wrong number of frames - either way the information is useful even though it was lost from the final video.

Transcoding a usable video

The video you recorded should accurately represent your source video, but will probably be a large file, be a noisy experience, and might not even play in some programs. You need to transcode it to a more usable format. You can use any program(s) to do this, but it's probably easiest to continue using FFmpeg:

ffmpeg -i "accurate-video.$MUXER_FORMAT" \
    -c:v <transcoded-video-format> <transcoded-video-options> \
    -c:a <transcoded-audio-format> <transcoded-audio-options> \
    usable-video.<usable-muxer>

If you're happy with the result, you can stop here. But you might want to improve the video, for example:

This section will discuss some of the high-level issues you'll face if you choose to improve your video.

Cleaning audio

Any analogue recording will contain a certain amount of background noise. Cleaning noise is optional, and you'll always be able to produce a slightly better result if you spend a little longer on it, so this section will just introduce enough theory to get you started. Audacity's equalizer and noise reduction effect are good places to start experimenting.

The major noise sources are:

  • your audio codec might throw away sound it thinks you won't hear in order to reduce file size
  • your recording system will produce a small, consistent amount of noise based on its various electrical and mechanical components
  • VHS format limitations cause static at high and low frequencies, depending on the VCR's settings
  • imperfections in tape recording and playback produce noise that differs between recordings and even between scenes

A lossless audio format (e.g. WAV or FLAC) should ensure your original encoding doesn't produce any extra noise. Even if you transcode to a format like MP3 that throws information away, a lossless original ensures there's only one lot of noise in the result.

The primary means of reducing noise is the frequency-based noise gate, which blocks some frequencies and passes others. High-pass and low-pass filters pass noise above or below a certain frequency, and can be combined into band-pass or even multi-band filters. The rest of this section discusses how to build a series of noise gates for your audio.

Identify noise from your recording system by recording the sound of a paused tape or silent television channel for a few seconds. If possible, use the near-silence at the start of your recording so you can guarantee your sample matches your current hardware configuration. Use this baseline recording as a noise profile which your software uses to build a multi-band noise gate. You can apply that noise gate to the whole recording, and to other recordings with the same hardware that don't have a usable sample.

Identify VHS format limitations by searching online for information based on your TV norm (NTSC, PAL or SECAM), your recording quality (normal or Hi-Fi) and your VHS play mode (short- or long-play). Wikipedia's discussion of VHS audio recording is a good place to start. If you're able to find the information, gate your recordings with high-pass and low-pass filters that only allow frequencies within the range your tape actually records. For example, a long-play recording of a PAL tape will produce static below 100Hz and above 4kHz so you should gate your recording to only pass audio in the 100Hz-4000Hz range. If you can't find the information, you can determine it experimentally by trying out different filters to see what sounds right - your system probably produces static below about 10Hz or 100Hz and above about 4kHz or 12kHz, so try high- and low-pass filters in those ranges until you stop hearing background noise. If you don't remove this noise source, the next step will do a reasonable job of guessing it for you anyway.

Identify imperfections in recording and playback by watching the video and looking for periods of silence. You only need half a second of background noise to generate a profile, but the number of profiles is up to you. Some people grab one profile for a whole recording, others combine clips into averaged noise profiles, others cut audio into scenes and de-noise each in turn. At a minimum, tapes with multiple recordings should be split up and each one de-noised separately - a tape containing a TV program recorded in LP mode in one VCR followed by a home video recorded in SP in another VCR will produce two very different noise profiles, even if played back all in one go.

It's good to apply filters in the right order (system profile, then VHS limits, then recording profiles), but beyond that noise reduction is very subjective. For example, intelligent noise reduction tends to remove more noise in quiet periods but less when it would risk losing signal, which can sound a snare drum being brushed whenever someone speaks. But dumb filters silence the same frequencies at all times, which can make everything sound muffled.

You can run your audio through as many gates as you like, and even repeat the same filter several times. If you use a noise reduction profile, you can even get different results from different programs (see for example this comparison of sox and Audacity's algorithms). There's no right answer but there's always a better result if you spend a bit more time, so you'll need to decide for yourself when the result is good enough.

Cleaning video

Much like audio, you can spend as long as you like cleaning your video. But whereas audio cleaning tends to be about doing one thing really well (separating out frequencies of signal and noise), video cleaning tends to be about getting decent results in different circumstances. For example, you might want to just remove the overscan lines at the bottom of a VHS recording, denoise a video slightly to reduce file size, or aggressively remove grains to make a low-quality recording watchable. FFmpeg's video filter list is a good place to start, but here are a few things you should know.

Some programs need video to have a specified aspect ratio. If you simply crop out the ugly overscan lines at the bottom of your video, some programs may refuse to play your video. Instead you should mask the area with blackness. In ffmpeg, you would use a crop filter followed by a pad filter to create the appropriate result.

Analogue video is interlaced, essentially interleaving two consecutive video frames within each image. This confuses video filters that compare neighbouring pixels (e.g. to look for bright grains in dark areas of the screen), so you should deinterleave the frames before using such filters, then interleave them again afterwards. For example, an ffmpeg filter chain might start with il=d:d:d and end with il=i:i:i. If you skip the trailing il=i:i:i, you can see that de-interleaving works by putting each image in a different half of the frame to trick other filters into doing the right thing.

Choosing formats

Your transcoding format needs to be small and compatible with whatever software you will use to play it back. If you can't find accurate information about your players, create a short test video and try it on your system. Your video codec may well have options to reduce file size at the cost of encoding time, so you may want to leave your computer transcoding overnight to get the best file size.