How do I preserve digital media, like CD, DVD, DAT and all the different kinds of digital videotape?
This is an answer to one of four related FAQs:
- What standard(s) should I follow?
- How should audio and video be encoded for preservation?
- What file format(s) should I use?
- How do I preserve digital media, like CD, DVD, DAT and all the different kinds of digital videotape?
The first three answers concentrated on digitisation of analogue content, and getting it into the file-based world sitting on mass storage in IT systems. However there is a lot of content which is technically digital (the audio and video are represented by numbers, not analogue signals) but sitting on shelves, and so also needs to be brought into the file-based world. To avoid confusion, moving content off digital media and into files should NOT be called digitisation. The word originating in moving music tracks from audio CDs is ripping, and that is the term used here.
The problem: it isn’t only analogue carriers that face obsolescence, degradation, damage and loss of playback equipment and expertise. There is also a range of dedicated digital media, beginning with the audio CD in the early 1980s and continuing right through to the latest Blu-Ray disc. The media basically include:
- Audio: CD disc, DAT tape
- Video: DVD disc, Blu-Ray disc, a range of digital videotapes (D1, D2, D3, D5, DV, DVC, DVC-PRO (d7), Digital-S (D9), DigiBeta, mini-DV, HD-CAM, HD-CAM SR, DVC-PRO HD, HDV)
- Film: cinema “film” is now distributed digitally as DCP files with all sorts of rights protection, on similarly lock-up hard drives. If they can be cracked open, the sound and images are already in files. The thing that makes dealing with DCP similar to the above video and audio formats, is the fact that a DCP doesn’t easily allow access to the files.
The good news: the starting point is already digital, so the ingest (or migration or transfer) process should be capable of more automation (and lower cost) that for working with analogue originals.
The bad news: it gets complicated. See Digital Tape Preservation Strategy: Preserving Data or Video? By David Rice and Chris Lacinak – December 2, 2009
The basic complication is that there is no way to know what the bits are on the actual tape (there can be similar problems on CD and DVD, depending upon the playback device). For tape playback equipment (such as a mini-DV camera in playback mode) there is built-in correction of read errors, and the correction gets more sophisticated with more professional playback devices. But – often these devices have two outputs: one for a digital video signal that is as corrected as the device can make it, and an additional ‘digital data’ output that can have extra information to show where correction has been applied, making it easier to know what going on, and what data really was on the original media.
So: here’s a two-headed monster. Two digital versions coming off one playback device — at the same time. Which is best for archiving?
The obvious answer is to save the digital data version because it has more information — but that simple answer ignores the complications of a full workflow for a transfer. There may be quality analysis software that runs on a reconstructed Rec 601 video signal, for instance.
As a generality: save the data version. Viewing and any further analysis should then be done from that data version. It is only legacy equipment that was originally designed for digitisation that has any real problems with a workflow based on digital data instead of digital video.
A= Save the original bits (a basic principle)
B= Make an “archive preservation standard” version – if you want one master digital format for your archive
and/or Make something useful: make a version that runs in all the applications you need (editor, player).
If you can’t make one version that suits all your purposes, you may need to also make a ‘mezzanine’ version: a single starting point for the production of any other formats that you need.
Files sit on mass storage; moving between coding types and file types can be automated. So it is not usually a major problem to have to change formats, or to produce a useful version on demand rather than at time of digitisation. The only caveat is to always go upward in quality when archiving: making an uncompressed master from DV doesn’t lose anything (except space) — but making a DV master from an uncompressed digital original would be definitely wrong (because DV is encoded at an 8:1 lossy compression).
Audio: ripping the data straight from audio CDs is standard, because usually the CDs are played in a computer CD player rather than in a dedicated device like a CD Walkman. The Walkman will have circuits to keep the audio from skipping while you jog. The computer doesn’t expect you to be computing while you jog, so no de-skipping circuits — and no problem about what to save because there is only the ‘digital data’ version.
DAT tapes are different. These can only be played in dedicated equipment which may have extensive processing to keep the signal steady despite read errors. The AES-EBU output on professional equipment is thus a digital audio output, not raw data — but it will be the only output! High end professional equipment (eg Sony 7030) have a separate output to indicate uncorrected errors. There is a range of error types, which gets us into ‘known unknowns’ and ‘unknown unknowns’ territory which is beyond this one-pager.
Minidisc could be ripped to get ‘the bits on the disc’ — but that takes special software, and more special software to play the result (because minidisc encoding isn’t any standard file encoding recognised by any standard playout software). So the PrestoPRIME recommendation is to archive minidisc content by capturing the uncompressed digital output of a minidisc player. This is the SP/DIF output that uses an optical cable. Even portable players generally have an optical output, which can be captured with USB sound cards costing as little as US$20. So in this case PrestoSpace recommencds capturing of the ‘digital audio’ not the ‘digital data’.
Video: we’ve said ‘save the original bits’ but there will still be two cases. For the DV format, the original bits make a DV file that can be easily worked with by most audio applications — so the DV file is ‘something sensible and useful’. But if you have anything else in the collection besides DV tapes, DV is not a good choice for an overall ‘archive preservation standard’ version.
Many digital storage media dedicated to video can be cloned (and exact copy can be made on a computer): DVDs and any other optical media, memory cards, hard drives and the entire family of DV related videotape. The clone (the original bits) should be saved. Howeve these formats use a wide range of encoding methods, many are proprietary and all use lossy compression. The best route to a ‘master format’ if one is needed will depend on the particular type of video on the particular type of carrier —technical advice will be needed! We hope PrestoCentre can develop a simple roadmap, but the terrain is complicated!
For many other formats (eg Digibeta, HD-CAM SR) the original bits simply cannot be accessed. There will be a Rec 601 or REC 709 digital output, and that should be saved.
HD formats are just beginning to hit the archive, and there are real problems. As with DVDs, there is a range of native formats (from tapes, memory cards and hard drives) that all should be cloned (where possible), and all will probably be problems in the future. Because of the data sizes, saving HD as uncompressed is a hard case to sell in 2013. HD-CAM SR has a native datarate of 400 Mb/s, but many broadcasters are pushing the archive to save at the production format of 100 Mb/s. This violates two basic archive principles: save the original; don’t reduce the quality. The solution to archiving HD will have to be the subject of a future FAQ, when a solution emerges.
Film: the basic issue is unlocking a protected DCP (Digital Cinema Package), but once unlocked, the content is compressed in a lossy fashion. The workflow for acquisition of digital cinema content for true archiving and preservation (not just for access copies) needs to find a way to acquire the original digital materials, not just the DCP version. Further information on DCP packages, their problems and solutions, will have to be the subject of a separate FAQ.