Trial and error: Digitising my tango CD collection


DJ-Set milonga at Brussel’s Royal Yacht Club August 2011


When I started digitising my existing tango CD collection, I thought that  OGG Vorbis would be a good choice as a compression format. Especially the El Corte DJ manual stresses that most of the older tango recordings until the wider spread of the stereo recordings in the late 50s contain rarely more than 100 Kbit of information and therefore a low bitrate MP3 compression should be the most suitable. So I went for OGG Vorbis as the digital audio format for my collection because I’m a Linux user for more than 15 years and OGG Vorbis is an open source codec very common on all major Linux distributions. OGG Vorbis is also known to provide better sound quality at lower bitrates than MP3. Another important reason why I didn’t use MP3 are the patent and license issues involved with this codec.

After some weeks of hard encoding work I received more and more up-to-date information about the encoding chain, which I will explain later, and most important, the existence of lossless audio codecs, like FLAC. I also read some tests results which I have been able to reproduce partially in my home studio: OGG Vorbis encoding creates artifacts especially at lower bitrates and partially damaged source audio files, like old tango recordings. Let’s mention the short pops of a vinyl recording which can result in a drop like end result with a lossy codec.

I therefore decided to drop everything and to reencode the whole collection in a lossless audiophile format. The advantages are that I have now a CD-quality digital collection on my computer with a very nice sound timbre, every song can be tagged like with Vorbis or MP3, you can even embed the cover art (front, back, inline, etc) and several pages of scanned CD folders into each song title! FLAC is also based on an open-source license and there are no patent issues. Among audiophiles it is currently traded as the best lossless codec with the highest audio quality and interoperability. Once the collection is in a lossless form it can be transcoded into any lossy (MP3, OGG Vorbis, etc) or other lossless formats and one can even go back to WAV for restoring the initial Audio-CD. One of FLAC’s disadvantages is the larger need for disk space than for lossy formats. Typically it reduces the original audio material to around 42% of its initial size. Which is a fair compromise as hard disks are cheap and large today.

My current encoding chain is as follows: I use an external USB connected LG GE20-LU10 Multidrive to read the Audio-CDs. The extraction of the audio stream and FLAC encoding is done via the graphical frontend K3b which uses the excellent command line tool Cdparanoia. Cdparanoia reads audio from the CDROM directly as data, with no analog step between, and writes the data to a file or pipe in WAV, AIFC or raw 16 bit linear PCM. The data gets then transcoded to FLAC on the fly and ends up in my Music folder for further processing. Not to forget, before inserting the CDs into the drive I clean them. I recognised the importance of this step as one CD didn’t pass the encoding process and the encoder exited after a few tracks. After cleaning and repairing of the scratches it passed without errors. Also, when the CD is dirty certain sectors might not be readable and the encoder could apply some undesired interpolation.

Egon Ludwig's Tango Dictionary

Egon Ludwig’s Tango Dictionary

Next step is tagging the song titles as the provided info via CDDB is often incomplet and contains a lot of typing errors. For the moment I’m happiest with Easytag, it has a very interesting scanning function which permits for batch editing all song titles at once (also across several albums) and can add cover art into the FLAC metadata fields. I mostly use the database to crosscheck all titles and correct errors, adding the recording dates and singer names. (I haven’t yet worked out how to directly integrate the database into my tagger as their tools seem to be crafted mainly for Windows but I will certainly dig deeper into this soon as it could save some time.) As the database is still incomplete, Bernhard Gehberger’s tracklistings are a very good hint for crosschecking, he did an excellent job by providing his tracklistings and other resources on-line. I also use Egon Ludwig’s Tango Lexikon (in german language, ISBN 3-89602-294-6). It contains a waste discography of 78 RPM, LP records and CDs. It seems to be out of print but there might still be some copies around. To my mind the tagging part is really important and it needs to be done in a very precise manner. It’s simple: when the song titles cannot be found later, they cannot be played ;-).

For my first DJ-sets I used the Rhythmbox player software which maps on the Itunes way of working. I later decided to migrate to Mixxx as I want to adopt an improvised way of presenting the live set. I still use Rhythmbox for easily browsing my collection and for my private playback. But I’m not happy with these playlist based programs. So I turned to Mixxx which enables me to see the waveform of each song and to organise my collection into crates. It also has a BPM detection and writes the collected info back into the FLAC metadata. With the multichannel sound card support, I can connect a headphone to a second sound card and pre-listen while the live stream is running. It also works on 64bit systems and takes advantage of multi-core CPUs. Mixxx runs on a variety of operating systems like Linux, MacOS and different Windows versions. It cuts off all sound effects and other programs using the sound card (I still remember my worst dancing experience when the DJ at my local milonga, no names, received a Skype call during his Pugliese tanda, dring, dring, dring … :-)). The Mixxx software runs rock solid on my Ubuntu Linux Laptop and there seem to be more features and design improvements to come after this summer. There is an interesting feature I haven’t used yet: Live streaming via a streaming server. I will test this soon.

I still use my Ipod classic for mobile playback and to dig into my collection. I’m not happy with it though as it doesn’t support anything else then MP3 and some Apple codecs but in revenge it has a big hard drive. So I just transcoded the whole collection from FLAC to MP3 to upload the copies on the Ipod (it takes a night to run through the whole collection, only for using the Ipod!). Meanwhile I’m dreaming of a FLAC mobile player with a big hard drive. After some search I found the SoundKonverter graphical interface which uses also Cdparanoia to do the batch transcoding for the Ipod preserving most of the metadata tags from the FLAC files. I end up with two music folders, one with the ripped CDs in FLAC format for the live DJ-sets and one with the transcoded MP3 files for my Ipod.

Tascam US-100 DAC

My main sound card for the PA system is an external Tascam US-100, it also has a connector for a turntable and could be used later to record LPs and 78 RPMs since the late 50s as it integrates a RIAA preamplifier and ground connector (earlier 78 RPM records use different preamplification schemes and you need to map these curves accordingly, see this interesting article from Tangovia). I like the volume button on the device which permits to control the loudness during the live set with a real hand as I don’t want to use replay gain. This is mainly because I prefer to adapt the volume according to the audience and trust my sound perception. For some difficult playbacks of poor quality recordings I use sometimes the equilizer function of Mixxx to adapt the high, mid or low ranges. Most of the time the raw stream of my encoded music provides the best music experience.


One thought on “Trial and error: Digitising my tango CD collection

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.