Checking for corrupt FLAC files

Recently I had some strange conditions during playback of certain FLAC files in my digital tango collection. First I thought there must be a bug in my player but checking the player’s logs showed that the problem was within the FLAC stream. Somehow, some of the files must have gotten corrupted. This could have different causes, like power failures and unclean shutdowns, maybe an error on the media or an uncompleted copy task. Once a byte get’s changed in the FLAC file in the worst case it might not be playable anymore. In my case, I had two different conditions: First the file would start playing and then after a certain point the music output stopped. In another situation the corrupted FLAC file would trigger an exception in the decoder and it crashed the whole player application.

As a matter of fact this is a feature of the FLAC audio format: “Suitable for archiving: FLAC is an open format, and there is no generation loss if you need to convert your data to another format in the future. In addition to the frame CRCs and MD5 signature, FLAC has a verify option that decodes the encoded stream in parallel with the encoding process and compares the result to the original, aborting with an error if there is a mismatch.”

So the good news are that the FLAC format has a MD5 checksum mechanism integrated which provides for an easy integrity check. There is also a command line programm to check a FLAC file:

flac -wst flacfile.flac

Here are the command options explained:
-w, –warnings-as-errors Treat all warnings as errors (which cause flac to terminate with a non-zero exit code).
-s, –silent Silent: do not show encoding/decoding statistics.
-t, –test Test (same as -d except no decoded file is written). The exit codes are the same as in decode mode.

Now this is useful to check an individual FLAC file but when you have to scan several thousand files it might be more useful to put it into a shell script and run it against the whole music folder. I found this script which I saved as flactest.sh in my home folder:

#!/bin/bash
cd ~/Musique
if [[ -f flac-errors.txt ]]; then
rm flac-errors.txt;
fi
touch flac-errors.txt
shopt -s globstar
for file in ./**/*.flac; do
flac -wst "$file" 2>/dev/null || printf '%3d %s\n' "$?" "$file" >> flac-errors.txt;
done

The script changes into the Musique folder in my home directory and then creates a text file called flac-errors.txt, if it’s running consecutively, it tests if this text files exists and when it exists it deletes and recreates it as an empty file prior to proceeding.

shopt -s globstar means that the Bash script will perform recursive globbing on ** – therefore matching all directories and files from the current position in the filesystem, rather that only the current level.

In the for loop it will loop through all FLAC files in the Musique folder and its subfolders performing the integrity test. If the FLAC file is OK, the output is send to /dev/null which means the output is deleted and if the test is not OK, meaning that there is corruption, it will be written into the flac-errors.txt file with a little formatting. So you will have the title and the path of the corrupt FLAC file written each on one line in the text file for a later analysis and eventual restoration of the dammaged files.

The script will take quite some time to loop through all files. What I do is opening the flac-errors.txt for continuous reading to see the progress in another console, like this:

tail -f /home/jens/Musique/flac-errors.txt

So every once in a while such a test might be a good idea to check if all the music files in the collection are still OK. This rules out bad surprises during playback!

By the way, the foobar2000 player has such an integrity test in the interface, Mixxx will write FLAC stream errors into its log file.

The test script can also be useful to be run on newly added folders in the music collection to check that all FLAC files are in OK condition and that the encoding worked out well, like a last test before playback or archival. The verify feature of the FLAC audio format is actually a big advantage compared to other formats which don’t have such a mechanism!

3 thoughts on “Checking for corrupt FLAC files

  1. Hi. I don’t know much about Tango (even though I’m from Argentina). Long live Astor Piazzola anyway!
    I came across this page because I have a corrupted FLAC file, one that is impossible to get from any other source.
    It seems that one or a few bytes at the beginning of the data stream were changed (the header is OK). When I decode the FLAC into WAV, all I get is random values which result in inaudible noise. The frequency analysys reveals that the decoded samples are evenly distributed in all frequencies (this implies “noise”).
    Maybe one of the first bytes of the data steam has been changed, and the FLAC decoding algorythm makes wrong calculations after this corrupted byte.
    I guess there must be an application that is capable to fix this. Do you know any?
    Thanks in advance!
    Fabian.

  2. It’s worth mentioning that you need to install FLAC first otherwise every file in your collection will generate an error and be added to the list. Ask me how I know 🙂

Leave a Reply to Robert Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.