I've written a program called "ripcheck" which runs a variety of tests on a WAV file, to see if there are potential mistakes that occurred in converting a CD to a WAV file.
I've made this program available as free & open source, and it can be download at Sourceforce at: https://sourceforge.net/projects/ripcheck/ or from Github at https://github.com/panzi/ripcheck
You can use this program on your own WAV files to see if there potential defects in your CD rips.
.
WHAT?
ripcheckc.c - source code
ripcheckc - Mac OS command line binary
ripcheckc.exe - Windows command line binary
ripcheck.tcl - Slower version, already creates zoomed graphics of problem areas
Released under GPL v3 license - http://www.gnu.org/licenses/gpl.html
.
WHY?
This program was written because we'd received some complaints of occasional "pops" at the beginning on some albums at Magnatune. Further research found that most of the albums we released in 2007 had various CD ripping problems.
At that time, we were using the open source windows program CDEX and on many albums from that time this program would introduce a very short "click" sound at the beginning of each WAV file, which would then trickle down to all the file formats we offered at Magnatune. Most people didn't notice it, probably assuming it was an audio streaming glitch, since it was so short.
In order to discover just how bad the problem might be, I wrote a program called "ripcheck", because, to my surprise, no program currently exists to try to determine if there might be CD ripping errors in an audio file.
.
WHAT KINDS OF PROBLEMS?
Besides the common problem short click at the beginning of a track, this program also discovered:
1) the occasional dropped sample (i.e. a sample value of zero) in the middle of a song. This might be audible as a click as well.
2) the occasional repeated sample value, for about 1/1000th of a second. This would be audible as a very short tone.
3) the occasional empty spot in the middle of a song.
All in all, my program found that 124 albums at Magnatune had some sort of defect in them.
Most of the defects could be attributed to our early years, when we were running our ripping on Windows with software that was in its early versions (remember, we've been doing this since 2003, when CD ripping was pretty new).
Occasional problems also crept in from CDs burnt too quickly or on cheap CDR media, that were then sent to us as "masters".
What's amazing to me is that all these albums were sent to Amazon and iTunes, and those services never noticed anything. Obviously, though companies don't have any sort of "ripcheck" program, to do automatic quality control on the audio they distribute.
All our releases are now automatically tested with "ripcheck" before we make them available. And, all 124 albums that had defects in them have been repaired, in most cases with an audio editor. To remove the introductory pops (usually two samples long), I used Fission because it's a quick tool to use and the pops were in the silent part before the song starts. To fill in the occasional zero sample, I used Sequoia a professional grade audio program and actually drew in the missing sample. Where there was more than one sample missing, we reripped the CD master with modern software. If the CD master had defects in it (which as the case for about 20 releases), we contacted the musician and obtained new CD masters, which we insisted be slow (4x speed) burned. ripcheck was then rerun on all new audio to make sure that the defects were in fact removed.
.
HOW TO OBTAIN:
If you'd like to run this program yourself, on your own music library, please download at Sourceforce at:
https://sourceforge.net/projects/ripcheck/
It's a command line program that works on Windows, Mac and Unix/Linux. There is a fast version, which shows potential problems as text, and a much slower version which generates graphics of the WAV file defects, so you can easily see where the problem is.
In all cases, the location of the glitch is given in samples and milliseconds. You can use an audio editing program to type this number directly in and jump to that point.
Here is a typical warning, which detected 545 duplicate samples at two minutes into a song, which will sound like a very short (.012 seconds long) high pitched tone:
The ripcheck.tcl version of the program made this GIF for this WAV file, showing the context of the potential problem, and highlighting in red what it thinks the bad data is:
This particular audio file had two real problems. The repeated sample above is definitely a glitch on the CD (in this case, on the CDR master).
There was also a two sample long pop at the beginning of the recording, which ripcheck detects by looking for samples in the first five seconds of the song that have a zero value (silence), followed by a very short few samples that have a high value, followed by more silence. Here is how the ripcheck.tcl graphic represents this pop at the beginning of the song:
All the errors in a song are displayed as one GIF file, with the same name as the WAV file but with a .gif filename extension. The image zooms in on each error. Here is what the complete GIF file looks like for this song, showing the two problem areas (click on the graphic to see the full size image):
A text file is created for each WAV file, summarizing problems found, which in this case were:
- possible pop found at sample count 1014 (20 millisecs) values: '0, 0, 0, 0, 24932, 26464, 0'
- 1090 duplicate samples found (value: 18614) in the left channel at sample '6,360,822' (144230 milliseconds)
The program is intentionally over broad in detecting problems, and will often yield false positives (about half of the time, in my experience). For example, if you have electronic music which toys with audio glitches, perhaps creating pure tones with the same sample value, you might get a lot of false positives on that kind of music.
My goal was to always detect potential problems and never overlook real problems. On the Magnatune music collection, I personally listened to every audio file that had a warning, and viewed the samples in an audio editor, before marking a warning as a false positive.
The source code is pretty simple to read, and I'd love for any programmers who find it useful to devise other tests and add them to the list of things to detect.
.
HOW TO RUN:
If you download the .exe file for windows, you will also need to install cygwin.
To run it yourself, type:
ripcheckc filename.wav
Only WAV files are handled, and they must be 44k/16bit.
If you want to use this program on Unix/Linux, you will need to compile the program with this command line:
gcc ripcheckc.c -o ripcheckc
The ripcheck.tcl program requires Tcl/Tk to be on your system, and ImageMagick as well. It runs about 3x slower, but creates attractive graphics of the WAV file problem areas. This can be a great time saver if you have a lot of problematic WAV files, aiding you in quickly reviewing the problems visually to decide if they are false positives.
If you want a WAV with defects to test, you can download:
http://magnatune.com/p/ripcheck_example.zip
This file has both a pop at the beginning, and a RIP error in the middle of the song.
- possible pop found at sample count 1012 (22 millisecs) values: '0, 0, 0, 0, 24932, 26464, 0'
- 545 dupes found at sample count 6360820 (144236 millisecs) value='5703'
-john
You said "A" (which is very good news) but I'd also expect to hear "B".
Namely, are you planning to replace incorrectly ripped music in Magnatune library? What if master CD is already lost?
Can those of us who downloaded corrupted albums count on free replacement? What if I'm no longer Magnatune subscriber? (I remember reporting the issue in the past, most likely via web feedback form, but got no reply.)
P.S.
I'm surprised that the artists did not double-check the quality of their distributed songs... Otherwise, they would immediately noticed the glitches.
Posted by: Linulin | September 28, 2013 at 09:17 AM
Thanks for this, John.
While I don't know of any program that does after the fact checking,
like yours, either, cdparanoia gives you all this information (and more
in its verbose mode) while you're ripping. I use cdparanoia for all my
rips - or I should say I used it when I still had a CD drive :-)
Posted by: Ian Z. | September 28, 2013 at 10:07 AM
I agree that cdparanoia is a great CD ripping program
However, ripcheck is for people who need to test other people's cd rips, or need to test cd rips you yourself did, years ago, with less reliable software than cdparanoia.
-john
Posted by: John from Magnatune | September 28, 2013 at 10:50 AM
An addendum --
Below is a list of 90 albums that had defects in them.
If you downloaded these albums before September 2013, these all have small audio defects that have now been repaired. Magnatune members can, of course, redownload any of these albums.
If you bought these albums before Magnatune had a membership plan, you can get a redownload URL to any of your purchases by putting your email address on this page, and you'll receive an email with private download URLs to each album you purchased: http://magnatune.com/info/redownload
Altri Stromenti - Italian Music of the 17th Century
Ambient Teknology - The All Seeing Eye Project
Ambient Teknology - The Devils Toxin - Atek Rebirth Vol 1
American Bach Soloists - JS Bach - CD1 - Mass in B Minor
American Bach Soloists - JS Bach - CD2 - Mass in B Minor
American Bach Soloists - JS Bach Cantatas I - Solo Cantatas
American Bach Soloists - JS Bach Cantatas II - Trauerode
American Bach Soloists - JS Bach Cantatas III - Cantatas from Muehlhausen and Weimar
American Bach Soloists - JS Bach Cantatas IV - Early Cantatas for Holy Week
American Bach Soloists - JS Bach Cantatas V - More Cantatas from Muehlhausen and Weimar
American Bach Soloists - JS Bach Cantatas VI - Favorite Cantatas
Arthur Yoria - Handshake Smiles
Arthur Yoria - Suerte Something
Big Bad Sun - Big Bad Sun
Big Bad Sun - Strange Phenomena
Brad Sucks - CD1 Mixter Two - I Dont Know What Im Doing
Brad Sucks - CD2 Mixter Two - I Dont Know What Im Doing
Claire Fitch - Celocity
DAC Crowell - Sferica
DAC Crowell - Within This Space
DJ Cary - Downtempo Chill 2
DJ Cary - Electromelange
DJ Cary - Power Synths
DJ Cary - Sonic Chill
DJ Markitos - Sequences of Life
Etherfysh - Stasis
Falling You - Human
Five Star Fall - Automatic Ordinary
Four Stones - Chronic Dreams 2
Hans Christian - Cinema of Dreams
Heir to Madness - The Citadel
Human Response - Immortal
Ion - Future Forever
Ivan Ilic - Vitality and Virtuosity - Sonatas by Haydn and Beethoven
Janine Johnson - German Keyboard Masters
Janine Johnson - Suites - Op 22 24 and 29
Jay Kishor - CD2 The Sowebo Concert
Jeffrey Luck Lucas - Hell Then Divine
Jeffrey Luck Lucas - What We Whisper
Jeni Melia - Sister Awake - music only
Jeni Melia - Sister Awake
Jenraytor - A Steady Stream
John Jackson - Bad Things Happen All The Time
Kim Ribeiro - Majestic
LVX Nova - LVX Nova
La Nuova Musica - Il Circolo Di Giulio Caccini
Lara St John - CD1 - Bach - The Six Sonatas and Partitas for Solo Violin
Lara St John - CD2 - Bach - The Six Sonatas and Partitas for Solo Violin
Loops For Licensing - Electronica Loops 1
Loops For Licensing - Rock Loops 2
Loops For Licensing - Rock Loops 3
Magnatune Compilation - Christmas Music
Magnatune Compilation - Electronica
Magnatune Compilation - Magnatune At The CC Salon
Magnatune Compilation - Relaxing Classical
Magnatune Compilation - The 2007 Magnatune Records Sampler
Magnatune Compilation - The Art of Persuasion
Mediva - Viva Mediva
Mountain Mirrors - Dreadnought
Musica Franca - Corrette - Le Phenix - Les Delices de la Solitude
Nova Casa - Leclair - Blow - Matteis - Brescianello
Plunkett - 14 Days
Ralph Rousseau Meulenbroeks - Back with Bach
Ralph Rousseau Meulenbroeks - Christmas Carols
Ralph Rousseau Meulenbroeks - Gambomania
Ralph Rousseau Meulenbroeks - Moved By Marais
Rejuvenescence - Dreamscapes
Richard Savino - Legrenzi - Venice Before Vivaldi
Richard Savino - Murcia - Danza y Diferencias
Robert Rich - Open Window
Robert Rich - Temple of the Invisible
Rocket City Riot - Last Of The Pleasure Seekers
Roots of Rebellion - Surfacing
Seth Carlin - Mozart in the Age of Enlightenment
Shira Kammen - Mistral
SkinMechanix - The Secret Life of Angels
Solace - Moon Moth Mixes
Solace - Nagari
Solar Cycle - Silver Lights
Stephane Potvin and the Con Brio Choir - Goode Christemas Musicke
Strojovna 07 - Switch On - Switch Off
Telemann Trio Berlin - Telemann Trio Berlin
The Headroom Project - Apa Ya
The Sarasa Ensemble - A Baroque Mosaic
The Seldon Plan - Making Circles
The Seldon Plan - The Collective Now
Trip Wamsley - Curve
Yongen - Moonrise
Yongen - Yello Haus
Zilla - Egg
There are another 34 albums that were fixed some time ago, that I'm trying to dig up a list of. Nonetheless, if you suspect you hear a pop at the beginning of an album, you can always redownload if you want to make sure, since all the defects that I have been able to detect have now been repaired.
Posted by: John from Magnatune | September 28, 2013 at 10:59 AM
re: Namely, are you planning to replace incorrectly ripped music in Magnatune library? What if master CD is already lost?
Hi Linulin, I updated this blog entry this morning, to indicate that yes, all CDs with defects at Magnatune have been fixed, either by hand or by obtaining new, better master CDs.
re: Can those of us who downloaded corrupted albums count on free replacement? What if I'm no longer Magnatune subscriber? (I remember reporting the issue in the past, most likely via web feedback form, but got no reply.)
If you're a member, yes. If you're not a member, simplest is to sign up for a 7 day free trial, download the replacements and then cancel.
My apologies for the fact that you didn't get a reply when you reported it. Chances are we knew about the problem, and I was in the process of writing a program to automate finding problems (rather than just reripping CDs from user complaints, which wouldn't be as complete and accurate)
re: P.S.
I'm surprised that the artists did not double-check the quality of their distributed songs... Otherwise, they would immediately noticed the glitches.
Yes, that'd be nice, but not all musicians are technical, some are poor and buy really cheap CDR media, and many are quite rushed with the rest of their lives and burn CDs at the highest speed.
-john
Posted by: John from Magnatune | September 28, 2013 at 11:04 AM
Hi John,
good that you investigated and fixed this problem. While re-downloading some albums i think i found a problem: The album "Future Forever" by Ion now only has 3 songs, it used to be 11 songs long. Maybe a problem with the new re-encoding?
Posted by: Marc | September 30, 2013 at 11:35 AM
Awesome! But why do you use Sourceforge? That page really hurts to use. I've taken the liberty and made a GitHub mirror.[1] Will add cmake files, Linux and cross compiled Windows binaries later. Doing it right one can compile this simple software without depending on cygwin.
I also removed several warnings (unused and uninitialized variables - I initialized them with 0) and converted the README to markdown.
[1] https://github.com/panzi/ripcheck
Posted by: panzi | October 03, 2013 at 09:38 PM
Thanks Panzi, for running with the source and improving it.
Removing the cygwin dependency for windows users would be a good thing.
And your nicely formatted version of my documentation is very pretty!
-john
Posted by: John from Magnatune | October 04, 2013 at 12:48 AM
Btw. I fixed the GitHub repo so it shows you (John) as the initial commiter, making blame work correctly: https://github.com/panzi/ripcheck/blame/master/ripcheckc.c
Posted by: panzi | October 04, 2013 at 02:07 PM
Here you read 4 bytes:
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L146
But then you only use 2 of them:
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L152
Is this correct or a bug?
Posted by: panzi | October 04, 2013 at 03:49 PM
And this debug code moves the file pointer, so if debug == 1 the program behaves differently:
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L115
Posted by: panzi | October 04, 2013 at 03:55 PM
Also fgetc returns int, not char. It returns bytes as unsigned bytes and EOF (-1) on error/file end. On several places in the source the result of fgetc is written into a char.
Posted by: panzi | October 04, 2013 at 04:23 PM
And is the duration even correctly calculated? I think that cant be right. It ignores the sample rate and actually just uses the data size of the "RIFF" block. If anything, shouldn't it use the size of the "data" block and use sampling_rate instead of the hard coded 44100.0? Also why use floating point math here and not 44100L?
Posted by: panzi | October 04, 2013 at 04:42 PM
Also these lines are wrong:
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L163
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L169
It should be this in both cases:
poploc = count;
Posted by: panzi | October 04, 2013 at 09:32 PM
Hi Panzi -- in response to your comments
1) currently the app is hard coded to 44.1k WAV files at 16 bits
2) the app reads 4 bytes, which represent left and right channels at 16 bits each
3) the app currently only scans the left channel, and ignores the right channel. It'd be better to be checking both channels, but for my uses (CD rip errors) I found that rip errors showed up in both left and right channels. However, if you want to improve the program to test both left and right, that'd be great.
4) it sounds like you've identified several possible bugs with fgetc
5) I believe the duration is correct for the kind of data I'm looking at, which is plain WAV files that don't have multiple blocks, but it's also likely what you're proposing is the right way to do it.
Which is why I open sourced this all... so that clever people like you could improve what I'd started!
-john
Posted by: John from Magnatune | October 05, 2013 at 01:44 AM
re: good that you investigated and fixed this problem. While re-downloading some albums i think i found a problem: The album "Future Forever" by Ion now only has 3 songs, it used to be 11 songs long. Maybe a problem with the new re-encoding?
Hi Marc, sorry for the delay, but I wanted to get down to the root cause of what caused the Ion album to lose 8 tracks, and both fix the bug and fix any other albums that might have the problem.
Well, it turned out 3 albums at Magnatune lost some songs when they had their pops removed. I've fixed those 3 albums (which includes the Ion album you mentioned). I found those missing tracks by comparing all the songs on Magnatune post-defect-fixing to a list of songs pre-defect-fixing.
And I found the bug that caused the problem, which happens when I redo an already existing album, but with slightly different file names for some tracks. Some "tidying up" occurs when removes files that look like they aren't part of the album, which then accidentally deletes the new mp3/wavs that are similar-but-different from the old ones.
-john
Posted by: John from Magnatune | October 05, 2013 at 05:11 AM
I'm currently basically rewriting the C version so that it scans all channels, supports any sample and bit rate and I make all the limits configurable (what is silence, min. value of a pop etc.). Every now and then I like to do lowlevel stuff. To check if what I do is correct, I'd need some example test files that have the kinds of problems it should find. Can you provide any such files? It Won't be before tomorrow that I can test it anyway. Maybe even next week.
Posted by: panzi | October 05, 2013 at 08:49 AM
Ah and I guess here it should be x1 != 0 instead of x != 0:
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L176
If it's x1 != 0 it will ignore patches of total silence, if it's x != 0 it ignores dupes before a drop. Which one did you mean?
Hmm, and the same for these lines?
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L177
https://github.com/panzi/ripcheck/blob/master/ripcheckc.c#L179
These two lines overrule the first line anyway.
Posted by: panzi | October 05, 2013 at 10:11 PM
I should have linked a certain commit, because I just pushed tiny formatting changes and now all the referenced lines are move down by 3 or so.
Posted by: panzi | October 05, 2013 at 10:35 PM
I picked the first album on your list of previously corrupted albums (Altri Stromenti's Italian Music) and pulled down the flac archive again. I then compared it with my previous copy, extracting the wav files from the flac files to eliminate any differences in the compression settings or metadata. To my surprise no two .wav files were identical, or even had the same length; all were slightly different. Did this particular album have that many errors, or did you recreate the new version from an entirely different source? Just curious.
Posted by: Phil | November 09, 2013 at 07:32 PM
Phil, with that album, there as a pop in the silent part before each song started, and it appeared in the CDROM master that we have. I used an audio editor to trim the dozen or so samples out that were the pop. That's why every song now has a slightly different length.
This seemed like the best approach, since the defect was in the master, and this edit made no change to the musical portion of each piece.
-john
Posted by: John from Magnatune | November 10, 2013 at 03:43 AM
Thanks. BTW, one way I've introduced pops at the beginnings of songs is giving wav files to software expecting raw PCM; the 'pop' is the wav file header. You might look for that as a special case.
Posted by: Phil | November 12, 2013 at 01:41 AM
I can see this is a really hard problem. You can use a few heuristics, but without true error detection it'll never be 100% reliable.
I just ran your program on a ripped CD track in my library (The Doobie Brothers' Minute by Minute) with a very audible CD skip. It reported the track was fine.
If a skip is a repeat you might be able to detect it by autocorrelation, but not a skip forward. A program run by an ordinary user could compare the number of samples to the fingerprint in the various metadata databases, but that won't work for a new album you're releasing yourself!
This is really hard.
Posted by: Phil | November 12, 2013 at 04:37 PM
A bit off-topic.
Just re-downloaded Magnatune Compilation: The art of persuasion (which I highly recommend) and found out something funny in the artwork. The compilation is sub-titled "a sultry maGAnatune mix" X-DD
That said, I feel like leaving it as it is now ;-)
Posted by: Martà | March 29, 2014 at 07:47 AM