BedroomLAN

Tilde~A: Alexios' Homepage

Renaming MP3s

Let's say you have a large collection of incorrectly labelled, or downright unlabelled MP3 files arranged in directories by artist. Each artist directory contains one subdirectory per album, with untagged files under them. You need these files ID3-tagged with some basic information, namely the artist and album.

Solution

Here's a quick and dirty solution. You can put this in a script, or you can paste it on the command line. The original version was much more condensed — this one has one statement per line for clarity.

find SOME_DIRECTORY -type f -name '*mp3' | 
gawk -- '
BEGIN 
{
    FS="/";
    print "#!/bin/sh"
}
{
    gsub ("\"", "\\\"", $0);
    gsub ("\140", "\\\140", $0);
    gsub ("_", " ", $0);
    file=$0;
    printf ("id3v2 -a \"%s\" -A \"%s\" \"%s\"\n", $(NF-2), $(NF-1), file);
}' > /tmp/mp3_script.sh

This assumes you store your MP3s under directory SOME_DIRECTORY as SOME_DIRECTORY/.../Artist/Album/. Change the placeholder ‘SOME_DIRECTORY’ accordingly.

If you invoke find SOME_DIRECTORY/ -type f -name '*mp3', you should get lines with at least two slashes (three directory levels) each. Based on the original problem, this solution assumes this layout. Only the last two directory levels are considered, but SOME_DIRECTORY is actually /media/mp3 and you try to process a file /media/mp3/file.mp3, it'll get tagged with artist ‘media’ and album ‘mp3’. Probably not what you want.

The script ends up in /tmp/mp3_script.sh. Go through it (back up your MP3 files first!), and if all seems well, do:

chmod +x /tmp/mp3_script.sh
/tmp/mp3_script.sh

...and pray.

Operation

What the command does is it prints out a new script, with invocations of the MP3 tagging tool id3v2. It sets the album (with -A) and artist (with -a) using the names of the MP3's containing directory (album name) and the directory containing that (artist name).

You may, of course, modify the printf() invocation to generate a different command (this script uses id3v2) with appropriate command line arguments.

The embedded awk script that does most of the work splits paths into components by using / as a record separator. It then outputs an id3v2 invocation, setting the album to be the value of the penultimate ‘record’ (path component), and the artist to be the record before that.

The command escapes double quotes and back-ticks to make them safe for shell commands and takes the liberty of changing underscores to spaces in tags. If you don't like that last one, omit the statement gsub("_"," ",$0); (line 11).

Originally published here as a Livejournal comment.