Revisiting the get_tags() function

The get_tags() function follows the same logic we used for our EXIF plugin. Like any good programmer, we copied that script and made a few modifications to fit ID3 metadata. In the get_tags() function, we first need to create our CSV headers on line 69. These headers represent the possible keys our dictionary might possess and the order we want to see them in our CSV output:

059 def get_tags(filename):
060 """
061 The get_tags function extracts the ID3 metadata from the data
062 object.
063 :param filename: the path and name to the data object.
064 :return: tags and headers, tags is a dictionary containing ID3
065 metadata and headers are the order of keys for the CSV output.
066 """
067
068 # Set up CSV headers
069 header = ['Path', 'Name', 'Size', 'Filesystem CTime',
070 'Filesystem MTime', 'Title', 'Subtitle', 'Artist', 'Album',
071 'Album/Artist', 'Length (Sec)', 'Year', 'Category',
072 'Track Number', 'Comments', 'Publisher', 'Bitrate',
073 'Sample Rate', 'Encoding', 'Channels', 'Audio Layer']

On line 74, we create our tags dictionary and populate it with some filesystem metadata in the same manner as the EXIF plugin, as follows:

074     tags = {}
075 tags['Path'] = filename
076 tags['Name'] = os.path.basename(filename)
077 tags['Size'] = processors.utility.convert_size(
078 os.path.getsize(filename))
079 tags['Filesystem CTime'] = strftime('%m/%d/%Y %H:%M:%S',
080 gmtime(os.path.getctime(filename)))
081 tags['Filesystem MTime'] = strftime('%m/%d/%Y %H:%M:%S',
082 gmtime(os.path.getmtime(filename)))

Mutagen has two classes that we can use to extract metadata from MP3 files. The first class, MP3, has some standard metadata stored in MP3 files, such as the bitrate, channels, and length in seconds. Mutagen has built-in functions to access this information. First, we need to create an MP3 object, which is accomplished on line 85, using the mp3.MP3() function. Next, we can use the info.bitrate() function, for example, to return the bitrate of the MP3 file. We store these values in our tags dictionary in lines 88 through 92, as follows:

084     # MP3 Specific metadata
085 audio = mp3.MP3(filename)
086 if 'TENC' in audio.keys():
087 tags['Encoding'] = audio['TENC'][0]
088 tags['Bitrate'] = audio.info.bitrate
089 tags['Channels'] = audio.info.channels
090 tags['Audio Layer'] = audio.info.layer
091 tags['Length (Sec)'] = audio.info.length
092 tags['Sample Rate'] = audio.info.sample_rate

The second class, ID3, extracts ID3 tags from an MP3 file. We need to first create an ID3 object using the id3.ID3() function. This will return a dictionary of ID3 tags as keys. Sound familiar? This is what we were presented with in the previous plugin. The only difference is that the value in the dictionaries are stored in a slightly different format:

{'TPE1': TPE1(encoding=0, text=[u'The Artist']),...} 

To access the value, The Artist, we need to treat the value as a list and specify the element in the zeroth index.

In a similar manner, we look for each of our tags of interest and store the first element in the value in the tags dictionary. At the end of this process, we return the tags and header objects back to id3_parser(), which in turn returns it to the metadata_parser.py script:

094     # ID3 embedded metadata tags
095 id = id3.ID3(filename)
096 if 'TPE1' in id.keys():
097 tags['Artist'] = id['TPE1'][0]
098 if 'TRCK' in id.keys():
099 tags['Track Number'] = id['TRCK'][0]
100 if 'TIT3' in id.keys():
101 tags['Subtitle'] = id['TIT3'][0]
102 if 'COMM::eng' in id.keys():
103 tags['Comments'] = id['COMM::eng'][0]
104 if 'TDRC' in id.keys():
105 tags['Year'] = id['TDRC'][0]
106 if 'TALB' in id.keys():
107 tags['Album'] = id['TALB'][0]
108 if 'TIT2' in id.keys():
109 tags['Title'] = id['TIT2'][0]
110 if 'TCON' in id.keys():
111 tags['Category'] = id['TCON'][0]
112 if 'TPE2' in id.keys():
113 tags['Album/Artist'] = id['TPE2'][0]
114 if 'TPUB' in id.keys():
115 tags['Publisher'] = id['TPUB'][0]
116
117 return tags, header