Audiometa

Esta documentação está disponível apenas em inglês por enquanto.

Metadata Field Guide: Support and Handling

This document consolidates comprehensive metadata field handling across all supported audio formats (ID3v1, ID3v2, Vorbis, RIFF). It merges documentation on multiple values, genre handling, rating handling, track number handling, release date validation rules, and lyrics support into a single authoritative reference.

Note: For detailed information about each metadata format (history, structure, advantages, disadvantages), see the Metadata Formats Guide.

Note: For audio information (duration, bitrate, sample rate, channels, file size, format info, MD5 checksum), see the Audio Technical Info Guide.

Table of Contents

Unified field schema and full-metadata API

The library exposes a stable, JSON-friendly description of every unified field for apps, CLIs, and documentation generators:

  • get_unified_metadata_field_schema() (exported from audiometa) returns a list of dicts, one per UnifiedMetadataKey, with:

    • id: UnifiedMetadataKey.value (e.g. "title", "album_artists")
    • label: Short English label (not localized)
    • multiple: Whether the field can hold multiple values
    • value_type: Coarse hint ("string", "strings", "integer", "number", "string_or_integer", etc.)
    • optional_value: Whether a scalar may be omitted (e.g. disc total)

    Implementation: audiometa.utils.unified_metadata_field_schema. Per-field detail: describe_unified_metadata_field(key).

  • get_supported_unified_metadata_field_ids(file) returns sorted id strings for unified fields that have a non-None write mapping in the file’s primary metadata format (the first format in the extension’s priority list). Fields with no writer for that format are omitted.

  • get_full_metadata(file) always includes:

    • unified_metadata_field_schema: Same list as get_unified_metadata_field_schema()
    • supported_unified_metadata_field_ids: Same list as get_supported_unified_metadata_field_ids(file)

The audiometa read command includes these keys in JSON and YAML output; table output does not print them (it only shows unified, technical, and format sections).

Metadata Support by Format

The library supports a comprehensive set of metadata fields across different audio formats. The table below shows which fields are supported by each format:

FieldID3v1ID3v2VorbisRIFFAudioMetaCLI
Text EncodingASCIIUTF-16/ISO (v2.3)UTF-8ASCII/UTF-8UTF-8
+ UTF-8 (v2.4)
Max Text Length30 chars~8M chars~8M chars~1M charsFormat limit
Date Time FormatsYYYYYYYY+DDMM (v2.3)YYYY-MM-DDYYYY-MM-DDISO 8601
YYYY-MM-DD (v2.4)
TitleTITLE (30)TIT2TITLEINAMTITLE--title
ArtistsARTIST (30)TPE1ARTISTIARTARTISTS--artist, multiple
AlbumALBUM (30)TALBALBUMIPRDALBUM--album
Album ArtistsTPE2ALBUMARTISTIAAR*ALBUM_ARTISTS--album-artist, multiple
GenresGENRE (1#)TCONGENREIGNRGENRES_NAMES--genre, multiple
Release DateYEAR (4)TYER (4) + TDAT (4) (v2.3)DATE (10)ICRD (10)RELEASE_DATE--year or --release-date
TDRC (10) (v2.4)
Track NumberTRACK (1#) (v1.1)TRCK (0-255#)TRACKNUMBERIPRT*TRACK_NUMBER--track-number
Disc NumberTPOS (0-255#)DISCNUMBERDISC_NUMBER--disc-number
Disc TotalTPOS (0-255#)DISCTOTALDISC_TOTAL--disc-total
RatingPOPM (0-255#)RATING (0-100#)IRTD* (0-100#)RATING--rating
BPMTBPM (0-65535#)BPM (0-65535#)IBPM*BPM--bpm
LanguageTLAN (3)LANGUAGE (3)ILNG* (3)LANGUAGE--language
ComposersTCOMCOMPOSERICMPCOMPOSERS--composer, multiple
PublisherTPUBORGANIZATIONPUBLISHER--publisher
CopyrightTCOPCOPYRIGHTICOPCOPYRIGHT--copyright
LyricsUSLTLYRICS*UNSYNCHRONIZED_LYRICS--lyrics
Synchronized LyricsSYLT
CommentCOMMENT (28)COMMCOMMENTICMTCOMMENT--comment
ReplayGainREPLAYGAINREPLAYGAIN--replaygain
Archival LocationTXXXARCHIVAL_LOCATIONARCHIVAL_LOCATION--archival-location
ISRCTSRCISRC** (ISRC)ISRC--isrc
MusicBrainz Track IDUFIDMUSICBRAINZ_TRACKIDMBID*MUSICBRAINZ_TRACKID--musicbrainz-trackid
owner=http://musicbrainz.org
MusicBrainz Artist IDTXXXMUSICBRAINZ_ARTISTIDMBAR*MUSICBRAINZ_ARTISTID--musicbrainz-artistids, multiple
MusicBrainz Artist Id
DescriptionDESCRIPTION*** (Description)DESCRIPTION--description
Originator** (Originator)ORIGINATOR--originator
Originator Reference** (OriginatorReference)
Origination Date** (OriginationDate)
Origination Time** (OriginationTime)
Time Reference** (TimeReference)
Version** (Version)
UMID** (UMID)
Coding History** (CodingHistory)
Loudness Value*** (LoudnessValue)
Loudness Range*** (LoudnessRange)
Max True Peak Level*** (MaxTruePeakLevel)
Max Momentary Loudness*** (MaxMomentaryLoudness)
Max Short Term Loudness*** (MaxShortTermLoudness)

* Fields marked with asterisk (*) are supported in non-standard implementations.

** Fields marked with double asterisk (**) are Broadcast Wave Format (BWF) fields via the bext chunk. Currently only ISRC is exposed in unified metadata. Other BWF fields are available via raw metadata access. See the Metadata Formats Guide for details.

*** Fields marked with triple asterisk (***) are Broadcast Wave Format (BWF) v2 loudness metadata fields via the bext chunk. These fields are currently available via raw metadata access only, not through unified metadata. See the Metadata Formats Guide for details.

Multiple Values Handling

The library intelligently handles multiple values across different metadata formats, automatically choosing the best approach for each situation.

Semantic Classification

Fields are classified based on their intended use:

  • Semantically Multi-Value Fields: Fields that can logically contain multiple values (e.g., ARTISTS, GENRES_NAMES). They can be stored as multiple entries or concatenated values.
  • Semantically Single-Value Fields: Fields that are intended to hold a single value (e.g., TITLE, ALBUM). The library always returns only the first value for these fields.

Semantically Multi-Value Fields

The following fields are treated as semantically multi-value:

  • ARTISTS - Multiple artist names for the track
  • ALBUM_ARTISTS - Multiple album artist names
  • GENRES_NAMES - Multiple genre classifications
  • COMPOSERS - Multiple composer names
  • MUSICIANS - Multiple musician credits
  • CONDUCTORS - Multiple conductor names
  • ARRANGERS - Multiple arranger names
  • LYRICISTS - Multiple lyricist names
  • INVOLVED_PEOPLE - Multiple involved people credits
  • PERFORMERS - Multiple performer names

Ways to Handle Multiple Values

Metadata formats can represent multi-value fields in two ways:

Multiple Field Instances (Multi-Frame/Multi-Key)

Each value is stored as a separate instance of the same field or frame. Vorbis comments natively support this; ID3v2 and RIFF may contain multiple instances. ID3v1 does not.

Single Field with Separated Values (Separator-Based)

All values are stored in one field, separated by a character or delimiter (e.g., ;, /, ,, or a null byte for ID3v2.4).

Reading Semantically Multiple Values

AudioMeta follows a two-step process:

  1. Extract all field instances as found in the file for each format.
  2. If null-separator (ID3v2.4) is present, split on null bytes. Otherwise:
    • If multiple instances exist: return them as-is.
    • If a single instance exists: apply smart separator parsing using a priority list: //, \\, ;, \, /, ,.

Writing Semantically Multiple Values

Writing adapts to format capabilities:

FormatMulti-value Writing Method
ID3v1Concatenated with chosen single-char separator
ID3v2.3Separator-based concatenation
ID3v2.4Null-separated values
RIFFSeparator-based concatenation
VorbisMultiple entries (one per value)

Duplicate values are de-duplicated before writing. Empty strings and None values are filtered out; if all values are removed, the field is deleted.

Automatic Empty Value Filtering

The library automatically filters out empty strings and None values from list-type metadata fields before writing. If all values in a list are filtered out, the field is removed entirely (set to None).

Genre Handling

AudioMeta provides comprehensive genre support across all audio formats, with intelligent handling of genre codes, multiple genres, and format-specific limitations. See the Genre Handling Guide.

Rating Handling

Rating is supported across multiple audio formats, with normalization in unified metadata. See the Rating Handling Guide.

Release Date Validation Rules

The RELEASE_DATE field accepts two formats:

Valid Formats:

  1. YYYY format (4 digits) - for year-only dates

    • Examples: "2024", "1900", "1970", "0000", "9999"
    • Use when you only know the year
  2. YYYY-MM-DD format (ISO-like format) - for full dates

    • Examples: "2024-01-01", "2024-12-31", "1900-01-01", "1970-06-15"
    • Month and day must be zero-padded (2 digits each)
    • Use when you have the complete date
  3. Empty string - allowed and represents no date

    • Example: ""

Invalid Formats:

The following formats will raise InvalidMetadataFieldFormatError:

  • Wrong separator: "2024/01/01", "2024.01.01", "2024_01_01", "2024 01 01"
  • Incomplete date: "2024-1-1", "2024-1-01", "2024-01-1"
  • Short year: "24", "202", "20"
  • Long year: "20245", "20245-01-01"
  • Non-numeric: "not-a-date", "2024-abc-01", "abcd-01-01"
  • Incomplete format: "2024-01", "2024-", "-01-01", "2024-01-"

Examples:

from audiometa import update_metadata
from audiometa import UnifiedMetadataKey

# Valid: YYYY format
update_metadata("song.mp3", {UnifiedMetadataKey.RELEASE_DATE: "2024"})

# Valid: YYYY-MM-DD format
update_metadata("song.mp3", {UnifiedMetadataKey.RELEASE_DATE: "2024-01-01"})

# Valid: empty string
update_metadata("song.mp3", {UnifiedMetadataKey.RELEASE_DATE: ""})

# Invalid: wrong separator
update_metadata("song.mp3", {UnifiedMetadataKey.RELEASE_DATE: "2024/01/01"})
# Raises: InvalidMetadataFieldFormatError

ISRC Validation Rules

The ISRC (International Standard Recording Code) field accepts two formats based on ISO 3901:

Valid Formats:

  1. 12 alphanumeric characters (without hyphens) - compact format

    • Format: CCXXXYYNNNNN
    • Examples: "USRC17607839", "GBAYE0000001", "JPAB01234567"
    • CC = Country code (2 letters)
    • XXX = Registrant code (3 alphanumeric)
    • YY = Year of reference (2 digits)
    • NNNNN = Unique designation code (5 digits)
  2. 15 characters with hyphens - human-readable format

    • Format: CC-XXX-YY-NNNNN
    • Examples: "US-RC1-76-07839", "GB-AYE-00-00001", "JP-AB0-12-34567"
  3. Empty string - allowed and represents no ISRC

    • Example: ""

Invalid Formats:

The following formats will raise InvalidMetadataFieldFormatError:

  • Too short: "USRC1760783", "ABC", "U"
  • Too long: "USRC176078390", "USRC1760783901234"
  • Wrong hyphen positions: "USRC-17607839", "US-RC17607839"
  • Special characters: "USRC1760783!", "USRC@7607839", "USRC 7607839"
  • Wrong segment lengths in hyphenated format: "US-R-76-07839", "USA-RC1-76-07839"

Examples:

from audiometa import update_metadata
from audiometa import UnifiedMetadataKey

# Valid: 12-character format
update_metadata("song.mp3", {UnifiedMetadataKey.ISRC: "USRC17607839"})

# Valid: hyphenated format
update_metadata("song.mp3", {UnifiedMetadataKey.ISRC: "US-RC1-76-07839"})

# Valid: empty string
update_metadata("song.mp3", {UnifiedMetadataKey.ISRC: ""})

# Invalid: too short
update_metadata("song.mp3", {UnifiedMetadataKey.ISRC: "ABC"})
# Raises: InvalidMetadataFieldFormatError

MusicBrainz Track ID

The MusicBrainz Track ID (also known as the recording id in MusicBrainz) is a UUID that uniquely identifies the recording in MusicBrainz. See the dedicated guide: MusicBrainz Track ID

MusicBrainz Artist ID

The MusicBrainz Artist ID is a UUID that uniquely identifies the artist in MusicBrainz. See the dedicated guide: MusicBrainz Artist ID

Track Number Handling

The library handles different track number formats across audio metadata standards.

Track Number Formats by Format

  • ID3v1: Simple numeric string (stored in comment field since ID3v1.1), e.g., "5", "12"
  • ID3v2: Supports "track/total" format (e.g., "5/12", "99/99") or simple "track" format (e.g., "5")
  • Vorbis: Simple numeric string (or track/total where used), e.g., "5", "12"
  • RIFF: Simple numeric string, e.g., "5", "12"

Reading Track Number

The library returns track numbers as strings. Edge cases:

  • "5/" → Track number: "5/" (trailing slash preserved)
  • "/12" → Track number: None (no track number before slash)
  • "abc/def" → Track number: None (non-numeric values)
  • "" → Track number: None (empty string)
  • "5/12/15" → Track number: None (multiple slashes, invalid format)
  • "5-12" → Track number: "5-12" (different separator preserved)
  • "01" → Track number: "01" (leading zeros preserved)

Writing Track Number

Input ValueID3v1ID3v2VorbisRIFF
5 (int)"5""5""5""5"
"5" (str)"5""5""5""5"
"5/12""5""5/12""5/12""5/12"
"99/99""99""99/99""99/99""99/99"
"1""1""1""1""1"

Notes:

  • ID3v1: Only supports track numbers (1-255), extracts the track number from formats like "5/12" and ignores the total
  • ID3v2: Supports full track/total format (e.g., "5/12") as per ID3v2 specification
  • Vorbis: Supports full track/total format through TRACKNUMBER field
  • RIFF: Track number written to INFO IPRT (see Track and disc numbers)

For detailed information, see Track and disc numbers.

Disc Number Handling

The library provides two separate unified metadata fields for disc number:

  • DISC_NUMBER: Integer representing the current disc number (required)
  • DISC_TOTAL: Integer representing the total number of discs, or None if unknown (optional)

Format Support:

  • ID3v1: ✗ Not supported (format limitation)
  • ID3v2: TPOS frame - maps "disc/total" format to/from DISC_NUMBER and DISC_TOTAL; writer enforces 0–255; reader uses the same n / n/m / n-m rules as documented (see Track and disc numbers)
  • Vorbis: Separate DISCNUMBER and DISCTOTAL on write (unlimited range); read parses combined DISCNUMBER like ID3v2 TPOS, with explicit DISCTOTAL overriding the embedded total when valid (see Track and disc numbers)
  • RIFF: ✗ Not supported (format limitation)

For detailed information on disc number formats, limitations, reading/writing behavior, and examples, see Track and disc numbers.

Lyrics Support

Two types of lyrics are supported: synchronized lyrics (synchronized with music, for karaoke) and unsynchronized lyrics (plain text).

Synchronized Lyrics

Synchronized lyrics (SYLT frames in ID3v2) are not currently supported by the library. This is planned for future versions.

Unsynchronized Lyrics

Unsynchronized lyrics are supported differently across formats:

ID3v1 Unsynchronized Lyrics

ID3v1 does not support unsynchronized lyrics due to its limited structure.

ID3v2 Unsynchronized Lyrics

ID3v2 supports unsynchronized lyrics through the USLT (Unsynchronized Lyrics/Text transcription) frame. The library currently writes only a single USLT frame with default language code eng. Multi-language support is planned for future versions.

RIFF Unsynchronized Lyrics

RIFF INFO chunks support storing unsynchronized lyrics in the UNSYNCHRONIZED_LYRICS chunk. Language codes are not supported due to lack of standardization.

Vorbis Unsynchronized Lyrics

Vorbis comments support lyrics through the UNSYNCHRONIZED_LYRICS field. Language codes are not supported due to lack of standardization.

None vs Empty String Handling

The library handles None and empty string values differently across audio formats:

FormatSetting to NoneSetting to "" (empty string)Read Back Result
ID3v2 (MP3)Removes field completelyRemoves field completelyNone / None
Vorbis (FLAC)Removes field completelyCreates field with empty contentNone / ""
RIFF (WAV)Removes field completelyRemoves field completelyNone / None
ID3v1 (MP3)SupportedSupportedLegacy format with limitations

Example

from audiometa import update_metadata, get_unified_metadata_field

# MP3 file - same behavior for None and empty string
update_metadata("song.mp3", {"title": None})
title = get_unified_metadata_field("song.mp3", "title")
print(title)  # Output: None (field removed)

# FLAC file - different behavior for None vs empty string
update_metadata("song.flac", {"title": None})
title = get_unified_metadata_field("song.flac", "title")
print(title)  # Output: None (field removed)

update_metadata("song.flac", {"title": ""})
title = get_unified_metadata_field("song.flac", "title")
print(title)  # Output: "" (field exists but empty)