Digital audio

A digital audio file intended for preservation should preferably be encoded and stored in an uncompressed format to ensure the highest possible sound quality.

Below, we list several file formats, but many other formats may exist. Contact your local research data support service for advice on which file formats are appropriate for long-term preservation and sharing of the type of research data you are working with.

Also, consider the rights associated with the format, especially in interview situations where an individual's privacy may be protected by various laws. For example, see the page on GDPR and personal dataOpens in a new tab at SND.se.

Important characteristics

Uncompressed audio files are often very large, so compression may be necessary. To preserve the highest quality, it is important to use a lossless compression codec.

If you need to convert a file from one format to another, there are several key characteristics of the audio file you should be aware of to minimize the risk of information loss. It is also essential that technical metadata that describe the data file are accessible.

Duration: The length of the audio file in timecoded character format (Timecoded Character Format, TCF).
Bit Depth: Indicates how many bits of information are stored per sample; this is a measure of audio quality (e.g., 16 or 24 bits).
Sample Rate: Indicates the number of samples per second. Sample rate is typically expressed in hertz, for example, 44.1 kHz (a common sample rate). Like Bit Depth, it is an indicator of file quality.
Channels: Describes the number of audio channels used to carry the sound to the speakers: Mono (one channel), Stereo (two channels), or Surround (more than two channels, typically 5.1 or 5.2). If each channel has been assigned a channel number, the data creator can specify where the sound from different channels should come from, known as channel assignment. For example, in a stereo recording of an interview, the track with the interviewer can be assigned to channel 1 and heard in the left speaker, while the interviewee’s track can be assigned to channel 2 and heard in the right speaker.

What to preserve?

The original files should be preserved in their unprocessed state, but the final product, i.e., the processed audio file, should also be preserved. The reason for preserving both the original and the processed file is that the latter is likely to lack audio data that is present in the original file.

It is also important to preserve metadata and documentation of how the file was created and processed, such as the recording location, date, and the individuals involved. Metadata can be embedded in the file or stored separately.

Recommended file formats for sharing

Since many of the most common audio recording file formats are widely supported, it is rarely necessary to convert recorded material to another format. However, it is worth considering how the files should be best preserved for the long term.

Waveform Audio (.wav)
Broadcast Wave Format (.bwf)
Audio Interchange File Format (.aif, .aiff)
Free Lossless Audio Codec (.flac)
Matroska (.mka)
MPEG-1, MPEG-2 (.mpg, .mpeg, …)

MPEG-1 Audio Layer III (.mp3), Advanced Audio Coding (.aac), and Ogg Vorbis (.ogg) are file formats that use destructive (lossy) compression, which should be avoided when creating audio files if possible. If the research material was originally created in these formats, the files can be shared and easily reused by other researchers.

Apple Lossless Audio Codec (ALAC) enables lossless compression but is only supported on Apple devices and should therefore be avoided for sharing and long-term preservation.

For more information on file formats for audio, see the ARIADNE guide Digital Audio: A guide to good practiceOpens in a new tab.The guides have been developed by SND and translated into English in cooperation with the EU-funded infrastructure ARIADNEOpens in a new tab. ARIADNE is responsible for updating the English guides and keeping them accessible.