Rockbox.org home
releases
current build
extras manual
wiki
index mailing lists
IRC
forums bugs
patches
 requests



TWiki > Main > SoundCodecs
Main . { Users | Groups | Changes | Index | Search | Go }

Audio Codecs

Codecs for Encoding/Decoding Music Formats

This page describes various audio codecs and provides links to resources that would be useful to a developer wanting to add support for that format to Rockbox.

Overview of Audio CODECs

The basic format of an audio file in a computer is a Wave (.WAV) file. This contains uncompressed PCM audio and a 4-minute song at CD quality will be about 40MB in size. Audio CODECs are programs that reduce this filesize and can be split into two main categories - "lossy" and "lossless".

The Hydrogenaudio wiki contains more information about the codecs discussed here.

Lossy CODECs

A "lossy" codec (e.g. MP3, OGG Vorbis, AAC) uses knowledge of human hearing to try and discard as much of the original audio signal as possible, whilst attempting to make the audio sound as close as possible to the original. These codecs typically achieve a filesize of 10%-20% of the original.

Sound Codecs in Rockbox

NOTE: The following list is currently incomplete - the intention is to add a wiki page for each codec supported in Rockbox.

Development discussion

Format Decoder(s) Encoder(s)
MP3 MAD - Helix

From what I can tell from the website, the Helix decoder is MP3 only (i.e. no layer-I or layer-II support), is written in C++, and is licensed under the Real Networks Public Source License (RPSL). For those reasons, and the fact that MAD is tried and tested, I think we should stick with MAD -- DaveChapman

The fixed int Helix MP3 decoder is not written in C++, it's written in C. RPSL is an OSI license and GPL compatible. It is tried and tested - Motorola use it in their phones. -- AlastairS?

The RPSL is most definately not GPL compatible. They do list it as a compatible license, but that more or less just means that the GPL is RPSL compatible, rather than the other way round. Which is of course a huge joke. The license even has a note to that effect. -- JonasHaeggqvist

Another option is Stephane TAVENARD's MPEGDEC library as ported to the Coldfire here.
I have set this up to work with Rockbox, but the gains weren't as great as we hoped. I have no further time/desire to work on it, but if anyone feels like picking it up, it's available here. -- ThomJohansen

Moved to EncoderDiscussionMP3
OGG/Vorbis Vorbis is a free format and is supported by the original iriver and iAudio firmwares. The RockBox codec is based on Tremor which is a BSD licensed integer implementation of the decoder. more information is available on xiph.org. Download here a snapshot of the subversion repository. There's a lowmem branch of Tremor which may be good on devices with low memory. See this mailing list post for details. Also written and freely available under the BSD license, but not in integer form.
MPC libmusepack: Portable Musepack decoder library (including fixed point mode) by Peter Pavlovski, famous for being the foobar2000 developer (AKA "zZzZzZz" on http://www.hydrogenaudio.org ), Kuniklo and others. The library is under Modified BSD license, here is the library C sources. For further information just look the HA dedicated thread or the musepack.net forum. You can find some other details on HA (HERE). #MPC on irc.musepack.net is the place for "support". "you will find that there are problems with seeking" Open Source (LGPLd), but floating point only
AAC (and HE-AAC???) FAAD2 (Free Advanced Audio Decoder version2, with HE-AAC support) from http://www.audiocoding.com comes under GPL (unlike the outdated FAAD version which is under LGPL and doesn't allow HE-AAC decoding). FAAD(2) is a both mpeg2 and MPEG4 AAC decoder!

NOTE: There is currently some controversy with an apparent GPL-incompatible change to the FAAD2 license - see this thread at HA for details (especially comment #35)

RealNetworks released an open source fixed point HE-AAC decoder. Note: the license doesn't appear to be GPL compatible - any opinions?

DanHollis: The Podzilla guys seem to think its GPL compatible. More details HERE

RichardOBrien: The RCSL/RPSL (RealNetworks Public Source License) under which the AAC decoder was released, at Helix Community, is compatible to GPL but not the other way around.

PatrickSchuetz: I agree with RichardOBrien, for more details: HERE

For MP4 parsing, FAAD's mp4ff library would be nice, but it uses malloc rather excessively and the code in CVS for ALAC seems to work, with some minor changes for AAC support. Probably best to roll our own.

Another tricky issue is AAC gapless playback - a discussion of the subject is here

FAAC (Free Advanced Audio Coder) from http://www.audiocoding.com is under "Lesser GPL". FAAC doesn't provide HE-AAC encoding but it's a both MPEG2 and MPEG4 AAC coder too. Note that FAAC is also the name of the whole sourceforge project (including FAAC, FAAD and FAAD2), don't mistake! (more details HERE).

No integer version available.

WMA FFMPEG contains a wma decoder, and Rockbox currently uses a version of this modified to use fixed point math.

Real and Microsoft signed a deal to release open source code to play Windows Media in Helix later this year. No word yet on if this will be GPL compatible or even integer. We will just have to wait and see. -- JonathanHull

As of March 09, 2007, the FFmpeg project has an WMA encoder. -- BlakeJohnson - 19 Oct 2007
A/52 (aka AC3) liba52 is a GPL'ed implementation with an integer-only mode that would run without problems on the iRiver's hardware

AC3 is the most common audio format for DVDs, so support for this format would allow you to rip the audio from a DVD and play it directly on your DAP without re-encoding. An obvious next step (if technically possible) would be "AC3 pass-thru" via the optical digital output to a standalone AC3 surround decoder.

ffmpeg contains a GPL'd ac3 encoder (ac3enc.c)

Quality is very, very suboptimal - RobertoAmorim

Speex Speex is a codec geared towards speech compression. Rockbox uses libspeex, which does decoding completely in fixed point. Almost all of the encoder is now available in fixed point, and might be feasible for encoding in Rockbox at some quality/complexity settings.
RealAudio (Cook) RealAudio is a container format and a range of different codecs can be used. However, the most common is the "cook" codec. An open source (LGPL'd) implementation exists in ffmpeg CVS. But it's floating point. ???

Lossless CODECs

A "lossless" codec (e.g. FLAC) performs the same function as "winzip" - i.e. it compresses an audio file without discarding any of the information. These codecs typically achieve a filesize of 50%-60% of the original filesize, but the audio playback will be bit-for-bit identical to the original file.

Format Decoder(s) Encoder(s)
WAVE (.wav) ".wav" is the de-facto container format for storing uncompressed PCM audio, but may contain compressed data (most common formats are pcm, adpcm, alaw, mulaw, and dvi_adpcm). Obviously, no encoding or decoding needs to be done on PCM data (but byte-swapping may be needed - WAV files are little-endian), but other formats use various compression methods - thus the need for a codec. Details, Official specs, Sample WAV files. As decoder.
AIFF Audio Interchange File Format (AIFF) is the Apple version of WAV (developed jointly by Apple and SGI), and is the standard container for uncompressed PCM audio on a Mac. AIFF-C is a similar format containing compressed audio. A description of the format is available from Apple As decoder.
FLAC FLAC - integer only FLAC

According to this topic and the changelog, FLAC 1.1.2 has a compiler define (FLAC__INTEGER_ONLY_LIBRARY) that builds the whole library (decoder, encoder, etc.) with integer only. The remaining problem is the usage of 64-bit ints in the code. But it appears that someone is working on that.

Shorten (.shn) Shorten source is available from this website, but the license isn't GPL compatible and including it would be a major hassle. See discussion in the irc logs from Feb 13th 2005.

FFmpeg has a LGPL decoder.

To be completed
Wavpack (.wv) Licensed under the Modifed BSD license Does both lossy and lossless encoding.
Monkey's Audio (.ape) Monkey's Audio 3.99 SDK. For further information look at the Monkey's Audio developer page. The Monkey's Audio license is claimed to be "open source compatible" (DanielStenberg: I fail to see how it is GPL compatible. It adds further restrictions and the GPL doesn't allow that.)

jmac, a java implementation of Monkey's Audio is also an option (under LGPL).

The codec is heavily x86-centric with lots of x86 assembly to speed up parts of the code - particularly a neural network. Unless it's very heavily optimized for 68K and PortalPlayer, it won't run real-time. And some compression modes (Extra high, Insane) probably won't run no matter how much you optimize it - RobertoAmorim

Monkey's Audio 3.99 SDK.
ALAC Apple Lossless Audio Codec. http://craz.net/programs/itunes/alac.html missing
TTA True Audio: http://tta.sf.net/ and http://en.wikipedia.org/wiki/TTA_%28codec%29 GPL
MPEG-4 Lossless MPEG-4 Audio Lossless Coding (ALS) was standardised (Press Release) in December 2005. The specification does not yet appear to be publically available, but a reference encoder and decoder are available The reference source is C++, but could relatively easily be converted to C. However, the license is not GPL compatible. See decoder

Other CODECs

There are other audio "CODECs" that don't fit into the "lossy" or "lossless" categories above. These are different because the source for the files was not a .WAV file, but rather they are formats used by music composers to store computer-generated music.

Format Decoder(s) Encoder(s)
SID The sid format is music from Commodore 64 games and other productions. I don't know much about the format, but a good starting point would probably be sidplay2 which includes a standalone library. I don't know whether or not this is floating point.

A thing to note is that sid files have no notion of playing time. They are simply programs that go on forever, either with music that loops or silence. Traditionally this has been fixed by looking up playtimes in a database of file-md5 hashes which includes more or less every song available on the net.

Furthermore, each file may contain many subtunes.

DanHollis: sidplay2 appears to be completely integerized.

MartinArver: Unfortunatley, libsidplay, which is the lib for sidplay2, is written in c++. I have been looking at an older version of libsidplay(1.36.59), this has fixpoint-math. But, as it is written in c++ it seems we have to build the toolchain with c++ enabled for this to compile.

Encoding not possible.
SPC The SPC format is music from Super Nintendo (Super Famicom in Japan) games and other productions. There are a multitude of players for the SPC format, most of them listed here. As with most sound emulator formats the sound is not absolutely accurate, but the SNESamp Winamp plugin is generally regarded as the one which represents the sound best. I'm not sure if SNESamp is floating point (ChrisRobinson?: or if the nature of the Winamp plugin is indeed useful for converting - I'm no programmer), but if not there are multiple other players.

SNESamp actually uses the SNESapu emulator for SPC decoding. It is indeed the most accurate SPC700 APU emulator publicly available. Unfortunately, all emulation code is written in x86 assembly (NASM) - RobertoAmorim

Unlike the sid format SPCs are single songs, which have ID666 tag support, and every SPC file is 64kb in size. However, the more popular RSN format acts more like a sid file and contains all songs from a game (RSNs are simply RAR files which contain SPCs [which have a 90% average compression rate] named to RSN). Some players support RSN, others I believe do not.

Encoding not possible.
PSF The PSF format is music from Sony Playstation 1 games. The decoders are fixedpoint. There is source code in C from an GPL Linux XMMS plugin known as SexyPSF. I've looked at the code but am totally clueless as to how it works. It's avilable there for anyone willing to tinker with it. Neill Corlett's PSF Central also has alot of PSF related information as well as the actual PSF spec paper. I would love to see this format make it into Rockbox in the future. There are also a few other decoders out there, but Sexy PSF is the only one I could find a source link to. I suppose some Emulator decoders could work somehow too? Encoding not possible.
Tracker formats This is actually a whole range of more or less similar sequencer formats. There are a few opensource libraries available, which all support a lot of formats.

Mikmod is for some reason often used, but I've been told that it lacks support for many features in some formats. This is a C library.

Modplug (the project also releases libmodplug) should have better support and it seems that it has very limited use of floats. However, libmodplug is in C++. People talking about Mikmod vs. Modplug.

Another option might be XMP which is C.

There's also DUMB which is said to be an accurate module player library (also C).

Any thoughts on which has the best support? Modplug is probably not feasible.

Encoding not possible.
NSF, NSFe This is for the sound tracks from the original Nintendo Entertainment System (NES). The NSF file is actually an NES ROM, with all data not related to audio stripped out. The sound track for pretty much any 8-bit Nintendo (not Super Nintendo) game is available online. The NSFE file format is a bit better. Both are playable. Encoding not possible.
MIDI Trying out MIDI, and it says you have no patchset? Download here.

I have written a simple MIDI player for Linux in C++, now rewriting it in C in hopes of making it work with Rockbox. Currently the C port is able to allocate memory for the file, decode it, sequence events, and interpret them enough to play the file using the sound card an a very simple sinewave-based synthesizer I threw together. I am hoping to find an adlib emulation engine that I can import into this thing, and possibly a wavetable engine if I can find good patches.

It plays sound and at this state, it may actually work on the iRiver if someone ports the sound output routine in pctest.c to write to the iRiver DSP.

I guess the MIDI codec would have to be more of a plug-in, as it loads the entire file at once and then plays it from memory...

You may want to look for the Gravis UltraSound patches; they are floating around and used by software MIDI players such as TiMidity++

I have looked at TiMidity++ as well as the music engine used by ScummVM... Does anyone know of a good description of the Gravis UltraSound patch format? Those patches store a good deal of information, such as waveform looping, envelope, etc.. but I cannot find a guide that explains the fields in the file. Any ideas?

There seems to be downloadable Gravis patch info here.

All right.. playback, looping, interpolation (No more ghetto lowpass filter!), drums, panning, pitch wheel and all that work fine. I have just added envelope support. It works but could probably use some more exhaustive testing. I don't know how well envelope stuff will work on the target, given the amount of extra work it puts on the processor compared with the difference it makes to the output. I guess at this point the code needs to be built for and tested on the target.. but I don't really know what kind of functions it uses for file I/O, etc. Maybe someone can help me with that.. someone who actually has an iriver, etc.

Updated synth sample here. This plugin needs a separate soundset to work. This is available here. Extract its contents into the /.rockbox directory. Warning: file is around 22MB in size.

The plugin can play back midi in 22kHz in realtime on coldfire based targets, still not realtime on pp based targets.

Encoding not possible.

Current status

Last updated: 19 October 2007

You might want to check CodecPerformanceComparison for a detailed testing of codec performance on various platforms.

Codec Status in Rockbox Plays Seeks Realtime on iriver and Cowon Realtime on iPod (4th Gen and later), Sansa and H10 Realtime on iPod (1st, 2nd, 3rd Gen) Realtime on Gigabeat
Lossy CODECs
Mpeg-audio Works very efficiently, but there's still room for more optimisation on ARM. DONE DONE DONE DONE DONE DONE
Ogg/vorbis Fairly efficient code is in SVN. DONE DONE DONE DONE DONE DONE
MPC Works very efficiently. DONE DONE DONE DONE DONE DONE
A/52 (AC3) Works very efficiently. Supports downmixing for playback of 5.1 streams in stereo. DONE DONE DONE DONE choice-no DONE
AAC (MP4) Decoder using libfaad (from FAAD2) checked into SVN. Now running in realtime on the h120 and ipod targets, but could still do with further optimisation to reduce the boost level. On PP5002 the margin is really thin, even sw tone controls may kill realtime. DONE DONE DONE DONE DONE DONE
WMA Runs fairly efficiently with more potential for optimization. DONE DONE DONE DONE DONE DONE
ADX Added on 25 Sep 2006 DONE   DONE DONE DONE DONE
Speex Runs efficiently with more potential for optimization DONE DONE DONE DONE DONE DONE
Lossless CODECs
WAV Improved WAV codec added in September 2005 supporting most types of WAV files. DONE DONE DONE DONE DONE DONE
AIFF Added on 1 Feb 2006, Some limitations - see the forum post DONE DONE DONE DONE DONE DONE
FLAC A new decoder based on the ffmpeg FLAC decoder was committed on 26 October 2005, replacing the previous libFLAC based implementation. This decoder has now been optimised for use in Rockbox and tests have shown that it decodes faster than realtime with the iriver's processor clocked at a constant 34MHz. Seeking is now fully implemented using routines adapted from libFLAC. DONE DONE DONE DONE DONE DONE
ALAC Runs fairly efficiently, but more optimisations (e.g. using EMAC) are possible. DONE DONE DONE DONE   DONE
Wavpack Code in SVN runs very efficiently, both encoding and decoding. DONE DONE DONE DONE DONE DONE
Shorten Decoder from ffmpeg project in SVN. No seeking support. DONE   DONE DONE   DONE
Monkey's Audio Support was added on 05 June 2007. -c1000 to -c3000 decode in realtime on Gigabeat, only -c1000 decodes in realtime on Coldfire (H1x0/H3x0 and X5/M5/M3) targets; still too slow on PortalPlayer (iPod/Sansa/H10). DONE DONE DONE choice-no choice-no DONE
Other CODECs
SID Added on 18 Jul 2006. Works very well on all supported targets. DONE   DONE DONE DONE DONE
MOD A working patch is in the tracker. FS#8680 DONE   DONE DONE DONE DONE
NSF,NSFE Added to SVN on 25 Jan 2007. Works very well on all supported targets. DONE   DONE DONE Unknown DONE
SPC Added to SVN on 14 Feb 2007. Plays well on all supported targets. DONE   DONE DONE DONE DONE
MIDI Code with some optimizations in SVN, currently in a plugin. DONE   DONE     DONE
ALERT! Realtime means that the codec is able to decode a file as fast as it needs to be played (ie. a one minute file is decoded in one minute). Codecs should be a good deal faster than this to allow for buffering, crossfading etc. though.

r148 - 18 Apr 2008 - 19:42:11 - JensArnold
Edit | View raw | Attach | Ref-By | History: r148 < r147 < r146 < r145 < r144 | More | Refresh cache

Copyright © 1999-2008 by the contributing authors.