release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Wiki > Main > SoundCodecs (compare)

Difference: SoundCodecs (r182 vs. r181)

Audio Codecs


Introduction

The basic format of an audio file in a computer is a Wave (.WAV) file. This contains uncompressed PCM audio and a 4-minute song at CD quality will be about 40MB in size. Audio codecs (encoder-decoder) are programs that reduce this filesize and can be split into two main categories - "lossy" and "lossless". See the Hydrogenaudio wiki for more information about these terms and more.

This page describes various audio codecs and provides links to resources that would be useful to a developer wanting to add support for that format to Rockbox as well as providing a chart detailing their current support status by Rockbox.


Current Status

CodecStatus in RockboxPlaysSeeksRealtime on Coldfire (H100/H300/iAudio)Realtime on PP502x (iPod 4G and later, Sansa, H10, m:robe 100)Realtime on PP5002 (iPod 1st, 2nd, 3rd Gen)Realtime on S3C2440 (Gigabeat F/X)Realtime on i.MX31 (Gigabeat S)
Lossy Codecs
Mpeg-audio Works very efficiently on Coldfire, and on both cores on PP but there's room for more optimization on ARM. DONEDONEDONEDONEDONEDONEDONE
Ogg/vorbis Very efficient code is in Git. DONEDONEDONEDONEDONEDONEDONE
MPC Supports SV7 and SV8. Works very efficiently. DONEDONEDONEDONEDONEDONEDONE
A/52 (AC3) Works very efficiently. Supports downmixing for playback of 5.1 streams in stereo. DONEDONEDONEDONEDONEDONEDONE
AAC (MP4) Decoder using libfaad (from FAAD2) checked into Git. Could still do with further optimization. DONEDONEDONEDONEDONEDONEDONE
HE-AAC (MP4) Decoder using libfaad (from FAAD2) checked into Git. Could still do with further optimization. DONEDONE       DONEDONE
WMA Runs very efficiently, although problems exist with a handful of files. DONEDONEDONEDONEDONEDONEDONE
WMA Pro Runs very efficiently. DONEDONEDONEDONEDONEDONEDONE
ADX Added on 25 Sep 2006 DONE   DONEDONEDONEDONEDONE
Speex Runs efficiently with more potential for optimization DONEDONEDONEDONEDONEDONEDONE
Cook (RealAudio ) Runs efficiently. DONEDONEDONEDONEDONEDONEDONE
ATRAC3 (RealAudio ) Runs efficiently, with optimizations for both ARM and Coldfire. DONEDONEDONEDONEDONEDONEDONE
AC3 (RealAudio ) Runs efficiently. DONEDONEDONEDONEDONEDONEDONE
AAC (RealAudio ) Runs efficiently. DONEDONEDONEDONEDONEDONEDONE
Lossless Codecs Opus Runs on some targets like AMSv2              
WAV Lossless Codecs Improved WAV codec added in September 2005 supporting most types of WAV files. DONEDONEDONEDONEDONEDONEDONE
AIFF WAV Added on 1 Feb 2006, Some limitations - see the Improved WAV codec added in September 2005 supporting most types of WAV files. forum postDONEDONEDONEDONEDONEDONEDONE
FLAC AIFF Decoder Added is based on 1 Feb 2006, Some limitations - see the ffmpeg FLAC decoder. Seeking is implemented using routines adapted from libFLAC. forum postDONEDONEDONEDONEDONEDONEDONE
ALAC FLACRuns fairly efficiently, but more optimisations (e.g. Decoder is based on the ffmpeg FLAC decoder. Seeking is implemented using EMAC) are possible. routines adapted from libFLAC. DONEDONEDONEDONE DONEDONEDONE
Wavpack ALAC Code in Runs fairly Git runs very efficiently, both encoding and decoding. but more optimisations (e.g. using EMAC) are possible. DONEDONEDONEDONEDONE DONEDONE
Shorten Wavpack Decoder Code from ffmpeg project in Git. No seeking support. Git runs very efficiently, both encoding and decoding. DONE DONEDONEDONE DONEDONEDONE
TTA Shorten Could use more optimization. Decoder from ffmpeg project in Git. No seeking support. DONEDONE DONEDONE? DONEDONE
Monkey's Audio TTA Support was added on Could use more optimization. 05 June 2007. -c1000 to -c4000 decode in realtime on Gigabeat S, -c1000 to -c3000 decode in realtime on Gigabeat F and Coldfire, only -c1000 and -c2000 decode in realtime on PortalPlayer targets. DONEDONEDONEDONEDONE ? DONEDONE
Other Codecs Monkey's Audio Support was added on 05 June 2007. -c1000 to -c4000 decode in realtime on Gigabeat S, -c1000 to -c3000 decode in realtime on Gigabeat F and Coldfire, only -c1000 and -c2000 decode in realtime on PortalPlayer targets. DONEDONEDONEDONEDONEDONEDONE
SID Other Codecs Added on 18 Jul 2006. Works very well on all supported targets. DONEDONEDONEDONEDONEDONEDONE
MOD SID Added to repository on 21 May 2008. 18 Jul 2006. Works very well on all supported targets. DONEDONEDONEDONEDONEDONEDONE
NSF,NSFE MOD Added to repository on 25 Jan 2007. 21 May 2008. Works very well on all supported targets. DONEDONEDONEDONE Unknown DONEDONEDONE
SPC NSF,NSFEAdded to repository on 14 Feb 25 Jan 2007. Plays Works very well on all supported targets. Some reports of missing instrument patches. DONEDONEDONEDONEDONE Unknown DONEDONE
MIDI SPC Code with some optimizations in Git, currently in a plugin. Still not implemented as a proper codec. Added to repository on 14 Feb 2007. Plays well on all supported targets. Some reports of missing instrument patches. DONE ??? DONEDONE ??? DONE ??? DONEDONEDONE
GBS MIDIThere is a patch Code with some optimizations for GBS playback in the tracker; see Git, currently in a plugin. Still not implemented as a proper codec. FS #11906. Appears to run quite well on PP502x-based targets and the Gigabeats; needs testing on Coldfire and PP5002. DONEDONE ??? ??? DONEDONE ??? ??? DONEDONE
HES GBS There's There is a patch for HES GBS playback in the tracker; see FS #11945 #11906 . Needs additional Appears to run quite well on PP502x-based targets and the Gigabeats; needs testing before it can be on Coldfire and PP5002. committed. DONEDONE ??? ??? DONE ??? ??? DONE ??? DONE
SAP HES Added on 26 Jul 2008. Tested on There's a patch for HES playback in the tracker; see PortalPlayer FS #11945 502X, Gigabeat and H120. . Needs additional testing before it can be committed. DONEDONEDONE ??? DONE ??? ??? DONE ??? DONE ???
SAP Added on 26 Jul 2008. Tested on PortalPlayer 502X, Gigabeat and H120. DONEDONEDONEDONE   DONEDONE
  • SID and NSF seek uses subtracks instead of second. Each subtrack equals a second.
  • Realtime means that the codec is able to decode a file as fast as it needs to be played (ie. a one minute file is decoded in one minute). Codecs should be a good deal faster than this to allow for buffering, crossfading etc. though.
  • You might want to check CodecPerformanceComparison for a detailed testing of codec performance on various platforms.

Specific Sound Codec Pages

NOTE: The following list is currently incomplete - the intention is to add a wiki page for each codec supported by Rockbox.


Development discussion

Lossy Codecs

A "lossy" codec (e.g. MP3, OGG Vorbis, AAC) uses knowledge of human hearing to try and discard as much of the original audio signal as possible, whilst attempting to make the audio sound as close as possible to the original. These codecs typically achieve a filesize of 10%-20% of the original.

FormatDecoder(s)Encoder(s)
MP3 MAD - Helix

From what I can tell from the website, the Helix decoder is MP3 only (i.e. no layer-I or layer-II support), is written in C++, and is licensed under the Real Networks Public Source License (RPSL). For those reasons, and the fact that MAD is tried and tested, I think we should stick with MAD -- DaveChapman

The fixed int Helix MP3 decoder is not written in C++, it's written in C. RPSL is an OSI license and GPL compatible. It is tried and tested - Motorola use it in their phones. -- AlastairS?

The RPSL is most definately not GPL compatible. They do list it as a compatible license, but that more or less just means that the GPL is RPSL compatible, rather than the other way round. Which is of course a huge joke. The license even has a note to that effect. -- JonasHaeggqvist

Another option is Stephane TAVENARD's MPEGDEC library as ported to the Coldfire here.
I have set this up to work with Rockbox, but the gains weren't as great as we hoped. I have no further time/desire to work on it, but if anyone feels like picking it up, it's available here. -- ThomJohansen

Moved to EncoderDiscussionMP3
OGG/Vorbis Vorbis is a free format and is supported by the original iriver and iAudio firmwares. The RockBox codec is based on Tremor which is a BSD licensed integer implementation of the decoder. more information is available on xiph.org. Download here a snapshot of the subversion repository. There's a lowmem branch of Tremor which may be good on devices with low memory. See this mailing list post for details. Also written and freely available under the BSD license, but not in integer form.
MPC

libmusepack: Portable Musepack decoder library v1.3.0 with SV7 and SV8 support. The library is under Modified BSD license, here is the library C sources.

For further information just look http://www.hydrogenaudio.org/forums/index.php?showtopic=21775&hl=mpc/ the HA dedicated thread or the http://www.musepack.net/forum/viewtopic.php?t=137 musepack.net forum. You can find some other details on HA (http://www.hydrogenaudio.org/forums/index.php?showtopic=21775&st=25&p=259696&# HERE).

Open Source (LGPLd), but floating point only
AAC (and HE-AAC???) FAAD2 (Free Advanced Audio Decoder version2, with HE-AAC support) from http://www.audiocoding.com comes under GPL (unlike the outdated FAAD version which is under LGPL and doesn't allow HE-AAC decoding). FAAD(2) is a both mpeg2 and MPEG4 AAC decoder!

NOTE: There is currently some controversy with an apparent GPL-incompatible change to the FAAD2 license - see http://www.hydrogenaudio.org/forums/index.php?showtopic=35535 this thread at HA for details (especially comment http://www.hydrogenaudio.org/forums/index.php?showtopic=35535&view=findpost&p=314491 #35)

RealNetworks released an http://www.hydrogenaudio.org/forums/index.php?showtopic=32051 open source fixed point HE-AAC decoder. Note: the license doesn't appear to be GPL compatible - any opinions?

DanHollis: The Podzilla guys seem to think its GPL compatible. More details http://www.ipodlinux.org/blog/index.php?p=15 HERE

RichardOBrien: The RCSL/RPSL (RealNetworks Public Source License) under which the AAC decoder was released, at Helix Community, is compatible to GPL but not the other way around.

PatrickSchuetz: I agree with RichardOBrien, for more details: HERE

For MP4 parsing, FAAD's mp4ff library would be nice, but it uses malloc rather excessively and the code in CVS for ALAC seems to work, with some minor changes for AAC support. Probably best to roll our own.

Another tricky issue is AAC gapless playback - a discussion of the subject is http://www.hydrogenaudio.org/forums/index.php?showtopic=34989 here

DanLenski: The FAAD2 license no longer seems to have this issue. See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=419339#65 this discussion on a Debian bug tracker which shows that as of the 2.6 release of FAAD2, the license text is clarified to show that FAAD2 falls fully under the GPL.

FAAC (Free Advanced Audio Coder) from http://www.audiocoding.com is under "Lesser GPL". FAAC doesn't provide HE-AAC encoding but it's a both MPEG2 and MPEG4 AAC coder too. Note that FAAC is also the name of the whole sourceforge project (including FAAC, FAAD and FAAD2), don't mistake! (more details http://www.audiocoding.com/modules/wiki/?page=FAAC HERE).

No integer version available.

WMAFFMPEG contains a http://www1.mplayerhq.hu/cgi-bin/cvsweb.cgi/ffmpeg/libavcodec/wmadec.c?cvsroot=FFMpeg wma decoder, and Rockbox currently uses a version of this modified to use fixed point math. As of March 09, 2007, the FFmpeg project has an WMA encoder. -- BlakeJohnson - 19 Oct 2007
A/52 (aka AC3) liba52 is a GPL'ed implementation with an integer-only mode that would run without problems on the iRiver's hardware

AC3 is the most common audio format for DVDs, so support for this format would allow you to rip the audio from a DVD and play it directly on your DAP without re-encoding. An obvious next step (if technically possible) would be "AC3 pass-thru" via the optical digital output to a standalone AC3 surround decoder.

ffmpeg contains a GPL'd ac3 encoder (ac3enc.c)

Quality is very, very suboptimal - RobertoAmorim

Speex Speex is a codec geared towards speech compression. Rockbox uses libspeex, which does decoding completely in fixed point. Almost all of the encoder is now available in fixed point, and might be feasible for encoding in Rockbox at some quality/complexity settings.
RealAudio

RealAudio is a container format and a range of different codecs can be used. However, the most common is the "cook" codec.

Currently rockbox supports playback of RealAudio files with any of the following codecs : cook, AAC, AC3 and ATRAC3.

???

Lossless Codecs

A "lossless" codec (e.g. FLAC) performs the same function as "winzip" - i.e. it compresses an audio file without discarding any of the information. These codecs typically achieve a filesize of 50%-60% of the original filesize, but the audio playback will be bit-for-bit identical to the original file.

FormatDecoder(s)Encoder(s)
WAVE (.wav) ".wav" is the de-facto container format for storing uncompressed PCM audio, but may contain compressed data (most common formats are pcm, adpcm, alaw, mulaw, and dvi_adpcm). Obviously, no encoding or decoding needs to be done on PCM data (but byte-swapping may be needed - WAV files are little-endian), but other formats use various compression methods - thus the need for a codec. Details, Official specs, Sample WAV files. As decoder.
AIFF Audio Interchange File Format (AIFF) is the Apple version of WAV (developed jointly by Apple and SGI), and is the standard container for uncompressed PCM audio on a Mac. AIFF-C is a similar format containing compressed audio. A description of the format is available from Apple As decoder.
FLAC FLAC - integer only FLAC

According to http://www.hydrogenaudio.org/forums/index.php?s=acf4dfb6550103bfab92e279bfbe379f&showtopic=31471 this topic and the changelog, FLAC 1.1.2 has a compiler define (FLAC__INTEGER_ONLY_LIBRARY) that builds the whole library (decoder, encoder, etc.) with integer only. The remaining problem is the usage of 64-bit ints in the code. But it appears that http://www.neurosaudio.com/community/forum/topic.asp?TOPIC_ID=2212&whichpage=3 someone is working on that.

Shorten (.shn) Shorten source is available from this website, but the license isn't GPL compatible and including it would be a major hassle. See discussion in the irc logs from Feb 13th 2005.

FFmpeg has a LGPL decoder.

To be completed
Wavpack (.wv) Licensed under the Modifed BSD license Does both lossy and lossless encoding.
Monkey's Audio (.ape) Monkey's Audio 3.99 SDK. For further information look at the Monkey's Audio developer page. The Monkey's Audio license is claimed to be "open source compatible" (DanielStenberg: I fail to see how it is GPL compatible. It adds further restrictions and the GPL doesn't allow that.)

jmac, a java implementation of Monkey's Audio is also an option (under LGPL).

The codec is heavily x86-centric with lots of x86 assembly to speed up parts of the code - particularly a neural network. Unless it's very heavily optimized for 68K and PortalPlayer, it won't run real-time. And some compression modes (Extra high, Insane) probably won't run no matter how much you optimize it - RobertoAmorim

Monkey's Audio 3.99 SDK.
ALAC Apple Lossless Audio Codec. http://craz.net/programs/itunes/alac.html ffmpeg, since revision 14849 (2008)
TTA True Audio: http://tta.sf.net/ and http://en.wikipedia.org/wiki/TTA_%28codec%29 The "hardware" decoder isn't GPL compatible, but the XMMS plugin appears to be nearly identical and is under the GPL. Code is already all integer. GPL
MPEG-4 Lossless MPEG-4 Audio Lossless Coding (ALS) was standardised (http://www.acnnewswire.net/article.asp?Art_ID=31029\x{2329}= Press Release) in December 2005. The specification does not yet appear to be publically available, but a reference encoder and decoder are available The reference source is C++, but could relatively easily be converted to C. However, the license is not GPL compatible. See decoder

Other Codecs

There are other audio "Codecs" that don't fit into the "lossy" or "lossless" categories above. These are different because the source for the files was not a .WAV file, but rather they are formats used by music composers to store computer-generated music.

FormatDecoder(s)Encoder(s)
SID The sid format is music from Commodore 64 games and other productions. I don't know much about the format, but a good starting point would probably be sidplay2 which includes a standalone library. I don't know whether or not this is floating point.

A thing to note is that sid files have no notion of playing time. They are simply programs that go on forever, either with music that loops or silence. Traditionally this has been fixed by looking up playtimes in a database of file-md5 hashes which includes more or less every song available on the net.

Furthermore, each file may contain many subtunes.

DanHollis: sidplay2 appears to be completely integerized.

MartinArver: Unfortunatley, libsidplay, which is the lib for sidplay2, is written in c++. I have been looking at an older version of libsidplay(1.36.59), this has fixpoint-math. But, as it is written in c++ it seems we have to build the toolchain with c++ enabled for this to compile.

Encoding not possible.
SPC The SPC format is music from Super Nintendo (Super Famicom in Japan) games and other productions. There are a multitude of players for the SPC format, most of them listed here. As with most sound emulator formats the sound is not absolutely accurate, but the SNESamp Winamp plugin is generally regarded as the one which represents the sound best. I'm not sure if SNESamp is floating point (ChrisRobinson?: or if the nature of the Winamp plugin is indeed useful for converting - I'm no programmer), but if not there are multiple other players.

SNESamp actually uses the SNESapu emulator for SPC decoding. It is indeed the most accurate SPC700 APU emulator publicly available. Unfortunately, all emulation code is written in x86 assembly (NASM) - RobertoAmorim

Unlike the sid format SPCs are single songs, which have ID666 tag support, and every SPC file is 64kb in size. However, the more popular RSN format acts more like a sid file and contains all songs from a game (RSNs are simply RAR files which contain SPCs [which have a 90% average compression rate] named to RSN). Some players support RSN, others I believe do not.

Encoding not possible.
PSF The PSF format is music from Sony Playstation 1 games. The decoders are fixedpoint. There is source code in C from an GPL Linux XMMS plugin known as SexyPSF. I've looked at the code but am totally clueless as to how it works. It's avilable there for anyone willing to tinker with it. Neill Corlett's PSF Central also has alot of PSF related information as well as the actual PSF spec paper. I would love to see this format make it into Rockbox in the future. There are also a few other decoders out there, but Sexy PSF is the only one I could find a source link to. I suppose some Emulator decoders could work somehow too? Encoding not possible.
GSF GSF files are Game Boy Advance music files. Perhaps due to how new it is, there are almost no programs around that can play GSFs, let alone open source ones. I did find one, on http://www.caitsith2.com/gsf/ . However, it is a Winamp plugin. Encoding not possible.
Tracker formats This is actually a whole range of more or less similar sequencer formats. There are a few opensource libraries available, which all support a lot of formats.

Mikmod is for some reason often used, but I've been told that it lacks support for many features in some formats. This is a C library.

Modplug (the project also releases libmodplug) should have better support and it seems that it has very limited use of floats. However, libmodplug is in C++. http://google.com/search?q=mikmod+modplug People talking about Mikmod vs. Modplug.

Another option might be XMP which is C.

There's also DUMB which is said to be an accurate module player library (also C).

Any thoughts on which has the best support? Modplug is probably not feasible.

FS#8806 has a fair amount of work done towards playing these formats.

Encoding not possible.
NSF, NSFe This is for the sound tracks from the original Nintendo Entertainment System (NES). The NSF file is actually an NES ROM, with all data not related to audio stripped out. The sound track for pretty much any 8-bit Nintendo (not Super Nintendo) game is available online. The NSFE file format is a bit better. Both are playable. Encoding not possible.
GBS This is a chiptune format for Game Boy games. These can be played with special players, or directly through a Game Boy emulator. It is similar to the NSF format in that a GBS file is nothing a Game Boy ROM with all data not related to audio stripped out. Encoding not possible.
HES This is a chiptune format for the NEC PC Engine/TurboGrafx-16. Similar to the NSF and GBS format, it is merely the audio portion of a TG16 ROM. Encoding not possible.
SAP This is a chiptune format for old Atari 8-bit computers. It is the data played by the POKEY chip. Its based off of the code for ASAP, a GPL'd Atari music player. More info about the player can be seen here. Encoding not possible.
MIDI Trying out MIDI, and it says you have no patchset? Download here.

I have written a simple MIDI player for Linux in C++, now rewriting it in C in hopes of making it work with Rockbox. Currently the C port is able to allocate memory for the file, decode it, sequence events, and interpret them enough to play the file using the sound card an a very simple sinewave-based synthesizer I threw together. I am hoping to find an adlib emulation engine that I can import into this thing, and possibly a wavetable engine if I can find good patches.

It plays sound and at this state, it may actually work on the iRiver if someone ports the sound output routine in pctest.c to write to the iRiver DSP.

I guess the MIDI codec would have to be more of a plug-in, as it loads the entire file at once and then plays it from memory...

You may want to look for the Gravis UltraSound patches; they are floating around and used by software MIDI players such as TiMidity++

I have looked at TiMidity++ as well as the music engine used by ScummVM... Does anyone know of a good description of the Gravis UltraSound patch format? Those patches store a good deal of information, such as waveform looping, envelope, etc.. but I cannot find a guide that explains the fields in the file. Any ideas?

There seems to be downloadable Gravis patch info http://myfileformats.com/download_info.php?id=2480 here.

All right.. playback, looping, interpolation (No more ghetto lowpass filter!), drums, panning, pitch wheel and all that work fine. I have just added envelope support. It works but could probably use some more exhaustive testing. I don't know how well envelope stuff will work on the target, given the amount of extra work it puts on the processor compared with the difference it makes to the output. I guess at this point the code needs to be built for and tested on the target.. but I don't really know what kind of functions it uses for file I/O, etc. Maybe someone can help me with that.. someone who actually has an iriver, etc.

Updated synth sample here. This plugin needs a separate soundset to work. This is available here. Extract its contents into the /.rockbox directory. Warning: file is around 22MB in size.

The plugin can play back midi in 22kHz in realtime on coldfire based targets, still not realtime on pp based targets.

Encoding not possible.

Faster MDCT Experiment

Work has started on an experiment aiming for writing a faster MDCT for the codec library. For details, see FasterMDCT


r182 - 23 Mar 2013 - 10:16:44 - BertrikSikken

Revision r182 - 23 Mar 2013 - 10:16 - BertrikSikken
Revision r181 - 05 Mar 2013 - 17:20 - GuillaumeCocatreZilgien
Copyright by the contributing authors.