Wiki > Main > RockboxAudioAPIProposal (compare)
Difference: RockboxAudioAPIProposal (r27 vs. r26)
See also: AudioAPIEnhancement
The in-progress port of Rockbox to the iRiver and other devices requires both software audio decoding and an abstraction of the audio hardware and playback features of the different target devices, neither of which are present in the current Archos-oriented code. The aim of this document is to:
Alain Berguerand has thought about a design proposal of his own.
We need a dual-buffer system with a filter in between:
Basic data flow:
Questions: How does the above architecture deal with different sampling frequencies, mono/stereo and (possibly) sample sizes? Shouldn't be a problem in buf1 (the compressed data will contain the relevant meta-data), but is an issue for buf2. Do we want to attempt cross-fading between a 44.1KHz file and a 48KHz file? Is a "gapless" change in playback frequency possible on the iRiver?
Keep in mind we need to make this work in reverse too, for recording:
For devices with hardware codecs, the chain is shortcut between the loader and the feeder:
NOTE: the possibility of implementing a dual-buffer approach for devices with hardware codecs was discussed on IRC (2005-02-16 - very start of the day) - for the MAS devices, buf1 would contain the MP3 data as read from the disk, and the "codec" would be a "swap-copy" routine to bitswap the data in preparation for sending to the hwcodec. The existing architecture bitswaps the data in-place, right after reading it from the disk. Implementing a dual-buffer scheme here will sacrifice some RAM.
In addition, the audio API needs to support instant playback of short audio clips from memory or files - for Talkbox support, key beeps etc.
[Can someone who is familiar with the current playback and recording systems write a high-level description here?]
This section describes the highest level of the API - namely that between the rockbox application and the rockbox firmware.
The first problem when playing an audio file is to determine the format. A simple approach which is probably good enough is just, in the first instance, use the file extension to decide if the file is supported or not. This guess needs to be confirmed by the actual codec code - for example, a ".WAV" file could theoretically contain one of many types of data, not just uncompressed PCM (e.g. GSM 6.10). So the codec code itself needs the ability to double-check that the file is supported.
The following is a list of proposed file formats to be recognised (but maybe not playable - that depends on the hardware) by Rockbox. I propose that the definitions of a particular hardware device includes a
/* ROCKBOX Audio Formats */ #define AFMT_MPA_L1 0x0001 // MPEG Audio layer 1 #define AFMT_MPA_L2 0x0002 // MPEG Audio layer 2 #define AFMT_MPA_L3 0x0004 // MPEG Audio layer 3 // (MPEG-1, 2, 2.5 layers 1, 2 and 3 */ #define AFMT_PCM_WAV 0x0008 // Uncompressed PCM in a WAV file #define AFMT_OGG_VORBIS 0x0010 // Ogg Vorbis #define AFMT_FLAC 0x0020 // FLAC #define AFMT_MPC 0x0040 // Musepack #define AFMT_AAC 0x0080 // AAC #define AFMT_APE 0x0100 // Monkey's Audio #define AFMT_WMA 0x0200 // Windows Media Audio #define AFMT_A52 0x0400 // A/52 (aka AC3) audio #define AFMT_REAL 0x0800 // Realaudio
So the hardware definition for the Archos Jukeboxes would be:
and the iRiver H120/H140 might be defined as:
[Please propose and discuss.... ]
There are three tasks needed to be done during the playback of a track which require knowledge of the codec's file format:
This section describes how the audio hardware of the various devices can be abstracted.
[Please propose and discuss.... ]
This section describes the API for providing decoding and encoding of the audio codecs to be supported in Rockbox. Metadata (e.g. ID3 tags) are also a feature of the codecs and so the codec API needs to include the appropriate functions to read (and write?) the metadata in a file.
We need to remember to give credit to the codec authors and information about decoder versions in the "Info" menu screen and other relevant places.
A main design goal of Rockbox is to minimise battery usage by keeping the hard-disk powered down as much as possible, and performing as few power-hungry spin-ups as possible.
It is proposed that codecs dynamically loadable - using a specialised version of the existing general-purpose plugin architecture already in Rockbox. This will remove any limititations on the number of codecs Rockbox can support. However, both the number of "codec slots" (the destinations for loadable codecs) and the number of codecs compiled into Rockbox should be configurable.
In order to allow Rockbox to support many different types of codecs (such as "non-streaming" codecs like SID/MOD, or codecs that offer "hybrid" compression like wavpack where two input files are needed to produce one output stream), it is proposed that the codecs themselves manage the memory buffer for the tracks that they are playing.
When Rockbox is initially started, no codecs will be activated. The user will add some songs to the playlist and the first codec will need to be loaded into a codec slot and initialised.
In order to allow the codec to make full use of the disk-spinup, it should start (in a seperate thread) the loading of the data from disk. This should not be a CPU-intensive task, but it is desirable for the codec to remove any redundant data during this loading process in order to maximise memory usage. The codec should be able to peek into the playlist in order to load multiple files during the same load operation - subject to the available memory.
As soon as possible after the codec has loaded a small amount of the first file into memory, the codec should start decoding that data into either the cross-fade buffer or directly into the low-level audio buffer. Codec implementations should aim to minimise the amount of copying of data between buffers
When a change of codecs is necessary, the audio system will need to load the second codec and initialise the decoding of the next file before the first one has finished.
[this section is now out-dated by the above changes to the API overview]
This is the general initialisation call to the codec - so the codec can allocate memory and perform any other housekeeping tasks before it is ready to actually load and decode a file. Return codes would include:
Return codes would include:
This function returns ID3-tag type information from the file. We may want to call it either before or after a file is opened. i.e. to read the metadata from a track wie will be playing in the future, but without initialising a full decoder instance.
This function would, with the help of the "read" callback, decode "size" bytes of PCM data from the input stream (in the format specified in the
CODEC_OK CODEC_READ_ERROR CODEC_END_OF_FILE CODEC_RECOVERED_FROM_ERROR // e.g. sync was lost, but decoding continued. The audio system could feed this back to the user. CODEC_INTERNAL_ERROR // An unexpected internal error from the codec
This function would seek on a sample-accurate basis in the file. For some codecs this could be an expensive operation, in which case we may want to allow the codec to "guess" at the appropriate seek point.
NOTE: Seeking is a complicated issue and possible seeking strategies for each supported codec need to be discussed before deciding on the semantics of this function. But some codecs (e.g. WAV, FLAC) are designed to allow sample-accurate seeking, so this should be the benchmark.
This function is called when the audio system is finished with a file - either when the end of the file has been reached, or the user has cancelled playback. It can not fail. The codec is returned to the same state as that just following a call to
This function causes the codec to release any memory. It then can not be used again until a call to
[Please propose and discuss.... ]
Now that code can be tested on the iRiver itself, it would be useful to see example implementations of simple "viewer" plugins which decode a track from a compressed format and write to a WAV file. These can be used for testing decoding speed and optimisation work can begin before the full audio API is developed and implemented.
Library source code and "codec2wav" test plugins are now in CVS for MPEG Audio (libmad), FLAC (libFLAC), AC-3/A-52 (liba52) and OGG Vorbis (Tremor). If you are actively working on such an implementation for a different codec, add the name of the codec (and your name) to the following list. We are especially in need of someone to investigate the implementation of "non-streaming" codecs such as the various sequencer formats.
These decoders will be used as the basis for the Rockbox implementations of the decoders.
None of the initial codec implementations are running fast enough for real-time decoding. Therefore effort is needed to optimise the libraries for the iRiver environment.
Some profiling information on libmad and libFLAC can be found here: http://ipodlinux.org/forums/viewtopic.php?t=850
r27 - 23 Jan 2006 - 21:35:09 - BrandonLowRevision r27 - 23 Jan 2006 - 21:35 - BrandonLow
Revision r26 - 08 Apr 2005 - 05:55 - LinusNielsenFeltzing
Copyright © by the contributing authors.