Goals of playback reworking
- Simpler (and more reliable) code
- Fewer structures to keep in sync
- Reduced thread dependency and swapping
- Metadata to be buffered - supporting album art and chapter info (e.g. cuesheets)
- Pause to work on playback, not all PCM output (so voice can play while playback paused, no need to fade old track if new playlist started while paused) - may need lower PCM latency
- Voice not dependent on playback at all (no voice codec switching)
- Unification of hardware and software codecs
- Integration of audio playback for video
- Video playback based on codecs
Changes and resolved bugs since the commit of Metadata on Buffer:
- The initial implementation relied heavily on the spinning status of the disk to maintain filling activity across api calls. Change to using a stateful filling flag.
- Yield codec depended on the amount of allocated data on the buffer instead of the amount of useful data on the buffer, this was initially resolved by removing the loop but was later enhanced with a new check based on useful data.
- Codecs went through a multi-stage process to get from the buffer to the codec ram, this was changed by the addition of codec_load_buf which mirrors closely codec_load_file. This once again allows codecs to wrap the end of the buffer and is elegant.
- Both the old buffering code and the original MoB code would keep leading tags on the buffer, these were stripped by using the first_frame_offset calculated in get_metadata as the earliest data stored on buffer. Codecs needed to be updated to work with this change. MP3 specifically needed adjustment and had an old long standing bug with resume uncovered by this.
- The initial MoB code didn't make use of any preseek rebuffer prevention. A partial implementatin of this, only for resume was added. I believe this same code will also improve backward seek off buffer performance once FS#8092 is resolved and a full rebuffer is triggered for a backward seek off buffer.
- A bug in the flash player implementatio of conf_watermark would lead to flash players not triggering buffer fills. Resolved this by not setting the watermark to anything other than the minimal/default value on flash targets.
- The low buffer call back, used to ask buffering clients to clear unused handles and provide new handles to buffer was not consistently called. I modified the semantics slightly so that the client (playback) only resets the callback once it has completed its work based on the callback. This makes it safe to more frequently call the call_buffer_low_callbacks function from buffering.
- Buffer shrinkage used two passes to compact the buffer as much as possible, this didn't allow for optimal compacting when playing small codecs such as SPC or NSF that are relocatable and atomically read. Use a recursive compacting algorithm instead. There is a risk of running out of stack space, so cap the number of handles to a safe value.
- The old code allowed codecs to set the file chunk size in order to attempt to optimize a reverse seek off buffer. This should be unnecessary now and was causing potential problems where the buffering code would change the size of reads during the course of buffering making it unresponsive. Drop this setting.
- Initially move_handle and add_handle did not sufficiently protect from buffer wrap conditions. Correct this with rigorous byte math and alignment.
Flyspray bugs in Music Playback:
(Note that reported data abort/crashing bugs may be related to Portal Player CPU frequency scaling, not playback problems.)
The following bugs have been reported and need putting on Flyspray (if they are not already on there, and have not been resolved recently):
- Seeking within the last few seconds of a track seeks in the next song (FS#2687).
- Repeatedly skipping back and forth between the same tracks can cause a filebuf counter overflow (appears as negative filebuf in audio thread view).
- Very long rewinds are still not be handled quite right (FS#6665).
- AB repeat off buffer seems to cause some kind of a problem on the seek backwards immediately after setting the B point.
- Non 44100 khz sample rate music seems to cause some confusing with seeking, and possibly with playback position reporting.
Rockbox software codec proposed audio output structure
Three independent processes -
- BUFFERING (performs all file reading to make best use of the audio buffer, based on the best guess of future required files, e.g. the current playlist)
- PLAYBACK (manages the current playback position, reads from the audio buffer) -> PCMPLAYBACK (mixes multiple input streams, including crossfade/voice)
- VOICE (decodes main voice file and .talk clips, via dedicated codec) -> PCMPLAYBACK
Rockbox software codec current audio playback functional description
The primary control thread of the system is the audio thread. This thread reads audio files and codecs into the large compressed audio buffer. When playback is requested, the audio thread reads the metadata of the first audio file in the playlist, and reads the required codec into memory, then executes it on the codec thread.
The codec thread is where all audio decoding and writes to the PCM decompressed audio buffer take place. Once the codec is initialized, it calls back (on thread) for compressed audio data using what look to it like standard file-like operations. It then decodes this compressed audio, and again calls back with a decompressed buffer to be written to the PCM buffer. After inserting the decompressed data into the PCM buffer, the codec thread updates the position indicators with two more callbacks.
The codec thread doesn’t have any knowledge of the state of the compressed audio buffer, and simply requests and decompresses data as fast as it is able. Buffer levels are managed by the audio thread when there is contention for buffer fill resources. This is done by yielding and/or/nor sleeping on the audio thread to allow the codec thread to fill the PCM buffer, or to fill the compressed audio buffer as needed.
On each write to the PCM buffer, the buffer level is checked, and if the buffer is below the watermark level, the CPU is boosted to try and recover the buffer level. This works, because the PCM buffer is the more heavily CPU limited of the two buffers.
User triggered events are handled primarily on the audio thread. Most events are handled entirely on this thread, however track skips are handled partially on the requesting thread for better interactive performance. Events are primarily converted to linear execution by the use of the audio event queue.
Most events will eventually require a change in behavior on the codec thread.
For a seek, the codec thread is notified by simply setting the seek time on the codec API to the desired seek location. The codec thread picks this up on the next loop and moves to the desired location, calling the seek complete callback when this is done.
For track changes that do not require a change of codec, the reload codec variable on the codec api is set, this tells the codec to immediately terminate decoding of the current track and request a new track through the request next track callback. The request next track callback is responsible for moving the buffer pointers, and ensuring that the new track is at least partially buffered before allowing the codec thread to continue.
For track changes which do require a change of codec, the stop codec variable on the codec api is set, telling the current codec to exit immediately. The codec thread then sleeps until it receives a codec load event on its event queue, at which point it executes the new codec and audio resumes.
The voice ui operates on a separate thread from the codec thread, however only one of the two decode threads will be awake at any given time (mutex enforced). When the voice codec has decoded a section of audio, it calls the same callback to insert that data into the PCM buffer as the codec thread. If audio is not playing the callback functions exactly as it does for audio playback. If audio is playing then the callback inserts the audio into a special mix buffer, and instructs the pcmbuffer to mix the data into the audio stream as soon as possible. The PCM buffer inserts any mix data (beeps or voice) as close as possible to the currently playing buffer (1/8 second past the end of the chunk that the hardware is currently reading). In the insertion callback, priority is balanced between the two codecs, swapping as necessary in an attempt to ensure uninterrupted playback and smooth voice insertion.
Copyright © by the contributing authors.