Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category Music playback
  • Assigned To No-one
  • Operating System Another
  • Severity Low
  • Priority Very Low
  • Reported Version Version 3.3
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by matsch - 2009-09-13
Last edited by bertrik - 2010-09-26

FS#10605 - Patch for stable playback for clip_v1

Here are two patches. Patch 1 disables buffering of codec file.
In lowmem targets like clip the codec is moving around in the
ringbuffer causing instability. This is not the root cause. Pls look at the patch.
Patch 2 deals in addition with buffering and allows a larger movement of a handle
within the ringbuffer. Please try out the patches and have a look at buffering. Perhaps you
find more improvements for buffering.c.
Codec and ID3 handle move more in a lowmem target than in a highmem target before they are deleted. Maybe that is the reason for the playback instability of clip.


Closed by  bertrik
2010-09-26 21:24
Reason for closing:  Out of Date
Additional comments about closing:   Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407

Clip v1 is already a stable target now.

Contrary to what I said before, theres definitely something going on with patch 1. Something about low memory forces rebuffering of the codec file even when its in memory already, and this contributes to the instability we see on the Clip. IMO we need to figure out why its being reloaded so often.

i've been using the first patch on clip for some days and i am very happy with the performance

Someone reported a log here:

http://forums.rockbox.org/index.php?topic=22845.msg156699#msg156699

That verifies a crash in the second patch (buffering.c). So both of these changes address real (and different) bugs. I'm going to investigate this further.

Matthias:

Can I ask your reasoning on the second patch (buffering.c)? I think you have something there but I'm struggling to understand why your fix works. And I don't want to commit it without fully understanding it.

My reasoning is:
buffering.c defines a ringbuffer. The ringbuffer has data, called handle, moving around in the ring buffer. The biggest handle is the audio handle with mp3-data which is filled up
at certain times. The audio handle has read and write pointers indicating remaining mp3 data for the codec. When remaining data is too small, the function shrink_buffer is called.
This function moves the audio handle struct before the read address. The other handle (ID3 struct) with its size (size_to_move) has to move (to newpos) to free space for the next filling of the ringbuffer. This is done with move_buffer.
The hande can go over the ringbuffer. This is calculated with the variable overlap. Overlap > 0 means that the handle goes beyond the ringbuffer limit. Overlap > size_to_move is OK.
No measures are required. I added this in the patch in order to avoid the following. There are some if conditions to avoid general wrapping of the handle or wrapping of the handle struct. Especially the last case leads to a broken ringbuffer. There is the variable correction. Why is newpos corrected with correction = overlap - data_size ? The handle would not move a lot if it is at the end of the ringbuffer. It will certainly stick at the end of the ringbuffer if !can_wrap is true.
I hope this explanation will help to understand buffering.c which is quite complex. What is the size of the ID3 struct?

I must correct my statement before. I looked at move_handle. overlap > size_to_move should not happen.
When I look at the macros RINGBUF_ADD and RINGBUF_ADD_CROSS I think that overlap >0 cannot happen either. Exept the handle is
moving out of the ringbuffer. With RINGBUF_ADD there is always newpos < buffer_len. newpos + size_to_move > buffer_len is possible, but then
according to RINGBUF_ADD_CROSS newpos + size_to_move - ( buffer_len -1) -buffer_len is < 0
The ID3 handle can wrap around the ringbuffer, which is not detected. ID3 Handle + data is small. So the probability is low that this occurs.
Could you please analyze this.

Again correction. The macros seem to be correct. Patch2 is not an improvement. First case for RINGBUF_ADD_CROSS is valid

I have quite a few log results where overlap > size_to_move. For instance:

overlap = 9653344
size_to_move= 304,
newpos= 10223311
final_delta= 10223616

Taken on my clip using logf.

I'm not sure what these mean, although they do result in playback ending very shortly afterwards.

          

did you log src and buffer ?

Converted from RINGBUF macros the code is:

newpos = buffer + final_delta;
if (newpos >= buffer_len)

  newpos -= buffer_len;

overlap = newpos + size_to_move - (buffer_len-1);
if (newpos >= (buffer_len-1)) /* if (newpos > buffer_len) */

  overlap -= buffer_len;

Here is a new patch which only adresses buffering.c .
I do not think that the memmove instructions in move_handle work with
a handle which is already wrapped. Therefore the CODEC handle attribute is
changed to !canwrap. The ID3 handle has the attribute !canwrap, but in move_hande
there is a small probability that it is wrapped because of else if (!canwrap). A handle
which fulfills the previous if condition is not checked for !canwrap.
A long MP3 file plays without working with the CODEC handle moving around in the ringbuffer.
The ringbuffer size is about 180kB. A smaller ringbuffer may lead to playbrack crash.
But the CODEC handle has a size of 50k and is not allowed to wrap. Therefore the
handle can block 100k of the ringbuffer for the audio handle.
Pls check this patch.

Why do you think memmove instructions can't handle wrapped handles ?

The first hunk of the patch looks ok
The second just inverts 2 memmove instruction and change indentation ⇒ no effect

Also I'm not sure why codec can't be wrapped.. I don't even understand why codec is loaded in the audiobuffer, it should be loaded in codecbuf ?

Codecs can be buffered on the audio buffer if theres a change between codec formats coming up. That way you don't have the spin up the disk to switch between Ogg and MP3 tracks for instance. Unfortunately theres also some bug that causes codecs to be buffered improperly on low memory targets even when a playlist contains all the same format.

Memmove
DESCRIPTION

This function moves <[length]> characters from the block of
memory starting at <<*<[src]>>> to the memory starting at
<<*<[dst]>>>. <<memmove>> reproduces the characters correctly
at <<*<[dst]>>> even if the two areas overlap

When moving two neigboured memory blocks with memmove it is better
to move first the block in the direction of move then the other one. Otherwise
overwriting is possible if &src and &dst are close together.
After changing this I realized that the memmove instructions in buffering.c
cannot manage moving a wrapped handle.
Imagine a wrapped handle. Next time move_handle calculates overlap < 0.
There is only one instruction to move two splitted memory blocks. One at the end
and the other at the beginning of the ringbuffer. Memmove takes the memory block at the end plus data
beyond the ringbuffer up to size_to_move and moves it to dest.
what happens when overlap > 0 instead of <0? This gets complicated.
Therfore in the patch move_handle does not make a wrapped CODEC handle.
Attached is a patch where add_handle may not create a wrapped CODEC handle.

The code has to be changed.

Matthias:

First, very interesting work.

Second, we (on IRC) are struggling to understand what you have changed. I see that you have disabled wrapping codec files. Is this because it needs to be disabled, or simply because it seems to help (i.e. is a workaround)? It seems to me that wrapping codec file should be possible.

On a side note, you might consider stopping by #rockbox during evenings (in Europe) and talking about your work. People are interested in understanding what you have changed.

Michael:

the patch avoids a wrapped CODEC handle in the ringbuffer. This is a workaround for move_handle which has trouble
with wrapped handles.
The patch has some changes in move_handle which should improve its function, but this is not sufficient if
the CODEC handle is allowed to wrap.

change one is that if (!canwrap) comes first. The entire moved !canwrap handle (struct + data_size) ends before the end of the ringbuffer.
(Only one memmove because overlap is corrected to ⇐ 0). This is important for the workaround.

change two is overlap = overlap - data_size to overlap = overlap - sizeof(struct_memory_handle)
( data_size may wrap, but not the struct)

change three is reverting the memmove instructions. (helps, but not sufficient).
How do you get an unwrapped handle out of a wrapped one?

There are two handles which are allowed to wrap in svn. CODEC and AUDIO. (I hope)
Moving CODEC handle means moving the memory_handle struct and CODEC data.
Moving AUDIO handle (not ATOMIC AUDIO) means moving only the memory_handle struct before the data position
for the codec. You can see this in shrink_handle, where move_handle has data_size 0.
Though AUDIO handle has CANWRAP there is only one memmove, because of overlap ⇐ 0.
In principle AUDIO handle can also get !CANWRAP, at least for move_handle.

I hope the patch gives a good result in your test build. Nevertheless the ringbuffer must not be too small.
My short test was with 200kB.
Is it necessary to have the CODEC in the ringbuffer?
Is is necessary to rewrite move_handle?
Finally, buffering.c is complex. Maybe there is another issue regarding low memory playback.

Im not used to IRC and my time is limited, but if it helps it try to join in.

the patch avoids a wrapped CODEC handle in the ringbuffer.

Thanks, your description is helpful.

Is it necessary to have the CODEC in the ringbuffer?

It is not necessary to have codecs on the ring buffer but strongly preferred, and if possible they should be able to wrap. We would strongly prefer to avoid workarounds like disabling wrapping because it would mean leaving some potentially serious bugs still in the buffering code. If at all possible we would like to know why wrapping causes problems. Disabling wrapping for codecs might be accepted if all else fails as it is relatively minor, but I think many people would not like doing it.

Finally, buffering.c is complex. Maybe there is another issue regarding low memory playback.

We can tackle these one problem at a time. I haven't had time to review all your changes but if they look good most of them can go in even if they don't completely fix the problem. The complexity of buffering/playback is a problem for me as well :)

In the meantime, it would be nice if you could separate your fixes (such as the memmove stuff) from the work arounds (such as disabling codec wrapping) so that we can more easily review your code. Workarounds are certainly welcome, and maybe very useful to understand the problem, but ideally should not be combined with fixes so that people know which are which when trying to understand code. I will try to find time to test your code as soon as I can.

Im not used to IRC and my time is limited, but if it helps it try to join in.

If you can find time it might be helpful to talk to other people who have looked at the buffering code, and it improves the odds that you might eventually get SVN access for yourself (we give accounts to people we know). We have a webclient on the site, so it might be worth logging in while working and seeing if anyone else has comments. But it is not required, we can discuss things on the tracker if you prefer that.

The memmove() can be problematic if the data to be moved is bigger than half of the buffer.

Example if we want to move this data (The data is represented by A & B, where the B is the part of data which will overlap)

–AAAAABBB—

We want it to be moved as

BBB—–AAAAA

If we start by the part which doesn't overlap we will get

–AAAAAAAAAAA

And we destroy the part which does overlap.

If we start by the part which does overlap we will get

BBBAAAABBB—

And we destroy the (start of the) part which doesn't overlap.

So your fix to make codecs not wrappable could mean that the memmove() do destroy some data.

Note that even if the buffer is at least twice as big than the largest data to be moved, we would need some checks in buffering.c to determine the correct order for the 2 memmove() calls and they aren't there at the moment.

Using a wrap aware version of memove() should take care of the problem. I haven't tested it but the code looks ok (and the ascii art was worth it even if it doesn't work).

Edit: 4th argument points to the first byte of the wrapped buffer, 5th argument points to the first byte after the wrapped buffer.

Thanks for your patch ! This is a bit complex but I'm we can make sure it works if we are enough eyes on it.

The first part looks ok, the 2nd is too complex for me to read it now, but we could easily use small tests (with 16 char arrays for example)

However I see 1 problem : the memory move is not optimized (simple byte loop)

Inside the 1st case we could make 3 cases:
no wrap : memmove(dst,src,n);
dst < src : src wraps, dst might wrap : first=end-src; memove(dst, src, first); memmove(dst+first; src+first; until_dst_wraps); [memmove(start, src+first+until_dst_wraps, data_remaining);]
dst >= src : dst wraps, src might wrap : can't think about it now but it must be doable

Ssince the 2nd case does no memory move, then it should be enough.

Yeah, it would definitively be worth it to use memmove() where possible. I just didn't feel like figuring that bit out :-)

Michael:

This is the workaround which does not create a wrapped CODEC handle.
move_handle is only modified by giving the if (!notwrap) command priority.
A 54 min MP3 file worked fine with this patch (after skipping and CODEC handle in buffer) except some seconds before its end.
I limited buffer size to 200kB (with general setting → system → limit → max entries). Well, the CODEC data is a ~50kB block moving back and forward to free space for the audio buffer.
Just checking it with 250kB.

A fix would be a modification of move_handle so that it can deal properly with
a wrapped CODEC data handle.
If I have time during weekend I try to work on it and look at the previous comments.

Antoine: I started looking at using memmove() but stopped in the middle when I found this:
"size_t offset = srcend-d; " d is undeclared

What is this offset ?

It's supposed to be dst. This offset is the number of bytes you have to skip so that your copy destination doesn't overlap a source you'll have to use later on.

Edit: I've started testing this and it doesn't work correctly when we're moving more than half of the buffer. I'll post a fixed version once I have it.

Use memmove() for the first case (not finished)

I tested this case:
* |ss—–sssssssss|
* |-ddddddddddd—-|

And it seemed to work (this is not a proof however ..)

EDIT: if your current algo is right, we could use memmove() and not memmove_wrap() in the 2nd case

This is a patch for move_handle. It adapts the (slow) copying routine
of memmove_wrap. The CODEC handle is always moving in one direction to
memory which is freed. There is no memory occupied by another handle.
Therefore I think it is sufficient to copy src to dest from
head to tail. The head of dest will never overlap with src.
I am running this patch with a ringbuffer size of about 150kB.
No crash of the ringbuffer up to now.

I had a crash (not deadlock) with that patch after a couple hours. I was logging but it didn't catch much of interest. The last line (and only atypical one) was:

"buffer_handle(-1)" (from buffer_handle in buffering.c)

Followed by a hard reboot. I'll try it again without logging enabled to see how stable it is without all the overhead.

There were no issues with two mp3-files each having about 50min playing time and patch without log.
The buffer size was 125kB and the CODEC handle was in the ring buffer.

MP3 playing stops after track skip with svn 23334 + patch. Do not know why.
I tested with svn 23172 + patch, no issues. svn 23209 (buffering.c was changed) + patch seems to be OK.

I've seen a few data aborts with this patch, but no deadlocks so far. Unfortunately, the build I used for testing has all the buffering.c functions static, so I can't say where they happened, other then its in that file. I'm going to test a build today without the static functions and see where it crashes.

Edit: Recompiled an identical build without static functions and the crash address is move_handle (unsuprisingly).

Edit2: Tested some more and got crashes in fill_buffer and find_handle. So either the data aborts on AMS are inaccurate or theres some corrupted pointer being passed around.

you can disassemble the rockbox.elf file to see where the crashes happen

svn 23261 + patch crash about 2 sec after track skip
svn 23260 + patch OK
Apart from this I got a data abort when trying out other patches.
Pls look for the changes in svn 23258, 23209, 23301 on buffering and playback. Perhaps
they are related to the issues above.

I think those patches don't relate to buffering, perhaps the crashes are more frequent because the audiobuffer size is modified

I increased the size of the ringbuffer. No crash after skipping.
I agree to Rafaels statement. The new patches decreased the ringbuffer size for a
fixed limit setting. People with a high limit setting will get a
fast crash of the ringbuffer. I tested with 125kB buffer size which
seems to be the minimum buffer size for the MP3 codec.
Does anybody know the maxium codec size ?
Does a special limit setting for low mem targets make sense? I think so.

I tested with 125kB buffer size which
>seems to be the minimum buffer size for the MP3 codec.
>Does anybody know the maxium codec size ?

It should just be the size of the .codec file on the disk:

5636 aiff.codec
9448 wav.codec
10272 adx.codec
10372 mod.codec
11676 shorten.codec
13304 alac.codec
15756 flac.codec
16668 spc.codec
25920 ape.codec
27188 wavpack.codec
28848 atrac3_rm.codec
32036 cook.codec
33864 a52.codec
34692 a52_rm.codec
36576 mpc.codec
39748 sid.codec
48388 speex.codec
68496 nsf.codec
72312 asap.codec
79960 mpa.codec
81956 raac.codec
84332 aac.codec
97872 wma.codec
115724 vorbis.codec

Individual codecs are limited to 1MB, but a lot of that space is allocated during runtime, so the .codec files are fairly small. 125KB is probably big enough for the .codec file and 1 32KB chunk of audio.

Sync to current SVN.

I'm going run this for a while on my e200v1. If theres no problems, I think the patch should be committed. Even if it doesn't fix the clip issues its clearly an improvement over what is in SVN.

I also made a sync yesterday with the following changes:

1. moved 'align correction' to the if (correction) block.
I do not know the background of this alignment, but it should apply to any correction.
Maybe the alignment is good for a fast memcopy routine.

2. commented code which should avoid reading in the next handle. I do not think
this can happen. At least I did not notice any change with the comment.

3. When the codec is in the buffer, the size of the codec data is added to the useful data of the

  audio data buffer. A filling of the audio buffer is not initiated by the buffering thread, because it looks at useful data
  which is not correct due to a wrong base handle ID (-1).  (see update_data_counters)
  I do not know what forces the buffer refill. Perhaps the codec threat. The change in playback.c
  corrects the base handle ID value to the right handle.
  Seems to be OK when I look at 'view buffering thread' in the debug menu.
   

adding a small patch in playback.c and put a comment to some code in buffering.c
which is not correct. I think it has no effect because the case

I don't know if this is useful, but while testing the previous patch on my large memory target, I observed a data abort in update_data_counters due to a corrupted memory_handle on the list (most entries were 0).

I'll try your latest patch and debug any aborts I get. On the upside, it does seem like these changes make deadlocks much less common, and its a lot easier to debug data aborts . . .

Edit: Interestingly, while I can easily reproduce the data abort on the previous patch by resuming a specific playlist, I cannot do so on this new patch.

This patch is working great for me. I ran all night without a crash and its still going. It also seems to cause no problems on my e200v1.

One optimization though, the overlap safe copy is used even when theres no overlap, which is probably quite slow. Instead we can do a memmove when the source and dest files are to completely different parts of the buffer. I've added that, but not had time to test it yet.

I think it should be memmove(dest, src, size_to_move) like in the svn version, but I have to check this.
Here is a patch with a faster copying routine. The distance of the handles is a multiple of four, the ringbuffer size too.
Therefore it uses long pointers for copying. The patch works, but you never know what can happen in the complex world
of buffering.c

IMO we can commit this and then release rockbox officially for the ClipV1 when:

1) We are sure this works reasonably well. I think it does but more people trying more settings and such would be good.
2) We have the common case on other targets (no overlap of target and dest elements) running with no performance penalty.
3) Its been tested on a few other targets just to make sure it doesn't cause any odd effects on other devices.

Matthias: Is there anything else you think needs to be addressed? So far the v1a patch seems to work quite well for me.

Michael:

That's good news. I tried analyzing buffer.c as well as possible and did not find another
bug. The V1a patch offers speed advantages, but has to be tested thoroughly.
It is possible to increase move_handle speed by copying two long words or more in a loop.
Memmove can copy four long words at a time. There is still a speed difference of factor 4.
I do not know if this speed difference is relevant for lowspeed targets.

I think this is fast enough for overlapped moves. However, memmove would still be better for the safe case where there is no overlap, as this will be the only case likely to encountered on large memory tagets. IMO other targets should still be able to use optimized move routines whenever safe since it would be unfair to slow them down to fix the clip.

So far several people have used the 1a patch for many hours without any reports of problems.

This patch combines the copying routine for a wrapped handle (old or moved one)
with memmove for an unwrapped hande (old and moved one).
It is working on my clip after a short test.

I will give this a few days testing to make sure no issues occur on other targets and then commit it.

The long copy throws up warnings on 64 bit sims, so I switched it over to an int32_t copy. Unless you strongly prefer the 2 word copy I would rather do it a word at a time since we don't know for sure what the alignment is, and since I think the difference in performance isn't all that important now that you've optimized the no overlapped case.

One word at a time is OK. Most important is a stable playback. Clipv1 people are waiting for it for a long time.


The copying loop is not correct

  if ( overlap_old > 0 || overlap > 0 ) {
  /* Old or moved handle wraps */
      while (n--) {
          if (here < begin)
              here =  end;
          if (there < begin)
              there = end;
         *there-- = *here--;
      }

Consider this 8 bytes buffer, data_size = 6

old_pos = 4, new_pos = 7
here = 1, there = 4

56XX1234 - start (S = Source)
23456XX1 - final (D = Destination)

First iteration of the loop:
*(4) = *(1)
56XX6234 ⇒ the first byte of source buffer has been overwritten, so the final buffer will look like:
23456XX6 ⇒ corruption

After a slight delay I finally came up with a perfect^Wworking memmove_wrap() implementation. If it's still useful here it goes (along with a test routine to help debugging changes to the algorithm). Happy hacking!

Edit: basically, as long as you're moving less than 1/2 the buffer size, everything should be doable with a few optimized memmove calls. When you're moving more than 1/2 the buffer size it can't be optimized easily as far as I can tell … except maybe if you allocate a buffer of size gcd to only keep one of the two loops.

Attached is a new version using memmove() where applicable. The quintuple branch if could maybe be simplified.

quickly and dirtlily adapted to buffering.c

With plain svn 23741 I'm seeing problems with playback of some mp3s and oggs not even starting on c200v2. funman's latest patch doesn't seem to fix that unfortunately.

Hum, in fact my routine copies the buffer correctly but changes some of the other data too … I'm updating my test code here to check that too and I'll post an update really soon. Sorry :/ I shouldn't have used zeros to fill the rest of the buffer.

So here it comes. I changed the test code to fill the initial buffer with random values. I've also attached an updated diff to the buffering code. I hope that it'll work this time.

I patched my clipv1 with this new patch and got a complete lockup after some hours playing

Here is a patch which has a faster moving routine for handles with
data. Now 16 bytes are copied in a loop. The distance from handle to hande
is changed to 16bytes instead of 4 bytes. Speed should be comparable to
memmove now.
The copying loop is correct for the ringbuffer. The first byte of source
buffer cannot be overwritten due to ringbuffer architecture.
The audio handle has the memmove routine.
playback.c has a test on small ringbuffer with NO_FILEBUFCODEC.
If its length is less than 150kB the codec will not be copied into the ringbuffer.
I do not know the background of the playback issues with c200V2. Maybe this helps.

If its length is less than 150kB the codec will not be copied into the ringbuffer.

I think all codecs are less then 150KB, so that would just disable buffering codecs completely.

NO_FILEFUFCODEC refers to the size of the ringbuffer.
If the ringbuffer size filebuflen is smaller than 150kB the codec is not copied
into the ringbuffer.

This patch deals with the buffering thread. Within this thread the
available data (data_counters_useful) for the codec thread is checked more often with this patch.
I noticed that playback stops if there is low data for the codec thread and
a codec data handle in the ringbuffer.
The buffering thread checks more often (HZ/10 instead of HZ/2 for timeout) for
low data. If this is the case space is freed and the buffer is filled. New track(s) are added to the ringbuffer if there is space
available for them.
I do not know if there is an improvement for low mem targets, because my clip wotks fine with the unstable release.
Maybe other targets like C200V2 can benefit.
MP3-playback (192kbit) is fine with 140kB ringbuffer size.

There is also some problems with voices, very probably related to this (closing  FS#10881  as duplicate)

Just noticed that r24968 essentially reverted the playback.c changes in r23680, by adding a buf_set_base_handle call to audio_check_new_track again. This call is needed for the watermark check in buffering.c to work properly (at least when more than one track is buffered). Shouldn't break playback for clip_v1 I think…

Is this patch still relevant?

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing