Rockbox mail archive
Subject: Re: Red freeze
From: Mike Holden (rockbox_at_mikeholden.uklinux.net)
Mike Holden said:
> Björn Stenberg said:
>> However, the single most important problem right now is the "red led
>> dead". This appears to be a condition where the drive never recovers
>> from failure and thus reports errors forever. Hence my attempt with
>> using a hard reset.
> Another idea -
> we recently added a patch that updated last_disk_activity on each loop
> within ata_read_sectors(). This was to resolve the problem where the
> disk could spindown during an interrupted read. Since wait_for_rdy() and
> wait_for_bsy() have a built in maximum timeout of 10 seconds (10*HZ), we
> should do the same here as well. If either of these functions takes a
> few seconds, we could be spinning the disk down (default disk spindown
> is I think 5 seconds). We could either update it each time round the
> while() loop, or we could pre-empt it and set it to the function timeout
> value on entry and set it again on exit (I assume that nothing will get
> "confused" by having last_disk_activity in the future? It seems to be a
> simple integer arithmetic comparison to current_tick-timeout or similar,
> so should be ok).
Here's another scenario that will cause red freeze:
We hit a disk error and the mpeg thread skips to the next track. It then
tries to start loading the next track, and gets as far as the
perform_soft_reset() at the top of ata_read_sectors(). This calls
wait_for_rdy() which calls wait_for_busy(). Both of these function calls
will wait for 10 seconds each before timing out and returning. On top of
this, the perform_soft_reset() then loops round calling wait_for_rdy() 9
times (the count is 8, but it loops 9 times!), thus giving 9 * 20 seconds
= 3 ___minutes___ (the comment says "up to 30 seconds")!!!!
perform_soft_reset() thus fails, and we return -1 from ata_read_sectors().
This failure takes at least 3 minutes. The mpeg code then skips to the
next track and the whole process starts again.
A few solutions to this:
1. Update the last_disk_activity during wait_for_bsy and _rdy, as stated
earlier, to avoid disk spindown.
2. Call ata_hard_reset() on starting ata_read_sectors() instead of the
soft reset. It seems obvious that the soft reset doesn't clear errors, and
the hard reset doesn't cause any problems to prevent us calling it
3. Is a 10 second timeout appropriate in wait_for_bsy/_rdy? We have only a
5 second timeout on ata_read_sectors, so 10 + 10 seems like a bit of
4. Should we handle exit from ata_read_sectors differently? At the moment,
we call led(true) at the top, but if we exit early, then led(false)
doesn't get called at all. It would require some restructuring of the code
to set a return variable and use break;, if() etc to force skip execution
to the bottom of the function so that the led is turned off. Another
option could be use of a forward goto to skip to the bottom of the
function on error (can you have a forward goto?). Much as I dislike the
use of a goto, we have one already so another one doesn't make much
5. I would suggest changing perform_soft_reset() so that if we are still
waiting after a couple of loops, we do a hard reset instead. The current
delay is just too much!
Much food for thought - Bon Appetite!
Page was last modified "Jan 10 2012" The Rockbox Crew