Rockbox

Tasklist

FS#12184 - Fuze V1 locking when transferring files Rockbox 3.9

Attached to Project: Rockbox
Opened by Kent Williams (williamskg6) - Saturday, 09 July 2011, 00:13 GMT
Task Type Bugs
Category Operating System/Drivers
Status Unconfirmed
Assigned To No-one
Operating System Sansa AMSv1
Severity Low
Priority Normal
Reported Version Release 3.8.1
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

This is a post that another user put in the forums that pretty much describes the problems I'm having exactly:

" I have a v1 sansa fuze and been using rockbox since 3.8.1 first came out.I just upgraded to 3.9,did not mess with ''a-n-y'' setting at all,pc is windows vista (latest sp).Whenever i drag and drop folders with mp3s in them(be it 58 mb or 300 mb or 600 mb) it starts to copy but halfway through,the process just freezes,clicking cancel freezes the windows explorer for 10 seconds then cancels it.Even worse,when i go check out my sansa fuze at my computer,it's recognized but when i click on it,shows as empty!? so i disconnect safely,theeeen....device needs a hard reset.So i say ok this has happened once or twice before but guess what? it happened again...5 TIMES each on 3 different usb ports.Uninstalled rockbox,went back to official firmware and it's all normal...

So i changed the screen to be always on during plugged in and this is the error i get

data abort at 30054264
FSR 0x8
( domain 0, fault 8 )
address 0xA00000BF "

I am running Windows 7 64-bit and experiencing the same issues. If the player doesn't lock outright, it gives the same error white screen described above. This seems to happen more frequently when copying folders that contain folders themselves, but I've had the problem occur on copying just one folder with no subfolders.
This task depends upon

Comment by Ben Leggett (phanboy4) - Saturday, 09 July 2011, 01:42 GMT
Yep, V1 Fuze here, same issue. I also get *really* slow speeds when transferring to the MicroSD via Rockbox. ~680kb/s through Rockbox, 2.67mb/s through OF. Same issue with several different cards I tried.

Happens in 3.9 and latest archived build as of today.
Comment by Reimu Hakurei (ReimuHakurei) - Monday, 11 July 2011, 18:28 GMT
This also affects me on a Sansa e250v2
Comment by jon pipitone (pipitone) - Wednesday, 13 July 2011, 14:38 GMT
Ditto for me. I have a V1 Fuze with 3.9 freshly installed and I'm running Ubuntu.
Comment by Reimu Hakurei (ReimuHakurei) - Friday, 15 July 2011, 23:27 GMT
It's probably just completely random.
Comment by Remek (pirx) - Wednesday, 27 July 2011, 14:07 GMT
unfortunately, I have to confirm this issue. With the new firmware write operations to the memory card in my Fuze will render it inoperable until I reset it...
Comment by MichaelGiacomelli (saratoga) - Friday, 29 July 2011, 14:44 GMT
3.8 was branched from a r29345, while 3.9 was branched at r30085. If someone who has this problem could figure out which revision caused it, we could probably just revert it.
Comment by Remek (pirx) - Monday, 01 August 2011, 07:03 GMT
oh, all that I know at the moment is that this issue is 3.9-specific. I am positive it did not happen in 3.8 (and 3.8.1).

is there anything a user (not a developer) can do for this to get corrected? The description of http://www.rockbox.org/tracker/task/11870 appears dangerously related to this problem...
Comment by Ice Qbe (IceQbe) - Wednesday, 17 August 2011, 07:38 GMT
Same issue here (Fuze v1 with 8GB internal storage, 3.9 installed). The player locks up while transferring files from a windows XP SP3 laptop in windows explorer, but it doesn't seem to crash when copying file by file using a perl script.
Comment by Frantisek Sindelar (franta) - Friday, 07 October 2011, 09:23 GMT
Same here ... fresh installation of 3.9.1 on brand new Fuze v1 - this also happens with 3.8.1
Comment by orestes naraj (orestes) - Sunday, 06 November 2011, 15:07 GMT
Same issue here. Tested with 3.8 and 3.9.1 on a FuzeV1
Had to revert back to 3.7.1 to be able to transfer files.
Comment by Richard C (rc123) - Thursday, 07 June 2012, 23:54 GMT
Also affects me. Windows 7 64-bit, Rockbox 3.11.2 on a v1 Fuze. Whenever I try to transfer files via USB to the Micro SD card, the connection seems to be dropped partway through the transfer and the Fuze needs to be reset.
Comment by MichaelGiacomelli (saratoga) - Saturday, 14 July 2012, 21:30 GMT
I realize its been a year since I asked, but I'd still like to see this bug fixed. If someone who has a player that has USB stability issues could test and figure out which between 3.8 and 3.9 causes this problem, it would greatly improve the chance of this getting fixed.
Comment by steve (stevodevo) - Tuesday, 24 July 2012, 14:18 GMT
+1 on this issue. I have a Fuze v1 and RB 3.11.2 - windows XP SP3. I haven't noticed the issue when I copy files to the internalSD but if I copy files to the external microSD via Windows Explorer then the copy command times out and the unit locks up and shuts itself down. At first I thought something internally was timing out so I tried copying files individually and scrolling the wheel to keep the screen on but after 6 files it still locked up.

This is kind of a critical bug since it affects some major functionality of the unit in my opinion. If there are log files or screenshots I can send please advise. Thanks.
Comment by MichaelGiacomelli (saratoga) - Tuesday, 24 July 2012, 14:54 GMT
>This is kind of a critical bug since it affects some major functionality of the unit in my opinion. If there are log files or screenshots I can send please advise.

There is. See my post immediately above yours.
Comment by orestes naraj (orestes) - Wednesday, 01 August 2012, 03:26 GMT
>I realize its been a year since I asked, but I'd still like to see this bug fixed. If someone who has a player that has USB stability issues could test and figure out which between 3.8 and 3.9 causes this problem, it would greatly improve the chance of this getting fixed.

I've been experiencing this bug since upgrading to 3.8.
It still exists in the current version (3.11.2)
Comment by steve (stevodevo) - Thursday, 02 August 2012, 18:50 GMT
I just downgraded my wife's sansa fuze v1 to RB 3.8.1 from the archives and it seems much better. I was able to copy several large groups of files as well as a fairly large folder (MB-wise) without a crash. It did lockup once but that was before a reboot from the rollback. On a side note, where do you go to get old daily builds? I found the archives but they seem to only be release versions (3.7, 3.8, 3.8.1, etc.)...
Comment by MichaelGiacomelli (saratoga) - Thursday, 02 August 2012, 19:55 GMT
> On a side note, where do you go to get old daily builds? I found the archives but they seem to only be release versions (3.7, 3.8, 3.8.1, etc.)...

I doubt you'll find any binaries from that long ago, most are only stored for a month or so given the shear size they take up. Instead, you'll have to compile them from source:

http://www.rockbox.org/wiki/DevelopmentGuide
Comment by Andreas Klein (Tyrell76) - Wednesday, 19 September 2012, 07:51 GMT
I have the same problem. If you dont want to downgrade, you can use the following work-around if you want to copy files to the Fuze v1:

1. Boot into Recovery Mode by holding down << and then powering on. The Player has to be connected with your computer.
2. After the Fuze has started it will mount as USB Drive and you can access the drive.
3. Copy your new files to the drive as usual. The funny thing is: When i started the fuze with Rockbox i had a transfer rate of about 2.5 MB/sec, when i booted the fuze into recovery mode i had about 3.7 MB/sec.
4. After you copied the files, unplug the cable and wait until it has rebuilt the fuze´s database, i guess thats the old sansa db ...
5. Power down and power on again
6. RB 3.11.2 will boot and you can access the new files.

So i guess the problem comes definitely from Rockbox itself, my OS is Win7 64bit...

cheers
Comment by Alex (ChiefAlex) - Wednesday, 19 September 2012, 08:27 GMT
@Tyrell76:
This is no recovery mode but the normal sandisk's firmware ;)

Also, you can abort the database thing, just holt OFF switchfor 15 seconds. The fuze will turn off, just load rockbox on re-powering.
Comment by Jon Dowland (jmtd) - Wednesday, 03 October 2012, 13:21 GMT
Hi - I've begun to experience this issue. I've started a git bisect, starting with v3.8.1-final.
Comment by Jon Dowland (jmtd) - Wednesday, 03 October 2012, 13:56 GMT
OK I've hit this issue with v3.8.1-final. However I did not reflash the bootloader, and I wonder if the bug is hiding in there.
OK, I've hit it with the archived 3.8.1 build, too. At least I know I'm building the right ones :)
Comment by Jon Dowland (jmtd) - Wednesday, 03 October 2012, 14:14 GMT
OK 3.8 seems good, my 711M test write which failed consistently with later versions worked. I'll try a lot more writing before I 'git bisect good', but it looks like something between 3.8 and 3.8.1, and d1fdb485b12676dc99db29ee9f9900c9c752c124 looks particularly suspicious ("Backport several AMSv2 sd fixes/cleanups for the 3.8.1 branch"): despite AMSv2 in the commit message it touches firmware/target/arm/as3525/sd-as3525v2.c which I think is relevant for v1 too.
Comment by MichaelGiacomelli (saratoga) - Wednesday, 03 October 2012, 14:52 GMT
3.8.1 is actually just 3.8 with a few bug fixes, and I don't think any are involve files related to the Fuze V1. Its not from the development branch. Basically, rockbox has numbered release branches and a development branch. 3.x is forked from the development branch and then any point releases 3.x.x are in that branch. If 3.8 works, 3.8.1 should too since they are nearly the same thing. If you have trouble with 3.8.1, I would double check that 3.8 is really ok.

I don't think the changes to sd-as3525v2.c will have any impact on the Fuze v1, that file isn't compiled for AMSv1 targets and so changes shouldn't do anything (you can check what files build for what target in firmware/SOURCES).

How sure are you that 3.8 work and 3.8.1 does not?
Comment by Jon Dowland (jmtd) - Wednesday, 03 October 2012, 16:37 GMT
OK, now running d1fdb485b12676d^ (i.e. parent of the commit I suspect is responsible) and have managed to put a few 100M on OK. I'll do some more intensive I/O to be sure.
Comment by Jon Dowland (jmtd) - Thursday, 04 October 2012, 06:36 GMT
Hi Michael, thanks for your reply. I jumped to conclusions about sd-as3525v2, and indeed d1fdb485b12676d^ suffers this problem after all. RE testing 3.8, I wrote on the order of 700-800M of data via USB over a period of about 30 minutes, on a Linux host via nautilus. The same write-load triggered the problem on 3.8.1 and d1fdb485b12 and d1fdb485b12^ very reliably. I'll do more of a test on 3.8 to build confidence that its ok.
Comment by Jon Dowland (jmtd) - Thursday, 04 October 2012, 16:18 GMT
Over the course of today I've had the fuzev1 running 3.8 connected, played several hours of music and wrote/re-organised several 100Ms (moving stuff between the internal and SD storage as well as my laptop), absolutely no problems. I'll keep doing that over the weekend.
Comment by Eric Shattow (lucent) - Thursday, 22 November 2012, 01:33 GMT
A big "me too" with Fuze v1 8gb and Fuze v1 4gb. Tested internal storage only. Works no errors in 3.8.1 ; breaks in 3.9 / 3.12 / daily build 3a39f77.
Test Procedure:
1. Sansa firmware (V01.02.31A) format function, USB Mode to MSC
2. Connect to Ubuntu 12.10 host, mount filesystem
3. unzip rockbox.zip -d /mount/point
4. sync, unmount, disconnect and wait for Firmware upgrade in progress to finish and power off
5. Connect to Ubuntu 12.10 host, device powers on and boots Rockbox in USB mode, mount filesystem
6. Initiate file copy from host to device
Problem: After some minutes the host dmesg says
[754691.152259] usb 2-4: reset high-speed USB device number 52 using ehci_hcd
[754706.264258] usb 2-4: device descriptor read/64, error -110
and transfer fails. Device resets on button press, or shows garbled / frozen display. Device internal filesystem contains errors like .rockbox/config sharing data with transferred file copy data.
Comment by Eric Shattow (lucent) - Friday, 23 November 2012, 03:53 GMT
After dozens of tests, I have a 100% hit rate on my file copy data to trigger the bug.
I did git bisect (initial good is ef3ec72 and bad is 0e8e166). Here's what I found w/ git bisect:
The merge base 48b1a2d39d1678c0dfa7b2271c29c52b6c8169d0 is bad.
This means the bug has been fixed between 48b1a2d39d1678c0dfa7b2271c29c52b6c8169d0 and [ef3ec72bfbbb5fa98ed07cdb6666a546b7fd3be0].
Comment by Eric Shattow (lucent) - Friday, 23 November 2012, 07:02 GMT
Updated: Now that I sort of know what I'm doing with git bisect, I have a better answer.
9cc0dab3ced1f0c9205f6cba4933096eca915157 is the first bad commit
commit 9cc0dab3ced1f0c9205f6cba4933096eca915157
Author: Amaury Pouly <pamaury@rockbox.org>
Date: Mon Jul 4 21:55:56 2011 +0000
elftosb: remove duplicate code, merge two redundant fields
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@30123 a1c6a512-1295-4272-9138-f99709370657
:040000 040000 d2dc67c2279954d462de75de05e521e17cd1ad90 3c52d3e8be5e5af0d698039c97d83b122836e974 M utils

I did git bisect this time from git master with 3a39f77 bad ; f9a6bde good. After a lot of compiling and testing, and my 100% hit rate on the bug with my file copy data, I'm confident that this is where to start looking.
Comment by Eric Shattow (lucent) - Friday, 23 November 2012, 07:58 GMT
Diff against git master today 3a39f77ed60df55abf1cf069a903d4665fb17cbc to revert old commit 9cc0dab3ced1f0c9205f6cba4933096eca915157.

Tested Fuze V1 4GB and fixes the issue for me. Please test, what is the reason this has any effect on USB mode?
Comment by Frank Gevaerts (fg) - Friday, 23 November 2012, 09:56 GMT
That shouldn't make any difference. I just built 3a39f77 with and without that patch (with "make VERSION=test" to make sure the compiled-in version numbers are the same), and the resulting binaries are identical. Something else must be going on.
Comment by MichaelGiacomelli (saratoga) - Saturday, 24 November 2012, 02:27 GMT
Perhaps you're just off by a few? There are some AMS commits right around 9cc0dab3ced1f0c9205f6cba4933096eca915157.
Comment by Eric Shattow (lucent) - Saturday, 01 December 2012, 05:05 GMT
Agreed, this git bissect process does not work if I am triggering the failure by random chance.

How to reliably detect if there is a problem?
Comment by Eric Shattow (lucent) - Sunday, 02 December 2012, 10:11 GMT
more extensive process to induce failure mode:
1. format with OF
2. use OF MSC mode usb to mount and unzip rockbox.zip
3. sudo badblocks -c 1 -n -o /tmp/sansascan.txt -s /dev/sdX

In my case I tried this and it seems repeatable to induce failure with rockbox 3.8 ; However, rockbox 3.7 (449838b) works fine, no errors. Can't git bissect 3.7 to 3.8 though.

So... 3.8 might total chance that it works, but no problems with 3.7, yet.
Comment by Eric Shattow (lucent) - Sunday, 02 December 2012, 23:51 GMT
latest attempt to git bisect between 42b7d84 (good), and 149fb18 (bad). Used badblocks as noted above to trigger failure mode.

935b6c63d774bddb01d33b38985eb091b8232ebe is the first bad commit
commit 935b6c63d774bddb01d33b38985eb091b8232ebe
Author: Magnus Holmgren <magnushol@gmail.com>
Date: Sun Nov 21 15:27:36 2010 +0000
Backport fix for  FS#11696  to 3.7 branch: Scrollwheel doesn't respond in some cases.

Any chance of this being related?
Comment by Jon Dowland (jmtd) - Friday, 04 January 2013, 12:01 GMT
Hi - I'm picking this up again after ignoring it for a while. My problem was that I was using the fuzev1 as my primary PMP with my main music store on a 64G microsd, so testing this was causing me grief, and I was worried about causing corruption to my music library which would be hassle to resolve. Anyhow, I've moved my main music to a fuzev2 for the time being and put a dedicated test 2G microsd into my fuzev1, so it can corrupt to its heart's content.

I've done some preliminary re-testing with the new microsd and badblocks: One pass of badblocks on v3.8-final succeeded without problems. I tested a known-bad commit from my previous bisect (57e272cf5db) and one pass failed. Last night I tested git HEAD (c500f4efe521) out of curiosity and one pass of badblocks succeeded… but a second pass just failed! However, the first pass was connected to a USB port direct, and the second via a hub. I wonder if it could be a USB1 vs USB2 issue.

I'm going to reflash v3.8-final and run a load more badblocks tests, then some file tests if they all come out ok, with local USB and via a hub to try and ensure that it's fine for my HW at least.
Comment by Jon Dowland (jmtd) - Monday, 07 January 2013, 10:21 GMT
OK I've hit this a couple of times now with v3.8-final. I haven't actually ever ran RB on this unit which has not suffered the problem, but I'll test OF and 3.7.1 based on other people's comments.
Comment by Jon Dowland (jmtd) - Thursday, 10 January 2013, 16:41 GMT
OF fine, 3.7-final fine so far (passed 3 badblocks tests so far). Interestingly the first common parent commit of v3.7-final and 3.8 and later fails for me, which suggests that the problem was fixed in the 3.7 branch but not in trunk. There are 28 commits in v3.7-final's history that aren't in masters worth looking at. I haven't started to look at any yet, but 4c6425437f8d068618f57789cd276f267ea57bdc popped out at me:

"Revert r28000 on the 3.7 release branch, as for as yet unknow reasons it
causes playback issues on a small number of players."

In particular it was never reverted on the trunk nor in the 3.8 branch.

I'm going to continue bisecting between v3.7-final and v3.8 for now.
Comment by Jon Dowland (jmtd) - Tuesday, 22 January 2013, 11:16 GMT
In case anyone is curious, I'm keeping track of my testing in the following google docs spreadsheet: https://docs.google.com/spreadsheet/ccc?key=0Al0NtsRKNZQ1dFpVMDZUbG9TRUdrY0NZOGN1R0J4cUE
Comment by Orbidia (orbidia) - Sunday, 27 January 2013, 01:25 GMT
""Revert r28000 on the 3.7 release branch, as for as yet unknow reasons it
causes playback issues on a small number of players."
In particular it was never reverted on the trunk nor in the 3.8 branch."

So if that is the problem, can that particular piece of code still be reverted in the latest builds and retested?
Comment by Jon Dowland (jmtd) - Monday, 28 January 2013, 13:46 GMT
Hi Orbidia,

Yes, if we had identified that commit as being the problem, then trying to undo it on the master branch might be a thing to try.

Unfortunately that isn't the "bad" commit. This isn't a normal regression: In effect we're looking for a "good" commit; the main branch has never worked and the 3.7 branch started to work at some point, so we're looking for a single commit that fixed the problem rather than caused it. Normal regression techniques don't work as a result.

I've kept my spreadsheet updated with my testing but here's an alternative view, this one is the output of "git log" for v3.7-final decending coloured according to my test results. https://docs.google.com/spreadsheet/ccc?key=0Al0NtsRKNZQ1dG5FMUxfTUUzaGx2NFdtN3NNWkluaWc

The problem is there's no clear reason why the first three commits are good and the rest bad. The implication is 90fafbea fixed the problem that patch *was* applied in master.

At this point I'm going to try more tests on 90fafbea to increase the confidence that it is indeed good, then when it's undisputable hopefully we can start to figure out why.
Comment by Orbidia (orbidia) - Saturday, 02 February 2013, 19:21 GMT
Thanks for the response. It sounds like you're narrowing it down. Hopefully, it gets straightened out at some point.

I really appreciate the effort and I'm sure everyone else that responded here does too!
Comment by will f (willf) - Saturday, 16 November 2013, 09:55 GMT
So, am I right in thinking that the problem hasn't been seen since commit 5375025a05b0a0fea988f6f0de9531dd (2/27/2012)?
Comment by Eric Shattow (lucent) - Saturday, 16 November 2013, 17:21 GMT
> So, am I right in thinking that the problem hasn't been seen since commit 5375025a05b0a0fea988f6f0de9531dd (2/27/2012)?

No. Filesystem corruption and device lock-up would happen eventually even on "good" commits discussed in this report.

This bug needs looked at by someone with reverse engineering skills to compare what OF does against what compiled output of current rockbox code does.

Loading...