FS#11876 - Playlists Choke on non-ASCII Characters

Attached to Project: Rockbox
Opened by Martin Gallant (martyg) - Thursday, 13 January 2011, 19:20 GMT
Last edited by Nils Wallménius (nls) - Saturday, 12 February 2011, 17:26 GMT
Task Type Bugs
Category Battery/Charging
Status Closed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Release 3.7.1
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


One of my Podcast subscriptions is occasionally embedding non-ASCII (Unicode?) characters in their filenames.

This is causing playlists created from those files to skip the affected files.

I am enclosing an example m3u file, created from MediaMonkey v1297.
(Notice the apostrophes on the first two file names)
This task depends upon

Closed by  Nils Wallménius (nls)
Saturday, 12 February 2011, 17:26 GMT
Reason for closing:  Not a Bug
Additional comments about closing:  This is not a bug, if using m3u playlists with non ascii chars the correct codepage must be set, this is not something rockbox can do itself by magic.
Comment by Nils Wallménius (nls) - Thursday, 13 January 2011, 20:37 GMT
It looks to me like that playlist is saved with some other codepage than unicode so rockbox will not find the matching filenames since they differ.
Look for an option in your application for saving unicode playlists, they are often named .m3u8
Comment by Martin Gallant (martyg) - Friday, 14 January 2011, 19:08 GMT
I will take this up with my app vendor and the author of this feed. Thanks for looking at this, Please close.
Comment by Martin Gallant (martyg) - Saturday, 15 January 2011, 14:34 GMT
After further research, I believe this is a legitimate issue with how Rockbox handles Unicode embedded in file names.
(I apologize in advance for not being a Unicode expert)

I have uploaded a testcase directory so we are all on the same page (10Mb)

If you play Playlists/Unicode.m3u (3 files)
you will see the middle file gets skipped in playback
This is because of the UTF-8 apostrophe (0xE28099) in the filename

Also, if you list the playlist from the WPS, you will see the send entry prefixed with (ERR)

Interestingly, if you save this playlist.
e.g. ./dynamic.m3u8
you will see the apostrophe was translated to Unicode 0xC292
But this playlist still doesn't work, presumably because the referenced file was not changed
(I think this is a second, separate issue from the above)
Comment by Magnus Holmgren (learman) - Sunday, 16 January 2011, 14:32 GMT
Try setting the default code page (in general settings, display settings) to "Central European (CP1250)" (or one of the other CP125x code pages, if they are more suitable for you). I suspect it is set to "Latin1 (ISO-8859-1)", which would explain the contents of dynamic.m3u8.
Comment by Martin Gallant (martyg) - Sunday, 16 January 2011, 18:49 GMT
Setting the codepage to CP1250 prevents the track from being skipped. Thanks for the workaround.

Note the characters used in the filename and playlist are consistent, so I don't think any user intervention should have been required.
I still think this is legitimate bug, as there is no way even a sophisticated user could have figured out how to get this to work.

This embedded Unicode character was completely transparent to the other players I use.
I did not realize something was amiss until I tried playing this fie through Rockbox.
Comment by Magnus Holmgren (learman) - Sunday, 16 January 2011, 21:07 GMT
This is the problem with code pages. You need to have a matching one, or things like this will happen. Having Rockbox try different code pages until there is a match doesn't seem like a very nice solution (and could possibly pick the wrong file, even if that is unlikely). Maybe the default code page should be different though (e.g., a Windows code page). The documentation could probably be better here too.

If the other players are Windows applications, they can use the current OS code page, and things will just work. Copy the playlist to a different computer or device, and things can fail. This one important reason for the M3U8 format (where there isn't any code page to worry about). Seems like a better solution than adding special UTF-8 filename comments in the M3U file.