Rockbox

Tasklist

FS#5089 - Accented Characters in Playlists Don't Work?

Attached to Project: Rockbox
Opened by Jeremy (FlyingSaucrDude) - Tuesday, 11 April 2006, 00:32 GMT
Task Type Bugs
Category Playlists
Status Closed
Assigned To No-one
Operating System Iriver H100 series
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

I recently got Rockbox and put it on my H120. I copied a bunch of files over, including some playlists I have for an artist whose name has an accented "e" in it (google "Bela Fleck" and you'll see the proper spelling with the accent). Anyway, when I tried to load up a playlist that included a file with an accented "e" in it, it looked as if Rockbox couldn't find the files with the accents. (All other files were fine.) For example, Rockbox won't play any of the files in the attached "WorksOnPC_NotOnPlayer.m3u" playlist.

However, if I generate the playlist using Rockbox on the H120 itself, everything works fine -- and the playlist looks like the attached file "WorksOnPlayer_NotOnPC.m3u". However, that playlist doesn't work on my PC.

If you open them both up in notepad, they look identical. However, opening them up in XEmacs (or any other more sophisticated text editor, I suppose) showed the difference -- in the playlist that works on the PC, XEmacs shows the accented "e" properly. However, in the playlist that works on the player, the accented "e" shows up as some weird character (an A with a tilde on top, followed by a copyright symbol).

Since I just got Rockbox and have no idea how it works, I have no idea what's causing this...all I know is, it would be nice if Rockbox treated accented characters the same way as my PC, so I wouldn't have to hand-tune my playlists that have accented characters.

P.S. Thanks for an awesome piece of software! I'm gonna stick with Rockbox anyway...and if somebody can point me in the right direction, I'd be happy to lend a hand trying to fix this.
This task depends upon

Closed by  Magnus Holmgren (learman)
Thursday, 30 November 2006, 19:24 GMT
Reason for closing:  Fixed
Additional comments about closing:  This should now be fixed. .m3u playlists are read using the default code page. However, if the file does begin with a BOM, it will be read as UTF-8, regardless of extension.
Comment by Jeremy (FlyingSaucrDude) - Tuesday, 11 April 2006, 00:36 GMT
Forgot to give more info...I'm using Rockbox version 060410-0517, if it helps.
Comment by Dominik Riebeling (bluebrother) - Tuesday, 11 April 2006, 07:55 GMT
The cause is really simple: Rockbox uses utf-8 as character encoding when generating the playlist, your PC latin1. When opening the Rockbox playlist with vim and switching to utf-8 everything looks fine.
So the simple solution would be to avoid extended characters (like your accented e) for filenames. This doesn't affect your files' metadata (Vorbis comments / ID3 tags), so the wps will show it correctly (assuming your files have proper metadata).

I haven't tried this myself but IMO the "default codepage" setting should be applied when generating playlists as well. I don't know if it already works like this but maybe changing this setting helps.
Comment by Jeremy (FlyingSaucrDude) - Tuesday, 11 April 2006, 09:48 GMT
What you said makes sense. I tried the "Default Codepage" setting, but it doesn't look like it's being applied when generating (or reading) playlists. I tried changing the "Default Codepage" setting to UTF-8 and to Latin 1, but in both cases it seemed to be reading and writing playlists in UTF-8.

I'm not sure whether getting playlist reading/writing to use the "Default Codepage" setting constitutes a bug or a feature request, but either way I agree -- it should be used. Plus, I've never been one for simple solutions...

I'm pretty busy at the moment, but perhaps when I get some free time I'll look into this...of course, if somebody who actually knows the code could find a quick fix, that would be great too.
Comment by Jeremy (FlyingSaucrDude) - Tuesday, 11 April 2006, 11:19 GMT
I tried delving into the code a little bit, but I couldn't make sense of most of it...it looks like it would take me weeks just to figure it out. If anybody has any pointers on where I can find the code that reads in the filenames from a playlist, that would be great. (Also, it probably wouldn't hurt to fix the text_viewer plugin as well -- it also seems to suffer from this problem.)
Comment by PY (py) - Wednesday, 19 April 2006, 06:49 GMT
Hi all!
I've discovered Rockbox a week ago, now that it runs on my iPod mini 2g (20060416), and find it such a great project that I took some time testing this feature. I hope you developers can find it of interest and come up with something 'cause I can't wait to convince my friends here in ShangHai to use it, and international support is a priority for them.

The file includes all playlists tested and a text file with results and conclusions of all the tests made (almost 60). But for simpler reading, I've copied my suggestions for improvement below:

S1) Support for non-utf playlist usinf Default Codepage
S2) Choice of utf use [utf / codepage] when saving playlist and for all file operations
S3) Support for .m3u8 extension utf playlist (simple, for foobar2000 compatibility)
S4) Support for non-UNIX files by handling properly the first character / control code ??
S5) Choice of [Relative / Full path] when saving playlist
S6) Choice of text file format [DOS / UNIX] when saving playlist and for all file operations
Comment by PY (py) - Wednesday, 19 April 2006, 06:52 GMT
With the attachment...
Comment by PY (py) - Friday, 21 April 2006, 00:17 GMT
Comment on S4:

"UTF-8 files can sometimes start with what is called a byte-order mark (BOM for short), which is three bytes that are used to indicate that the file is indeed an UTF-8 file. Rockbox currently doesn't check for or remove any BOM, so if one is present, the first track would likely be skipped."
Thanks to Lear

http://forums.rockbox.org/index.php?topic=3444.15
Comment by Dominik Riebeling (bluebrother) - Friday, 21 April 2006, 07:05 GMT
To S3:
what exactly is .m3u8? m3u playlists in utf8 with or without BOM? If it's always without BOM and without a specified line end coding it really should be fairly simple.

To S4:
The BOM isn't specific for *nix, you can find it on "Windows uft8-files" as well. But as Windows usually uses an 8 bit encoding (like latin9 in europe) you'll hardly find a) utf8 files, b) files with BOM.

To S6:
Why do we need DOS text file type for playlists? Can't Windows players (like foobar and winamp) handle with Unix text file type? The only program I know that doesn't handle unix text files is notepad.
Comment by PY (py) - Friday, 21 April 2006, 07:42 GMT
.m3u8 is the extension given by foobar2000 to a playlist saved in utf-8. It is identical to .m3u. Since foobar2000 seems popular for windows users, it would be convenient if Rockbox recognized .m3u8 as a playlist. UTF-8 files generated under windows always seem to insert the BOM (foobar2000, notepad, UltraEdit...) which curently has to be removed with a hex editor in order to avoid Rockbox skipping the 1st track.

UltraEdit was messing up the BOM (my mistake). After using a decent hex editor, I can say that UNIX and DOS playlists are handled similarely by all media players tested including Rockbox. However, without a BOM, most windows players fail to recognize UTF and playlists are often garbage. Only VLC reads BOM-less fine. So S6 could be forgotten, but BOM not.


Comment by PY (py) - Friday, 21 April 2006, 08:10 GMT
Edit:
Actually, most windows players also fail to recognize UTF with BOM. Only foobar2000 and Media Player Classic don't complain. So even more important than Rockbox writing BOM or not, is to be able to write in ANSI (cf S2). As far as reading is concerned, support for playlists with BOM would still be very much appreciated.
Comment by Rabid Teddybear (Mr_Rabid_Teddybear) - Sunday, 23 April 2006, 14:34 GMT
To sum it up: m3u8 are m3u playlist files in UTF-8 encoding created by foobar2000. The files uses DOS (CR/LF) lineendings and contain byte order mark (BOM) EF BB BF. Rockbox doesn't currently handle BOM and skips the first line (track) in the file. A patch that tells Rockbox how to handle BOM should be done.
Comment by Dominik Riebeling (bluebrother) - Friday, 04 August 2006, 20:04 GMT
I've created a patch which skips a BOM on a playlist if any is present. See  FS#5770 .
Comment by Rani Hod (RaeNye) - Thursday, 28 September 2006, 16:21 GMT
Anything left here now that  FS#5770  is in CVS?
Comment by Dominik Riebeling (bluebrother) - Thursday, 28 September 2006, 18:59 GMT
Yes: playlists are always treated as utf8 by Rockbox. IMO we should use the chosen encoding for playlists and move to using m3u8 as extension for our (utf8) playlists.
Comment by Michael Sevakis (MikeS) - Friday, 27 October 2006, 10:39 GMT
I think RockBox should use UTF8 exclusively internally and convert any non-Unicode material to UTF-8 and back if needed.

Loading...