FS#5923 - Ignore "The " in front of Band/Artist names in id3-database - and more

(Using CVS from Sept 29th on iAudio X5:) In my database I have 30 entries starting with "The ". That's quite annoying, and there is a standard of ignoring "The " when sorting.
Note that there is a band called "The The" which may result in a bug, so make sure it just applies to the first "The" if there are several.
I would also like an option to manually add other "prefixes" that will be ignored, so that e.g. "Neil Young" is set to ignore "Neil", and shows up at "Y".
Furthermore in the database, it would be nice if users could set an option to start sorting groupwise, like I have a list showing "A - B - C" and so on. Clicking "C" will show a list containing only bands that start with C. With 300 artists, this could make selecting artists more efficient. (Especially on a big HDD-player)
Comment by chris baldwin (putty182) - Sunday, 03 September 2006, 11:36 GMT
i like the 'the' thing - just dont delete the 'the' altogether.
but as for the A-B-C thing, can't you do this already by making your own views? eg. make one for each letter of the alphabet that only shows artists starting with that letter? I thaught it was possible, just never could be bothered to do it myself.

I would check, except my iPod no-longer exists. Maybe ill buy an iAudio instead, when i can afford it.
Comment by Christoph Päper (crissov) - Sunday, 03 September 2006, 22:23 GMT
I commented on that issue on a now closed previous request: http://www.rockbox.org/tracker/task/2554

Besides “The The” don’t forget to consider bands like “A” and “Da Muttz” (and, while we’re at the issue of collation, “3 Doors Down”), and internationalisation. Automating i18n is problematic, though, because for example one of the German articles to be ignored is “die”, which is an English word, too; compare the band names “Die Happy” and “Die Ärzte”. That, and people who want to sort real-name solo artists by their surnames, is the reason for the inclusion of special fields for sorting purposes in several tag standards. They are still undersupported, sadly.
Comment by Ketil W Aanensen (kwaanens) - Monday, 04 September 2006, 09:31 GMT
1: No, "The " shouldn't be deleted. It can either be put in last, like: "Band, The", or "The Band" is listed on "B"
2: No, you can't. Not when you're using the id3tag-database. At least I have no idea how.

Christoph: I agree there are a lot of possible prefixes ("...And you will know us by the trail of dead" is another example), that's why I suggested having the option to manually make a list of prefixes, so that users can choose for themselves. (Getting sort of off-topic, if you mean that "3 doors down" should be sorted on "D", I disagree. There has long been a standard that numbers are sorted, and go before letters. Where would you otherwise put a band called "35007"? Come to think of it, where do we place a band called "!!!"?)

- Ketil
Comment by Christoph Päper (crissov) - Monday, 04 September 2006, 14:40 GMT
IMHO “3 Doors Down” should be sorted after “Threat Level 5” and before “The Thrills”, because history shows that numbers are sometimes spelt out and sometimes they are not, e.g. on album covers, even if this particular band’s marketing is consistent. “35007” might be put before “A”, though, or sorted as “Loose”, but no maintainable automatism would do that.

I still think tags are better than a manually edited list o’ magic, because that “Die” example actually occurs in my audio library.

Concerning “!!!”, what does http://www.unicode.org/unicode/reports/tr10/ say? Does the UCA define a language-independent “simply ignore diacritics” mode (i.e. Ä=A, not Ä=Ae nor Ä>A or Ä>Z)? Is it implemented/-able in Rockbox at all?
Comment by Ketil W Aanensen (kwaanens) - Tuesday, 05 September 2006, 08:49 GMT
In order to do what I propose it could be worth checking out http://amarok.kde.org/. Amarok ignores "The" and handles id3tags very nicely.
Of course there will be trouble with "Die " and things, but (and no offence to German speaking people) the "Die"-problem is marginal for most users. (And not sorting "Die" is hardly an argument against sorting "The".
On another note, Amarok also sorts albums featuring various artists under various artists, even though the tracks all are indicated with different band names. I would like to be able to choose to select a various artists album both from "Artis" > "Various artists" *as well as* from the specific band names.

Concerning "!!!" it is currently sorted before numbers and letters on Rockbox, and that's probably where it should be, as it was merely a joke :)

- Ketil
Comment by Christoph Päper (crissov) - Tuesday, 05 September 2006, 22:59 GMT
I’m not saying the “die” case wasn’t marginal for many users (and I don’t know any Latin-script language which has the word “the”, besides English of course), but it shows that voodoo only works sometimes---where’s my rubber chicken?
In conclusion the ID3v2.4 tags (TSOP, TSOT, TSOA) should be supported, too, and preferred when existing. According to http://age.hobba.nl/audio/tag_frame_reference.html Matroska uses a different approach for this, but APEv2 and Ogg Vorbis don’t feature a separate sort key. (JFTR, I don’t know which non-MPEG-1 audio formats and tags thereof Rockbox currently supports, because I’m an AJR user.)

Concerning VA, I like the Foobar2000 community’s way: http://wiki.hydrogenaudio.org/?title=Foobar2000:Encouraged_Tag_Standards I don’t know how exactly Amarok works here.