FS#9538 - tagnavi search condition opperator supports for non lattin alphabet

Attached to Project: Rockbox
Opened by Yoshihisa Uchida (Uchida) - Sunday, 09 November 2008, 07:54 GMT
Task Type Patches
Category Database
Status Unconfirmed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Daily build (which?)
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No


The search condition of tagnavi (eg. =, >, ^,...) doesn't return a correct search result to characters other than the Latin alphabet.

For example;
1) artist ^ "A"
Neither "Ándre" nor "ándre" appear to the search result.

2) artist = "Ándre"
Neither "Andre" nor "andre" appear to the search result.

My patch file solves such a problem.

Because it doesn't test enough.
There is still a possibility that the search result is not correct
according to the character used.
Please report.

About performance
The search time is slow than before.
I will improve this in the future.

About patch
  Please execute make zip (or make fullzip) after applying
the patch to the source file.

  There is in .rockbox/codepages folder
when is unziped, and copy this file onto your player's
(The search result doesn't correct if there is no

About search result
1) It is considered that the character that is the difference
   of the uppercase, titlecase and the lowercase is the same.
eg. A = a, Ω = ω

2) It is considered that the character with the pronunciation sign
(accent, umlaut, etc.) is the same as the character to which these are not attached.
eg. A = Á

3) The combination character is considered to be a character that divides into each character.
  eg. Π= O E

4) Only Japanese: the Hiragana, the halfwidth Katakana are considered to be the same character as the fullwidth Katakana.
eg. あ = ア, ア = ア
This task depends upon

Comment by Frank Gevaerts (fg) - Sunday, 09 November 2008, 15:39 GMT
Some comments:
- It's not clear to me that 1) is a problem, and I think that 2) is actually desired behaviour (i.e. if I explicitely ask for an accent, I want to get it)
- Such tables need to be language dependent for them to work properly in all cases (Have a look at e.g. I,i in Turkish, but I think that the problem is likely to be more general)
- How do you handle things like ue vs ü? Again, this is language dependent (I'm not called Gevärts)
- This is going to add a significant chunk to the core RAM usage

One complication with all the language-dependedness is that it's also not always clear if it depends on the UI language or the tag language.
Comment by Yoshihisa Uchida (Uchida) - Wednesday, 10 December 2008, 11:20 GMT
I am sorry for very late the answer.

My patch file was updated.
- The usage of the memory was changed to an efficient method.
- It was corrected that the user was able to select the mapping rule.
Settings > General Settings > Database
New menu "Tag string mapping rule" add.
- "Ignore diacritial mark"
If you select "Yes", then it is considered that à = a. (default)
If you select "No", then à != a.

- "Umlaut mapping mode"
If you select "ä -> ä", then it is considered that ä != a, ae, ë != e, ee, ï != i, ie, ü != u, ue, ö != o, oe.
If you select "ä -> a", then it is considered that ä = a, ë = e, ï = i, ü = u, ö = o. (default)
If you select "ä -> ae", then it is considered that ä = ae, ë = ee, ï = ie, ü = ue, ö = oe.

The mapping rule of each language was not prepared.
It is because the music file written in the language in various countries is obtained, and there is a possibility of hearing it with Rockbox.

A lot of mapping rules might come out. It will think about another method again at that time.

Moreover, please give the comment to me when there is a noticed point.
Comment by Yoshihisa Uchida (Uchida) - Friday, 10 July 2009, 14:52 GMT
sync r21743.
Comment by Yoshihisa Uchida (Uchida) - Friday, 10 July 2009, 14:54 GMT
sorry, send patch file wrong. I send correct patch file.