|
Rockbox mail archiveSubject: Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?
From: Bryan VanDyke <bryan.vandyke_at_gmail.com>
Date: Thu, 19 Mar 2009 14:24:54 -0400 codemonkey wrote: > Are you guys aware that there's a quasi-standard regarding this in > the GNU libraries? See the following excerpt from Fedora "info ls" > and "man strverscmp". > > ~ray > > PS: I've found that "ls -v" works well for sorting MP3s with track > numbering, etc. I don't know if it handles all of the cases described in > this thread though. Maybe GNU's implementation is worth borrowing for > rockbox? > > ------ > > $ info ls > > (...excerpt...) > > 10.1.4 More details about version sort > -------------------------------------- > > The version sort takes into account the fact that file names frequently > include indices or version numbers. Standard sorting functions usually > do not produce the ordering that people expect because comparisons are > made on a character-by-character basis. The version sort addresses > this problem, and is especially useful when browsing directories that > contain many files with indices/version numbers in their names: > > $ ls -1 $ ls -1v > foo.zml-1.gz foo.zml-1.gz > foo.zml-100.gz foo.zml-2.gz > foo.zml-12.gz foo.zml-6.gz > foo.zml-13.gz foo.zml-12.gz > foo.zml-2.gz foo.zml-13.gz > foo.zml-25.gz foo.zml-25.gz > foo.zml-6.gz foo.zml-100.gz > > Note also that numeric parts with leading zeros are considered as > fractional one: > > $ ls -1 $ ls -1v > abc-1.007.tgz abc-1.007.tgz > abc-1.012b.tgz abc-1.01a.tgz > abc-1.01a.tgz abc-1.012b.tgz > > This functionality is implemented using the `strverscmp' function. > > ------ > > $ man strverscmp > > STRVERSCMP(3) Linux Programmer’s Manual > STRVERSCMP(3) > > NAME > strverscmp - compare two version strings > > SYNOPSIS > #define _GNU_SOURCE > #include <string.h> > > int strverscmp(const char *s1, const char *s2); > > DESCRIPTION > Often one has files jan1, jan2, ..., jan9, jan10, ... and it > feels > wrong when ls(1) orders them jan1, jan10, ..., jan2, ..., jan9. In > order to rectify this, GNU introduced the -v option to ls(1), which > is > implemented using versionsort(3), which again uses strverscmp(). > > Thus, the task of strverscmp() is to compare two strings and > find > the "right" order, while strcmp(3) only finds the lexicographic > order. > This function does not use the locale category LC_COLLATE, so is > meant > mostly for situations where the strings are expected to be in > ASCII. > > What this function does is the following. If both strings are > equal, > return 0. Otherwise find the position between two bytes with the > property that before it both strings are equal, while directly after > it > there is a difference. Find the largest consecutive digit strings > containing (or starting at, or ending at) this position. If one or > both > of these is empty, then return what strcmp(3) would have returned > (numerical ordering of byte values). Otherwise, compare both digit > strings numerically, where digit strings with one or more leading > zeroes > are interpreted as if they have a decimal point in front (so that in > particular digit strings with more leading zeroes come before digit > strings with fewer leading zeroes). Thus, the ordering is 000, > 00, > 01, 010, 09, 0, 1, 9, 10. > > RETURN VALUE > The strverscmp() function returns an integer less than, equal to, or > greater than zero if s1 is found, respectively, to be earlier than, > equal to, or later than s2. > > CONFORMING TO > This function is a GNU extension. > > SEE ALSO > rename(1), strcasecmp(3), strcmp(3), strcoll(3), > feature_test_macros(7) > > GNU 2001-12-19 > STRVERSCMP(3) > > Seems very close. My understanding is natural sort would interpret as: 000, 00, 0, 01, 1, 09, 9, 010, 10. Received on 2009-03-19 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |