Rockbox mail archiveSubject: Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?
Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?
From: Bryan VanDyke <bryan.vandyke_at_gmail.com>
Date: Thu, 19 Mar 2009 14:24:54 -0400
> Are you guys aware that there's a quasi-standard regarding this in
> the GNU libraries? See the following excerpt from Fedora "info ls"
> and "man strverscmp".
> PS: I've found that "ls -v" works well for sorting MP3s with track
> numbering, etc. I don't know if it handles all of the cases described in
> this thread though. Maybe GNU's implementation is worth borrowing for
> $ info ls
> 10.1.4 More details about version sort
> The version sort takes into account the fact that file names frequently
> include indices or version numbers. Standard sorting functions usually
> do not produce the ordering that people expect because comparisons are
> made on a character-by-character basis. The version sort addresses
> this problem, and is especially useful when browsing directories that
> contain many files with indices/version numbers in their names:
> $ ls -1 $ ls -1v
> foo.zml-1.gz foo.zml-1.gz
> foo.zml-100.gz foo.zml-2.gz
> foo.zml-12.gz foo.zml-6.gz
> foo.zml-13.gz foo.zml-12.gz
> foo.zml-2.gz foo.zml-13.gz
> foo.zml-25.gz foo.zml-25.gz
> foo.zml-6.gz foo.zml-100.gz
> Note also that numeric parts with leading zeros are considered as
> fractional one:
> $ ls -1 $ ls -1v
> abc-1.007.tgz abc-1.007.tgz
> abc-1.012b.tgz abc-1.01a.tgz
> abc-1.01a.tgz abc-1.012b.tgz
> This functionality is implemented using the `strverscmp' function.
> $ man strverscmp
> STRVERSCMP(3) Linux Programmer’s Manual
> strverscmp - compare two version strings
> #define _GNU_SOURCE
> #include <string.h>
> int strverscmp(const char *s1, const char *s2);
> Often one has files jan1, jan2, ..., jan9, jan10, ... and it
> wrong when ls(1) orders them jan1, jan10, ..., jan2, ..., jan9. In
> order to rectify this, GNU introduced the -v option to ls(1), which
> implemented using versionsort(3), which again uses strverscmp().
> Thus, the task of strverscmp() is to compare two strings and
> the "right" order, while strcmp(3) only finds the lexicographic
> This function does not use the locale category LC_COLLATE, so is
> mostly for situations where the strings are expected to be in
> What this function does is the following. If both strings are
> return 0. Otherwise find the position between two bytes with the
> property that before it both strings are equal, while directly after
> there is a difference. Find the largest consecutive digit strings
> containing (or starting at, or ending at) this position. If one or
> of these is empty, then return what strcmp(3) would have returned
> (numerical ordering of byte values). Otherwise, compare both digit
> strings numerically, where digit strings with one or more leading
> are interpreted as if they have a decimal point in front (so that in
> particular digit strings with more leading zeroes come before digit
> strings with fewer leading zeroes). Thus, the ordering is 000,
> 01, 010, 09, 0, 1, 9, 10.
> RETURN VALUE
> The strverscmp() function returns an integer less than, equal to, or
> greater than zero if s1 is found, respectively, to be earlier than,
> equal to, or later than s2.
> CONFORMING TO
> This function is a GNU extension.
> SEE ALSO
> rename(1), strcasecmp(3), strcmp(3), strcoll(3),
> GNU 2001-12-19
Seems very close. My understanding is natural sort would interpret as:
000, 00, 0, 01, 1, 09, 9, 010, 10.
Received on 2009-03-19