Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?

Re: how is strnatcmp aka "Interpret numbers while sorting" supposed to sort?

From: Bryan VanDyke <bryan.vandyke_at_gmail.com>
Date: Thu, 19 Mar 2009 14:24:54 -0400

codemonkey wrote:
> Are you guys aware that there's a quasi-standard regarding this in
> the GNU libraries? See the following excerpt from Fedora "info ls"
> and "man strverscmp".
>
> ~ray
>
> PS: I've found that "ls -v" works well for sorting MP3s with track
> numbering, etc. I don't know if it handles all of the cases described in
> this thread though. Maybe GNU's implementation is worth borrowing for
> rockbox?
>
> ------
>
> $ info ls
>
> (...excerpt...)
>
> 10.1.4 More details about version sort
> --------------------------------------
>
> The version sort takes into account the fact that file names frequently
> include indices or version numbers. Standard sorting functions usually
> do not produce the ordering that people expect because comparisons are
> made on a character-by-character basis. The version sort addresses
> this problem, and is especially useful when browsing directories that
> contain many files with indices/version numbers in their names:
>
> $ ls -1 $ ls -1v
> foo.zml-1.gz foo.zml-1.gz
> foo.zml-100.gz foo.zml-2.gz
> foo.zml-12.gz foo.zml-6.gz
> foo.zml-13.gz foo.zml-12.gz
> foo.zml-2.gz foo.zml-13.gz
> foo.zml-25.gz foo.zml-25.gz
> foo.zml-6.gz foo.zml-100.gz
>
> Note also that numeric parts with leading zeros are considered as
> fractional one:
>
> $ ls -1 $ ls -1v
> abc-1.007.tgz abc-1.007.tgz
> abc-1.012b.tgz abc-1.01a.tgz
> abc-1.01a.tgz abc-1.012b.tgz
>
> This functionality is implemented using the `strverscmp' function.
>
> ------
>
> $ man strverscmp
>
> STRVERSCMP(3) Linux Programmer’s Manual
> STRVERSCMP(3)
>
> NAME
> strverscmp - compare two version strings
>
> SYNOPSIS
> #define _GNU_SOURCE
> #include <string.h>
>
> int strverscmp(const char *s1, const char *s2);
>
> DESCRIPTION
> Often one has files jan1, jan2, ..., jan9, jan10, ... and it
> feels
> wrong when ls(1) orders them jan1, jan10, ..., jan2, ..., jan9. In
> order to rectify this, GNU introduced the -v option to ls(1), which
> is
> implemented using versionsort(3), which again uses strverscmp().
>
> Thus, the task of strverscmp() is to compare two strings and
> find
> the "right" order, while strcmp(3) only finds the lexicographic
> order.
> This function does not use the locale category LC_COLLATE, so is
> meant
> mostly for situations where the strings are expected to be in
> ASCII.
>
> What this function does is the following. If both strings are
> equal,
> return 0. Otherwise find the position between two bytes with the
> property that before it both strings are equal, while directly after
> it
> there is a difference. Find the largest consecutive digit strings
> containing (or starting at, or ending at) this position. If one or
> both
> of these is empty, then return what strcmp(3) would have returned
> (numerical ordering of byte values). Otherwise, compare both digit
> strings numerically, where digit strings with one or more leading
> zeroes
> are interpreted as if they have a decimal point in front (so that in
> particular digit strings with more leading zeroes come before digit
> strings with fewer leading zeroes). Thus, the ordering is 000,
> 00,
> 01, 010, 09, 0, 1, 9, 10.
>
> RETURN VALUE
> The strverscmp() function returns an integer less than, equal to, or
> greater than zero if s1 is found, respectively, to be earlier than,
> equal to, or later than s2.
>
> CONFORMING TO
> This function is a GNU extension.
>
> SEE ALSO
> rename(1), strcasecmp(3), strcmp(3), strcoll(3),
> feature_test_macros(7)
>
> GNU 2001-12-19
> STRVERSCMP(3)
>
>

Seems very close. My understanding is natural sort would interpret as:
000, 00, 0, 01, 1, 09, 9, 010, 10.
Received on 2009-03-19


Page was last modified "Jan 10 2012" The Rockbox Crew
aaa