sort performs two fundamentally different
kinds of sorting operations: alphabetic sorts and numeric sorts. An alphabetic
sort is performed according to the traditional "dictionary order," using the
ASCII collating sequence. Uppercase letters come before lowercase letters
(unless you specify the -f
option, which "folds" uppercase and
lowercase together), with numerals and punctuation interspersed. The
-l
(lowercase L) option sorts by the
current locale instead of the default US/ASCII order.
This is all fairly trivial and common sense. However, it's worth belaboring the difference, because it's a frequent source of bugs in shell scripts. Say you sort the numbers 1 through 12. A numeric sort gives you these numbers "in order," just like you'd expect. An alphabetic sort gives you:
1 11 12 2 ...
Of course, this is how you'd sort the numbers if you applied dictionary rules
to the list. Numeric sorts can handle
decimal numbers (for example, numbers like
123.44565778); they can't handle floating-point numbers (for example,
1.2344565778E+02). The GNU sort
does provide the -g
flag for sorting numbers in scientific notation.
Unfortunately, it is significantly slower than plain old decimal sorting.
What happens if you include alphabetic characters in a numeric sort? Although the results are predictable, I would prefer to say that they're "undefined." Including alphabetic characters in a numeric sort is a mistake, and there's no guarantee that different versions of sort will handle them the same way. As far as I know, there is no provision for sorting hexadecimal numbers.
One final note: your version of numeric sort may treat initial blanks as
significant, sorting numbers with additional spaces before them ahead of numbers
without the additional spaces. This is an incredibly stupid misfeature. There is
a workaround: use the -b
(ignore leading blanks) and always specify a
sort field.[2] That is, sort -nb +0
will do what
you expect; sort -n
won't.
— ML
[2] Stupid misfeature number 2: -b
doesn't work unless
you specify a sort field explicitly, with a +n
option.