-
- sort - sort and/or merge files
-
- sort [ options ] [ file ... ]
-
- sort sorts lines of all the files together and writes the result on the standard output. The file
name - means the standard input. If no files are named, the standard input is sorted.
- The default sort key is an entire line. Default ordering is lexicographic by bytes in machine collating sequence.
The ordering is affected globally by the following options, one or more of which may appear. See recsort(3)
for details.
- For backwards compatibility the -o option is allowed in any file operand position when neither the -c
nor the -- options are specified.
-
- -k, --key=pos1[,pos2]|.reclen|.position.length]]
- Restrict the sort key to a string
beginning at pos1 and ending at pos2. pos1 and pos2 each have the form m.n, counting
from 1, optionally followed by one or more of the flags CMbdfginprZ; m counts fields from the beginning
of the line and n counts characters from the beginning of the field. If any flags are present they override all
the global ordering options for this key. If .n is missing from pos1, it is taken to be 1; if missing
from pos2, it is taken to be the end of the field. If pos2 is missing, it is taken to be end of line. The
second form specifies a fixed record length reclen, and the last form specifies a fixed field at byte position
position (counting from 1) of length bytes. The obsolescent reclen:fieldlen:offset (byte offset
from 0) is also accepted.
- -K, --oldkey=pos
- Specified in pairs: -K pos1 -K pos2, where
positions count from 0.
- -R, --record|recfmt=format
- Sets the record format to format; newlines will be
treated as normal characters. The formats are:
- d[terminator]
- Variable length with record terminator character,
\n by default.
- [f]reclen
- Fixed record length reclen.
- v[op...]
- Variable length. h4o0z2bi (4 byte IBM V format descriptor) if op
are omitted. op may be a combination of:
- hn
- Header size is n bytes (default 4).
- on
- Size offset in header is n bytes (default 0).
- zn
- Size length is n bytes (default min(h-o,2)).
- b
- Size is big-endian (default).
- l
- Size is little-endian (default b).
- i
- Record length includes header (default).
- n
- Record length does not include header (default i).
- %
- If the record format is not otherwise specified, and the any input file name, from left
to right, ends with %format or %format.* then the record format is set to format
. In addition, the -o path, if specified and if it does not contain % and if it names a regular file,
is renamed to contain the input %format.
- -
- The first block of the first input file is sampled to check for v variable
length and f fixed length format records. Not all formats are detected. sort exits with an error
diagnostic if the record format cannot be determined from the sample.
- -b, --ignorespace
- Ignore leading white space (spaces and tabs) in field comparisons.
- -d, --dictionary
- `Phone directory' order: only letters, digits and white space are significant in
string comparisons.
- -C, --codeset|convert=codeset|from:to
- The field data codeset is codeset or the field
data must be converted from the from codeset to the to codeset. The codesets are:
- ascii
- 8 bit ascii
- ebcdic
- X/Open ebcdic
- o|ebcdic-o
- mvs OpenEdition ebcdic
- h|ebcdic-h
- ibm OS/400 AS/400 ebcdic
- s|ebcdic-s
- siemens posix-bc ebcdic
- i|ebcdic-i
- X/Open ibm ebcdic (not idempotent)
- m|ebcdic-m
- mvs ebcdic
- u|ebcdic-u
- microfocus cobol ebcdic
- native
- native code set
- -f, --fold|ignorecase
- Fold lower case letters onto upper case.
- -i, --ignorecontrol
- Ignore characters outside the ASCII range 040-0176 in string comparisons.
- -J, --shuffle|jumble=seed
- Do a random shuffle of the sort keys. seed specifies a
pseudo random number generator seed. A seed of 0 generates a seed based on time and pid.
- -n, --numeric
- An initial numeric string, consisting of optional white space, optional sign, and a
nonempty string of digits with optional decimal point, is sorted by value.
- -g, --floating
- Numeric, like -n, with e-style exponents allowed.
- -p, --bcd|packed-decimal
- Compare packed decimal (bcd) numbers with trailing sign.
- -M, --months
- Compare as month names. The first three characters after optional white space are
folded to lower case and compared. Invalid fields compare low to jan.
- -r, --reverse|invert
- Reverse the sense of comparisons.
- -t, --tabs=tab-char
- `Tab character' separating fields is char.
- -c, --check
- Check that the single input file is sorted according to the ordering rules; give no
output unless the file is out of sort.
- -j, --processes|nproc|jobs=processes
- Use up to jobs separate processes to sort the
input. The current implementation still uses one process for the final merge phase; improvements are planned.
- -m, --merge
- Merge; the input files are already sorted.
- -u, --unique
- Unique. Keep only the first of two lines that compare equal on all keys. Implies -s
.
- -s, --stable
- Stable sort. When all keys compare equal, preserve input order.
- -S, --unstable
- Unstable sort. When all keys compare equal, break the tie by using the entire
record, ignoring all but the -r option. This is the default.
- -o, --output=output
- Place output in the designated file instead of on the standard
output. This file may be the same as one of the inputs. The file - names the standard output. The option
may appear among the file arguments, except after --.
- -l, --library=library[,name=value...]
- Load the external sort discipline library with
optional comma separated name=value arguments. Libraries are loaded, in left to right order, after the sort
method has been initialized.
- -T, --tempdir=tempdir
- Put temporary files in tempdir. The default value is /usr/tmp
.
- -L, --list
- List the available sort methods. See the -x option.
- -x, --method=method
- Specify the sort method to apply:
- rasp
- Initial radix split into a forest of splay trees.
- radix
- Radix sort.
- splay
- Splay tree sort.
- verify
- Verify that the input is sorted.
- copy
- Copy (no sort).
- The default value is rasp.
- -v, --verbose
- Trace the sort progress on the standard error.
- -Z, --zd|zoned-decimal
- Compare zoned decimal (ZD) numbers with embedded trailing sign.
- -z, --size|zip=type[size]
- Suggest using the specified number of bytes of internal store to
tune performance. Type is a single character and may be one of:
- a
- Buffer alignment.
- b
- Input reserve buffer size.
- c
- Input chunk size; sort chunks of this size and disable merge.
- i
- Input buffer size.
- m
- Maximum number of intermediate merge files.
- p
- Input sort size; sort chunks of this size before merge.
- o
- Output buffer size.
- r
- Maximum record size.
- I
- Decompress the input if it is compressed.
- O
- gzip(1) compress the output.
- -y, --size=size
- Equivalent to -zisize.
- -X, --test=test
- Enables implementation defined test code. Some or all of these may be
disabled.
- dump
- List detailed information on the option settings.
- io
- List io file paths.
- keys
- List the canonical key for each record.
- read
- Force input file read by disabling memory mapping.
- show
- Show setup information and exit before sorting.
- test
- Immediatly exit with status 0; used to verify this implementation
- -D, --debug=level
- Sets the debug trace level. Higher levels produce more output.
- +pos1 -pos2 is the classical alternative to -k, with counting from 0 instead of 1, and pos2
designating next-after-last instead of last character of the key. A missing character count in pos2 means 0,
which in turn excludes any -t tab character from the end of the key. Thus +1 -1.3 is the same as -k 2,2.3
and +1r -3 is the same as -k 2r,3.
- Under option -tx fields are strings separated by x; otherwise fields are non-empty strings
separated by white space. White space before a field is part of the field, except under option -b. A b
flag may be attached independently to pos1 and pos2.
- When there are multiple sort keys, later keys are compared only after all earlier keys compare equal. Except under
option -s, lines with all keys equal are ordered with all bytes significant. -S turns off -s, the
last occurrence, left-to-right, takes affect.
- Sorting is done by a method determined by the -x option. -L lists the available methods. rasp
(radix+splay-tree) is the default and current all-around best.
- Single-letter options may be combined into a single string, such as -cnrt:. The option combination -di
and the combination of -n with any of -diM are improper. Posix argument conventions are supported.
- Options -b, -c, -d, -f, -i, -k, -m, -n, -o, -r,
-t, and -u are in the Posix and/or X/Open standards.
-
- sort comments and exits with non-zero status for various trouble conditions and for disorder discovered
under option -c.
-
- comm(1), join(1),
uniq(1), recsort(3)
-
- The never-documented default pos1=0 for cases such as sort -1 has been abolished. An input file
overwritten by -o is not replaced until the entire output file is generated in the same directory as the input,
at which point the input is renamed.
-
- version
- sort (AT&T Research) 2008-04-24
- author
- Glenn Fowler <gsf@research.att.com
>
- author
- Phong Vo <kpv@research.att.com>
- author
- Doug McIlroy <doug@research.bell-labs.com
>
- copyright
- Copyright © 1996-2009 AT&T Intellectual Property
- license
- http://www.opensource.org/licenses/cpl1.0.txt