An important part of working with the Unix shell is processing
text directly from command line. We've done a bit of this, but
Chapter 20 introduces a few more interesting tools.
The text tools can be used to create reports, generate HTML, or
even send email messages, frequently using temp files to
hold intermediate results.
- cat (again)
- The -A option to show unprintable characters.
- -n prints line numbers.
- -s suppresses blank lines.
- sort offers much control over how sorting is performed.
- -f ignore case.
- -r reverse.
- -t to specify a character (other than spaces) to separate fields.
- -k to specify a sort fields.
- Specify a range of field numbers which form the key. One-based.
- May specify multiple -k's to sort on multiple keys.
- -u remove duplicate lines.
- uniq filters out duplicates in a sorted stream, much like
using the -u option to sort.
- cut removes a part of each line.
- Use -c to select a list of character ranges.
- Use -f to select a list of fields.
- Fields separated by tab, or use -d.
- Often useful with pipes to perform separate cuts of different type.
- last | egrep -v '^wtmp begins|^reboot' | cut -c 1-9,23-37 | sort -u
- paste will combine lines from two files in pairs.
- join will combine lines based on a common field, modeled on
a database join. Works on two files sorted on the join field.
- Fields specified in ways similar to sort and cut.
- Joins on first field by default.
- Files must be sorted on the join field (so maybe it's more
of a merge).
- tr translate — character substitution.
- Lowercase letters: tr A-Z a-z < file.txt
- -d option deletes characters instead of replacing them.
- -s option will compress runs to a single.
last | egrep -v '^wtmp begins|^reboot' | cut -c 1-9,23-37 | sort -u > tmp1
sort -t : -k 1 /etc/passwd | cut -d : -f 1,5 | tr : ' ' > tmp2
join tmp1 tmp2
- comm compares the lines of two sorted files. Makes three columns
for those lines appearing in one, the other, or both.
- diff compare two (usually similar) files and summarize differences.
- Most often used with the -u option to
display context format.
- Often used to inspect two different versions of a program.
- patch apply a diff to one version to reproduce the other.
- A way to store or transmit changes.
- Software source updates are sometimes distributed as diffs.
- sed stream editor. Various commands.