Text Processing in Linux
awk
– Pattern scanning and processing
Purpose:
awk
reads a file line-by-line, splits lines into fields, and processes them using patterns and actions.
Syntax:
awk 'pattern { action }' filename
Key Concepts:
- Fields are referenced by
$1
,$2
, …,$NF
(last field) - Whole line is
$0
Examples:
awk '{print $1}' file.txt # Print first column
awk '/error/ {print $0}' log # Print lines containing "error"
awk '{sum += $3} END {print sum}' data.txt # Sum of 3rd column
sed
– Stream editor
Purpose:
sed
is used to search, find and replace, insert, or delete text in files or streams.
Syntax:
sed [options] 'command' file
Common Commands:
s/pattern/replacement/
– substituted
– deletep
– printi
– inserta
– append
Examples:
sed 's/foo/bar/' file.txt # Replace first "foo" with "bar" in each line
sed 's/foo/bar/g' file.txt # Replace all "foo"
sed -n '/error/p' logfile # Print only lines with "error"
sed '2d' file.txt # Delete line 2
sed '/debug/d' file.txt # Delete lines containing "debug"
sort
– Sort lines of text files
Purpose:
Sorts lines in a text file or stream.
Syntax:
sort [options] file
Useful Options:
Option | Description |
---|---|
-r | Reverse sort |
-n | Numerical sort |
-k | Sort by specific key/column |
-u | Remove duplicates |
-t | Define custom delimiter |
Examples:
sort file.txt # Default (alphabetical) sort
sort -r file.txt # Reverse order
sort -n numbers.txt # Numeric sort
sort -k2 file.txt # Sort by 2nd column
sort -t: -k3 data.txt # Sort by 3rd field using ":" as delimiter
sort file.txt | uniq # Remove duplicate lines
Summary Table
Command | Purpose | Typical Use Case |
---|---|---|
awk | Pattern-based text processing | Extract columns, compute sums, filter by field |
sed | Stream editing (search, replace, delete) | Modify text inline or filter lines |
sort | Sort lines of text or file content | Order lists, remove duplicates, custom sorting |
Last updated on