Viewing, searching, manipulating file using sed, awk , cut, and grep

 

grep

grep is a command-line utility used for searching plain-text data for lines that match a specified pattern. It is particularly useful for filtering through large files or outputs to quickly find relevant information, such as error messages in log files or specific entries in data files. By using various options, users can perform case-insensitive searches, display line numbers, or invert matches, making it an essential tool for text processing and analysis.

sed

sed (stream editor) is a powerful command-line tool for parsing and transforming text in a file or input stream. It allows users to perform basic text manipulations, such as substitution, deletion, and insertion, through simple commands. sed is invaluable for automating repetitive text editing tasks, enabling users to modify files without opening them in an editor, which is especially useful in scripting and batch processing scenarios.

awk

awk is a versatile programming language and command-line tool designed for text processing and data extraction from structured data formats, such as CSV or TSV files. It allows users to analyze and manipulate data based on specified patterns and conditions, making it ideal for generating reports or performing calculations on columns of data. With its built-in support for variables and functions, awk can handle complex data processing tasks efficiently.

cut

cut is a command-line utility used to extract sections from each line of input data, typically based on delimiters or character positions. It is particularly useful for retrieving specific columns from files or outputs, such as extracting usernames from a list or fields from a CSV file. By using cut, users can quickly isolate and manipulate relevant portions of text data, making it an essential tool for basic data processing tasks.

Here’s a detailed guide on viewing, searching, and manipulating files using sed, awk, cut, and grep, with step-by-step examples.

1. Viewing Files

1.1. Using cat

Purpose: Display the entire content of a file. Example:
cat filename.txt
Step-by-step:
  • This command outputs all lines of filename.txt to the terminal.

1.2. Using less

Purpose: View a file one page at a time. Example:
less filename.txt
Step-by-step:
  • Opens filename.txt in a paginated format.
  • Use SPACE to go forward and b to go back. Press q to exit.

1.3. Using head

Purpose: View the first few lines of a file. Example:
head filename.txt
Step-by-step:
  • Displays the first 10 lines of filename.txt.
Customizing the number of lines:
head -n 5 filename.txt

1.4. Using tail

Purpose: View the last few lines of a file. Example:
tail filename.txt
Step-by-step:
  • Shows the last 10 lines of filename.txt.
Real-time monitoring:
tail -f logfile.txt
  • This command follows the file, updating in real-time as new lines are added.

2. Searching for Patterns

2.1. Using grep

Purpose: Search for lines matching a specific pattern in a file. Example:
grep 'error' logfile.txt
Step-by-step:
  • This command looks for the word “error” and prints matching lines.
Case-insensitive search:
grep -i 'error' logfile.txt
Search with line numbers:
grep -n 'error' logfile.txt
Invert match (show lines without “error”):
grep -v 'error' logfile.txt
Search recursively in a directory:
grep -r 'error' .

3. Manipulating Text

3.1. Using sed

Purpose: Stream editor for transforming text. Example: Substitution
sed 's/foo/bar/' file.txt
Step-by-step:
  • Replaces the first occurrence of “foo” with “bar” in each line of file.txt.
Global substitution:
sed 's/foo/bar/g' file.txt
In-place editing:
sed -i.bak 's/foo/bar/g' file.txt
  • This modifies file.txt directly and creates a backup with a .bak extension.
Delete lines containing “delete this”:
sed '/delete this/d' file.txt
Print specific lines (e.g., first 5 lines):
sed -n '1,5p' file.txt

3.2. Using awk

Purpose: Powerful text processing tool. Example: Print specific columns
awk '{print $1, $3}' data.txt
Step-by-step:
  • Prints the first and third columns from data.txt.
Conditional printing:
awk '$2 > 100' data.txt
  • Only prints lines where the second column is greater than 100.
Sum a column:
awk '{sum += $2} END {print sum}' data.txt
  • Calculates and prints the sum of the second column.
Using a custom field separator:
awk -F, '{print $1, $2}' data.csv
  • This uses a comma as the delimiter in data.csv.
Modify columns:
awk '{$2 += 10; print}' data.txt
  • Adds 10 to every value in the second column and prints the result.

3.3. Using cut

Purpose: Extract specific sections from each line of input. Example: Cut by delimiter
cut -d ',' -f 1 data.csv
Step-by-step:
  • Extracts the first column from a CSV file, where columns are separated by commas.
Cut by character position:
cut -c 1-5 filename.txt
  • Extracts characters from position 1 to 5 from each line.
Multiple fields extraction:
cut -f 1,3 -d $'\t' data.tsv
  • Extracts the first and third columns from a tab-separated file.
Save output to a new file:
cut -d ',' -f 1 data.csv > first_column.txt
  • This command saves the extracted first column to first_column.txt.

4. Combining Commands

Often, you may need to combine these commands using pipes (|) to create complex workflows.

Example 1: Combine grep and awk

Find lines containing “error” and print the second column.
grep 'error' logfile.txt | awk '{print $2}'

Example 2: Use cut with grep

Extract usernames from a user data file.
grep 'username' user_data.txt | cut -d ':' -f 1

Example 3: Use sed and cut

Replace “foo” with “bar” and then extract the first column.
sed 's/foo/bar/g' file.txt | cut -d ' ' -f 1

Summary of Commands

Command Description Example Command
cat Display file contents cat filename.txt
less View file with pagination less filename.txt
head Show first lines of a file head filename.txt
tail Show last lines of a file tail filename.txt
grep Search for patterns grep 'error' logfile.txt
sed Edit/transform text sed 's/foo/bar/g' file.txt
awk Process structured text awk '{print $1, $3}' data.txt
cut Extract sections of text cut -d ',' -f 1 data.csv