Viewing, searching, manipulating file using sed, awk , cut, and grep
grep
grep is a command-line utility used for searching plain-text data for lines that match a specified pattern. It is particularly useful for filtering through large files or outputs to quickly find relevant information, such as error messages in log files or specific entries in data files. By using various options, users can perform case-insensitive searches, display line numbers, or invert matches, making it an essential tool for text processing and analysis.
sed
sed (stream editor) is a powerful command-line tool for parsing and transforming text in a file or input stream. It allows users to perform basic text manipulations, such as substitution, deletion, and insertion, through simple commands. sed is invaluable for automating repetitive text editing tasks, enabling users to modify files without opening them in an editor, which is especially useful in scripting and batch processing scenarios.
awk
awk is a versatile programming language and command-line tool designed for text processing and data extraction from structured data formats, such as CSV or TSV files. It allows users to analyze and manipulate data based on specified patterns and conditions, making it ideal for generating reports or performing calculations on columns of data. With its built-in support for variables and functions, awk can handle complex data processing tasks efficiently.
cut
cut is a command-line utility used to extract sections from each line of input data, typically based on delimiters or character positions. It is particularly useful for retrieving specific columns from files or outputs, such as extracting usernames from a list or fields from a CSV file. By using cut, users can quickly isolate and manipulate relevant portions of text data, making it an essential tool for basic data processing tasks.
Here’s a detailed guide on viewing, searching, and manipulating files using sed, awk, cut, and grep, with step-by-step examples.
1. Viewing Files
1.1. Using cat
Purpose: Display the entire content of a file.
Example:
cat filename.txt
Step-by-step:
This command outputs all lines of filename.txt to the terminal.
1.2. Using less
Purpose: View a file one page at a time.
Example:
less filename.txt
Step-by-step:
Opens filename.txt in a paginated format.
Use SPACE to go forward and b to go back. Press q to exit.
1.3. Using head
Purpose: View the first few lines of a file.
Example:
head filename.txt
Step-by-step:
Displays the first 10 lines of filename.txt.
Customizing the number of lines:
head -n 5 filename.txt
1.4. Using tail
Purpose: View the last few lines of a file.
Example:
tail filename.txt
Step-by-step:
Shows the last 10 lines of filename.txt.
Real-time monitoring:
tail -f logfile.txt
This command follows the file, updating in real-time as new lines are added.
2. Searching for Patterns
2.1. Using grep
Purpose: Search for lines matching a specific pattern in a file.
Example:
grep 'error' logfile.txt
Step-by-step:
This command looks for the word “error” and prints matching lines.
Case-insensitive search:
grep -i 'error' logfile.txt
Search with line numbers:
grep -n 'error' logfile.txt
Invert match (show lines without “error”):
grep -v 'error' logfile.txt
Search recursively in a directory:
grep -r 'error' .
3. Manipulating Text
3.1. Using sed
Purpose: Stream editor for transforming text.
Example: Substitution
sed 's/foo/bar/' file.txt
Step-by-step:
Replaces the first occurrence of “foo” with “bar” in each line of file.txt.
Global substitution:
sed 's/foo/bar/g' file.txt
In-place editing:
sed -i.bak 's/foo/bar/g' file.txt
This modifies file.txt directly and creates a backup with a .bak extension.
Delete lines containing “delete this”:
sed '/delete this/d' file.txt
Print specific lines (e.g., first 5 lines):
sed -n '1,5p' file.txt
3.2. Using awk
Purpose: Powerful text processing tool.
Example: Print specific columns
awk '{print $1, $3}' data.txt
Step-by-step:
Prints the first and third columns from data.txt.
Conditional printing:
awk '$2 > 100' data.txt
Only prints lines where the second column is greater than 100.
Sum a column:
awk '{sum += $2} END {print sum}' data.txt
Calculates and prints the sum of the second column.
Using a custom field separator:
awk -F, '{print $1, $2}' data.csv
This uses a comma as the delimiter in data.csv.
Modify columns:
awk '{$2 += 10; print}' data.txt
Adds 10 to every value in the second column and prints the result.
3.3. Using cut
Purpose: Extract specific sections from each line of input.
Example: Cut by delimiter
cut -d ',' -f 1 data.csv
Step-by-step:
Extracts the first column from a CSV file, where columns are separated by commas.
Cut by character position:
cut -c 1-5 filename.txt
Extracts characters from position 1 to 5 from each line.
Multiple fields extraction:
cut -f 1,3 -d $'\t' data.tsv
Extracts the first and third columns from a tab-separated file.
Save output to a new file:
cut -d ',' -f 1 data.csv > first_column.txt
This command saves the extracted first column to first_column.txt.
4. Combining Commands
Often, you may need to combine these commands using pipes (|) to create complex workflows.
Example 1: Combine grep and awk
Find lines containing “error” and print the second column.
grep 'error' logfile.txt | awk '{print $2}'
Example 2: Use cut with grep
Extract usernames from a user data file.
grep 'username' user_data.txt | cut -d ':' -f 1
Example 3: Use sed and cut
Replace “foo” with “bar” and then extract the first column.
Source Code What is Source Code? Source code is a collection of instructions written in a programming language that can be read and understood by humans. This is the original form of a program, written Read more…
File System Partition Pengertian Partisi Partisi adalah pembagian ruang dalam media penyimpanan, seperti hard drive atau SSD, yang memungkinkan pengaturan data secara lebih terstruktur dan efisien. Dengan membagi media penyimpanan menjadi beberapa partisi, pengguna dapat Read more…
Apa itu Biner? Biner adalah sistem bilangan yang menggunakan dua simbol dasar, yaitu 0 dan 1, yang dikenal sebagai bit. Setiap bit mewakili satu unit data dalam sistem komputasi. Sistem biner sangat mendasar dalam dunia Read more…