This content originally appeared on Level Up Coding - Medium and was authored by Afnan Mostafa
SED (stream editor) is a successor of ED and QED (quick editor) in Unix-like systems and is used for plain text editing. It’s not rare to find us in a situation where we have to extract some data from a humongous file or get one or two columns from a big data file. We take our coffee and sit down to write some code or start looking for already-provided code on the internet. I’m not calling us out but how would you feel if I tell you there’s a one-liner code that can take the toll for you? Yes, SED can do that with very simple one-liners.
In this article, I will summarize some basic file formatting, and then in the latter part of this article, I will try to exemplify how to use SED to get data from LAMMPS’ log file (LAMMPS is a molecular dynamics simulation software package).
Let’s say we have a text file named ‘foo.data’ and it has contents like this:
Syntax:
sed -option ‘command’ textfile > outputfile
Let’s break it down-
- sed = initiating SED
- option = any optional argument (-n for turning off automatic printing, -i for in-place editing, -e = expression)
- ‘command’ = what you want to do (regular expression, pattern matching, deletion)
- textfile = file you want to work on
- outputfile = if you want to save the output to a file (optional)
Printing lines with SED:
If we want to print the 6th line of foo.data, then the command
sed ‘6p’ foo.data
would give us the whole file content like this and it will print the 6th line (LAMMPS is a classical molecular dynamics simulation code designed to) twice- one for 6p and another for not using -n.
But we can turn the automatic printing off by adding -n before the ‘command’:
sed -n ‘6p’ foo.data
Now, the output will have only the 6th line:
Similarly, for the 2nd line, we have to write ‘2p’. SED also recognizes blank lines. If you write ‘5p’, you will get the 5th line (a blank line) as output. Also, if we want to print from the 3rd to 5th line, then we use:
sed -n ‘3,5p’ foo.data
Or,
sed -n ‘3,+2p’ foo.data
To get the final line, we use $:
sed -n ‘$p’ foo.data
Deletion:
sed ‘6d’ foo.data
The aforementioned command will delete the 6th line (LAMMPS is a classical molecular dynamics simulation code designed to) but the original text file is unchanged. To make changes to the raw file, we have to use -i (inplace) before the command.
sed -i ‘6d’ foo.data
Now, to see the file, we can use either of these tools: cat, less, or more.
less foo.data
more foo.data
cat foo.data
For less foo.data, the output buffer will look like this:
You can see the line we deleted is missing and this -i option modified the original text file. Now, let’s put this line back again:
sed -i ‘5 a LAMMPS is a classical molecular dynamics simulation code designed to’ foo.data
This will put the line we chose after the 5th line. Let’s see what it means:
- any number (here, 5)= get to the 5th line.
- a = append after 5th line (which means 6th line to be inserted)
- LAMMPS is …. = line to be appended to
Now, if we see the file again using less, we will see the original file has been retained.
Printing odd/even lines:
ODD:
sed -n ‘1~2p’ foo.data
This will print odd lines (1, 3, 5,….)
EVEN:
sed -n ‘2~2p’ foo.data
This will print even lines (2, 4, 6,….)
- 1~2 = start from 1st line and then jumps +2 lines: 1+2 =3, 3+2 =5…
- 2~2 = start from 2nd line and then jumps +2 lines: 2+2 =4, 4+2 =6…
Similarly, 4~3 will get you lines 4, 7, 10, ….
Insertion:
What if we want to insert a line at the 4th line?
sed -i ‘4 i Inserted line here’ foo.data
cat foo.data
Change line contents:
sed -i ‘5 c simulation package’ foo.data
cat foo.data
now let’s get back to where we were before:
sed -i ‘4d; 5 c simulator’ foo.data
This will delete the 4th line (‘inserted line here’) and change the 5th line back to ‘simulator’ from ‘simulation package’. Semi-colon (;) is used to separate two commands. This will make the file look exactly the same as the file we started with.
Another way:
sed -i -e ‘4d’ -e ‘5 c simulator’ foo.data
- -e is used for expression.
Miscellanies:
sed ‘=’ foo.data
- This will print the line numbers before each line- a good way to see individual lines.
sed ‘G’ foo.data
- This will add a blank line after each line.
sed ‘=;G’ foo.data
or, sed -e ‘=’ -e ‘G’ foo.data
- This will print the numbers and then add a blank line after each line.
sed ‘1=’ foo.data
- This will print the number before 1st line.
sed ‘$=’ foo.data
- This will print the number before the last line.
sed -n ‘$=’ foo.data
- This will print the total number of lines in the file (no content printing).
sed ‘1~2 w oddlines.data’ foo.data
- This will print only the odd lines into a file named oddlines.data
I will go through how SED works in terms of pattern matching in part 2 and show you how you can use SED to manipulate large data using only a line or two- that simple!!
I am not a prodigy of Unix but I love to use SED and SED-like other tools for convenience. I am sharing my knowledge and this is also my digital go-to cheatsheet so two birds at one stone!! Thank you for keeping up with this long article. Have a nice day!!
How to use SED to manipulate text files (Part 1: print, delete, append) was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by Afnan Mostafa
Afnan Mostafa | Sciencx (2022-06-17T14:56:48+00:00) How to use SED to manipulate text files (Part 1: print, delete, append). Retrieved from https://www.scien.cx/2022/06/17/how-to-use-sed-to-manipulate-text-files-part-1-print-delete-append/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.