Tuesday, March 27, 2012

sed - Replace or substitute file contents - Part 2



 In one of our earlier articles, we saw about Replace and substitute using sed. In continuation to it, we will see a few more frequent search and replace operations done on files using sed.

Let us consider a file with the following contents:
$ cat file
RE01:EMP1:25:2500
RE02:EMP2:26:2650
RE03:EMP3:24:3500
RE04:EMP4:27:2900

1. To replace the first two(2) characters of a string or a line with say "XX":
$ sed 's/^../XX/' file
XX01:EMP1:25:2500
XX02:EMP2:26:2650
XX03:EMP3:24:3500
XX04:EMP4:27:2900
     The "^" symbol indicates from the beginning. The two dots indicate 2 characters.

     The same thing can also be achieved without using the carrot(^) symbol as shown below. This also works because by default sed starts any operation from the beginning.
sed 's/../XX/' file
2.  In the same lines, to remove or delete the first two characters of a string or a line.
$ sed 's/^..//' file
01:EMP1:25:2500
02:EMP2:26:2650
03:EMP3:24:3500
04:EMP4:27:2900
  Here the string to be substituted is empty, and hence gets deleted.

3. Similarly, to remove/delete the last two characters in the string:
$ sed 's/..$//' file
RE01:EMP1:25:25
RE02:EMP2:26:26
RE03:EMP3:24:35
RE04:EMP4:27:29
4. To add a string to the end of a line:
$ sed 's/$/.Rs/' file
RE01:EMP1:25:2500.Rs
RE02:EMP2:26:2650.Rs
RE03:EMP3:24:3500.Rs
RE04:EMP4:27:2900.Rs
     Here the string ".Rs" is being added to the end of the line.

5.  To add empty spaces to the beginning of every line in a file:
$ sed 's/^/   /' file
   RE01:EMP1:25:Rs.2500
   RE02:EMP2:26:Rs.2650
   RE03:EMP3:24:Rs.3500
   RE04:EMP4:27:Rs.2900
    To make any of the sed command change permanent to the file OR in other words, to save or update the changes in the same  file, use the option "-i"
$ sed -i 's/^/   /' file
$ cat file
   RE01:EMP1:25:Rs.2500
   RE02:EMP2:26:Rs.2650
   RE03:EMP3:24:Rs.3500
   RE04:EMP4:27:Rs.2900
6. To remove empty spaces from the beginning of a line:
$ sed 's/^ *//' file
RE01:EMP1:25:2500
RE02:EMP2:26:2650
RE03:EMP3:24:3500
RE04:EMP4:27:2900
      "^ *"(space followed by a *) indicates a sequence of spaces in the beginning.

7. To remove empty spaces from beginning and end of string.
$ sed 's/^ *//; s/ *$//' file
RE01:EMP1:25:2500
RE02:EMP2:26:2650
RE03:EMP3:24:3500
RE04:EMP4:27:2900
    This example also shows to use multiple sed command substitutions as part of the same command.

     The same command can also be written as :
sed -e 's/^ *//' -e 's/ *$//' file

8. To add a character before and after a string. Or in other words, to encapsulate the string with something:
$ sed 's/.*/"&"/' file
"RE01:EMP1:25:Rs.2500"
"RE02:EMP2:26:Rs.2650"
"RE03:EMP3:24:Rs.3500"
"RE04:EMP4:27:Rs.2900"
     ".*" matches the entire line. '&' denotes the pattern matched. The substitution pattern "&" indicates to put a double-quote at the beginning and end of the string.

9. To remove the first and last character of a string:
$ sed 's/^.//;s/.$//' file
RE01:EMP1:25:2500
RE02:EMP2:26:2650
RE03:EMP3:24:3500
RE04:EMP4:27:2900
10. To remove everything till the first digit comes :
$ sed 's/^[^0-9]*//' file
01:EMP1:25:2500
02:EMP2:26:2650
03:EMP3:24:3500
04:EMP4:27:2900
    Similarly, to remove everything till the first alphabet comes:
sed 's/^[^a-zA-Z]*//' file
11. To remove a numerical word from the end of the string:
$ sed 's/[0-9]*$//' file
RE01:EMP1:25:
RE02:EMP2:26:
RE03:EMP3:24:
RE04:EMP4:27:
12. To get the last column of a file with a delimiter. The delimiter in this case is ":".
$ sed 's/.*://' file
2500
2650
3500
2900
    For a moment, one can think the output of the above command to be the same contents without the first column and the delim. sed is greedy. When we tell, '.*:' it goes to the last column and consumes everything. And hence, we only the get the content after the last colon.

13. To convert the entire line into lower case:
$ sed 's/.*/\L&/' file
re01:emp1:25:rs.2500
re02:emp2:26:rs.2650
re03:emp3:24:rs.3500
re04:emp4:27:rs.2900
      \L is the sed switch to convert to lower case. The operand following the \L gets converted. Since &(the pattern matched, which is the entire line in this case) is following \L, the entire line gets converted to lower case.

14. To convert the entire line or a string to uppercase :
$ sed 's/.*/\U&/' file
RE01:EMP1:25:RS.2500
RE02:EMP2:26:RS.2650
RE03:EMP3:24:RS.3500
RE04:EMP4:27:RS.2900
       Same as above, \U instead of \L.

No comments:

Post a Comment