Thursday, March 17, 2011

Different ways to delete ^M character in a file



 Control-M is a character found at the end of a line usually in files transferred from windows. Before, processing these files in UNIX, we need to remove the ^M characters. Let us see in this article, the different ways to delete the Control-M from the files:

 Consider a file, file1, which has the control M characters:
$ cat -v file1
a^M
b^M
c^M
d^M
e^M
f^M


  The -v option in cat command shows the file with non-printable characters if any. Here we have used it to display the Control-M characters without which it would not have got displayed.

1. dos2unix The simplest of all is using the dos2unix command.Some Unix flavors have a command, dos2unix, to handle the dos files in which the control character Control-M is automatically removed. This command updates the source file itself directly.
$ dos2unix file1
2. tr  Using tr itself, we can do this using 2 ways:
$ tr -d '^M' <file1
  Note: When typing the above command, the Control-M should be entered like: Control-V+Control-M, and not as Shift-6-M.
  The "-d" option of tr deletes the occurrences of Control-M. To confirm the above output does not contain Control-M, pipe the output of the above command to "cat -v".
$ tr -d '^M' <file1 | cat -v
  The above command does not update the source file, instead, it just displays the file1 contents without Control-M. To update the source file, redirect the output to a temporary file, and rename it to the source file.
$ tr -d '\015' <file1 >file2
$ mv file2 file1
 The other way to get the same thing done using tr is to use the escape sequence of Control-M which is 015.
$ tr -d "\015" <file1
3. sed The replacement of Control-M using sed can be done as shown below:
$ sed 's/^M//g' file1
  The above command all the occurrences of Control-M with nothing. In other words, it simply removes the occurrences of Control-M. If your sed has '-i' option, you can remove the Control-M directly in the file itself:
$ sed -i 's/^M//g' file1
4. vi In vi, we can substitute all the ^M characters with nothing. Open the source file in vi, and give the below command in the escape mode:
 :%s/^M//
5. awk The sub function in awk can be used for the substitution. The 1 is used to print the lines.
$ awk 'sub(/^M/,"");1' file1
6. perl perl too provides the substitution as in sed.
$ perl -p -e 's/^M//g' file1
 Perl also has the '-i' option which edits the file in place.
$ perl -pi -e 's/^M//g' file1

6 comments:

  1. Excellent post, great job.

    ReplyDelete
  2. Most of these commands will delete the letters 'M' at the beginning of the line. Awful article.

    ReplyDelete
    Replies
    1. Please read the instructions carefully before using it.

      Delete
  3. Thank you. Have been trying to do this from half an hour.

    ReplyDelete