Monday, December 3, 2012

How to remove the leading and trailing spaces in a file?



How to remove / delete the leading and trailing spaces in a file? How to replace a group of spaces with a single space?
    Let us consider a file with the below content:
$ cat file
 Linux  25
Fedora   40
Suse 36
    CentOS 50
LinuxMint 15
Using the -e option of cat, the trailing spaces can be noticed easily(the $ symbol indicates end of line)
$ cat -e file
 Linux  25$
Fedora   40$
Suse 36    $
    CentOS 50$
LinuxMint 15$
Let us see the different ways of how to remove these spaces:
1. awk command:
$ awk '$1=$1' file
Linux 25
Fedora 40
Suse 36
CentOS 50
LinuxMint 15
awk has a property wherein just by editing a field, all the whitespaces get removed automatically. Nothing changes just by assigning $1 to $1 ('$1=$1' ) and at the same time, a dummy edit has happened which will remove the whitespaces.

2. sed command:
$ sed 's/^ *//;s/ *$//;s/  */ /;' file
Linux 25
Fedora 40
Suse 36
CentOS 50
LinuxMint 15
Using multiple substitutions(3) in sed, the spaces are removed. The 1st command removes the leading spaces, the second removes the trailing spaces and the last replaces a group of spaces with a single space. The source file itself can be updated by using the -i option of sed.

3. Perl solution:
$ perl -plne 's/^\s*//;s/\s*$//;s/\s+/ /;' file
Linux 25
Fedora 40
Suse 36
CentOS 50
LinuxMint 15
This is almost same as the sed solution. Like sed, the source file itself can be updated by adding an -i option to the above command.

4. Bash solution:
$ while read f1 f2
> do
>  echo $f1 $f2
> done < file
Linux 25
Fedora 40
Suse 36
CentOS 50
LinuxMint 15
Using the while loop, the 2 columns are read in variables f1 and f2. By just echoing the variables back, the spaces get removed automatically.

1 comment: