Tuesday, September 4, 2012

8 examples to change the delimiter of a file in Linux



How to change the delimiter of a file from comma to colon?

  Let us consider a file with the following contents:
$ cat file
Unix,10,A
Linux,30,B
Solaris,40,C
HPUX,20,D
Ubuntu,50,E
1. sed solution:
$ sed 's/,/:/g' file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
Using the sed substitution(s) command, all(g) commas are repalced with colons.

2. awk solution:
$ awk '$1=$1' FS="," OFS=":" file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
FS and OFS are awk special variables which means Input Field separator and Output field separator respectively. FS is set to comma which is the input field separator, OFS is the output field separator which is colon. $1=$1 actually does nothing. For awk, to change the delimiter, there should be some change in the data and hence this dummy assignment.

3. awk using gsub function:
$ awk 'gsub(",",":")' file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
    gsub function in awk is for global substitution. Global, in the sense, to substitute all occurrences. awk provides one more function for substitution: sub. The difference between sub and gsub is sub replaces or substitutes only the first occurrence, whereas gsub substitutes all occurrences.

4. tr solution:
$ tr ',' ':' < file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
tr can be used for mutliple things: to delete, squeeze or replace specific characters. In this case, it is used to repalce the commas with colons.

5.Perl solution to change the delimiter:
$ perl -pe 's/,/:/g' file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
Same explanation as the sed solution above.

6. One more perl way:
$ perl -F, -ane 'print join ":",@F;' file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
In this, the elements of the line are autosplit(a) and stored into the default array(@F). Using join, the array elements are joined using colon and printed.

7. Shell script to change the delimiter of a file:
$ while read line
> do
>   echo ${line//,/:}
> done < file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
   Using the shell substitution command, all the commas are replaced with the colons. '${line/,/:}' will replace only the 1st match. The extra slash in '${line//,/:}' will replace all the matches.
Note: This method will work in bash and ksh93 or higher, not in all flavors.

8. Shell script using IFS to change the delimiter of file:
$ while IFS=, read f1 f2 f3
> do
>  echo $f1:$f2:$f3
> done < file
Unix:10:A
Linux:30:B
Solaris:40:C
HPUX:20:D
Ubuntu:50:E
   IFS(Internal Field Separator) is a shell environment variable which holds the delimiter. The default is whitespace. Using the IFS in while loop, individual columns can be read in separate variables and while printing

No comments:

Post a Comment