cut is a very frequently used command for file parsing. It is very useful in splitting columns on files with or without delimiter. In this article, we will see how to use the cut command on files having a delimiter.
Let us consider a sample file, say file1, with a comma as delimiter as shown below:
$ cat file1 Rakesh,Father,35,Manager Niti,Mother,30,Group Lead Shlok,Son,5,StudentThe first column indicates name, second relationship, the third being the age, and the last one is their profession.
cut command has 2 main options to work on files with delimiters:
-f - To indicate which field(s) to cut.
-d - To indicate the delimiter on the basis of which the cut command will cut the fields.
Let us now try to work with this command with a few examples:
1. To get the list of names alone from the file, which is the first column:
$ cut -d, -f 1 file1 Rakesh Niti ShlokThe option "-d' followed by a comma indicates to cut the file on the basis of comma. "-f" followed by 1 indicates to retrieve the field 1 from the file1, and hence we got the names alone.
2. To get the relationship alone:, i.e, 2nd field
$ cut -d, -f 2 file1 Father Mother Son3. To get 2 fields, say Name and Age:
$ cut -d, -f 1,3 file1 Rakesh,35 Niti,30 Shlok,5Giving 1,3 means to retrieve the first and third fields which happens to be name and age respectively.
4. To get the name, relationship and age, excluding the profession, i.e, 1st to 3rd fields:
$ cut -d, -f 1-3 file1 Rakesh,Father,35 Niti,Mother,30 Shlok,Son,5The option 1-3 means from first field till third field. Whenever we need a range of fields to be retrieved, we use the '-' option.
The same result above can also be retrieved in other ways also:
$ cut -d, -f -3 file1 Rakesh,Father,35 Niti,Mother,30 Shlok,Son,5This is the best of the 3 methods to retrieve a range of fields. The option "-3" means from the beginning i.e, the first field till the third field. And hence we get the fields 1, 2 and 3.
5. To retrieve all the fields except the name field. i.e, to retrieve from field 2 to field 4:
$ cut -d, -f 2- file1 Father,35,Manager Mother,30,Group Lead Son,5,StudentSimilar to the last result, "2-" means from the second field till the end which is the 4th field. Whenever the beginning of the range is not specified, it defaults to 1, similarly when the end of the range is not given, it defaults to the last field. The same result could have been achieved using the option "2-4" as well.
Let us consider the same input file with a space as the delimiter:
$ cat file1 Rakesh Father 35 Manager Niti Mother 30 GL Shlok Son 5 StudentThe same options and commands used above hold good but for the delimiter specified. When comma is the delimiter, we can give it after the -d option. However, for the space as delimiter, we need to quote the delimiter as shown below. In fact, we can always quote the delimiter to be in the safer side.
6. To retrieve the first field from a space delimited file:
$ cut -d" " -f 1 file1 Rakesh Niti ShlokLet us consider the same file separated by tab space:
$ cat file1 Rakesh Father 35 Manager Niti Mother 30 GL Shlok Son 5 StudentTo actually confirm the file is indeed separated by tab space, use the "-t" option with the cat command:
$ cat -t file1 Rakesh^IFather^I35^IManager Niti^IMother^I30^IGL Shlok^ISon^I5^IStudentThe ^I indicates a tab space.
7. To retrieve the first field from this tab separated file. How to specify the tab space with the "-d" option?
$ cut -f 1 file1 Rakesh Niti ShlokSurprised!! The default delimiter of the cut command is the tab space, and hence when we have a file which is tab separated, we need not specify the "-d" option at all. Directly, the "-f" option can be used to retrieve the fields.