In this article, we will see the different ways in which we can extract and print the first 3 characters of every line in a file. Assume a file with sample contents as below:
$ cat file Linux Unix Solaris1. This is the best of all since its purely internal. The file is read in the file loop and the first 3 characters are retrieved using the shell.
$ while read line > do > echo ${line:0:3} > done < file Lin Uni Sol${x:0:3} means to extract 3 characters from position 0 from the variable x. In this way, the shell can also be used to extract sub-string from a variable. This is one of the most important features of the shell.
2. Using cut, we can cut the first 3 characters.
$ cut -c -3 file Lin Uni SolThe same can also be achieved by having 1-3 in place of -3.
3. grep is used to print the contents of a file matching a specific content. grep prints an entire line by default. -o option of grep allows to print only the pattern matched. The dot(.) matches a single character. By giving 3 dots, it matches 3 characters and the control(^) character makes it to match from the beginning.
$ grep -o '^...' file Lin Uni Sol4. In case of lesser characters, we can provide that many dots. What if it is some 20 characters? Giving that many dots will look clumsy. Regular expressions provide an option {n} which means the preceeding character should match n times. Here, a single character is matched using the dot(.), and the number 3 tells it to match it 3 times. The backslash(\) is to to prevent the grep from interpreting the '{' as literal.
$ grep -o '^.\{3\}' file Lin Uni Sol5. The substr function of awk does it easily. The substr syntax is: substr(string,starting position, offset). The string is the entire line which is $0. 0 is the starting position, 3 is the number of characters to extract.
$ awk '{print substr($0,0,3);}' file Lin Uni Sol6. The sed solution is almost same as the grep earlier which we did. The entire line is broken into two parts; first 3 characters and the rest. By giving \1, the first 3 characters are printed which are sub-grouped earlier.
$ sed 's/\(.\{3\}\).*/\1/' file Lin Uni Sol7. The perl option also has the substr function. $_ represents the line read in perl.
$ perl -lne 'print substr($_,0,3);' file Lin Uni Sol
No comments:
Post a Comment