Tuesday, April 28, 2020

bash - 10 examples to find / replace in a string



Many times when we want to replace or extract something, we immediately end up with and awk or a sed oneliner. Keep in mind, the first option should always be a internal shell option, only in the absence of which we should resort to an external command. In this article, we will see 10 different examples where instead of an awk/sed, using bash specific internal will be very beneficial:

1. Capitalize a string:  Though there are many ways to capitalize a string,  one tends to use toupper function to get it done. Instead, bash can directly handle it.
$ x='hello'
$ echo ${x^^}
HELLO
$ echo "$x" | awk '{print toupper($0)}'
HELLO
   This feature of bash(^^) is only available for bash version 4. The two carrot pattern capitalizes the entire string.

2. Capitalize only first character:     One common way used to get this is using sed where we match a single character(.) and replace it with uppercase.  Bash 4, when used with only carrot(^) capitalizes only first character.
$ x='hello'
$ echo ${x^}
Hello
$ echo "$x" | sed -e "s/./\u&/"
Hello
3. Convert string to lowercase:
    Bash has another case modification pattern ,, which will change the case to lowercase.  Generally, folks try to use tolower of awk or the tr command for this.
$ x='HELLO'
$ echo ${x,,}
hello
$ echo "$x" | awk '{print tolower($0);}'
hello
4. Convert first character to lowercase:
  When the case modification pattern used is a single comma(,), only the first character is converted.
$ x='HELLO'
$ echo ${x,}
hELLO
$ echo "$x" | sed -e "s/./\l&/"
hELLO
5. Replace set of characters to lowercase:
   Bash's case modification can take a pattern after the double comma or double caret's. When given,the case conversion is done only on the pattern which is matched. In this example, we want to change the case of all E and O to lowercase.
$ echo ${x,,[EO]}
HeLLo
$ echo "$x" | sed 's/[EO]/\l&/g'
HeLLo
Now if you look at example 3 when we giving without any pattern, it means apply the action on the entire string and hence we got the entire string capitalized.

 6. Replace a pattern with another pattern:
     Bash has an inbuilt substitution option where we can replace a pattern with another. In this case, to replace 'a' with 'A', we use it like below. Most of the times we go for sed by default when it comes to string repalcement like this:
$ x='Rama and Sita'
$ echo ${x/a/A}
RAma and Sita
$ echo "$x" | sed 's/a/A/'
RAma and Sita
The above replaced only the 1st instance of the match. To replace all instances, the pattern should start with a /
$ echo ${x//a/A}
RAmA And SitA
$ echo "$x" | sed 's/a/A/g'
RAmA And SitA
7. Replace a pattern with another:
   Bash allows to use wildcards in the pattern match. In the below example, we want to replace everything starting from a digit with a pattern "old".
$ x='Ram is 24 years old'
$ echo ${x/[0-9]*/ old}
Ram is old
$ echo "$x" | sed 's/[0-9].*/old/'
Ram is old
8. Extract filename from absolute path:
   Bash's, remove matching prefix pattern, ##, removes the longest matching pattern from the beginning. The pattern we have given is */ meaning an set of characters followed by /. Since we have asked for longest match, it will remove till the last /.
$ echo "$x" | awk -F'/' '{print $NF}'
test.csv
$ x='/home/guru/test.csv'
$ echo ${x##*/}
test.csv
$ echo "$x" | awk -F'/' '{print $NF}'
test.csv
9. Extract directory from absolute path:
   Bash's, remove matching suffix pattern, %, removes the smallest matching pattern from the end. In this case, the pattern /* indicates a slash followed by set of characters. Since we have asked for shortest match, it will remove /test.csv, as a result we get the directory.
$ x='/home/guru/test.csv'
/home/guru/
$ echo ${x%/*}/
/home/guru/
$ echo ${x%/*}
/home/guru
10. Remove a pattern from the string:
    Bash's, remove matching suffix pattern, %%, removes the longest matching pattern from the end. In this case, the pattern [0-9]* indicates a number followed by set of characters. Since we have asked for longset match, it will start removing 2 all the way till the end.
$ x='Ram is 24 and Lax is 50 years old'
$ echo ${x%%[0-9]*}
Ram is
$ echo "$x"  | sed 's/[0-9].*//'
Ram is 

No comments:

Post a Comment