How to fetch the tag value for a given tag from a simple XML file?
Let us consider a simple XML file, cust.xml, with customer details as below:
<?xml version="1.0" encoding="ISO-8859-1"?> <CustDetails> <CustName>Unix</CustName> <CustomerId>999</CustomerId> <Age>28</Age> </CustDetails>The requirement is to retrieve the tag value of "CustName" from the xml file. Let us see how to get this done in different ways:
1. Using the sed command:
$ sed -n '/CustName/{s/.*<CustName>//;s/<\/CustName.*//;p;}' cust.xml UnixUsing sed substitution command, the pattern matching till the beginning of the opening tag is deleted. Another substitution is done to remove the pattern from the closing tag till the end. After this, only the tag value will be left in the pattern space.
2. Another sed way:
$ sed -n '/CustName/{s/.*<CustName>\(.*\)<\/CustName>.*/\1/;p}' cust.xml UnixThe entire line is matched using the regular expression, however, only the value part is grouped. Hence, by using the backreference( \1) in the replacement part, only the value is obtained.
Using a variable:
$ x="CustName" $ sed -n "/$x/{s/.*<$x>\(.*\)<\/$x>.*/\1/;p}" cust.xml UnixThe only difference being the use of double-quotes. Since variable is used, double quotes are needed for the variables to get expanded.
3. Using awk:
$ awk -F "[><]" '/CustName/{print $3}' cust.xml UnixUsing multiple delimiter ( < and >) in awk, the special variable $3 will contain the value in our example. By filtering the data only for CustName, the tag value is retrieved.
4. Using Perl:
$ perl -ne 'if (/CustName/){ s/.*?>//; s/<.*//;print;}' cust.xml UnixThis perl solution is little similar to the sed. Using perl susbtitution, the pattern till the first ">" is removed, and then from the "<" till the end is removed. With this, only the tag value will be left with.
5. Using the GNU grep command:
$ grep -oP '(?<=<CustName>).*(?=</CustName)' cust.xml Unix-o option is to only print the pattern matched instead of the entire line. -P option is for Perl like regular expressions. Using -P, we can do the perl kind of look ahead and look behind pattern matching in grep. This regular expression means: print the pattern which is preceeded(?<=) by "
No comments:
Post a Comment