Wednesday, January 23, 2013

gawk - Calculate date / time difference between timestamps



How to find the time difference between timestamps using gawk?
  Let us consider a file where the
        1st column is the Process name,
        2nd is the start time of the process, and
        3rd column is the end time of the process.

The requirement is to find the time consumed by the process which is the difference between the start and the end times.

1. File in which the date and time component are separated by a space:
$ cat file
P1,2012 12 4 21 36 48,2012 12 4 22 26 53
P2,2012 12 4 20 36 48,2012 12 4 21 21 23
P3,2012 12 4 18 36 48,2012 12 4 20 12 35
Time difference in seconds:
$ awk -F, '{d2=mktime($3);d1=mktime($2);print $1","d2-d1,"secs";}' file
P1,3005 secs
P2,2675 secs
P3,5747 secs
  Using mktime function, the Unix time is calculated for the date time strings, and their difference gives us the time elapsed in seconds.

2. File with the different date format :
$ cat file
P1,2012-12-4 21:36:48,2012-12-4 22:26:53
P2,2012-12-4 20:36:48,2012-12-4 21:21:23
P3,2012-12-4 18:36:48,2012-12-4 20:12:35
Note: This file has the start time and end time in different formats

Difference in seconds:
$ awk -F, '{gsub(/[-:]/," ",$2);gsub(/[-:]/," ",$3);d2=mktime($3);d1=mktime($2);print $1","d2-d1,"secs";}' file
P1,3005 secs
P2,2675 secs
P3,5747 secs
 Using gsub function, the '-' and ':' are replaced with a space. This is done because the mktime function arguments should be space separated.
Difference in minutes:
$ awk -F, '{gsub(/[-:]/," ",$2);gsub(/[-:]/," ",$3);d2=mktime($3);d1=mktime($2);print $1","(d2-d1)/60,"mins";}' file
P1,50.0833 mins
P2,44.5833 mins
P3,95.7833 mins
Just by dividing the seconds difference by 60 gives us the difference in minutes.

3. File with only date, without time part:
$ cat file
P1,2012-12-4,2012-12-6
P2,2012-12-4,2012-12-8
P3,2012-12-4,2012-12-5
Note: The start and end time has only the date components, no time components

Difference in seconds:
$ awk -F, '{gsub(/-/," ",$2);gsub(/-/," ",$3);$2=$2" 0 0 0";$3=$3" 0 0 0";d2=mktime($3);d1=mktime($2);print $1","d2-d1,"secs";}' file
P1,172800 secs
P2,345600 secs
P3,86400 secs
In addition to replacing the '-' and ':' with spaces, 0's are appended to the date field since the mktime requires the date in 6 column format.

Difference in days:
$ awk -F, '{gsub(/-/," ",$2);gsub(/-/," ",$3);$2=$2" 0 0 0";$3=$3" 0 0 0";d2=mktime($3);d1=mktime($2);print $1","(d2-d1)/86400,"days";}' file
P1,2 days
P2,4 days
P3,1 days
   A day has 86400(24*60*60) seconds, and hence by dividing the duration in seconds by 86400, the duration in days can be obtained.

2 comments: