Friday, May 25, 2012

Linux / UNIX Shell: Sort Date

Q. How do I sort date type of data stored in a text file which is in following sample format:
Code Date TIME
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS051 05/12/2008 10AM
MHOS041 20/12/2008 2PM
MHOS042 27/12/2008 2PM
MHOS053 11/12/2008 10AM
MHOS043 30/12/2008 2PM
How do I sort data date and time wise?

A. Use sort command to displays the lines of its input listed in sorted order. It has special option callede -M to sort months. There are total 3 fields. First one is some sort of code. Second is actual date in DD/MM/YYYY format and third one is time. You need to sort 2nd and 3rd column using the -k option of sort command as follows:
$ sort -k 2,2n -k 3 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM
ZFT01 06/12/2008 10AM
MHOS053 11/12/2008 10AM
HOS73 11/12/2008 2PM
HOS06 15/12/2008 2PM
MHOS041 20/12/2008 2PM
HOS05 23/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
MHOS042 27/12/2008 2PM
HOS62 29/12/2008 10AM
MHOS043 30/12/2008 2PM
Where,
  • -k 2,2n -k 3 : Sort data using the given column number. The option -k 2,2n -k 3 sorts each column. First, it will sort 2nd column (date dd field) and then 3rd column (time).
  • data.file.txt : Input file

Multilevel sort

Now consider data as follows:
MHOS051 05/12/2008 10AM
ZFT01 06/12/2008 10AM
MHOS053 11/12/2008 10AM
MHOS051 10/01/2009 10AM
HOS73 11/12/2008 2PM
HOS06 15/12/2008 2PM
MHOS041 20/12/2008 2PM
HOS05 23/12/2008 10AM
HOS64 23/12/2008 2PM
MHOS051 12/04/2009 10AM
HOS70 26/12/2008 10AM
MHOS042 27/12/2008 2PM
HOS62 29/12/2008 10AM
MHOS051 27/01/2009 10AM
MHOS043 30/12/2008 2PM
Now, you need to first sort out year and then date using F[.C][OPTS] format, where F is the number of the field to use, and C is the number of the first character from the beginning of the field. You need to use sort command as follows:
$ sort -n -k 2.9 -k 2.5 -k 2 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS053 11/12/2008 10AM
HOS06 15/12/2008 2PM
MHOS041 20/12/2008 2PM
HOS05 23/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
MHOS042 27/12/2008 2PM
HOS62 29/12/2008 10AM
MHOS043 30/12/2008 2PM
MHOS051 10/01/2009 10AM
MHOS051 27/01/2009 10AM
MHOS051 12/04/2009 10AM
Where,
  • -n : sort numeric data
  • -k 2.9 : Select 2nd filed and 9th character for sorting (i.e sort on last digit of year)
  • -k 2.5 : Select 2nd field and 5th character for sorting (i.e. sort on last digit of month)
  • -k 2 : Select 2nd field and sort it out
  • data.file.txt : Input file

No comments:

Post a Comment