Monday, April 23, 2012

HowTo: Convert Between Unix and Windows text files

How can I convert newline [ line break or end-of-line (EOL) character ] between Unix and Windows text files?

A newline act as a end of line for all text files. It is a special character and the format of this character differs slightly under Windows and UNIX operating systems. The actual code for displaying a newline vary across oses as follows:
  1. Almost all Unix commands and text editor may display the EOL with Ctrl-m ( ^M ) characters at the end of each line for all text files created on MS-Windows operating systems.
  2. MS-Windows may not display line feed or EOL for all text files created on UNIX operating systems.

Option #1: dos2unix and unix2dos Commands

You can use the dos2unix and unix2dos as follows. To convert newline for a UNIX file to MS-Windows, type:
$ cat -v input.txt
$ unix2dos input.txt output.txt
$ cat -v output.txt
$ vi output.txt

To convert newline for a MS-Windows file to a Unix file, type:
$ cat -v input.txt
$ dos2unix input.txt output.txt
$ cat -v output.txt

Option #2: awk command

You can use the awk command to convert a MS-Windows file to Unix format, type:
$ cat -v input.txt
$ awk '{ sub("\r$", ""); print }' input.txt > output.txt
$ cat -v output.txt

You can convert newline for a Unix file to MS-Windows format, enter:
$ cat -v input.txt
$ awk 'sub("$", "\r")' input.txt > output.txt
$ cat -v output.txt

Please note that the cat command with the -v option is used to display all non-printing characters such as ^ and M- notation, except for LFD and TAB.

No comments:

Post a Comment