Small shell tools for text editing

Slashdot it! Delicious Share on Facebook Tweet! Digg!

Converting Character Sets

Various character sets have existed for a while, making data exchanges difficult at times. With the recode program, you can convert the text file to a desired character set. Probably the most important option is -l , which lists all the known source and destination character sets. With -f , you can force recodings under all circumstances and the process is irreversible. The -v option provides a verbose summary of the conversion process itself.

Further options are described on the manual web page.

The smart method is to copy the file to be converted and avoid using the original. The recode -l command will get you the character set information. You can then apply the conversion to the copied file.

The syntax is as follows:

recode [OLD_CHARACTER_SET]..[NEW_CHARACTER_SET] [FILENAME]

Necessary control characters are also added, such as CR or LF. As an example, you would do the following:

cp a.txt exported.txt
recode -v UTF-8..ISO-8859-15 export.txt

to convert a file from its original UTF-8 character set to ISO-8859-15.

Replacing Tabs

The expand and unexpand programs adapting tabs within texts (in files as well as pipes). Both use the -t[NUM] option that determines how many space characters to substitute for a tab (normally eight).

With unexpand , you can use the -a option to convert all spaces (not just the first one) into tab characters. Note that using tr has pretty much the same effect.

Buy this article as PDF

Express-Checkout as PDF

Pages: 1

Price $0.99
(incl. VAT)

Buy Ubuntu User

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content