Other coding systems with emacs
Facts - Editor: emacs
Sunday, 28 June 2009 08:00

This page was made for emacs version 22.

Suppose you have just pasted characters from another language into your file. You have saved the file but you want to save it again to another coding system, for example to "utf-8". I encountered this when editing a javascript file and adding characters with accents, such as é. When javascript had to output these characters in the title tag of a new HTML page these characters were not correctly displayed in my browser. Saving the javascript file to UTF-8 fixed this display problem.

To save a file with UTF-8 encoding first load the file into an emacs buffer. Then invoke the command mm-set-buffer-file-coding-system: "Alt-x mm-set-buffer-file-coding-system ENTER utf-8". If emacs can not find this command then simply use the command set-buffer-file-coding-system instead.

Finally save the buffer.

It would be nice if emacs recognizes the coding system of the saved file next time you visit it. I have been experimenting putting the following text anywhere at the first line of my files: "-*- coding: utf-8 -*-" (without the quotes).

But I would like every file to be utf-8 so I put the following in my .emacs file instead:

(prefer-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)

Now I can still enter characters with accents using "Control-x 8". So type "Control-x 8 'e" if you need to type é. These key sequences appear to be broken on Windows XP if you put the following in your .emacs init file as well:

(set-keyboard-coding-system 'utf-8)

Setting the coding system in emacs to utf-8 breaks cutting and pasting text from emacs to other applications. Add the following to your .emacs file in order to restore correct pasting text from emacs (tested on Windows XP):

(setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))
;; The next line is only needed for the MS-Windows clipboard
(set-clipboard-coding-system 'utf-16le-dos)

A note on a bug/problem in emacs 23: 23.1 and 23.2. Despite the internationalization effort so far on emacs 21.x and later you might still have difficulties when viewing certain characters. In particular some very common japanese characters won't display on certain windows computers, e.g. Windows 7, 64 bit. Instead these characters are displayed as empty squares. When you check such a character with the command describe-char then you will see in the output display: no font available.

If you still want to see these characters you can select a different default font: Meta-x set-default-font RET -outline-MS Gothic-normal-normal-normal-mono-*-*-*-*-c-*-jisx0208-sjis or type (set-default-font "MS Gothic 10") C-J in your Scratch buffer. Note that this command will change the font in all open buffers, which is often not what you want.

I am not a japanese language expert and neither I am able to speak or read chinese. But a similar trick seems to work for simplified and traditional chinese as well: Meta-x set-default-font RET -outline-SimSun-normal-normal-normal-*-16-*-*-*-p-*-iso8859-1. Moreover japanese characters seem to be displayed as well with this font.