GNU Emacs Manual. Node: Enabling Multibyte

PREVInternational Intro UPInternational NEXTLanguage Environments

16.2: Enabling Multibyte Characters

You can enable or disable multibyte character support, either for Emacs as a whole, or for a single buffer. When multibyte characters are disabled in a buffer, then each byte in that buffer represents a character, even codes 0200 through 0377. The old features for supporting the European character sets, ISO Latin-1 and ISO Latin-2, work as they did in Emacs 19 and also work for the other ISO 8859 character sets.

However, there is no need to turn off multibyte character support to use ISO Latin; the Emacs multibyte character set includes all the characters in these character sets, and Emacs can translate automatically to and from the ISO codes.

To edit a particular file in unibyte representation, visit it using find-file-literally. See Visiting. To convert a buffer in multibyte representation into a single-byte representation of the same characters, the easiest way is to save the contents in a file, kill the buffer, and find the file again with find-file-literally. You can also use C-x RET c (universal-coding-system-argument) and specify `raw-text' as the coding system with which to find or save a file. See Specify Coding. Finding a file as `raw-text' doesn't disable format conversion, uncompression and auto mode selection as find-file-literally does.

To turn off multibyte character support by default, start Emacs with the `--unibyte' option (see Initial Options), or set the environment variable `EMACS_UNIBYTE'. You can also customize enable-multibyte-characters or, equivalently, directly set the variable default-enable-multibyte-characters in your init file to have basically the same effect as `--unibyte'.

Multibyte strings are not created during initialization from the values of environment variables, `/etc/passwd' entries etc. that contain non-ASCII 8-bit characters. However, the initialization file is normally read as multibyte---like Lisp files in general---even with `--unibyte'. To avoid multibyte strings being generated by non-ASCII characters in it, put `-*-unibyte: t;-*-' in a comment on the first line. Do the same for initialization files for packages like Gnus.

The mode line indicates whether multibyte character support is enabled in the current buffer. If it is, there are two or more characters (most often two dashes) before the colon near the beginning of the mode line. When multibyte characters are not enabled, just one dash precedes the colon.

PREVInternational Intro UPInternational NEXTLanguage Environments