Text and Binary

GNU Emacs Manual. Node: Text and Binary

PREV MS-DOS File Names MS-DOS NEXT MS-DOS Printing

28.4: Text Files and Binary Files

GNU Emacs uses newline characters to separate text lines. This is the convention used on Unix, on which GNU Emacs was developed, and on GNU systems since they are modeled on Unix.

MS-DOS and MS-Windows normally use carriage-return linefeed, a two-character sequence, to separate text lines. (Linefeed is the same character as newline.) Therefore, convenient editing of typical files with Emacs requires conversion of these end-of-line (EOL) sequences. And that is what Emacs normally does: it converts carriage-return linefeed into newline when reading files, and converts newline into carriage-return linefeed when writing files. The same mechanism that handles conversion of international character codes does this conversion also (see Coding Systems).

One consequence of this special format-conversion of most files is that character positions as reported by Emacs (see Position Info) do not agree with the file size information known to the operating system.

Some kinds of files should not be converted, because their contents are not really text. Therefore, Emacs on MS-DOS distinguishes certain files as binary files, and reads and writes them verbatim. (This distinction is not part of MS-DOS; it is made by Emacs only.) These include executable programs, compressed archives, etc. Emacs uses the file name to decide whether to treat a file as binary: the variable file-name-buffer-file-type-alist defines the file-name patterns that indicate binary files. Note that if a file name matches one of the patterns for binary files in file-name-buffer-file-type-alist, Emacs uses the no-conversion coding system (see Coding Systems) which turns off all coding-system conversions, not only the EOL conversion.

In addition, if Emacs recognizes from a file's contents that it uses newline rather than carriage-return linefeed as its line separator, it does not perform conversion when reading or writing that file. Thus, you can read and edit files from Unix or GNU systems on MS-DOS with no special effort, and they will be left with their Unix-style EOLs.

You can visit a file and specify whether to treat a file as text or binary using the commands find-file-text and find-file-binary. End-of-line conversion is part of the general coding system conversion mechanism, so another way to control whether to treat a file as text or binary is with the commands for specifying a coding system (see Specify Coding). For example, C-x RET c undecided-unix RET C-x C-f foobar.txt visits the file `foobar.txt' without converting the EOLs.

The mode line indicates whether end-of-line translation was used for the current buffer. Normally a colon appears after the coding system letter near the beginning of the mode line. If MS-DOS end-of-line translation is in use for the buffer, this character changes to a backslash.

When you use NFS or Samba to access file systems that reside on computers using Unix or GNU systems, Emacs should not perform end-of-line translation on any files in these file systems--not even when you create a new file. To request this, designate these file systems as untranslated file systems by calling the function add-untranslated-filesystem. It takes one argument: the file system name, including a drive letter and optionally a directory. For example,

(add-untranslated-filesystem "Z:")

designates drive Z as an untranslated file system, and

(add-untranslated-filesystem "Z:\\foo")

designates directory `\foo' on drive Z as an untranslated file system.

Most often you would use add-untranslated-filesystem in your `_emacs' file, or in `site-start.el' so that all the users at your site get the benefit of it.

To countermand the effect of add-untranslated-filesystem, use the function remove-untranslated-filesystem. This function takes one argument, which should be a string just like the one that was used previously with add-untranslated-filesystem.

MS-DOS File Names

MS-DOS

MS-DOS Printing