sponsor Vim development Vim logo Vim Book Ad

MultiEnc.vim : Autodetect multiple encodings (more than 'fileencodings')

 script karma  Rating 5/2, Downloaded by 1869  Comments, bugs, improvements  Vim wiki

created by
Yongwei Wu
 
script type
utility
 
description
------------------------------------------------------------------------
IMPORTANT NOTICE: This script is now merged with Ming Bai's FencView.vim. The content below is now obsolete. Please go check vimscript#1708.
------------------------------------------------------------------------

The Vim option 'fileencodings' has some limitations: e.g., it cannot autodetect GBK and Big5 files at the same time. This is my first motivation to write this script (and the support program tellenc).

This script does these things to decide the encoding of a file:

- If a file has a modeline fileencoding=..., it will be used as the encoding to open the file.
- If a file is an HTML file, and it has the encoding specified with a HTTP-EQUIV meta tag, it will be used as the encoding to open the file. The file pattern of HTML files can be customized by the global variable multienc_html_patterns.
- If a file cannot be decided by the steps above, tellenc may be used to decide its encoding. This includes HTML files without a suitable HTTP-EQUIV meta tag, and additional files can be detected with the global variable multienc_auto_patterns.
- A file can be manually autodetected with the command EditAutoEncoding (without a file name for the current buffer, or with a file name to edit a new file).
- The autodetection may be overridden with the command EditManualEncoding ("e ++enc=" may not work in some cases now).

The program used to tell the encoding of a file is "tellenc" by default. It can also be changed with the environment variable MULTIENC_TELLENC. My current tellenc, available at http://wyw.dcweb.cn/download.asp?path=&file=tellenc.zip, supports ASCII, UTF8, UTF-16, Latin1, Windows-1252, GB2312, GBK, Big5, and any Unicode encodings with BOM.
 
install details
Put multienc.vim in your Vim plugin directory, and tellenc in your path. Customize the pattern of HTML files and additional files with the global variables multienc_html_patterns and multienc_auto_patterns if needed. Set the global legacy_encoding with the default legacy encoding on your system.

A simplistic _vimrc (for Windows) may be like:

" Legacy encoding is the system default encoding
let g:legacy_encoding=&encoding

source $VIMRUNTIME/vimrc_example.vim
source $VIMRUNTIME/mswin.vim

if has('gui_running')
  set encoding=utf-8
else
  if &termencoding != '' && &termencoding != &encoding
    let &encoding=&termencoding
    let &fileencodings='ucs-bom,utf-8,' . &encoding
  endif
endif

" Set default file encoding(s) to the legacy encoding
exec 'set fileencoding=' . g:legacy_encoding
let &fileencodings=substitute
                  \(&fileencodings, '\<default\>', g:legacy_encoding, '')

" File patterns of files for automatic encoding detection
let multienc_auto_patterns='*.txt,*.tex'
let multienc_html_patterns='*.htm{l\=},*.asp'
 

rate this script Life Changing Helpful Unfulfilling 
script versions (upload new version)

Click on the package to download.

package script version date Vim version user release notes
multienc.vim 1.4 2007-02-27 7.0 Yongwei Wu Normalize detected encoding names to make it work better with fileencodings.
multienc.vim 1.3 2007-02-25 7.0 Yongwei Wu Enhance using modeline to set fileencoding.
ip used for rating: 3.129.26.92

If you have questions or remarks about this site, visit the vimonline development pages. Please use this site responsibly.
Questions about Vim should go to the maillist. Help Bram help Uganda.
   
Vim at Github