==============================================================================
Locale *mbyte-locale*
The easiest setup is when your whole system uses the locale you want to work
in. But it's also possible to set the locale for one shell you are working
in, or just use a certain locale inside Vim.
WHAT IS A LOCALE? *locale*
There are many languages in the world. And there are different cultures and
environments at least as many as the number of languages. A linguistic
environment corresponding to an area is called "locale". This includes
information about the used language, the charset, collating order for sorting,
date format, currency format and so on. For Vim only the language and charset
really matter.
You can only use a locale if your system has support for it. Some systems
have only a few locales, especially in the USA. The language which you want
to use may not be on your system. In that case you might be able to install
it as an extra package. Check your system documentation for how to do that.
The location in which the locales are installed varies from system to system.
For example, "/usr/share/locale" or "/usr/lib/locale". See your system's
setlocale() man page.
Looking in these directories will show you the exact name of each locale.
Mostly upper/lowercase matters, thus "ja_JP.EUC" and "ja_jp.euc" are
different. Some systems have a locale.alias file, which allows translation
from a short name like "nl" to the full name "nl_NL.ISO_8859-1".
Note that X-windows has its own locale stuff. And unfortunately uses locale
names different from what is used elsewhere. This is confusing! For Vim it
matters what the setlocale() function uses, which is generally NOT the
X-windows stuff. You might have to do some experiments to find out what
really works.
*locale-name*
The (simplified) format of |locale| name is:
language
or language_territory
or language_territory.codeset
Territory means the country (or part of it), codeset means the |charset|. For
example, the locale name "ja_JP.eucJP" means:
ja the language is Japanese
JP the country is Japan
eucJP the codeset is EUC-JP
But it also could be "ja", "ja_JP.EUC", "ja_JP.ujis", etc. And unfortunately,
the locale name for a specific language, territory and codeset is not unified
and depends on your system.
Examples of locale name:
charset language locale name ~
GB2312 Chinese (simplified) zh_CN.EUC, zh_CN.GB2312
Big5 Chinese (traditional) zh_TW.BIG5, zh_TW.Big5
CNS-11643 Chinese (traditional) zh_TW
EUC-JP Japanese ja, ja_JP.EUC, ja_JP.ujis, ja_JP.eucJP
Shift_JIS Japanese ja_JP.SJIS, ja_JP.Shift_JIS
EUC-KR Korean ko, ko_KR.EUC
USING A LOCALE
To start using a locale for the whole system, see the documentation of your
system. Mostly you need to set it in a configuration file in "/etc".
To use a locale in a shell, set the $LANG environment value. When you want to
use Korean and the |locale| name is "ko", do this:
sh: export LANG=ko
csh: setenv LANG ko
You can put this in your ~/.profile or ~/.cshrc file to always use it.
To use a locale in Vim only, use the |:language| command: >
:language ko
Put this in your |init.vim| file to use it always.
Or specify $LANG when starting Vim:
sh: LANG=ko vim {vim-arguments}
csh: env LANG=ko vim {vim-arguments}
You could make a small shell script for this.
==============================================================================
Encoding *mbyte-encoding*
UTF-8 is always used internally to encode characters. This applies to all the
places where text is used, including buffers (files loaded into memory),
registers and variables.
*charset* *codeset*
Charset is another name for encoding. There are subtle differences, but these
don't matter when using Vim. "codeset" is another similar name.
Each character is encoded as one or more bytes. When all characters are
encoded with one byte,