您的位置:首页 > 其它

Connection Character Sets and Collations

2016-10-31 17:58 381 查看
MySQL 5.6 Reference Manual  /  ...  /  Connection
Character Sets and Collations


10.1.4 Connection Character Sets and Collations

Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:  >>一些字符集以及其校对规则与mysql服务器和客户端的交互有关。在前面的章节中我们已经提到过部分内容。

The server character set and collation are the values of the 
character_set_server
 and 
collation_server
 system
variables.  >>系统参数character_set_server和collation_server分别用来指定mysql服务(实例)的字符集和校对规则。

The character set and collation of the default database are the values of the 
character_set_database
 and 
collation_database
 system
variables.  >>系统参数character_set_database和collation_database分别用来指定数据库的默认字符集和校对规则。(如果在创建数据库时没有指定字符集,那么数据库的默认字符集为character_set_server指定字符集)

Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.  >>在处理mysql实例和mysql客户端之间信息传输时,也需要涉及到一些字符集和校对规则。每个客户端都有和连接有关的字符集和校对规则系统参数。

A “connection” is
what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several
questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:  >>mysql 客户端通过connection向mysql实例发送sql语句,例如一个查询语句。mysql 实例通过connection向发送了相应请求的客户端返回应答信息,例如查询结果集或者报错信息。这就可能导致一些字符集和校对规则的问题(主要是字符集转换产生的问题),这些问题可以通过设置相关的系统参数来解决:

What character set is the statement in when it leaves the client?  >>当sql语句离开客户端后,它使用什么作为自己的字符集?

The server takes the 
character_set_client
 system
variable to be the character set in which statements are sent by the client. >>当sql 语句离开客户端以后,使用mysql实例的character_set_client系统参数值作为自己的字符集

What character set should the server translate a statement to after receiving it?  >>当mysql接收到客户端发过来的sql语句时,把它转换成何种字符集?

For this, the server uses the 
character_set_connection
 and 
collation_connection
 system
variables. It converts statements sent by the client from
character_set_client
 to 
character_set_connection
 (except
for string literals that have an introducer such as 
_latin1
 or 
_utf8
).
collation_connection
 is
important for comparisons of literal strings. For comparisons of strings with column values, 
collation_connection
 does
not matter because columns have their own collation, which has a higher collation precedence.  >>mysql 实例把接收到的sql语句(mysql把character_set_client作为 客户端发送过来的sql语句的字符集),转换成character_set_connection所指定的字符集。(除非你在语句中指定了引介词,如SELECT _utf8'abc';)。collation_connection参数对于文字字符串的比较很重要。但是对于列值的比较collation_connection显得并不是很重要,因为在列级别会有自己的字符集和校对规则,并且优先级高于系统级别的collation_connection参数。

What character set should the server translate to before shipping result sets or error messages back to the client?  >>在mysql实例向客户端发送结果集或者报错信息前,需要把这些信息转换成什么字符集?

The 
character_set_results
 system
variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names and error messages.  >>character_set_results系统参数指定了mysql实例向客户端发送信息的字符集,这些信息包括查询的结果集(如,列值),结果集元数据(如,列名)以及报错信息等。

Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for
each connection to the server.  >>客户端能够很好的处理这些系统参数(character_set_clinet,character_set_connection,character_set_results),或者直接使用默认值(如果你决定使用默认值,你可以跳过本章的剩余内容)。如果你不打算使用默认值,你必须为每个连接mysql实例的connection指定字符集。

Two statements affect the connection-related character set variables as a group:  >>下面的两个语句影响我们上面提到的三个字符集:

SET NAMES 'charset_name
'
[COLLATE '
collation_name
']


SET
NAMES
 indicates what character set the client will use to send SQL statements to the server. Thus, 
SET
NAMES 'cp1251'
 tells the server, “future
incoming messages from this client are in character set 
cp1251
.” It
also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a 
SELECT
 statement.)
 >>

SET NAMES 'charset_name
' statement
is equivalent to these three statements:  >>set names 'charset_name';语句与下面三个语句等价:
SET character_set_client = [code]charset_name
;
SET character_set_results =
charset_name
;
SET character_set_connection =
charset_name
;
[/code]

Setting 
character_set_connection
 to 
charset_name
 also
implicitly sets 
collation_connection
 to
the default collation for 
charset_name
.
It is unnecessary to set that collation explicitly. To specify a particular collation, use the optional 
COLLATE
 clause:
 >>当你指定character_set_connection为某个字符集时,同时隐式指定collation_connection为该字符集的默认值(每个字符集都有其默认的校对规则)。当然也可以通过collate选项显示的为该字符集指定某个校对规则。如下:
SET NAMES '[code]charset_name
' COLLATE '
collation_name
'
[/code]

SET CHARACTER SET charset_name


SET
CHARACTER SET
 is similar to 
SET
NAMES
 but sets 
character_set_connection
 and 
collation_connection
 to 
character_set_database
 and
collation_database
.
SET
CHARACTER SET charset_name
 statement
is equivalent to these three statements:  >>set character set同set names语句类似,不同的是把character_set_database和collation_database参数值设置为character_set_connection和collation_connection值。set character set charset_name语句同下面的三个语句等价
SET character_set_client = [code]charset_name
;
SET character_set_results =
charset_name
;
SET collation_connection = @@collation_database;
[/code]

Setting 
collation_connection
 also
implicitly sets 
character_set_connection
 to
the character set associated with the collation (equivalent to executing 
SET
character_set_connection = @@character_set_database
). It is unnecessary to set 
character_set_connection
 explicitly.
 >>因为设置collation_connection参数同时会隐式设置character_set_connection为相应校对规则所对应的字符集。所以上面我们看到只执行了SET collation_connection = @@collation_database;并不需要再执行set character_set_connection语句。

Note

ucs2
utf16
utf16le
,
and 
utf32
 cannot be used as a client
character set, which means that they do not work for 
SET
NAMES
 or 
SET
CHARACTER SET
.  >>ucs2,utf16,utf16le以及utf32不能被用作客户端字符集,这就意味着如果使用set names或者set character set语句指定他们,并不会生效。

The MySQL client programs 
mysql
mysqladmin
mysqlcheck
mysqlimport
,
and 
mysqlshow
 determine the default
character set to use as follows:  >>mysql的客户端程序如 mysql,mysqladmin,mysqlcheck,mysqlimport,mysqlshow通过如下规则判断使用何种字符集作为客户端默字符集。

In the absence of other information, the programs use the compiled-in default character set, usually 
latin1
.
 >>如果没有其他信息,客户端程序使用mysql编译时指定的字符集为默认字符集,如果不是编译安装时特别指定的话,默认字符集都是latin1。编译时可以通过如下方法指定字符集和校对规则:               cmake . -DDEFAULT_CHARSET=utf8 \

           -DDEFAULT_COLLATION=utf8_general_ci

The programs can autodetect which character set to use based on the operating system setting, such as the value of the 
LANG
 or 
LC_ALL
 locale
environment variable on Unix systems or the code page setting on Windows systems. For systems on which the locale is available from the OS, the client uses it to set the default character set rather than using the compiled-in default. For example, setting 
LANG
 to 
ru_RU.KOI8-R
 causes
the 
koi8r
 character
set to be used. Thus, users can configure the locale in their environment for use by MySQL clients.  >>客户端程序根据操作系统端环境变量决定客户端默认字符集,如Unix操作系统上的LANG和LC_ALL环境变量,或者windows操作系统上的注册表。如果操作系统上的这些环境变量是有效的,客户端会使用系统的环境变量指定的字符集作为默认字符集而不是使用上面提到的mysql编译时指定的字符集。例如在系统上设置LANG为ru_RU.KOI8-R,并使环境变量生效,在这之后mysql客户端程序会使用KOI8-R作为默认字符集(character_set_client,character_set_connection,character_set_results)。注意这一条规则对mysql
5.1版本的客户端并不适用

The OS character set is mapped to the closest MySQL character set if there is no exact match. If the client does not support the matching character set, it uses the compiled-in default. For example, 
ucs2
 is
not supported as a connection character set.  >>操作系统字符集(LANG或者LC_ALL)匹配数据库字符集时并不是精确匹配。如果操作系统环境变量的字符集,mysql客户端并不支持,那么客户端使用编译时指定的字符集为默认字符集。例如,mysql客户端不支持ucs2字符集,如果LANG设置为ucs2,那么mysql客户端使用latin1作为默认字符集。

C applications can use character set autodetection based on the OS setting by invoking 
mysql_options()
 as
follows before connecting to the server:  >>对于C语言编写的应用程序,可以通过在连接数据库之前调用mysql_options,来识别操作系统环境变量为客户端默认字符集。
mysql_options(mysql,
MYSQL_SET_CHARSET_NAME,
MYSQL_AUTODETECT_CHARSET_NAME);


The programs support a 
--default-character-set
 option,
which enables users to specify the character set explicitly to override whatever default the client otherwise determines.  >>客户端程序可以通过--default-character-set选项,显示指定客户端默认字符集,并且会覆盖前面设定的字符集(编译时指定的字符集和操作系统环境变量字符集)。

When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the 
character_set_client
,
character_set_results
,
and 
character_set_connection
 system
variables. In effect, the server performs a 
SET
NAMES
 operation using the character set name.  >>当客户端连接mysql实例时,它会通知mysql实例它所使用的字符集。mysql实例设置character_set_client,character_set_results,character_set_connection为客户端指定字符集。实际上mysql实例是执行 set names语句。

With the mysql client,
to use a character set different from the default, you could explicitly execute 
SET
NAMES
 every time you start up. To accomplish the same result more easily, add the 
--default-character-set
 option
setting to your mysql command
line or in your option file. For example, the following option file setting changes the three connection-related character set variables set to 
koi8r
 each
time you invoke mysql:
 >>如果mysql客户端需要使用非默认字符集,你可以在每次连接后执行set names命令,或者在命令行或者配置文件中使用--defaults-character-set参数。例如:在配置文件中指定客户端默认字符集为koi8r
[mysql]
default-character-set=koi8r


If you are using the mysql client
with auto-reconnect enabled (which is not recommended), it is preferable to use the 
charset
 command
rather than 
SET
NAMES
. For example:  >>如果你使用了自动重连的客户端(我们并不推荐),最好使用 character 命令而不是set names命令,例如:
mysql> [code]charset utf8

Charset changed
[/code]

The 
charset
 command issues a 
SET
NAMES
 statement, and also changes the default character set that mysql uses
when it reconnects after the connection has dropped. >>character命令会执行set names命令,并且在客户端重连后修改客户端默认字符集。

Example: Suppose that 
column1
 is defined
as 
CHAR(5) CHARACTER SET latin2
. If
you do not say 
SET
NAMES
 or 
SET
CHARACTER SET
, then for 
SELECT column1
FROM t
, the server sends back all the values for 
column1
 using
the character set that the client specified when it connected. On the other hand, if you say 
SET
NAMES 'latin1'
 or 
SET CHARACTER SET
latin1
 before issuing the 
SELECT
 statement,
the server converts the 
latin2
 values
to 
latin1
 just before sending results
back. Conversion may be lossy if there are characters that are not in both character sets.  >>例如:假设列column1 字段类型和字符集被定义为CHAR(5) CHARACTER SET latin2。如果客户端连接mysql实例后,没有执行set
names或者set character修改客户端相关字符集,那么mysql实例,使用客户端连接时指定的字符集返回select column1 from t查询的结果集给客户端。如果你在执行select查询之前执行了set names latin1;或者set
character set latin1;修改客户端相关字符集为latin1,那么mysql实例会先把查询结果转换成lant1字符集再返回给客户端。如果latin2 字符集不是latin1字符集的绝对子集,那么转换过程中可能造成数据丢失。

If you want the server to perform no conversion of result sets or error messages, set 
character_set_results
 to 
NULL
 or 
binary
:
 >>如果你不喜欢发生这种转换,你可以设置character_set_results为空
SET character_set_results = NULL;


To see the values of the character set and collation system variables that apply to your connection, use these statements:  >>可以通过下面方法查看字符集和校对规则情况
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';


You must also consider the environment within which your MySQL applications execute. See Section 10.1.5,
“Configuring Application Character Set and Collation”.

For more information about character sets and error messages, see Section 10.1.6,
“Error Message Character Set”.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息