Connection Character Sets and Collations
2016-10-31 17:58
381 查看
MySQL 5.6 Reference Manual / ... / Connection
Character Sets and Collations
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections: >>一些字符集以及其校对规则与mysql服务器和客户端的交互有关。在前面的章节中我们已经提到过部分内容。
The server character set and collation are the values of the
variables. >>系统参数character_set_server和collation_server分别用来指定mysql服务(实例)的字符集和校对规则。
The character set and collation of the default database are the values of the
variables. >>系统参数character_set_database和collation_database分别用来指定数据库的默认字符集和校对规则。(如果在创建数据库时没有指定字符集,那么数据库的默认字符集为character_set_server指定字符集)
Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables. >>在处理mysql实例和mysql客户端之间信息传输时,也需要涉及到一些字符集和校对规则。每个客户端都有和连接有关的字符集和校对规则系统参数。
A “connection” is
what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several
questions about character set and collation handling for client connections, each of which can be answered in terms of system variables: >>mysql 客户端通过connection向mysql实例发送sql语句,例如一个查询语句。mysql 实例通过connection向发送了相应请求的客户端返回应答信息,例如查询结果集或者报错信息。这就可能导致一些字符集和校对规则的问题(主要是字符集转换产生的问题),这些问题可以通过设置相关的系统参数来解决:
What character set is the statement in when it leaves the client? >>当sql语句离开客户端后,它使用什么作为自己的字符集?
The server takes the
variable to be the character set in which statements are sent by the client. >>当sql 语句离开客户端以后,使用mysql实例的character_set_client系统参数值作为自己的字符集
What character set should the server translate a statement to after receiving it? >>当mysql接收到客户端发过来的sql语句时,把它转换成何种字符集?
For this, the server uses the
variables. It converts statements sent by the client from
for string literals that have an introducer such as
important for comparisons of literal strings. For comparisons of strings with column values,
not matter because columns have their own collation, which has a higher collation precedence. >>mysql 实例把接收到的sql语句(mysql把character_set_client作为 客户端发送过来的sql语句的字符集),转换成character_set_connection所指定的字符集。(除非你在语句中指定了引介词,如SELECT _utf8'abc';)。collation_connection参数对于文字字符串的比较很重要。但是对于列值的比较collation_connection显得并不是很重要,因为在列级别会有自己的字符集和校对规则,并且优先级高于系统级别的collation_connection参数。
What character set should the server translate to before shipping result sets or error messages back to the client? >>在mysql实例向客户端发送结果集或者报错信息前,需要把这些信息转换成什么字符集?
The
variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names and error messages. >>character_set_results系统参数指定了mysql实例向客户端发送信息的字符集,这些信息包括查询的结果集(如,列值),结果集元数据(如,列名)以及报错信息等。
Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for
each connection to the server. >>客户端能够很好的处理这些系统参数(character_set_clinet,character_set_connection,character_set_results),或者直接使用默认值(如果你决定使用默认值,你可以跳过本章的剩余内容)。如果你不打算使用默认值,你必须为每个连接mysql实例的connection指定字符集。
Two statements affect the connection-related character set variables as a group: >>下面的两个语句影响我们上面提到的三个字符集:
[COLLATE '
incoming messages from this client are in character set
also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a
>>
A
is equivalent to these three statements: >>set names 'charset_name';语句与下面三个语句等价:
SET character_set_results =
SET character_set_connection =
[/code]
Setting
implicitly sets
the default collation for
It is unnecessary to set that collation explicitly. To specify a particular collation, use the optional
>>当你指定character_set_connection为某个字符集时,同时隐式指定collation_connection为该字符集的默认值(每个字符集都有其默认的校对规则)。当然也可以通过collate选项显示的为该字符集指定某个校对规则。如下:
[/code]
A
is equivalent to these three statements: >>set character set同set names语句类似,不同的是把character_set_database和collation_database参数值设置为character_set_connection和collation_connection值。set character set charset_name语句同下面的三个语句等价
SET character_set_results =
SET collation_connection = @@collation_database;
[/code]
Setting
implicitly sets
the character set associated with the collation (equivalent to executing
>>因为设置collation_connection参数同时会隐式设置character_set_connection为相应校对规则所对应的字符集。所以上面我们看到只执行了SET collation_connection = @@collation_database;并不需要再执行set character_set_connection语句。
Note
and
character set, which means that they do not work for
The MySQL client programs
and
character set to use as follows: >>mysql的客户端程序如 mysql,mysqladmin,mysqlcheck,mysqlimport,mysqlshow通过如下规则判断使用何种字符集作为客户端默字符集。
In the absence of other information, the programs use the compiled-in default character set, usually
>>如果没有其他信息,客户端程序使用mysql编译时指定的字符集为默认字符集,如果不是编译安装时特别指定的话,默认字符集都是latin1。编译时可以通过如下方法指定字符集和校对规则: cmake . -DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci
The programs can autodetect which character set to use based on the operating system setting, such as the value of the
environment variable on Unix systems or the code page setting on Windows systems. For systems on which the locale is available from the OS, the client uses it to set the default character set rather than using the compiled-in default. For example, setting
the
set to be used. Thus, users can configure the locale in their environment for use by MySQL clients. >>客户端程序根据操作系统端环境变量决定客户端默认字符集,如Unix操作系统上的LANG和LC_ALL环境变量,或者windows操作系统上的注册表。如果操作系统上的这些环境变量是有效的,客户端会使用系统的环境变量指定的字符集作为默认字符集而不是使用上面提到的mysql编译时指定的字符集。例如在系统上设置LANG为ru_RU.KOI8-R,并使环境变量生效,在这之后mysql客户端程序会使用KOI8-R作为默认字符集(character_set_client,character_set_connection,character_set_results)。注意这一条规则对mysql
5.1版本的客户端并不适用
The OS character set is mapped to the closest MySQL character set if there is no exact match. If the client does not support the matching character set, it uses the compiled-in default. For example,
not supported as a connection character set. >>操作系统字符集(LANG或者LC_ALL)匹配数据库字符集时并不是精确匹配。如果操作系统环境变量的字符集,mysql客户端并不支持,那么客户端使用编译时指定的字符集为默认字符集。例如,mysql客户端不支持ucs2字符集,如果LANG设置为ucs2,那么mysql客户端使用latin1作为默认字符集。
C applications can use character set autodetection based on the OS setting by invoking
follows before connecting to the server: >>对于C语言编写的应用程序,可以通过在连接数据库之前调用mysql_options,来识别操作系统环境变量为客户端默认字符集。
The programs support a
which enables users to specify the character set explicitly to override whatever default the client otherwise determines. >>客户端程序可以通过--default-character-set选项,显示指定客户端默认字符集,并且会覆盖前面设定的字符集(编译时指定的字符集和操作系统环境变量字符集)。
When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the
and
variables. In effect, the server performs a
With the mysql client,
to use a character set different from the default, you could explicitly execute
setting to your mysql command
line or in your option file. For example, the following option file setting changes the three connection-related character set variables set to
time you invoke mysql:
>>如果mysql客户端需要使用非默认字符集,你可以在每次连接后执行set names命令,或者在命令行或者配置文件中使用--defaults-character-set参数。例如:在配置文件中指定客户端默认字符集为koi8r
If you are using the mysql client
with auto-reconnect enabled (which is not recommended), it is preferable to use the
rather than
Charset changed
[/code]
The
when it reconnects after the connection has dropped. >>character命令会执行set names命令,并且在客户端重连后修改客户端默认字符集。
Example: Suppose that
as
you do not say
the character set that the client specified when it connected. On the other hand, if you say
the server converts the
to
back. Conversion may be lossy if there are characters that are not in both character sets. >>例如:假设列column1 字段类型和字符集被定义为CHAR(5) CHARACTER SET latin2。如果客户端连接mysql实例后,没有执行set
names或者set character修改客户端相关字符集,那么mysql实例,使用客户端连接时指定的字符集返回select column1 from t查询的结果集给客户端。如果你在执行select查询之前执行了set names latin1;或者set
character set latin1;修改客户端相关字符集为latin1,那么mysql实例会先把查询结果转换成lant1字符集再返回给客户端。如果latin2 字符集不是latin1字符集的绝对子集,那么转换过程中可能造成数据丢失。
If you want the server to perform no conversion of result sets or error messages, set
>>如果你不喜欢发生这种转换,你可以设置character_set_results为空
To see the values of the character set and collation system variables that apply to your connection, use these statements: >>可以通过下面方法查看字符集和校对规则情况
You must also consider the environment within which your MySQL applications execute. See Section 10.1.5,
“Configuring Application Character Set and Collation”.
For more information about character sets and error messages, see Section 10.1.6,
“Error Message Character Set”.
Character Sets and Collations
10.1.4 Connection Character Sets and Collations
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections: >>一些字符集以及其校对规则与mysql服务器和客户端的交互有关。在前面的章节中我们已经提到过部分内容。The server character set and collation are the values of the
character_set_serverand
collation_serversystem
variables. >>系统参数character_set_server和collation_server分别用来指定mysql服务(实例)的字符集和校对规则。
The character set and collation of the default database are the values of the
character_set_databaseand
collation_databasesystem
variables. >>系统参数character_set_database和collation_database分别用来指定数据库的默认字符集和校对规则。(如果在创建数据库时没有指定字符集,那么数据库的默认字符集为character_set_server指定字符集)
Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables. >>在处理mysql实例和mysql客户端之间信息传输时,也需要涉及到一些字符集和校对规则。每个客户端都有和连接有关的字符集和校对规则系统参数。
A “connection” is
what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several
questions about character set and collation handling for client connections, each of which can be answered in terms of system variables: >>mysql 客户端通过connection向mysql实例发送sql语句,例如一个查询语句。mysql 实例通过connection向发送了相应请求的客户端返回应答信息,例如查询结果集或者报错信息。这就可能导致一些字符集和校对规则的问题(主要是字符集转换产生的问题),这些问题可以通过设置相关的系统参数来解决:
What character set is the statement in when it leaves the client? >>当sql语句离开客户端后,它使用什么作为自己的字符集?
The server takes the
character_set_clientsystem
variable to be the character set in which statements are sent by the client. >>当sql 语句离开客户端以后,使用mysql实例的character_set_client系统参数值作为自己的字符集
What character set should the server translate a statement to after receiving it? >>当mysql接收到客户端发过来的sql语句时,把它转换成何种字符集?
For this, the server uses the
character_set_connectionand
collation_connectionsystem
variables. It converts statements sent by the client from
character_set_clientto
character_set_connection(except
for string literals that have an introducer such as
_latin1or
_utf8).
collation_connectionis
important for comparisons of literal strings. For comparisons of strings with column values,
collation_connectiondoes
not matter because columns have their own collation, which has a higher collation precedence. >>mysql 实例把接收到的sql语句(mysql把character_set_client作为 客户端发送过来的sql语句的字符集),转换成character_set_connection所指定的字符集。(除非你在语句中指定了引介词,如SELECT _utf8'abc';)。collation_connection参数对于文字字符串的比较很重要。但是对于列值的比较collation_connection显得并不是很重要,因为在列级别会有自己的字符集和校对规则,并且优先级高于系统级别的collation_connection参数。
What character set should the server translate to before shipping result sets or error messages back to the client? >>在mysql实例向客户端发送结果集或者报错信息前,需要把这些信息转换成什么字符集?
The
character_set_resultssystem
variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names and error messages. >>character_set_results系统参数指定了mysql实例向客户端发送信息的字符集,这些信息包括查询的结果集(如,列值),结果集元数据(如,列名)以及报错信息等。
Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for
each connection to the server. >>客户端能够很好的处理这些系统参数(character_set_clinet,character_set_connection,character_set_results),或者直接使用默认值(如果你决定使用默认值,你可以跳过本章的剩余内容)。如果你不打算使用默认值,你必须为每个连接mysql实例的connection指定字符集。
Two statements affect the connection-related character set variables as a group: >>下面的两个语句影响我们上面提到的三个字符集:
SET NAMES 'charset_name
'[COLLATE '
collation_name']
SET NAMESindicates what character set the client will use to send SQL statements to the server. Thus,
SET NAMES 'cp1251'tells the server, “future
incoming messages from this client are in character set
cp1251.” It
also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a
SELECTstatement.)
>>
A
SET NAMES 'charset_name
' statementis equivalent to these three statements: >>set names 'charset_name';语句与下面三个语句等价:
SET character_set_client = [code]charset_name;
SET character_set_results =
charset_name;
SET character_set_connection =
charset_name;
[/code]
Setting
character_set_connectionto
charset_namealso
implicitly sets
collation_connectionto
the default collation for
charset_name.
It is unnecessary to set that collation explicitly. To specify a particular collation, use the optional
COLLATEclause:
>>当你指定character_set_connection为某个字符集时,同时隐式指定collation_connection为该字符集的默认值(每个字符集都有其默认的校对规则)。当然也可以通过collate选项显示的为该字符集指定某个校对规则。如下:
SET NAMES '[code]charset_name' COLLATE '
collation_name'
[/code]
SET CHARACTER SET charset_name
SET CHARACTER SETis similar to
SET NAMESbut sets
character_set_connectionand
collation_connectionto
character_set_databaseand
collation_database.
A
SET
CHARACTER SET charset_name
statementis equivalent to these three statements: >>set character set同set names语句类似,不同的是把character_set_database和collation_database参数值设置为character_set_connection和collation_connection值。set character set charset_name语句同下面的三个语句等价
SET character_set_client = [code]charset_name;
SET character_set_results =
charset_name;
SET collation_connection = @@collation_database;
[/code]
Setting
collation_connectionalso
implicitly sets
character_set_connectionto
the character set associated with the collation (equivalent to executing
SET character_set_connection = @@character_set_database). It is unnecessary to set
character_set_connectionexplicitly.
>>因为设置collation_connection参数同时会隐式设置character_set_connection为相应校对规则所对应的字符集。所以上面我们看到只执行了SET collation_connection = @@collation_database;并不需要再执行set character_set_connection语句。
Note
ucs2,
utf16,
utf16le,
and
utf32cannot be used as a client
character set, which means that they do not work for
SET NAMESor
SET CHARACTER SET. >>ucs2,utf16,utf16le以及utf32不能被用作客户端字符集,这就意味着如果使用set names或者set character set语句指定他们,并不会生效。
The MySQL client programs
mysql,
mysqladmin,
mysqlcheck,
mysqlimport,
and
mysqlshowdetermine the default
character set to use as follows: >>mysql的客户端程序如 mysql,mysqladmin,mysqlcheck,mysqlimport,mysqlshow通过如下规则判断使用何种字符集作为客户端默字符集。
In the absence of other information, the programs use the compiled-in default character set, usually
latin1.
>>如果没有其他信息,客户端程序使用mysql编译时指定的字符集为默认字符集,如果不是编译安装时特别指定的话,默认字符集都是latin1。编译时可以通过如下方法指定字符集和校对规则: cmake . -DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci
The programs can autodetect which character set to use based on the operating system setting, such as the value of the
LANGor
LC_ALLlocale
environment variable on Unix systems or the code page setting on Windows systems. For systems on which the locale is available from the OS, the client uses it to set the default character set rather than using the compiled-in default. For example, setting
LANGto
ru_RU.KOI8-Rcauses
the
koi8rcharacter
set to be used. Thus, users can configure the locale in their environment for use by MySQL clients. >>客户端程序根据操作系统端环境变量决定客户端默认字符集,如Unix操作系统上的LANG和LC_ALL环境变量,或者windows操作系统上的注册表。如果操作系统上的这些环境变量是有效的,客户端会使用系统的环境变量指定的字符集作为默认字符集而不是使用上面提到的mysql编译时指定的字符集。例如在系统上设置LANG为ru_RU.KOI8-R,并使环境变量生效,在这之后mysql客户端程序会使用KOI8-R作为默认字符集(character_set_client,character_set_connection,character_set_results)。注意这一条规则对mysql
5.1版本的客户端并不适用
The OS character set is mapped to the closest MySQL character set if there is no exact match. If the client does not support the matching character set, it uses the compiled-in default. For example,
ucs2is
not supported as a connection character set. >>操作系统字符集(LANG或者LC_ALL)匹配数据库字符集时并不是精确匹配。如果操作系统环境变量的字符集,mysql客户端并不支持,那么客户端使用编译时指定的字符集为默认字符集。例如,mysql客户端不支持ucs2字符集,如果LANG设置为ucs2,那么mysql客户端使用latin1作为默认字符集。
C applications can use character set autodetection based on the OS setting by invoking
mysql_options()as
follows before connecting to the server: >>对于C语言编写的应用程序,可以通过在连接数据库之前调用mysql_options,来识别操作系统环境变量为客户端默认字符集。
mysql_options(mysql, MYSQL_SET_CHARSET_NAME, MYSQL_AUTODETECT_CHARSET_NAME);
The programs support a
--default-character-setoption,
which enables users to specify the character set explicitly to override whatever default the client otherwise determines. >>客户端程序可以通过--default-character-set选项,显示指定客户端默认字符集,并且会覆盖前面设定的字符集(编译时指定的字符集和操作系统环境变量字符集)。
When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the
character_set_client,
character_set_results,
and
character_set_connectionsystem
variables. In effect, the server performs a
SET NAMESoperation using the character set name. >>当客户端连接mysql实例时,它会通知mysql实例它所使用的字符集。mysql实例设置character_set_client,character_set_results,character_set_connection为客户端指定字符集。实际上mysql实例是执行 set names语句。
With the mysql client,
to use a character set different from the default, you could explicitly execute
SET NAMESevery time you start up. To accomplish the same result more easily, add the
--default-character-setoption
setting to your mysql command
line or in your option file. For example, the following option file setting changes the three connection-related character set variables set to
koi8reach
time you invoke mysql:
>>如果mysql客户端需要使用非默认字符集,你可以在每次连接后执行set names命令,或者在命令行或者配置文件中使用--defaults-character-set参数。例如:在配置文件中指定客户端默认字符集为koi8r
[mysql] default-character-set=koi8r
If you are using the mysql client
with auto-reconnect enabled (which is not recommended), it is preferable to use the
charsetcommand
rather than
SET NAMES. For example: >>如果你使用了自动重连的客户端(我们并不推荐),最好使用 character 命令而不是set names命令,例如:
mysql> [code]charset utf8
Charset changed
[/code]
The
charsetcommand issues a
SET NAMESstatement, and also changes the default character set that mysql uses
when it reconnects after the connection has dropped. >>character命令会执行set names命令,并且在客户端重连后修改客户端默认字符集。
Example: Suppose that
column1is defined
as
CHAR(5) CHARACTER SET latin2. If
you do not say
SET NAMESor
SET CHARACTER SET, then for
SELECT column1 FROM t, the server sends back all the values for
column1using
the character set that the client specified when it connected. On the other hand, if you say
SET NAMES 'latin1'or
SET CHARACTER SET latin1before issuing the
SELECTstatement,
the server converts the
latin2values
to
latin1just before sending results
back. Conversion may be lossy if there are characters that are not in both character sets. >>例如:假设列column1 字段类型和字符集被定义为CHAR(5) CHARACTER SET latin2。如果客户端连接mysql实例后,没有执行set
names或者set character修改客户端相关字符集,那么mysql实例,使用客户端连接时指定的字符集返回select column1 from t查询的结果集给客户端。如果你在执行select查询之前执行了set names latin1;或者set
character set latin1;修改客户端相关字符集为latin1,那么mysql实例会先把查询结果转换成lant1字符集再返回给客户端。如果latin2 字符集不是latin1字符集的绝对子集,那么转换过程中可能造成数据丢失。
If you want the server to perform no conversion of result sets or error messages, set
character_set_resultsto
NULLor
binary:
>>如果你不喜欢发生这种转换,你可以设置character_set_results为空
SET character_set_results = NULL;
To see the values of the character set and collation system variables that apply to your connection, use these statements: >>可以通过下面方法查看字符集和校对规则情况
SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';
You must also consider the environment within which your MySQL applications execute. See Section 10.1.5,
“Configuring Application Character Set and Collation”.
For more information about character sets and error messages, see Section 10.1.6,
“Error Message Character Set”.
相关文章推荐
- 10.1.5 Connection Character Sets and Collations
- MariaDB_Setting Character Sets and Collations
- 字符集及其比较方式(Character Sets and Collations)
- Character Sets and Collations
- character sets and collations in mysql
- Character Sets and Collations
- Introducing Character Sets and Encodings(字符集与编码介绍)
- Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets
- Differences between ANSI, ISO-8859-1 and MacRoman character sets
- Win32 Series - Keyboard Messages and Character Sets
- Difference betweeh Character sets and Collate
- Listing of Character Sets FOR 9.2, 9.0.1 and 8.1.7 including Language references
- information_schema系列之字符集校验(CHARACTER_SETS,COLLATIONS,COLLATION_CHARACTER_SET_APPLICABILITY)
- Positively Must Know About Unicode and Character Sets (No Excuses!)
- Character sets and codepages
- The Minimum About Unicode and Character Sets
- Positively Developer Must Know About Unicode and Character Sets (No Excuses!)
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
- No statements may be issued when any streaming result sets are open and in use on a given connection