这个问题已经有很多答案了,但是 Mathias Bynens 提到应该使用 'utf8mb4' 而不是 'utf8' 以获得更好的 UTF-8 支持('utf8' 不支持 4 字节字符,字段是truncated on insert)。我认为这是一个重要的区别。所以这里是关于如何设置默认字符集和排序规则的另一个答案。一个可以让你插入一堆便便(?)。
这适用于 MySQL 5.5.35。
请注意,某些设置可能是可选的。由于我不能完全确定我没有忘记任何东西,因此我会将这个答案设为社区 wiki。
旧设置
mysql> SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
配置
# ? ?
# UTF-8 should be used instead of Latin1. Obviously.
# NOTE "utf8" in MySQL is NOT full UTF-8: http://mathiasbynens.be/notes/mysql-utf8mb4
[client]
default-character-set = utf8mb4
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysql]
default-character-set = utf8mb4
新设置
mysql> SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)
character_set_system is always utf8.
这不会影响现有表,它只是默认设置(用于新表)。
以下ALTER code 可用于转换现有表(无需转储恢复解决方法):
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
编辑:
在 MySQL 5.0 服务器上:character_set_client、character_set_connection、character_set_results、collation_connection 保持在 latin1。发出SET NAMES utf8(utf8mb4 在该版本中不可用)也会将它们设置为 utf8。
警告:
如果您有一个带有 VARCHAR(255) 类型索引列的 utf8 表,则在某些情况下无法转换它,因为超过了最大键长度 (Specified key was too long; max key length is 767 bytes.)。如果可能,将列大小从 255 减少到 191(因为 191 * 4 = 764