Bienvenido! - Willkommen! - Welcome!

Bitácora Técnica de Tux&Cía., Santa Cruz de la Sierra, BO
Bitácora Central: Tux&Cía.
Bitácora de Información Avanzada: Tux&Cía.-Información
May the source be with you!

Thursday, December 2, 2010

MySQL -globally, not session setted

Source
.net
In found out that it is not enough the set the variables on phpMyadmin. There are so-called global variables and session variables. I initially tried to set the global variables but it did not work. In most of the cases only the session variables were changed. But for MySql it does not make any difference to set the session variables in phpMyAdmin because it does not affect the session of the website when it connects. As you pointed out correctly, the only solution is to set the session variables by the client, immediately when the website connects to the database server. It is because of the unfortunate fact if no character set is specified, the connection defaults to latin1. (Defaulting to utf8 would support more characters and I would have had no problem).
The setting in .net takes place in the connection string (it also works for the ado.net data layer). By chance I have found the following description on the web:
If you do not specify the character set, the connection defaults to the latin1 charset. You can specify the character set as part of the connection string, for example:
MySqlConnection myConnection = new MySqlConnection("server=127.0.0.1;uid=root;" +
"pwd=12345;database=test;Charset=latin1;");
I simply had to enter Charset=utf8 and the problem has been solved.
Linux
1) The MySQL specific option of use_unicode is not needed anymore. 
The MySQL specific charset option also is deprecated. 
9.1.4. Connection Character Sets and Collations

Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:

    *  The server character set and collation can be determined from the values of the character_set_server and collation_server system variables.
    *      The character set and collation of the default database can be determined from the values of the character_set_database and collation_database system variables.

Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.

Consider what a “connection” is: It is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:

    *   What character set is the statement in when it leaves the client?

      The server takes the character_set_client system variable to be the character set in which statements are sent by the client.
    *      What character set should the server translate a statement to after receiving it?

      For this, the server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8). collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.
    *      What character set should the server translate to before shipping result sets back to the client?

      The character_set_results system variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names.

Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for each connection to the server.

There are two statements that affect the connection-related character set variables as a group:

    *  SET NAMES 'charset_name' [COLLATE 'collation_name']

      SET NAMES indicates what character set the client will use to send SQL statements to the server. Thus, SET NAMES 'cp1251' tells the server, “future incoming messages from this client are in character set cp1251.” It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a SELECT statement.)

      A SET NAMES 'x' statement is equivalent to these three statements:
      SET character_set_client = x;
      SET character_set_results = x;
      SET character_set_connection = x;

      Setting each of these character set variables also sets its corresponding collation variable to the default correlation for the character set. For example, setting character_set_connection to x also sets collation_connection to the default collation for x. It is not necessary to set that collation explicitly. To specify a particular collation for the character sets, use the optional COLLATE clause:

      SET NAMES 'charset_name' COLLATE 'collation_name'

    * SET CHARACTER SET charset_name

      SET CHARACTER SET is similar to SET NAMES but sets character_set_connection and collation_connection to character_set_database and collation_database. A SET CHARACTER SET x statement is equivalent to these three statements:

      SET character_set_client = x;
      SET character_set_results = x;
      SET collation_connection = @@collation_database;

      Setting collation_connection also sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is not necessary to set character_set_connection explicitly.

Note

ucs2 cannot be used as a client character set, which means that it does not work for SET NAMES or SET CHARACTER SET.

The MySQL client programs mysql, mysqladmin, mysqlcheck, mysqlimport, and mysqlshow determine the default character set to use as follows:

    *  In the absence of other information, the programs use the compiled-in default character set, usually latin1.
    * The programs support a --default-character-set option, which enables users to specify the character set explicitly to override whatever default the client otherwise determines.

When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the character_set_client, character_set_results, and character_set_connection system variables. In effect, the server performs a SET NAMES operation using the character set name.

With the mysql client, if you want to use a character set different from the default, you could explicitly execute SET NAMES every time you start up. However, to accomplish the same result more easily, you can add the --default-character-set option setting to your mysql command line or in your option file. For example, the following option file setting changes the three connection-related character set variables set to koi8r each time you invoke mysql:

[mysql]
default-character-set=latin1


If you are using the mysql client with auto-reconnect enabled (which is not recommended), it is preferable to use the charset command rather than SET NAMES. For example:

mysql> charset utf8
Charset changed

The charset command issues a SET NAMES statement, and also changes the default character set that mysql uses when it reconnects after the connection has dropped.

Example: Suppose that column1 is defined as CHAR(5) CHARACTER SET latin2. If you do not say SET NAMES or SET CHARACTER SET, then for SELECT column1 FROM t, the server sends back all the values for column1 using the character set that the client specified when it connected. On the other hand, if you say SET NAMES 'latin1' or SET CHARACTER SET latin1 before issuing the SELECT statement, the server converts the latin2 values to latin1 just before sending results back. Conversion may be lossy if there are characters that are not in both character sets.

If you do not want the server to perform any conversion of result sets or error messages, set character_set_results to NULL or binary:

SET character_set_results = NULL;

To see the values of the character set and collation system variables that apply to your connection, use these statements:

SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';

You must also consider the environment within which your MySQL applications execute. See Section 9.1.5, “Configuring the Character Set and Collation for Applications”.
===================================
Source 

With autoloaded tables you'll get unicode values if you use either
> use_unicode or convert_unicode, or str values if you don't.  With
> declared columns you'll also have to use the Unicode column type.
> 
> I had to use a my.cnf file to get the client and server charsets to
> match.  Otherwise I was getting:
> 
>  (u'character_set_client', u'latin1'),
>  (u'character_set_connection', u'latin1'),
>  (u'character_set_database', u'utf8'),
>  (u'character_set_filesystem', u'binary'),
>  (u'character_set_results', u'latin1'),
>  (u'character_set_server', u'utf8'),
>  (u'character_set_system', u'utf8'),
>  (u'character_sets_dir', u'/usr/share/mysql/charsets/'),
>  (u'collation_connection', u'latin1_swedish_ci'),
>  (u'collation_database', u'utf8_general_ci'),
>  (u'collation_server', u'utf8_general_ci'),
> 
> The my.cnf is a symlink to /etc/mysql/my.cnf and contains among other
> things:
> 
>     [client]
>     default-character-set  = utf8
> 
>     [mysqld]
>     character-set-server = utf8
>     default-character-set = utf8
> 
> The DSN looks like this:
> 
>     sqlalchemy.dbapi =
> mysql://...?read_default_file=%(here)s/my.cnf&use_unicode=1
> 
> Unfortunately MySQLdb does not raise an exception if the file is
> missing, so it's worth creating a unit test that checks whether all
> the character sets are correct.  I haven't tried that yet.
> 
> * * * *
> If you have existing records in the database which are in a different
> charset than what the column purports to be, and you want to keep the
> actual values as is, change the column type to BINARY in MySQL, then
> change it to the desired charset:
> 
> http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html

No comments: