UTF-8 Editing not possible

[expired user #4206]'s profile image [expired user #4206] posted 15 years ago in General Permalink
Hi Folks,

It would be great if i could use HeidiSQL for Editing Data in UTF-8.

This is a problem i've been having for a while with all the HeidiSQL-Verisons in the past and the current, too. Now it's getting quite critical because i have to edit loads of data by hand using HeidiSQL.

My MySQL DB (5.1.33) contains UTF-8 Data for a Web-Project. Most of the Data is beeing imported or created thru the web-interface of the project, so there usually is no problem.

Sometimes it comes handy to edit some data directly in the database - this is where the problem beginns.

Problem:
HeidiSQL won't display the stored UTF-8 data properly. Foreign Characters are messed up. When i try to edit a field and put some foreign characters into it, it looks fine in HeidiSQL, but is messed up on the website output.

Settings:

[u:df60b2830c]DB MySQL 5.1 [/u:df60b2830c]
Character set: Utf-8
Collation: utf9_general_ci

[u:df60b2830c]TABLE[/u:df60b2830c]
Default collation: utf8_general_ci

[u:df60b2830c]WEBSITE output[/u:df60b2830c]
content="text/html; charset=UTF-8"


I would like to unterstand, why this is happening and how to solve this.

Can anyone help me?

Regards, Peter
ansgar's profile image ansgar posted 15 years ago Permalink
Does more sound like your website has some problem, not HeidiSQL :)

To analyze that it would be helpful if you would post the table creation SQL - just click the table in HeidiSQL and copy the code from the "CREATE code" tab here.
[expired user #4206]'s profile image [expired user #4206] posted 15 years ago Permalink
Hi Anse,

here's the creation SQL:

CREATE TABLE `my_table` (
`id` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
`firma` VARCHAR(255) NULL DEFAULT NULL,
`firma_bez` VARCHAR(255) NULL DEFAULT NULL,
`haendler` INT(1) UNSIGNED NULL DEFAULT NULL,
`vertretung` INT(1) UNSIGNED NULL DEFAULT NULL,
`prio` INT(1) UNSIGNED NULL DEFAULT NULL,
`anrede` VARCHAR(50) NULL DEFAULT NULL COLLATE utf8_general_ci,
`name_first` VARCHAR(255) NULL DEFAULT NULL,
`name_last` VARCHAR(255) NULL DEFAULT NULL,
`str` VARCHAR(255) NULL DEFAULT NULL,
`str_zusatz` VARCHAR(255) NULL DEFAULT NULL,
`land` VARCHAR(50) NULL DEFAULT NULL,
`plz` VARCHAR(10) NULL DEFAULT NULL,
`ort` VARCHAR(255) NULL DEFAULT NULL,
`tel_1` VARCHAR(255) NULL DEFAULT NULL,
`tel_2` VARCHAR(255) NULL DEFAULT NULL,
`tel_3` VARCHAR(255) NULL DEFAULT NULL,
`fax` VARCHAR(255) NULL DEFAULT NULL,
`email` VARCHAR(255) NULL DEFAULT NULL,
`internet` VARCHAR(255) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `id` (`id`),
INDEX `id_2` (`id`)
)
COMMENT='My Comment'
COLLATE=utf8_general_ci
ENGINE=MyISAM
ROW_FORMAT=DYNAMIC
AUTO_INCREMENT=1119
AVG_ROW_LENGTH=158


I tried to output the Data edited in HSql in a clean UTF-8 XHTML document, with the same result. Honestly i don't know why both, my CMS and a simple handcoded output produce broken foreign characters.

It always thought that straight forward output of MySQL-Data with PHP should reproduce the Characters stored in the DB.

regards, petameter
[expired user #1125]'s profile image [expired user #1125] posted 15 years ago Permalink
Various comments.

* Agreeing with #2, I think HeidiSQL works correctly, unless you hack it with something like a SET NAMES, so it's probably the web application that has a problem.

* Your browser will detect if the character set from Content-Type (I assume that's the name of the meta-header you pasted) seems incorrect. Meaning that your browser may display the page correctly even though the Content-Type header is wrong. The various heuristics used includes pretty advanced stuff such as looking at the count of various letters and guessing a language based on that, etc.

You can, in some browsers, temporarily turn auto-detection off via a main menu item.

* The PHP team officially stated that they will hold back on native Unicode support until version 6. Of course the demand is great, and so a lot of Unicode features has already creeped into PHP5. But when you get down to the basics, it's a non-Unicode product, and various core glue such as the MySQL driver probably still works in ANSI mode.

You can hack the MySQL server to send UTF-8 or latin1 into PHP with "SET NAMES utf8" and "SET NAMES latin1". The server does the conversion before data hits the network/whatever.

This could help if PHP does something silly, such as detecting your UTF-8 Content-Type header, thinking it's internal strings (from the mysql database) are in latin1, and that it better do a conversion from latin1 to utf8. Such automagic could break your output because the strings were already in utf8. Or whatever, lots of possibilities for breakage without native support.

Can you please provide:
SHOW VARIABLES LIKE 'character_set%' from within PHP.

Also, a HEX(<column>) of a piece of data and a HTTP link to the failing page showing said piece of data (or a "curl http://blah/ | od -tx1") would be helpful in finding out what the input versus output is.

* Input from browsers are normally encoded in the character set of the page holding the HTML form which was sent to the server. Depends on the browser, though.

Please login to leave a reply, or register at first.