distal-attribute
distal-attribute
distal-attribute
distal-attribute

How to get the character set right

User, date Message
Written by kwdavids1
2 years ago
Category: Import/Export
6 posts since Wed, 01 Feb 12
I'm exporting and importing with HeidiSQL. Both databases are UTF-8, but he tables are Latin - Swedish for some reason. On Import, I've tried various combinations for the encoding. (This is a WordPress database.)

The problem is that non-latin characters (Cyrillic, quotation marks, ellipses and dashes) are replaced with 3-character odd characters. How do I do it right so the character sets line up.
Written by BubikolRamios
2 years ago
327 posts since Thu, 14 Jan 10
1.did you try drop/create table on export ?
2.are you viewing exported data with same client ?
3. Are exported data on same OS ?
Written by kwdavids1
2 years ago
6 posts since Wed, 01 Feb 12
1. The tables are being created (did not exist before).
2. Yes I am viewing with WordPress set to UTF-8 on both ends.
3. It's Linux both ends.
Written by ansgar
2 years ago
4940 posts since Fri, 07 Apr 06
Where do you see those "wrong" 3-chars approach? In Wordpress or HeidiSQL?

Which HeidiSQL version is it?

You could open the exported file with some text edior and watch out if that file is already broken. If yes, I guess the data in the existing database is already broken.
Written by kalvaro
2 years ago
587 posts since Thu, 29 Nov 07
if "both databases are UTF-8, but he tables are Latin - Swedish for some reason" then your data is not using UTF-8: it's using Latin 1. The database encoding is just a default to use, e.g., when you create a table and don't specify a charset. In the Latin 1 charset it's impossible to store Cyrillic characters.
Written by kwdavids1
2 years ago
6 posts since Wed, 01 Feb 12
I'm using the latest nightly build of HeidiSQL.

When I view the source data in WordPress, I see left quotation marks, ellipses and Cyrillic.

Looking at the HeidiSQL export SQL file I see: CREATE TABLE ... DEFAULT CHARSET=latin1

When I look at the HeidiSQL export SQL file with Notepad++, I see stuff like:

…

If you can't see that, it's a capitol A with a tilde over it, a cent sign, a lower case a with grave, a comma, a logical not sign, another tilde A and a vertical bar.

which is the same things I see in the target WordPress blog.

Both WordPress installations were setup with UTF-8 as the character set.

Unfortunately, I cannot get at the original data any more.
Written by jfalchMoney, Euro
2 years ago
382 posts since Sat, 17 Oct 09
this is probably the character '_', two times encoded as utf-8; ie if you decode the above string utf-8 -> (eg) win1252, you get …, decoding that again yields _ .
Written by jfalchMoney, Euro
2 years ago
382 posts since Sat, 17 Oct 09
via utf8-decoder
Written by kwdavids1
2 years ago
6 posts since Wed, 01 Feb 12
This turned out to be a WordPress installation error and nothing to do with the import/export. Sorry to have troubled you.
Written by ansgar
2 years ago
4940 posts since Fri, 07 Apr 06
Thank you for the update!
 

Please login to leave a reply, or register at first.