Bug: importing UTF-8 CSV

[expired user #5483]'s profile image [expired user #5483] posted 13 years ago in Import/Export Permalink
I've got some UTF-8 encoded CSV files (with field names in the first line) to import and noticed that when choosing
"ignore first [1] lines" not only the line containing the field names but also one line of data is not imported into the database table.

My workaround is to delete the first line (w/ the field names) in Vim text editor an to tell Heidi not to ignore any lines.

I use Heidi 6.0.0.3603 on Windows Server 2003 running MySQL 5.0 server.

cheers,
Brazil
ansgar's profile image ansgar posted 13 years ago Permalink
Could you try the latest build to see if that is fixed? I recall there was some changed stuff after the 6.0 release. Thanks!
ansgar's profile image ansgar posted 13 years ago Permalink
Did you select "Client parser" or "Server parser" ? Server would mean that this problem was caused by the LOAD DATA command.
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
I selected "Server Parser" ('cos is the default), ... with "Client parser" it works fine!

The behaviour of r_3659 is exactly the same as with 3603.
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
btw: heidi does not import if there's an Umlaut (e.g. 'Ü') in the pathname, i get an "file not found" error instead.
ansgar's profile image ansgar posted 13 years ago Permalink
Thanks for the infos. Will fix the path problem then. Server parsing is buggy then? Or is it some broken character in your CSV file?
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
I guess it's Server parsing as i can not see any errors in my CSV files (all created by Open Office Calc). I've just sent you an e-mail with the CREATE TABLE statement and two fitting (small) sample CSV files for testing.
ansgar's profile image ansgar posted 13 years ago Permalink
Just tested your CSV file with headers after creating the table. Import dialog generates this query and imports 3 rows - which is what the file contains:

LOAD DATA LOCAL INFILE 'C:\\Users\\beckera\\Downloads\\BlackMetabase\\BlackMetabase_w-headers.csv' INTO TABLE `test`.`BlackMetabase` CHARACTER SET utf8 FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (`DocID`, `Title_de`, `Title_it`, `Title_en`, `Title_fr`, `Title_es`, `Text`, `MenuImage`, `Publish`, `PublishStart`, `PublishEnd`, `PublishGroups`, `PublishLang`, `Portal`, `PortalStart`, `PortalEnd`, `Version`, `TimeCreate`, `TimeModify`, `UserCreate`, `UserModify`, `DocType`, `DocImage`, `Users`, `Groups`, `Content`, `System`, `Trash`, `TemplateID`);


Could you please compare the SQL to what your log panel says? Guess there is some different setting in place here.
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
Will check that in the office tomorrow.

Could it be relevant that the MySQL server (didn't see the wood for trees earlier;-) ist not the local one on the Windows Server? I've also connected HeidiSQL to a remote machine, running MySQL 5.x on debian linux -- an that's the one where i was importing the tables into.
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
When importing with "Server parsing" i get this:

LOAD DATA LOW_PRIORITY LOCAL INFILE 'D:\\turkei\\BlackMetabase_w-headers.csv' REPLACE INTO TABLE `groupe_tr`.`BlackMetabase` CHARACTER SET utf8 FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (`DocID`, `Title_de`, `Title_it`, `Title_en`, `Title_fr`, `Title_es`, `Text`, `MenuImage`, `Publish`, `PublishStart`, `PublishEnd`, `PublishGroups`, `PublishLang`, `Portal`, `PortalStart`, `PortalEnd`, `Version`, `TimeCreate`, `TimeModify`, `UserCreate`, `UserModify`, `DocType`, `DocImage`, `Users`, `Groups`, `Content`, `System`, `Trash`, `TemplateID`);
/* 2 rows imported in 0,062 seconds. */


switching to "Client parsing" produces:

REPLACE LOW_PRIORITY INTO `groupe_tr`.`BlackMetabase` (`DocID`, `Title_de`, `Title_it`, `Title_en`, `Title_fr`, `Title_es`, `Text`, `MenuImage`, `Publish`, `PublishStart`, `PublishEnd`, `PublishGroups`, `PublishLang`, `Portal`, `PortalStart`, `PortalEnd`, `Version`, `TimeCreate`, `TimeModify`, `UserCreate`, `UserModify`, `DocType`, `DocImage`, `Users`, `Groups`, `Content`, `System`, `Trash`, `TemplateID`) VALUES ('2', 'Quicklist', 'Hızlı Erişim', 'Quicklist', 'Quicklist', 'Quicklist', '', 'rs_tree.gif', '1', '0', '-1', '|3000|', '', '0', '0', '0', '1', '0', '0', '0', '0', 'fol', '', '|', '|3000,3|', '0', 'd', '0', '0'), ('3', 'Trash', 'Geri Dönüşüm Kutusu', 'Trash', 'Trash', 'Trash', '', 'trash.gif', '1', '0', '-1', '|3000|', '', '0', '0', '0', '1', '0', '0', '0', '0', 'trash', '', '|', '|3000,3|', '0', 't', '1', '0'), ('1', 'Infopool DE', 'Infopool TR', 'Infopool EN', 'Infopool FR', 'Infopool ES', '', 'rs_tree.gif', '1', '-1', '-1', '|3000|', '', '0', '-1', '-1', '1', '0', '0', '0', '0', 'fol', '', '|', '|3000,1|', '0', 's', '0', '0');
/* 3 rows imported in 0,234 seconds. */


... all other settings in the import dialog are left to default; tested with r_3659. It's MySQL 5.0.51 (with all current updates) running on the remote debian linux host.
ansgar's profile image ansgar posted 13 years ago Permalink
Hm you have
ESCAPED BY '"'
which is surely the cause here - leave that field blank and you will get 3 rows I'm nearly sure.
[expired user #5483]'s profile image [expired user #5483] posted 13 years ago Permalink
Right, when i clear the ESCAPED BY field, all three rows are imported in "Server parsing". I think a blank ESCAPED BY should be the default.

Please login to leave a reply, or register at first.