SQL Snippets in ANSI only (not UTF-8)

Chillbo posted 10 years ago in Running SQL scripts
I'm in the process of migrating from http://www.navicat.com to HeidiSQL and have noticed a small problem.

The Load SQL from Textfile function requires the text file to be in ANSI encoding. All my saved sql queries are in UTF-8.

Is there a way around this?
ansgar posted 10 years ago
Not sure what actually requires ANSI here. However, this is the relevant code for that part:

tmpstr, filecontent      : String;
AssignFile( f, filename );
Reset( f );
while not eof( f ) do
Readln( f, tmpstr );
filecontent := filecontent + tmpstr + CRLF;
SynMemoQuery.SelText := filecontent;

I guess the string variables and the ReadLn function don't like UTF8. Could also be that SynEdit doesn't like it or even both (latter one would be quite painful to fix).
ansgar posted 10 years ago
A first step for me could be to use WideString instead of String variables. Any suggestions, delphi hackers?
rosenfield posted 10 years ago
For an immediate fix, a workaround would be to use 'iconv' to convert the file from UTF-8 to fx latin1 ANSI. Iconv is a Unix tool, but it can be installed on Windows via MKS Toolkit, Cygwin, SFU or similar, in a virtual machine running Kubuntu or what not. There's also an abundance of text editors that can convert files, although probably not in a batch fashion.

I don't think using a WideString will magically cause the Delphi compiler to use a version of Readln() that is Unicode-aware (but I don't know).

There's a magic marker (a BOM) at the beginning of proper Unicode files, the code to read the file would have to identify that and read the file per the format indicated. See:

The most common Unicode formats are UTF-8, UTF-16 BE and UTF-16 LE. As far as the others go, HeidiSQL could get away with throwing an exception (ultimately ending in the showing of an error message). UTF-32 BE/LE would be nice to support too, just because it's the simplest universal encoding form available (encompassing all characters from all languages) and as such might be of use to developers. It's not in wide-spread use though, since it generally requires compression to achieve space efficiency.

If you want to do it yourself, Googling...:

...yields results:

But they are far from perfect, the two above does not support complex encodings such as UTF-8, for example.

There are various components that will do the job for you. Here's one with an unspecified license and an unclear maintainership, as are the norm for these things:

There's also open source components. SynEdit comes in a Unicode version, which has a LoadFromFile function that will load Unicode files. It's in the official SynEdit VCS by now, see:
ansgar posted 9 years ago
Btw, loading and saving SQL files and snippets should work in the latest build for any charset, including UTF8 and ANSI.
rosenfield posted 9 years ago
There is BOM detection code both in the WideStrings (iirc) Delphi unit and the TNT code recently added to HeidiSQL, so using that to detect Unicode files and load with the correct decoding if a UTF-8 BOM is found should be pretty straightforward.

But HeidiSQL does no such thing yet. File load does not work correctly, it assumes ANSI as of r1388. File save is ASCII, at least for SQL export.

There is also an issue with non-BOMmed files. Non-BOM text files generated by other apps could be in a variety of code pages. The most flexible would be to allow the user to choose at least between ASCII, ANSI codepage and UTF-8, but that's incidentally also the most complex to implement. Perhaps an option in preferences to switch between ASCII, local ANSI cp and UTF-8 for undetectable stuff would do the trick.

Please login to leave a reply, or register at first.