distal-attribute
distal-attribute
distal-attribute
distal-attribute

Bypassing initial/intermittent database metadata requests (MySQL)

aphel posted 5 months ago in Feature discussion
I use HeidiSQL extensively to connect to large, remote MySQL database servers. HeidiSQL runs the following queries whenever switching databases or after a brief period of inactivity:

USE `database`;
SELECT `DEFAULT_COLLATION_NAME` FROM `information_schema`.`SCHEMATA` WHERE `SCHEMA_NAME`='database';
SHOW TABLE STATUS FROM `database`;
SHOW FUNCTION STATUS WHERE `Db`='database';
SHOW PROCEDURE STATUS WHERE `Db`='database';
SHOW TRIGGERS FROM `database`;
SHOW EVENTS FROM `database`;

These queries can lock up Heidi for a significant period of time (>10s due to network and size of databases), which can become frustrating.

It would be really helpful if Heidi had an option to automatically disable collecting this information or if the information were collected on background thread.
ansgar posted 5 months ago
Can you track down which of them actually takes long?
aphel posted 5 months ago
SHOW TABLE STATUS FROM appears to be causing the problem:

SHOW TABLE STATUS FROM `database1`;
/* Affected rows: 0 Found rows: 50 Warnings: 0 Duration for 1 query: 11.282 sec. */

SHOW TABLE STATUS FROM `database2`;
/* Affected rows: 0 Found rows: 52 Warnings: 3 Duration for 1 query: 13.143 sec. */
ansgar posted 5 months ago
* Any clue why SHOW TABLE STATUS is so slow on these both databases?
* 50/52 tables is not such a large amount of tables, but probably tables are very huge in size and the query gathers statistics about these every time?
* MyISAM tables or InnoDB?
* Which HeidiSQL version?
aphel posted 5 months ago
It seems to be due to size & possibly storage engine.

Database1 ~22 GiB, 50 tables: 50% InnoDB, 50% MyISAM
Database2 ~302GiB, 52 tables: 100% InnoDB

MySQL 5.5.16 (on OpenSUSE)

I ran this across other databases of varying size to give you an idea.

Database3 ~6.0 GiB, 100% InnoDB
SHOW TABLE STATUS FROM database3;
/* Affected rows: 0 Found rows: 41 Warnings: 0 Duration for 1 query: 5.212 sec. */

Database4 ~26 MiB, 100% InnoDB
SHOW TABLE STATUS FROM database4;
/* Affected rows: 0 Found rows: 9 Warnings: 0 Duration for 1 query: 0.125 sec. */

Database5 ~4 GiB, 70% MyISAM, 30% InnoDB
SHOW TABLE STATUS FROM database5;
/* Affected rows: 0 Found rows: 44 Warnings: 0 Duration for 1 query: 2.685 sec. */

Database6 ~2.5 GiB, 100% InnoDB
SHOW TABLE STATUS FROM database6;
/* Affected rows: 0 Found rows: 21 Warnings: 0 Duration for 1 query: 4.727 sec. */
ansgar posted 5 months ago
Thanks man, that shows a perfect picture of how query duration depends on size, and secondly storage engine. While InnoDB is significantly slower than MyISAM here, it looks like size in general matters.

I am thinking that I cannot know about size before the first SHOW TABLE STATUS has been fired. Probably I can measure the time of that query, remember it for later in that session and leave out the columns which trigger file access on the server:

SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, `ENGINE`, VERSION, CREATE_OPTIONS, TABLE_COMMENT, TABLE_COLLATION
FROM information_schema.`TABLES`
WHERE
TABLE_SCHEMA='mydb'


Could you please test whether this query takes more than a few milliseconds on your largest databases? On my 6 to 10G databases they measure 0.00[something] seconds. Thanks!
aphel posted 5 months ago
The largest is Db2, which is 302GiB. Here's the result:

/* Affected rows: 0 Found rows: 52 Warnings: 0 Duration for 1 query: 0.016 sec. */

I'm not certain what all dependencies there are on the SHOW TABLE STATUS FROM db query but would it be somewhat simpler to have a setting that enables the collection of more or less information provided?
ansgar posted 5 months ago
Well, we already have a "Get full table status" setting per session (see "Advanced" tab). And the "Refresh" button has a "Full status refresh (Shift+F5)" in its pulldown menu. I'm thinking we don't need an additional setting here. The first SHOW TABLE STATUS in a database could really be executed normally, also to measure the time it takes. Later refreshes could depend on the duration it took. Let's say it was over 2 seconds, then the above faster query gets preferred.

This approach also makes sure you have at least once gotten the important columns such as table rows, data size and index size.

Does that makes sense to you? Or would you wonder why these values don't update on later refreshes?
aphel posted 5 months ago
Never saw the Advanced option. That's definitely helpful :)

I like your idea of the first SHOW TABLE STATUS as a way to measure the time and decide between whether to use the faster query or full query for subsequent requests.

Please login to leave a reply, or register at first.