distal-attribute
distal-attribute
distal-attribute
distal-attribute

Reconnection Hangs after Timeout

Harmit posted 4 years ago in General

I do a lot of different IP work, so starting Heidi along with many other applications and leaving them on is necessary.

The problem is, on some database servers I administrate, when I have not touched Heidi for a while and select a table or something, it needs to reconnect to the server. During this process it takes an extremely long time (60secs or more) to actually refresh. But if I use the Session Manager and double click on the connection there, it connects right away. So I know its not a connection problem.

I remember a "keep-alive" function that I can't find now. But it actively refreshed the program and was quite irratating when writing a query; it should do the keep-alive ping in the background.

Extra Note: The servers that I have problems with run a software-based firewall (iptables). But nothing is blocking my network so I can't see why this would be a problem.

Is there anything I can do to fix the reconnection timeout?

Has anyone else experienced this problem?

Thanks.
Harmit posted 4 years ago
Just wanted to add: I found the Auto-Refresh. It collapses the tree-view of database tables if you have the database selected. Pain in the butt when I need to see the table list on the left...
Harmit posted 4 years ago
Ugh and when refreshing, any field I am editing (like text) is reverted back to what it was before I started editing... Now I remember why I shut this off...
ansgar posted 4 years ago
There is no keep-alive feature in HeidiSQL, intentionally, as the reconnect logic should work. If it does not I'll fix that. Issue #2204 describes a problem in these cases, when the server is dead but this is not your problem. Could you examine the log window for what exact command takes such a long time, or post the log contents here so we can do that together here?
Harmit posted 4 years ago
There is the auto-refresh which is what I was referring to. I turned this on to prevent it from hanging but when it refreshes, anything that I didn't save gets wiped out and the layout of the directory tree (db/table list) gets reset to the default view.

As far as the command, all of them. Whatever I click on (table, data, refresh) takes a long time to do.
ansgar posted 4 years ago
So you are auto-refreshing your data grid while editing data in it or what?
Harmit posted 4 years ago
Exactly.

Just so we're clear, I'm identifying two problems.

The first one is the auto-refresh removing changes that I am making. Such as changing the content in a TEXT field or something.

The other is the hanging when it tries to refresh after timing out
ansgar posted 4 years ago
Ok, for the auto-refresh-changes-gone thingy: is it expected that when you have unsaved changes that the auto refresh is blocked or what?

The hanging thing is not reproducible here. Probably related to a special table you are in? Many rows, huge size, many columns, whatever, anything special there?
Harmit posted 4 years ago
Regarding "hanging" - Nothing identifiable. I'll try to find some common denominator.

Expectations on Auto-Refresh - I would expect that it does not change existing layout (see tree-view example described in 2nd post) or change anything that I am currently editing prior to Esc (Cancel) or running the query that is in queue.

Not sure if this is a "gotcha" or not, but, editing text fields can be confusing, because, the text-window that pops up has a green check-mark button to "Apply Changes". Because of the green check-mark and the context text, I thought that clicking this would UPDATE the record, but it doesn't.

To be clear, I would change the button and the context-text to a "Close Window" look and feel instead, since that is what it is doing.

If you end up fixing all this (except for the hanging thing) and you can count on a significant donation ;)

Harmit posted 4 years ago
I found something to do that doesn't hang it!

When it has been left along and it should give me the problem, if I right-click on the server in the tree-view before anything else, it refreshes all the databases very quickly. Then I can select any DB or Table without stalling.

So whatever the problem is that picked me as its buddy, a right-click-refresh on the server name at the top of the left-hand tree works around it. I can finally get around this thing!! Yay!


muzza4 posted 4 years ago
Hi Harmit

I have exactly the same problem, and boy it's annoying. I've raised this issue before but Anse cannot reproduce. Search for a topic called Keep Alive.

Your solution won't work for me. I just want a regular connection check under the bonnet to keep my sessions going.

Cheers
Muzza
Harmit posted 4 years ago
muzza: yeah, it only worked twice for me. So I think it was more of a coincidence than a fix. Sorry for misleading you.

There has to be something special with either A) the server or network we're connecting too, or, B) the client or network we are connecting from. I will eventually test on several different networks to see if there is a clue, but I've been too busy lately :(
muzza4 posted 4 years ago
Harmit

Probably a coincidence. Normally I get the delay we are discussing, but every 10 times or so...perfect.

Harmit posted 4 years ago
anse (ET AL): I found the problem.

I ran a packet sniffer and noticed that the application is sending a SLEEP() command. Some MySQL instances have a broken SLEEP command which will make it sleep for much longer than the time you define.

Heidi retransmitted the SLEEP command 8 times before continuing. This was the "hanging" issue.

Maybe an option to turn of persistent connections or in some way disconnect instead of sleep because it is not a trustworthy function in the MySQL library.

Anyway, that's the problem. If anyone has any temp-fixes, I'm all ears!
ansgar posted 4 years ago
HeidiSQL does not send any SLEEP command. Probably you mean the mysql_ping() call of the C-API which is done once before each query. Where did you get that this is broken in some MySQL instances?
Harmit posted 4 years ago
There have been quite a few SLEEP bugs identified in the bugs database of Mysql since back in 4.x. I remember SLEEP always being an issue from one degree to another, and the AB team couldn't duplicate a lot of them. Sound familiar? :)
I never had any problems with it, but I don't use it often either.

I'm using WireShark as a sniffer which has a built-in protocol translator. So it shows a short version of the command sent. I am the source of each request, and the output in WireShark's MySQL Protocol section is:

Command: SLEEP (0)
Payload: unknown/invalid command code

Each time there are 8 packets/commands sent before finally continuing. Packets 2 thru 8 are "TCP Retransmissions".

Harmit posted 4 years ago
I took another snap shot and to get a record of the "ping" you talked about. I see farther down that Wireshark displays it again. It surely didn't hang there, so I searched and found the setting so I can see the queries in the 'info' column. I think this is raw form which I wanted from the beginning, but I didn't wait long enough for the connection to timeout. I'll have to pick this up tomorrow; I'm exhausted.

Tomorrow I want to run promiscuous mode on the server to see what kind of traffic we're getting there during the sleep/ping anomaly.

But, I still don't think that Heidi is to blame, not yet anyway. I still think its either a MySQL server issue/bug/feature or a firewall issue and Heidi is not getting back that initial 'pong'. 3306 is open though, so its hard to imagine that being it.

Either way, it might be something Heidi can check for to alleviate potential issues in the future...
ansgar posted 4 years ago
Thanks a lot for taking a deeper look. And feel free to follow up here.
muzza4 posted 4 years ago
Hi Harmit

Since I have the same problem, let me know if there's anything you want me to test for - to verify the issue when identified.

Good work man (or woman).

Muzza
ansgar posted 4 years ago
Hehe. I'd appreciate any woman working on this or other Heidi related topic, only I guess that's not the case.

Harmit posted 4 years ago
I'm male, but I've been called a b|7ch before -- so one could get confused. ;)

I'll post back later today.
Harmit posted 4 years ago
Tested on server end with `tcpdump port 3306`

Absolutely no traffic is received at the server during the time it is hanging. After those 8 "ping/sleep" entries, it reconnects going through the MySQL server 'greetings' etc.

So, I'm left with either a Firewall issue or... I have no idea.

Let's look at this from a firewall perspective when the "ping" is sent. From what I can tell it uses the TCP protocol on the 3306 port. Our network uses a firewall NAT so internal servers are 192.168.*.* I also am a client who uses a NAT, so internally, I use 192.168.*.*.

I think, firewall NAT's can be finicky with timeouts, in other words if you wait a while before connecting, and try to re-establish a connection without the normal 3-way handshake, it thinks you might be trying to blind IP hijack.

This may be what is going on. The reconnect after a "certain time" is not handshaking first as a new connection.

That's the best I think I can do given the resources I have. If anyone can verify this, great. If I can think of anything else, I will post back.

muzza4 posted 4 years ago
Hi Anse

Harmit's done a great job, but this truly annoying issue that affects him and I and presumably other people is no closer to being resolved.

I think you agree that the problem is real. What's the chances of a process to retain connection i.e. 'if no connection in last x minutes then ...'?

Regards
Muzza
ansgar posted 4 years ago
What I will done to fix this server issue in mysql_ping()? Not much, I can't do anything I'm afraid. Doesn't matter if I would introduce a keep-alive feature which calls mysql_ping() from time to time, or as it is now with a call to mysql_ping() before each query.
Harmit posted 4 years ago
I would offer this as a possible feature addition:

Include a "Reconnect" button for connections that are suppose to be established. If I click the disconnect button, in order to reconnect, the entire application closes.

Also - Unless this is available somewhere (which I have not found), clicking on the disconnect button closes the application. Perhaps not closing the application when this button is pressed and have the user intentionally close it by clicking the [X] or File->Exit

At least with this, folks suffering from this problem can just hit the reconnect (not refresh!) button when to much time had passed.

Oh - remembering what was selected/viewed before reconnecting would be of great benefit. I know it would be aggregating to constantly expand the tree-view to the table I was looking at and then clicking the tab I was on (database, table, data, etc) and selecting the row or whatever I had selected before.

Harmit posted 4 years ago
.. awesome -- my browser dictionary replaced my typo's with the incorrect words. I'm sure you can understand what I was saying though. :)
ansgar posted 4 years ago
Harmit, this sounds more like a half baked workaround, not a solution. Could we please recapitulate what the exact problem is here? What exactly causes the long waiting time you are experiencing?
Harmit posted 4 years ago
Again - it seems to be associated to a user using a firewall between the client and server. Ours is a NAT firewall which could or could not be relevant. Being an admin of many different firewalls and packet analyzer applications, I know that they could time-out a legit connections if there has not been activity in XX seconds or minutes. It will just drop the packets instead of sending a RST. As a remote admin, I can't test this with the offending network, but as I've described in the previous posts, it all points at the Firewall being the only thing that could cause the problem.

I have seen this same issue when using the Filezilla FTP client -- when trying to refresh a directory after a long period of inactivity, it will reconnect if it doesn't receive a reply from the server right away (I guess they went through this before?). They may even apply a 'keep-alive', I'll have to look.

Bottom line, when this issue is happening:
Heidi sends ping packets to server but the server does not receive any packets at all. Heidi does not receive anything from the network in response.

Obviously, something is dropping these packets in between and the firewall is the only thing remaining.

Your options are to make available a "keep-alive" ping that runs in the background, or attempt a full reconnect when receiving a RST or no response during the ping process.

While what I suggested can be viewed as work-arounds, they are legit features that lends more options to connection management, regardless of this particular problem.

Example:
1) I don't want the application to close when I click on the "Disconnect" button. I don't even expect it to close -- I just want to disconnect from the server.

2) I do want to reconnect an existing connection for many different reasons. Right now, I can't do this. If I click the disconnect button, we're back to the problem listed on #1.

None of these are purposed to solve this particular problem, but this problem can be worked around by using these features if they were available. See?

ansgar posted 4 years ago
If a ping does not work for you, how would a keep-alive feature sending a ping help you in any way?

I'm not going to add work arounds for some firewall issues. These should be fixed in the firewall itself, not in HeidiSQL. Additionally you can increase the wait_timeout option in the MySQL server.
Harmit posted 4 years ago
anse - I'm not sure where the confusion is.

The firewall is correct in its configuration and operation. This is not the issue.

Keep-alive would keep the connection open instead of timing out on a legitimate NAT Firewall. So of course PING would go through if it is went every 30 or 60 seconds. Even the "Auto Refresh" option that exists now works. We would be able to use this except it refreshes the entire view and resets all the panes. That is not helpful when working in the application.

Again - what you are suggesting as "work-arounds" are not work-arounds at all. They are common features and functions found on many network service applications:

1) A button or option to 'reconnect' to an existing server
2) provide a connection refresh/keep-alive that pings in the background without affecting the user's view. Like the auto-refresh but it does not touch the UI content.
3) Keep the application running when the disconnect button is pressed on the last server. In-fact, closing the application was unexpected. I don't understand the point of closing the application all the way when disconnecting from a server.


So, I am suggesting that these same features can be used to overcome the problems a few of us have. But they are -not- work-arounds.

Oh, the wait_timeout option is irrelevant if the client packets do not reach the server ;)

Harmit posted 4 years ago
Oh yeah,

another option is to allow the user to set how many pings does it take before it decides to reconnect completely.

I recorded a ping 8 times before it reconnected. If I could set this to 2 times, it would reconnect faster.
ansgar posted 4 years ago
> the wait_timeout option is irrelevant if the client packets do not reach the server

Of course it would, as the server disconnects you, and increasing the wait interval before the server does this would help you, doesn't it?

I'm probably confused by too much text here in this thread. There are feature requests mixed with network problem descriptions, and much talking about firewalls, but nothing MySQL specific here. I'm lost.
Harmit posted 4 years ago
Well, for now, let's leave it as two feature requests:

1) A button or menu item to completely reconnect to a connected server

2) Keep the application open/running when clicking on the Disconnect button on the last server in the list.

ansgar posted 4 years ago
No. Please tell me again *in short* how to reproduce the above described slowness. Up to now we both did not consider this as a code-wise problem in Heidi which I should do probably. I'm not going to blindly add fixes for problems I never had reproduced here.
kipstafoo posted 3 years ago
Anse,

Were you able to reproduce this? I suffer from the same problem with one of my clients. Essentially, I need to VPN into my client's system. If I don't use Heidi for a while (15+ minutes? Don't know the exact time, to be honest), if I do anything in the UI it hangs. It will eventually come back... but by eventually I mean somewhere on the order of 30+ minutes.

I tend to just open a new window and do what I need to do there, but if I have a sql statement I've been executing for testing purposes, it's a bit frustrating to have it locked away in a hung app. :|

Anyway, if there is anything I can do to help provide info, please let me know.

-Cord Awtry
ansgar posted 3 years ago
There is still no short and precise reproduction recipe here in this ultra long topic. Sorry, no, I could not reproduce anything here yet.
kipstafoo posted 3 years ago
Anse,

I understand. Let me offer up what I see in 3 scenarios:

- Scenario #1 (local db) - I can leave the app open for days on end and every time I return to use it, simply reconnects and away we go within a second or two.

- Scenario #2 (remote db via vpn) - For the most part reacts the same way as the local db

- Scenario #3 (remote db via vpn tunnel) - This is a db where I have to connect to my client's VPN and then tunnel to the db server. This one will hang after 15+ minutes of not using. If I kill my putty session with my tunnel, the app will return after about 15 seconds. If I just let it run, it will return after 30+ minutes.

Not sure if any of this is actually useful info for you, but these are my observations.
ansgar posted 3 years ago
I'm confused. Are we talking about VPN or SSH tunnel?
kipstafoo posted 3 years ago
Both

I have to VPN into the customer's network then tunnel from inside the next to the db server.
ansgar posted 3 years ago
So, is scenario 3 the only problematic one?

kipstafoo posted 3 years ago
Aye... the other ones seem to work just fine.

Please login to leave a reply, or register at first.