Reconnection Hangs after Timeout
I do a lot of different IP work, so starting Heidi along with many other applications and leaving them on is necessary.
The problem is, on some database servers I administrate, when I have not touched Heidi for a while and select a table or something, it needs to reconnect to the server. During this process it takes an extremely long time (60secs or more) to actually refresh. But if I use the Session Manager and double click on the connection there, it connects right away. So I know its not a connection problem.
I remember a "keep-alive" function that I can't find now. But it actively refreshed the program and was quite irratating when writing a query; it should do the keep-alive ping in the background.
Extra Note: The servers that I have problems with run a software-based firewall (iptables). But nothing is blocking my network so I can't see why this would be a problem.
Is there anything I can do to fix the reconnection timeout?
Has anyone else experienced this problem?
Thanks.
There is no keep-alive feature in HeidiSQL, intentionally, as the reconnect logic should work. If it does not I'll fix that. Issue #2204 describes a problem in these cases, when the server is dead but this is not your problem. Could you examine the log window for what exact command takes such a long time, or post the log contents here so we can do that together here?
There is the auto-refresh which is what I was referring to. I turned this on to prevent it from hanging but when it refreshes, anything that I didn't save gets wiped out and the layout of the directory tree (db/table list) gets reset to the default view.
As far as the command, all of them. Whatever I click on (table, data, refresh) takes a long time to do.
As far as the command, all of them. Whatever I click on (table, data, refresh) takes a long time to do.
Ok, for the auto-refresh-changes-gone thingy: is it expected that when you have unsaved changes that the auto refresh is blocked or what?
The hanging thing is not reproducible here. Probably related to a special table you are in? Many rows, huge size, many columns, whatever, anything special there?
The hanging thing is not reproducible here. Probably related to a special table you are in? Many rows, huge size, many columns, whatever, anything special there?
Regarding "hanging" - Nothing identifiable. I'll try to find some common denominator.
Expectations on Auto-Refresh - I would expect that it does not change existing layout (see tree-view example described in 2nd post) or change anything that I am currently editing prior to Esc (Cancel) or running the query that is in queue.
Not sure if this is a "gotcha" or not, but, editing text fields can be confusing, because, the text-window that pops up has a green check-mark button to "Apply Changes". Because of the green check-mark and the context text, I thought that clicking this would UPDATE the record, but it doesn't.
To be clear, I would change the button and the context-text to a "Close Window" look and feel instead, since that is what it is doing.
If you end up fixing all this (except for the hanging thing) and you can count on a significant donation ;)
Expectations on Auto-Refresh - I would expect that it does not change existing layout (see tree-view example described in 2nd post) or change anything that I am currently editing prior to Esc (Cancel) or running the query that is in queue.
Not sure if this is a "gotcha" or not, but, editing text fields can be confusing, because, the text-window that pops up has a green check-mark button to "Apply Changes". Because of the green check-mark and the context text, I thought that clicking this would UPDATE the record, but it doesn't.
To be clear, I would change the button and the context-text to a "Close Window" look and feel instead, since that is what it is doing.
If you end up fixing all this (except for the hanging thing) and you can count on a significant donation ;)
I found something to do that doesn't hang it!
When it has been left along and it should give me the problem, if I right-click on the server in the tree-view before anything else, it refreshes all the databases very quickly. Then I can select any DB or Table without stalling.
So whatever the problem is that picked me as its buddy, a right-click-refresh on the server name at the top of the left-hand tree works around it. I can finally get around this thing!! Yay!
When it has been left along and it should give me the problem, if I right-click on the server in the tree-view before anything else, it refreshes all the databases very quickly. Then I can select any DB or Table without stalling.
So whatever the problem is that picked me as its buddy, a right-click-refresh on the server name at the top of the left-hand tree works around it. I can finally get around this thing!! Yay!
Hi Harmit
I have exactly the same problem, and boy it's annoying. I've raised this issue before but Anse cannot reproduce. Search for a topic called Keep Alive.
Your solution won't work for me. I just want a regular connection check under the bonnet to keep my sessions going.
Cheers
Muzza
I have exactly the same problem, and boy it's annoying. I've raised this issue before but Anse cannot reproduce. Search for a topic called Keep Alive.
Your solution won't work for me. I just want a regular connection check under the bonnet to keep my sessions going.
Cheers
Muzza
muzza: yeah, it only worked twice for me. So I think it was more of a coincidence than a fix. Sorry for misleading you.
There has to be something special with either A) the server or network we're connecting too, or, B) the client or network we are connecting from. I will eventually test on several different networks to see if there is a clue, but I've been too busy lately :(
There has to be something special with either A) the server or network we're connecting too, or, B) the client or network we are connecting from. I will eventually test on several different networks to see if there is a clue, but I've been too busy lately :(
anse (ET AL): I found the problem.
I ran a packet sniffer and noticed that the application is sending a SLEEP() command. Some MySQL instances have a broken SLEEP command which will make it sleep for much longer than the time you define.
Heidi retransmitted the SLEEP command 8 times before continuing. This was the "hanging" issue.
Maybe an option to turn of persistent connections or in some way disconnect instead of sleep because it is not a trustworthy function in the MySQL library.
Anyway, that's the problem. If anyone has any temp-fixes, I'm all ears!
I ran a packet sniffer and noticed that the application is sending a SLEEP() command. Some MySQL instances have a broken SLEEP command which will make it sleep for much longer than the time you define.
Heidi retransmitted the SLEEP command 8 times before continuing. This was the "hanging" issue.
Maybe an option to turn of persistent connections or in some way disconnect instead of sleep because it is not a trustworthy function in the MySQL library.
Anyway, that's the problem. If anyone has any temp-fixes, I'm all ears!
There have been quite a few SLEEP bugs identified in the bugs database of Mysql since back in 4.x. I remember SLEEP always being an issue from one degree to another, and the AB team couldn't duplicate a lot of them. Sound familiar? :)
I never had any problems with it, but I don't use it often either.
I'm using WireShark as a sniffer which has a built-in protocol translator. So it shows a short version of the command sent. I am the source of each request, and the output in WireShark's MySQL Protocol section is:
Command: SLEEP (0)
Payload: unknown/invalid command code
Each time there are 8 packets/commands sent before finally continuing. Packets 2 thru 8 are "TCP Retransmissions".
I never had any problems with it, but I don't use it often either.
I'm using WireShark as a sniffer which has a built-in protocol translator. So it shows a short version of the command sent. I am the source of each request, and the output in WireShark's MySQL Protocol section is:
Command: SLEEP (0)
Payload: unknown/invalid command code
Each time there are 8 packets/commands sent before finally continuing. Packets 2 thru 8 are "TCP Retransmissions".
I took another snap shot and to get a record of the "ping" you talked about. I see farther down that Wireshark displays it again. It surely didn't hang there, so I searched and found the setting so I can see the queries in the 'info' column. I think this is raw form which I wanted from the beginning, but I didn't wait long enough for the connection to timeout. I'll have to pick this up tomorrow; I'm exhausted.
Tomorrow I want to run promiscuous mode on the server to see what kind of traffic we're getting there during the sleep/ping anomaly.
But, I still don't think that Heidi is to blame, not yet anyway. I still think its either a MySQL server issue/bug/feature or a firewall issue and Heidi is not getting back that initial 'pong'. 3306 is open though, so its hard to imagine that being it.
Either way, it might be something Heidi can check for to alleviate potential issues in the future...
Tomorrow I want to run promiscuous mode on the server to see what kind of traffic we're getting there during the sleep/ping anomaly.
But, I still don't think that Heidi is to blame, not yet anyway. I still think its either a MySQL server issue/bug/feature or a firewall issue and Heidi is not getting back that initial 'pong'. 3306 is open though, so its hard to imagine that being it.
Either way, it might be something Heidi can check for to alleviate potential issues in the future...
Tested on server end with `tcpdump port 3306`
Absolutely no traffic is received at the server during the time it is hanging. After those 8 "ping/sleep" entries, it reconnects going through the MySQL server 'greetings' etc.
So, I'm left with either a Firewall issue or... I have no idea.
Let's look at this from a firewall perspective when the "ping" is sent. From what I can tell it uses the TCP protocol on the 3306 port. Our network uses a firewall NAT so internal servers are 192.168.*.* I also am a client who uses a NAT, so internally, I use 192.168.*.*.
I think, firewall NAT's can be finicky with timeouts, in other words if you wait a while before connecting, and try to re-establish a connection without the normal 3-way handshake, it thinks you might be trying to blind IP hijack.
This may be what is going on. The reconnect after a "certain time" is not handshaking first as a new connection.
That's the best I think I can do given the resources I have. If anyone can verify this, great. If I can think of anything else, I will post back.
Absolutely no traffic is received at the server during the time it is hanging. After those 8 "ping/sleep" entries, it reconnects going through the MySQL server 'greetings' etc.
So, I'm left with either a Firewall issue or... I have no idea.
Let's look at this from a firewall perspective when the "ping" is sent. From what I can tell it uses the TCP protocol on the 3306 port. Our network uses a firewall NAT so internal servers are 192.168.*.* I also am a client who uses a NAT, so internally, I use 192.168.*.*.
I think, firewall NAT's can be finicky with timeouts, in other words if you wait a while before connecting, and try to re-establish a connection without the normal 3-way handshake, it thinks you might be trying to blind IP hijack.
This may be what is going on. The reconnect after a "certain time" is not handshaking first as a new connection.
That's the best I think I can do given the resources I have. If anyone can verify this, great. If I can think of anything else, I will post back.
Hi Anse
Harmit's done a great job, but this truly annoying issue that affects him and I and presumably other people is no closer to being resolved.
I think you agree that the problem is real. What's the chances of a process to retain connection i.e. 'if no connection in last x minutes then ...'?
Regards
Muzza
Harmit's done a great job, but this truly annoying issue that affects him and I and presumably other people is no closer to being resolved.
I think you agree that the problem is real. What's the chances of a process to retain connection i.e. 'if no connection in last x minutes then ...'?
Regards
Muzza
I would offer this as a possible feature addition:
Include a "Reconnect" button for connections that are suppose to be established. If I click the disconnect button, in order to reconnect, the entire application closes.
Also - Unless this is available somewhere (which I have not found), clicking on the disconnect button closes the application. Perhaps not closing the application when this button is pressed and have the user intentionally close it by clicking the [X] or File->Exit
At least with this, folks suffering from this problem can just hit the reconnect (not refresh!) button when to much time had passed.
Oh - remembering what was selected/viewed before reconnecting would be of great benefit. I know it would be aggregating to constantly expand the tree-view to the table I was looking at and then clicking the tab I was on (database, table, data, etc) and selecting the row or whatever I had selected before.
Include a "Reconnect" button for connections that are suppose to be established. If I click the disconnect button, in order to reconnect, the entire application closes.
Also - Unless this is available somewhere (which I have not found), clicking on the disconnect button closes the application. Perhaps not closing the application when this button is pressed and have the user intentionally close it by clicking the [X] or File->Exit
At least with this, folks suffering from this problem can just hit the reconnect (not refresh!) button when to much time had passed.
Oh - remembering what was selected/viewed before reconnecting would be of great benefit. I know it would be aggregating to constantly expand the tree-view to the table I was looking at and then clicking the tab I was on (database, table, data, etc) and selecting the row or whatever I had selected before.
Again - it seems to be associated to a user using a firewall between the client and server. Ours is a NAT firewall which could or could not be relevant. Being an admin of many different firewalls and packet analyzer applications, I know that they could time-out a legit connections if there has not been activity in XX seconds or minutes. It will just drop the packets instead of sending a RST. As a remote admin, I can't test this with the offending network, but as I've described in the previous posts, it all points at the Firewall being the only thing that could cause the problem.
I have seen this same issue when using the Filezilla FTP client -- when trying to refresh a directory after a long period of inactivity, it will reconnect if it doesn't receive a reply from the server right away (I guess they went through this before?). They may even apply a 'keep-alive', I'll have to look.
Bottom line, when this issue is happening:
Heidi sends ping packets to server but the server does not receive any packets at all. Heidi does not receive anything from the network in response.
Obviously, something is dropping these packets in between and the firewall is the only thing remaining.
Your options are to make available a "keep-alive" ping that runs in the background, or attempt a full reconnect when receiving a RST or no response during the ping process.
While what I suggested can be viewed as work-arounds, they are legit features that lends more options to connection management, regardless of this particular problem.
Example:
1) I don't want the application to close when I click on the "Disconnect" button. I don't even expect it to close -- I just want to disconnect from the server.
2) I do want to reconnect an existing connection for many different reasons. Right now, I can't do this. If I click the disconnect button, we're back to the problem listed on #1.
None of these are purposed to solve this particular problem, but this problem can be worked around by using these features if they were available. See?
I have seen this same issue when using the Filezilla FTP client -- when trying to refresh a directory after a long period of inactivity, it will reconnect if it doesn't receive a reply from the server right away (I guess they went through this before?). They may even apply a 'keep-alive', I'll have to look.
Bottom line, when this issue is happening:
Heidi sends ping packets to server but the server does not receive any packets at all. Heidi does not receive anything from the network in response.
Obviously, something is dropping these packets in between and the firewall is the only thing remaining.
Your options are to make available a "keep-alive" ping that runs in the background, or attempt a full reconnect when receiving a RST or no response during the ping process.
While what I suggested can be viewed as work-arounds, they are legit features that lends more options to connection management, regardless of this particular problem.
Example:
1) I don't want the application to close when I click on the "Disconnect" button. I don't even expect it to close -- I just want to disconnect from the server.
2) I do want to reconnect an existing connection for many different reasons. Right now, I can't do this. If I click the disconnect button, we're back to the problem listed on #1.
None of these are purposed to solve this particular problem, but this problem can be worked around by using these features if they were available. See?
If a ping does not work for you, how would a keep-alive feature sending a ping help you in any way?
I'm not going to add work arounds for some firewall issues. These should be fixed in the firewall itself, not in HeidiSQL. Additionally you can increase the wait_timeout option in the MySQL server.
I'm not going to add work arounds for some firewall issues. These should be fixed in the firewall itself, not in HeidiSQL. Additionally you can increase the wait_timeout option in the MySQL server.
anse - I'm not sure where the confusion is.
The firewall is correct in its configuration and operation. This is not the issue.
Keep-alive would keep the connection open instead of timing out on a legitimate NAT Firewall. So of course PING would go through if it is went every 30 or 60 seconds. Even the "Auto Refresh" option that exists now works. We would be able to use this except it refreshes the entire view and resets all the panes. That is not helpful when working in the application.
Again - what you are suggesting as "work-arounds" are not work-arounds at all. They are common features and functions found on many network service applications:
1) A button or option to 'reconnect' to an existing server
2) provide a connection refresh/keep-alive that pings in the background without affecting the user's view. Like the auto-refresh but it does not touch the UI content.
3) Keep the application running when the disconnect button is pressed on the last server. In-fact, closing the application was unexpected. I don't understand the point of closing the application all the way when disconnecting from a server.
So, I am suggesting that these same features can be used to overcome the problems a few of us have. But they are -not- work-arounds.
Oh, the wait_timeout option is irrelevant if the client packets do not reach the server ;)
The firewall is correct in its configuration and operation. This is not the issue.
Keep-alive would keep the connection open instead of timing out on a legitimate NAT Firewall. So of course PING would go through if it is went every 30 or 60 seconds. Even the "Auto Refresh" option that exists now works. We would be able to use this except it refreshes the entire view and resets all the panes. That is not helpful when working in the application.
Again - what you are suggesting as "work-arounds" are not work-arounds at all. They are common features and functions found on many network service applications:
1) A button or option to 'reconnect' to an existing server
2) provide a connection refresh/keep-alive that pings in the background without affecting the user's view. Like the auto-refresh but it does not touch the UI content.
3) Keep the application running when the disconnect button is pressed on the last server. In-fact, closing the application was unexpected. I don't understand the point of closing the application all the way when disconnecting from a server.
So, I am suggesting that these same features can be used to overcome the problems a few of us have. But they are -not- work-arounds.
Oh, the wait_timeout option is irrelevant if the client packets do not reach the server ;)
> the wait_timeout option is irrelevant if the client packets do not reach the server
Of course it would, as the server disconnects you, and increasing the wait interval before the server does this would help you, doesn't it?
I'm probably confused by too much text here in this thread. There are feature requests mixed with network problem descriptions, and much talking about firewalls, but nothing MySQL specific here. I'm lost.
Of course it would, as the server disconnects you, and increasing the wait interval before the server does this would help you, doesn't it?
I'm probably confused by too much text here in this thread. There are feature requests mixed with network problem descriptions, and much talking about firewalls, but nothing MySQL specific here. I'm lost.
Anse,
Were you able to reproduce this? I suffer from the same problem with one of my clients. Essentially, I need to VPN into my client's system. If I don't use Heidi for a while (15+ minutes? Don't know the exact time, to be honest), if I do anything in the UI it hangs. It will eventually come back... but by eventually I mean somewhere on the order of 30+ minutes.
I tend to just open a new window and do what I need to do there, but if I have a sql statement I've been executing for testing purposes, it's a bit frustrating to have it locked away in a hung app. :|
Anyway, if there is anything I can do to help provide info, please let me know.
-Cord Awtry
Were you able to reproduce this? I suffer from the same problem with one of my clients. Essentially, I need to VPN into my client's system. If I don't use Heidi for a while (15+ minutes? Don't know the exact time, to be honest), if I do anything in the UI it hangs. It will eventually come back... but by eventually I mean somewhere on the order of 30+ minutes.
I tend to just open a new window and do what I need to do there, but if I have a sql statement I've been executing for testing purposes, it's a bit frustrating to have it locked away in a hung app. :|
Anyway, if there is anything I can do to help provide info, please let me know.
-Cord Awtry
Anse,
I understand. Let me offer up what I see in 3 scenarios:
- Scenario #1 (local db) - I can leave the app open for days on end and every time I return to use it, simply reconnects and away we go within a second or two.
- Scenario #2 (remote db via vpn) - For the most part reacts the same way as the local db
- Scenario #3 (remote db via vpn tunnel) - This is a db where I have to connect to my client's VPN and then tunnel to the db server. This one will hang after 15+ minutes of not using. If I kill my putty session with my tunnel, the app will return after about 15 seconds. If I just let it run, it will return after 30+ minutes.
Not sure if any of this is actually useful info for you, but these are my observations.
I understand. Let me offer up what I see in 3 scenarios:
- Scenario #1 (local db) - I can leave the app open for days on end and every time I return to use it, simply reconnects and away we go within a second or two.
- Scenario #2 (remote db via vpn) - For the most part reacts the same way as the local db
- Scenario #3 (remote db via vpn tunnel) - This is a db where I have to connect to my client's VPN and then tunnel to the db server. This one will hang after 15+ minutes of not using. If I kill my putty session with my tunnel, the app will return after about 15 seconds. If I just let it run, it will return after 30+ minutes.
Not sure if any of this is actually useful info for you, but these are my observations.
Please login to leave a reply, or register at first.