Jump to content

Server Timeout: after intermission and after warmup countdown


clan DIABOLIK

Recommended Posts

Hello,

 

These timeouts make "Connection interrupted" at players screen, and at server side, server is in timeout.

 

Sometimes 3seconds, sometimes 60s !!!

 

Here is what shows ie HLSW:

http://wolfensmart.free.fr/divers/lag1.JPG

 

Of course I've opened a ticket at my VPS provider, and they can not see any problem at host side.

 

So he thinks it's from bad configuration of my server, but my settings are the same since 2 months and problem is here since 2 weeks !!!

 

In server.log, no strange message ...

 

I've made !dboptimize operation => server.log say it is OK (and also !dbcleanup OK) ...

 

When I make a "top" on server, no extra process is eating CPU ...

Concerning RAM (512M) is the same since many months and never had lags ...

 

 

We need an engineer !!!

Edited by clan DIABOLIK
Link to comment
Share on other sites

  • Subscriber

I'm sorry it used to happen to ours years ago but I cant remember how it was fixed.

It sounds like the server is doing something before the map starts. Have you tried deleting the server logs ? so they are not very big to load / read and having no skins or anything running, just basic server with silent on it.

Are you running punkbuster still ?

Link to comment
Share on other sites

  • Management

Problem solved, by deleting these files in silent/database folder :

  • userdb.db
  • userxdb.db
  • serverstat.cfg
  • mapvoteinfo.cfg

@JohnDory, thx for your help.

 

V55

 

If that fixed the issue then it seems like HD I/O issue on your server. Try this command on u Linux shell and 

dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync

And post the output here. It might give an idea if something is wrong with hard drives. 

Link to comment
Share on other sites

i'll join topic since i have similar problem. Unfortunately in my case removing users database is not an option. 

 

hd test output

dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 17.2295 s, 62.3 MB/s
Link to comment
Share on other sites

  • Management

I have seen and (experienced myself) players being in an coneection interrupt state for a long time from time to time. Never consistently and it doesn't usually affect every player the same way. Having red bars with an external tool does suggest it is stuck on something or the connection is getting refused for a while. I don't think there is anything programmatic with the database handling that can take so long. Handling the records at the end game does actually less than !usersearch command. Unless the command "/rcon dboptimize" is given. That one should perform exponentially worse when the database grows. But it also needs to be triggered with an rcon command every time.

 

I can think of three areas where to look further, some sort of issues with file system when actually writing the data to a file, networking/firewall issue, or, game is for some reason resending the game state to player(s) for a while. The last one is what I have seen happen a few times, but I don't know what triggers it. Gamestate is automatically resent if the server thinks the player is still in the previous map. It could be caused by the mod somehow.

Link to comment
Share on other sites

Thanks hellreturn, here is output:

 

wolf@lasmartbox:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 15.7585 s, 68.1 MB/s
 
Does-it seem normal ?
 
Concerning the CPU, I want to try the tool in this forum http://www.webhostingtalk.com/showthread.php?s=2be42efe8e3f79996fe00fd7cfbbc469&t=924581 , which can be a database-reference if I use the link in the 1st post. 
Link to comment
Share on other sites

  • Management

 

Thanks hellreturn, here is output:

 

wolf@lasmartbox:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 15.7585 s, 68.1 MB/s
 
Does-it seem normal ?
 
Concerning the CPU, I want to try the tool in this forum http://www.webhostingtalk.com/showthread.php?s=2be42efe8e3f79996fe00fd7cfbbc469&t=924581 , which can be a database-reference if I use the link in the 1st post. 

 

 

 

i'll join topic since i have similar problem. Unfortunately in my case removing users database is not an option. 

 

hd test output

dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 17.2295 s, 62.3 MB/s

 

Those values seems fine for SATA 3Gbs devices or VPS/VDS.

 

I had same issue on one of my windows server and it got auto resolved when I added new hard drives to my server.

 

Since you guys have linux try ioping also. It would tell you latency of hard drives at the time of map start. Let's try this:

 

1. Install ioping.

2. Run ioping and monitor it during map start.

3. See if you see any increase in latency causing CPU halts. 

4. Remove the DB and then start the server again.

5. If no red spikes and no increase in latency spikes of hard drive after removing DB files then it could be HD I/O issue.

 

For debian just do apt-get install ioping

 

Example:

4096 bytes from / (simfs /vz/private/3581): request=1 time=0.1 ms
4096 bytes from / (simfs /vz/private/3581): request=2 time=0.1 ms
4096 bytes from / (simfs /vz/private/3581): request=3 time=0.4 ms
4096 bytes from / (simfs /vz/private/3581): request=4 time=0.4 ms
4096 bytes from / (simfs /vz/private/3581): request=5 time=0.3 ms
4096 bytes from / (simfs /vz/private/3581): request=6 time=0.1 ms
4096 bytes from / (simfs /vz/private/3581): request=7 time=10.8 ms
4096 bytes from / (simfs /vz/private/3581): request=8 time=6.0 ms
4096 bytes from / (simfs /vz/private/3581): request=9 time=0.3 ms
4096 bytes from / (simfs /vz/private/3581): request=10 time=1.6 ms
4096 bytes from / (simfs /vz/private/3581): request=11 time=0.2 ms

--- / (simfs /vz/private/3581) ioping statistics ---
11 requests completed in 10530.4 ms, 547 iops, 2.1 mb/s
min/avg/max/mdev = 0.1/1.8/10.8/3.3 ms

During any map start game read all pk3's.. so if that time HD I/O increases then server can handle, then it could give red spikes to players.. Once map is loaded it would be all fine... That was the case with me cos my server hard drive got old.

 

Diabolic / mp_no can I host your server files for testing purpose and see if you are getting same red spikes or not? Just to make sure it's not environment issue? 

 

PS I also use munin for monitoring server usage... if any of you are interested here is the link: http://munin-monitoring.org/

Link to comment
Share on other sites

OK for ioping, gonna try it.

I've made the bench from the link of my previous post, and get this result: 

 

========================================================================

   BYTE UNIX Benchmarks (Version 5.1.2)

 

   System: lasmartbox: GNU/Linux

   OS: GNU/Linux -- 2.6.32-042stab084.20 -- #1 SMP Mon Jan 27 00:40:08 MSK 2014

   Machine: x86_64 (unknown)

   Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")

   CPU 0: AMD Opteron Processor 4386  (6200.0 bogomips)

          Hyper-Threading, x86-64, MMX, AMD MMX, Physical Address Ext, SYSENTER/SYSEXIT, AMD virtualization, SYSCALL/SYSRET

   14:57:43 up 34 days, 16:56,  3 users,  load average: 0.24, 0.18, 0.09; runlevel

 

------------------------------------------------------------------------

Benchmark Run: Wed Mar 26 2014 14:57:43 - 15:24:31

1 CPU in system; running 1 parallel copy of tests

 

Dhrystone 2 using register variables       11590823.8 lps   (10.0 s, 7 samples)

Double-Precision Whetstone                     3082.1 MWIPS (10.0 s, 7 samples)

Execl Throughput                               1887.5 lps   (29.9 s, 2 samples)

Pipe Throughput                              604574.4 lps   (10.0 s, 7 samples)

Pipe-based Context Switching                  92456.4 lps   (10.0 s, 7 samples)

Process Creation                               5076.1 lps   (30.0 s, 2 samples)

Shell Scripts (1 concurrent)                   1976.9 lpm   (60.0 s, 2 samples)

Shell Scripts (16 concurrent)                   137.1 lpm   (60.2 s, 2 samples)

Shell Scripts (8 concurrent)                    256.9 lpm   (60.1 s, 2 samples)

System Call Overhead                         597336.3 lps   (10.0 s, 7 samples)

 

System Benchmarks Partial Index              BASELINE       RESULT    INDEX

Dhrystone 2 using register variables         116700.0   11590823.8    993.2

Double-Precision Whetstone                       55.0       3082.1    560.4

Execl Throughput                                 43.0       1887.5    438.9

Pipe Throughput                               12440.0     604574.4    486.0

Pipe-based Context Switching                   4000.0      92456.4    231.1

Process Creation                                126.0       5076.1    402.9

Shell Scripts (1 concurrent)                     42.4       1976.9    466.2

Shell Scripts (16 concurrent)                     ---        137.1      ---

Shell Scripts (8 concurrent)                      6.0        256.9    428.2

System Call Overhead                          15000.0     597336.3    398.2

                                                                   ========

System Benchmarks Index Score (Partial Only)                          457.6

Link to comment
Share on other sites

  • Management

OK for ioping, gonna try it.

I've made the bench from the link of my previous post, and get this result: 

 

 

To be honest benchmark will not help you much unless you have your own dedicated server. If you have your own server I would suggest using munin or any other lower level monitoring tools to monitor your HD, CPU and RAM usage. Munin has good plugins too. 

Link to comment
Share on other sites

Thanks a lot for helping me, but for sure the problem is here:

 

wolf@lasmartbox:~$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/simfs             64G  3.9G   42G   9% /
tmpfs                 256M     0  256M   0% /lib/init/rw
tmpfs                 256M     0  256M   0% /dev/shm
 
(42G+3.9G = 64G ..... as you can see, space-disk is eaten by an alienware lol)
 
So I made a new ticket at http://www.pulseheberg.com   :(
Link to comment
Share on other sites

  • Management

 

Thanks a lot for helping me, but for sure the problem is here:

 

wolf@lasmartbox:~$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/simfs             64G  3.9G   42G   9% /
tmpfs                 256M     0  256M   0% /lib/init/rw
tmpfs                 256M     0  256M   0% /dev/shm
 
(42G+3.9G = 64G ..... as you can see, space-disk is eaten by an alienware lol)
 
So I made a new ticket at http://www.pulseheberg.com   :(

 

 

That one is showing more usage because probably it's a node issue where your VPS is hosted. Probably overselling OpenVZ. I could be also wrong. 

Link to comment
Share on other sites

  • Management

which files you need?

 

Either only your DB files and server.dat file inside your silent mod folder or your full server files so that I can just start the server (Remove u rcon)

 

This way i can recreate replica of your server on my machine and see if it lags or not. If it lags on my machine also then we can investigate more and if it doesn't then it's your machine/server issue.

Link to comment
Share on other sites

Either only your DB files and server.dat file inside your silent mod folder or your full server files so that I can just start the server (Remove u rcon)

 

This way i can recreate replica of your server on my machine and see if it lags or not. If it lags on my machine also then we can investigate more and if it doesn't then it's your machine/server issue.

 I sent you a link in pm

Link to comment
Share on other sites

  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...