Hi.
We currently have a number of cloud based servers with EUKHost and over the past few months one of them is getting gradually worse, suffering from high load.
Over the past few weeks I've transferred some of the bigger and more used web sites of it to try to easy the burden (most of the sites are test sites so clients can see the site before putting it live and so get very little traffic).
Over the past few days been trying to see why the server is struggling so did a top this morning:
top - 08:24:08 up 1 day, 14:24, 1 user, load average: 2.15, 3.34, 3.14
Tasks: 198 total, 1 running, 192 sleeping, 0 stopped, 5 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 74.9%id, 25.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 5993032k total, 2465552k used, 3527480k free, 91832k buffers
Swap: 4128760k total, 2416k used, 4126344k free, 1079732k cached
To me this reports that the memory is fine, the server is 75% idle but all of the processes on the server are in a waiting state (25%). Have run an iostat -x 2 5 and it shows this:
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.13 24.91 0.00 74.97
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 24.00 1.00 444.00 388.00 33.28 1.31 55.96 39.88 99.70
dm-0 0.00 0.00 24.50 0.00 448.00 0.00 18.29 10.43 574.45 40.71 99.75
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.25 43.98 0.00 55.76
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 83.50 36.00 25.00 588.00 864.00 23.80 2.14 34.74 16.36 99.80
dm-0 0.00 0.00 35.50 109.00 576.00 872.00 10.02 7.03 48.35 6.91 99.80
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.12 0.00 0.12 38.58 0.00 61.17
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 4.50 13.50 3.50 228.00 32.00 15.29 9.20 78.00 58.74 99.85
dm-0 0.00 0.00 13.50 27.50 236.00 220.00 11.12 11.10 35.34 24.37 99.90
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
6.13 0.00 1.25 33.29 0.00 59.32
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 6.00 53.00 25.00 1128.00 284.00 18.10 9.64 224.88 12.75 99.45
dm-0 0.00 0.00 53.50 11.00 1132.00 88.00 18.91 11.11 322.71 15.42 99.45
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
This shows that the bottle neck seems to be the disks as they are pretty much 100% in use constantly.
Having a hard time getting support to help on this, after I posted a support ticket about this I have now been told it is a network issue. Asked if this would affect the disks and they then said no. Arrgghhhh.
Can anyone offer any help?
We currently have a number of cloud based servers with EUKHost and over the past few months one of them is getting gradually worse, suffering from high load.
Over the past few weeks I've transferred some of the bigger and more used web sites of it to try to easy the burden (most of the sites are test sites so clients can see the site before putting it live and so get very little traffic).
Over the past few days been trying to see why the server is struggling so did a top this morning:
top - 08:24:08 up 1 day, 14:24, 1 user, load average: 2.15, 3.34, 3.14
Tasks: 198 total, 1 running, 192 sleeping, 0 stopped, 5 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 74.9%id, 25.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 5993032k total, 2465552k used, 3527480k free, 91832k buffers
Swap: 4128760k total, 2416k used, 4126344k free, 1079732k cached
To me this reports that the memory is fine, the server is 75% idle but all of the processes on the server are in a waiting state (25%). Have run an iostat -x 2 5 and it shows this:
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.13 24.91 0.00 74.97
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 24.00 1.00 444.00 388.00 33.28 1.31 55.96 39.88 99.70
dm-0 0.00 0.00 24.50 0.00 448.00 0.00 18.29 10.43 574.45 40.71 99.75
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.25 43.98 0.00 55.76
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 83.50 36.00 25.00 588.00 864.00 23.80 2.14 34.74 16.36 99.80
dm-0 0.00 0.00 35.50 109.00 576.00 872.00 10.02 7.03 48.35 6.91 99.80
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.12 0.00 0.12 38.58 0.00 61.17
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 4.50 13.50 3.50 228.00 32.00 15.29 9.20 78.00 58.74 99.85
dm-0 0.00 0.00 13.50 27.50 236.00 220.00 11.12 11.10 35.34 24.37 99.90
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
6.13 0.00 1.25 33.29 0.00 59.32
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 6.00 53.00 25.00 1128.00 284.00 18.10 9.64 224.88 12.75 99.45
dm-0 0.00 0.00 53.50 11.00 1132.00 88.00 18.91 11.11 322.71 15.42 99.45
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
This shows that the bottle neck seems to be the disks as they are pretty much 100% in use constantly.
Having a hard time getting support to help on this, after I posted a support ticket about this I have now been told it is a network issue. Asked if this would affect the disks and they then said no. Arrgghhhh.
Can anyone offer any help?
Comment