Sunday, May 8, 2005

Potential server problems

Twice today, my server got unresponsive where I had to turn it off, and then turn it back on.  When it was in this state, I noticed that the hard drive was thrashing. After rebooting it the first time, I noticed that apache died because it wasn't able to allocate memory.

I think that one of the processes is leaking memory so bad that it eventually uses all of the swap space.  I don't think that this problem is caused by anything external, as I can see a log of all of the connections though my firewall.  Now I just need to find out what is leaking the memory.

Update:  I found out what causing the problem.  I had installed version 1.3.0 of munin.  Since I had a cron job that got the statistics every 5 minutes and it appears that some of the processes were not finishing, so eventually the computer would run out of memory.    I updated munin to 1.3.2, and it doesn't seem to have this problem.