Monday, October 13, 2008

Monitoring Load Average

Knowledge about databases is often not sufficient to effectively monitor usage that application produces. You also must have some basic knowledge about the host machine which runs your database.

When users experience performance problems with application one of the first things you should check on host machine is load average.

You can do that executing uptime command.


12:05:13 up 227 days, 11:15, 2 users, load average: 1.93, 2.41, 2.30


This information shows how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.



You can also get info about load average executing w or top commands.


In my example I have three numbers for load average: 1.93, 2.41, 2.30.

This information shows average load for last 1, 5 and 15 mins.

- 1.93 means that processes were using about 96.5% of my overall CPU capability. I can conclude from this that processes were served well last minute.
- 2.41 tells me that in past 5 minutes processes requested about 20.5% more CPU power than machine could serve. This means that some processes were forced to wait for CPU.
- 2.30 tells me that in past 15 minutes processes requested about 15% more CPU power than machine could serve.

It is important to notice that my machine has 2 CPUs so ideal load average for my machine is below 2.0. If I had machine with just 1 CPU then load average below 1.0 would be OK.

Also it is important to understand that load isn't CPU usage. I could have one or hundred processes (most of them probably idle) with CPU usage 99% but still my load would be acceptable.


To find out number of CPUs on machine you could execute following commands:
LINUX

[oracle@dibidus ~]$ cat /proc/cpuinfo | grep processor | wc -l
2

This Linux machine has 2 CPUs.

SOLARIS

# psrinfo -p -v
The UltraSPARC-IV physical processor has 2 virtual processors (0, 16)
The UltraSPARC-IV physical processor has 2 virtual processors (2, 18)

This Solaris machine has 2 physical CPUs with 2 virtual processors.

1 comment:

  1. Load Average is a metric that indicates the level of load - or stress - that a server is under at a given point in time. In literal terms, the Load Average is a moving average of the number of system processes that are using the CPU, waiting for CPU time, or waiting on IO. By monitoring a server's Load Average over a period of time, we can produce a trend graph that provides insight into how efficiently the server is using its' CPU resources.

    ReplyDelete