How I may help
LinkedIn Profile Email me!
Call me using Skype client on your machine

Reload this page Computer Performance Monitoring

This is my concise reference on analyzing and tuning Microsoft Windows and Unix/Linux machines using counters and other tools for performance profilinganother page on this site testinganother page on this site and tuninganother page on this site

Sound: “Away, away, I tell you. You are overloading the machine!”
Sound: Galloping horse
"Warning: Warp core collapse in 10 seconds" (from Star Trek, The Movie)


Topics this page:

  • Monitor Types
  • Windows Monitoring
  • Unix Monitoring:
  • Processes
  • Rstat Daemon
  • Measured Objects
  • Log Analysis
  • ARM Instrumentation
  • Quiz
  • Links
  • Your comments???

    RSS XML feed for load testers RSS preview for load testers Site Map List all pages on this site 
    About this site About this site 
    Go to first topic Go to Bottom of this page

    System Analysis Utilities downloads Top/ Computers/ Performance_and_Capacity

    Set screen Types of Monitoring

      Basically, there are two approaches to monitor machines under test:

      1. Internal Probes/Agents that run within the server (as a process/daemon) and sends counter values to an application like the Windows Task Manager or out to a Diagnostics Mediator/Serveranother page on this site
      2. External Monitors (such as the unix rstad daemon) that respond to operator commands sent using SSH (Secure Shell) protocol through the network to the server being monitored, which then responds with another transmission over the network.

      Set screen External Monitors

      The amount of resources used by each server in an J2EE or .NET application can be monitored externally by sending requests to Windows performance monitor or commands as a Unix Secure Shell (SSH) session user.

      Since all monitoring requests orginate from outside the server being monitored, some call this a "black box" approach to testing.

      This is the approach used by Mercury SiteScope, LoadRunneranother page on this site and Business Process Monitor.

      Reminder Just because a monitor is "agent-less" doesn't mean that it is "non-intrusive" in that it imposes overhead on the system being tested.

      Set screen Probes/Agents Within Servers

      This approach is classified as "white box" testing because probes are installed inside each app server under test (SUT).

        This is why product ReliAgent, calls its probes embedded "narks".

      After a probe is "instrumented"another page on this site to recognize methods running on its app server, it makes an announcement whenever it detects invocations of servlets, JSPs, EJBs, JNDI, JDBC, JMS, and Struts.

      Probes report on JVM memory heap usageanother page on this site and the memory consumption of Java collections.


      Battle at Yamaki Mansion - Japanese Title: Yamaki yakatano Tsuki. 
	A clever Samurai uses his helmet as a decoy on the other side of a paper screen to catch his enemy off guard.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Microsoft Windows System Monitors

      There are several ways to view performance data:

      • Task Manager lists major user-selected objects in real-time.
      • System Monitor MMC snap-in to the Performance console — which Windows 2000 renamed from the Windows NT4 Perfmon — graphs, in real-time, objects on remote computers as well as the local computer.
      • Network Monitor does the same exclusively for network activity on each NIC card.
      • Performance Logs and Alerts are configured to create logs which can later be analyzed together using MS-Excel, SAS, or other statistical analysis application.
      • Servers configured with a Simple Network Management Protocol (SNMP) agent serviceanother page on this site send SNMP messages to a central SNMP sink machine which holds the messages for analysis by some 3rd party SNMP management system.
      • Enterprise infrastructure management software catch business transaction instrumentation messages issued from C and Java programs using the ARM (Application Response Measurement) API to measure application availability, application performance, application usage, and end-to-end transaction response time.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen MS Windows Task Manager

      There are several ways to invoke the Task Manager:
      • Press Ctrl-Shift-Esc keys at the same time.
      • Press Ctrl-Alt-Del, then select Task Manager.
      • Press and R at the same time or
        press then R for Run and press Enter,
        then type taskmgr and click OK.

      Perfmon screen

      This sample screen shows both green User mode and red kernel mode CPU Usage because View, Show Kernel Times has been selected:

      Perfmon menu

      Reminder By default, the update speed is set to “Normal”, which means once per second.

      I prefer the "Low" setting when I see what is hogging up CPU cycles (by clicking the "CPU" heading):

      Perfmon menu

      Metrics shown in the Processes tab can be selected from View, Select Columns:

      The above are defaults. Session ID and User Name are new since Windows XP.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen MS Windows System Monitor

      Performance data is displayed immediately using the System Monitor node in the Performance console.

      System Monitor is an ActiveX control, so can be displayed in applications that support this object type. For example, System Monitor can be displayed in Internet Explorer or in a Microsoft Office application, such as Microsoft Word. The simplest way to export the OLE Custom eXtension (OCX) is to select a Counter log and on the Action menu, click Save Settings As. The file is then saved as a standard .html file and viewed in Internet Explorer. System Monitor is fully functional whether it is running as part of the Performance console or in another application.

      View data in System Monitor in one of three views: chart, histogram, or report. The chart and histogram views are typically used for analyzing real-time data. The report view is ideal for viewing a summary of data collected in counter logs. You can use all views to see real-time and logged data.

      typeperf.exe (a variant of perfmin) that comes with Win2k3 dumps perf data to CSV, TSV or a database.

      Ganglia's monitoring architecture takes less resources.

      Set screen Configuring Remote Windows Machine Monitoring by User Accounts

      According to Q158438: If you are not using an account with no administror privileges to an NT/Win 2000 machine being monitored, you must first grant read permission to certain files and registry entries. The required steps are:

      1. Using Explorer or File Manager, give the user READ access to: %windir%\system32\PERFCxxx.DAT
        where xxx is the basic language ID for the system. For example, 009 for English. If these files are missing or corrupt, expand them off of the installation cd.
      2. Using REGEDT32, give the user READ access to: HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Perflib and all sub keys of that key.
      3. Using REGEDT32, give the user at least READ access to: HKEY_LOCAL_MACHINE\System\CurrentControlSet\ Control\SecurePipeServers\winreg
      4. Give the user at least READ access to the following key and allow Read permission to propagate down to the Services subkeys: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
      5. With Windows 2000, in addition to the access described above, the user must also have access granted by the following Group Policies: Computer Configuration\Windows Settings\Security Settings\Local Policies\User Rights Assignment
        • Profile System Performance
        • Profile Single Process
      6. If the user is neither a power user nor an administrator, additional permissions might be needed to access SysMonLog services. To grant full access to SysMonLog services, run the subinacl /service sysmonlog /grant=tester=f command, where tester is the user account.

      A system is described as "quiescent" (dormant; in a state of tranquil repose; at rest; resting; still; inactive; quiet;) when its CPU is running no active user tasks.

      A system is described as "pegged" when its CPU utilization remains at or near 100% -- the maximum.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Unix Systems Monitoring

      This section discusses performance monitoring tools among variants of the UNIX operating systemanother page on this site which consists of: The kernel, the shell, and the file system.

      Nagios is a popular open-source performance monitoring.
      book Pro Nagios 2.0 (Apress, 2006, 424 pages) by James Turnbull

      Set screen Site Scopetool

      Mercury Site Scope monitors almost all (50) aspects of running Windows and UNIX systems (networks, servers, services, datatabases, applications, etc.) Colorado based Freshwater Software charged $2,495 for it before (on April, 2003) Mercury bought it and now sells it for $60 per unit (the number of servers multiplied by the number of counters monitored on each machine).

      So Sitescope monitors can be viewed among other LoadRunner Controller System Resource Graphs:

      • CPU Utilization Monitor
      • DNS Monitor
      • Directory Monitor
      • Disk Space Monitor
      • Log File Monitor
      • Memory Monitor
      • Network Monitor
      • Ping Monitor
      • Port Monitor
      • Script Monitor
      • Service Monitor
      • URL Monitor
      • URL List Monitor
      • URL Sequence Monitor
      • Web Server Monitor
      • WebLogic Application Server Monitor

      SiteScope is called an "agent-less" technology because it sends native UNIX commands (defined in \SiteScope\templates.os).

      Sitescope exerts about a 10% overhead on servers responding to remote queries. However, it duplicates the same requests made by LoadRunner's UNIX monitors (hitting servers with twice as much monitoring traffic).

      Perhaps for this reason the SiteScope monitor has a default measurement update rate of once every 10 minutes. Unless you change this default to 15 seconds (the most frequent allowed), you won't see measurements in Controller graphs.

      1. Invoke SiteScope.
      2. Click on the name of the monitor.
      3. Click on "Edit" next to the Counter name.
      4. In the "Update Every" entry, change the value to 15 seconds.
      5. Click on the "Update" button to register the change.

      Set screen Linux Status Commands

    • uptime provides an instantaneous summary such as

      11:42pm up 18 days, 8:45, 5 users, load average: 0.01, 0.03, 0.07

      For the current time, the Number of days up since last boot, the number of users currently logged in, and the load average for the last 1, 5, and 15 minute intervals.

      Reminder The load average (LA) is the average number of processes (the sum of the run queue length and the number of jobs currently running) that are ready to run, but are waiting for access to a busy CPU. Averages from 0 to 1.0 are acceptable for a single CPU. As a general rule of thumb, a machine is being overworked if load averages consistently exceed three times the number of CPUs.

    • top provides load average with auto refresh and additional data (sorted by %CPU):

      68 processes: 67 sleeping, 1 running, 0 zombie, 0 stopped
      CPU states: 12.2% user, 1.6% system, 0.0% nice, 86.1% idle

      Solaris comes with the prstat command to provide this info. Graphical versions of this include gtop within Gnome and the KDE Process Monitor.

      webpage article Linux Load Average Not Your Average Average by Neil J. Gunther

      Reminder For memory usage, press M.
      For CPU info, press P.
      To stop display, enter q.

      The 'SIZE' field is the total virtual memory size of each process, including all code, data, stack, mapped files, libraries etc.

    • procinfo -fn30 is used to gather system data from the /proc directory every 30 seconds.

      • Last Boot time
      • Load Average
        • average number of jobs running
        • number of runnable processes
        • total number of processes
        • PID of the last process run (idem)
      • Swap info
      • Memory resources
      • Number of disks
      • IRQ info
      • Installed modules (with the -a or -m option)
      • File Systems (with the -a or -m option)

    • xos provides a constantly updated colorful summary view of various components.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Process stats

    • ps -a lists by ID processes spawned by the current user from the current shell.

      Option -f shows child processes. For each Unique Process ID (PID):

        "SIZE" = Virtual image size; KB of text+data+stack Or "SZ" in Solaris, for memory consumption.
        "RSS" = Resident Set Size (kilobytes of program in memory).
        "SHARE" = Amount of shared memory used by the task.
        "TTY" = Controlling tty on which the process was started.
        "STAT" = Status of each process. In Solaris, "S":
          S = Sleeping
          R = Running (active)
          D = uninterruptible sleep
          T = traced (stopped)
          Z = zombie process
          Second field = "W" if the process has no resident pages.
          Third field = "N" if the process has a positive nice value. "NI" in Solaris for The nice value (priority) of the process.

      In solaris, option -l (for long) shows these additional columns:

        "F" for any flags set by the process.
        "ADDR" for the number of memory addresses used by the process.
        "WCHAN" for the memory addresses for processes that are sleeping.
        "PPID" for the parent process ID.
        "PRI" for the process priority

      ps -Af displays a full list of all processes on the system, including additional details.

        "PID" of the process
        "PPID" of the process
        "C" for the scheduling class of the process.
        "STIME" for the date/time when the process was started.

      ps -a lists the most frequently requested processes.

      Other flags include:

        -d Lists all processes
        -t Lists all processes associated with a specific terminal
        -u Lists all processes for a specific user

        -f Prints comprehensive process information
        -c Lists processes in scheduler format
        -g Prints process information on a group basis for a single group
        -G Prints process information on a group basis for a list of groups
        -j Includes SID and PGID in printout
        -l Prints complete process information
        -L Displays LWP details
        -p Lists process details for list of specified process
        -P Lists the CPU ID to which a process is bound
        -s Lists session leaders

      To filter only processes of the current user in Linux:

        ps aufx | grep $USER

      To scroll up and down in Linux:

        ps -a | less


      DOWNLOAD: Microsoft's Process Monitor(ProcMon.exe) (v2.8 released by Mark Russinovich and Bryce Cogswell Nov. 2009). It captures in real-time and combines in one GUI every file system, Registry, and process/thread activity, including those of low-level programs (such as lsass, svchost, etc.) and background apps such as anti-virus. Unlike the legacy Sysinternals utilities Filemon and Regmon it replaces, it provides rich and non-destructive filtering, simultaneous logging to a file, and comprehensive session IDs and user names event properties.

      It highlights the issues with system operations, such as "BUFFER OVERFLOW", "BUFFER TOO SMALL", "FAST IO DISABLED", "NAME NOT FOUND", "FILE LOCKED WITH ONLY READERS".

      Click on an activity for its full thread stacks, with integrated symbol support for each operation,

      Idea Shell script commands, pipes and other commands.

      The Most Executed Code in Solaris ... the CPU Idle Loop by Bill Holler

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen rpc.rstatd stats

      To obtain statistics from each UNIX machine (through port 111), use rpc.rstatd subserver daemon invoked by the inetd subsystem controlled by /etc/inet/inetd.conf

      1. download Get it from SourceForge, then install it using webpage article Joel Griffth's instructions:
      2. Build and install rstatd:

        $ tar xvzf rstatd.tar.gz
        $ cd rpc.rstatd
        $ ./configure --prefix=/usr
        $ make
        # sudo su
        # make install
      3. Add a line to the hosts.allow file within /etc/ to specify the subnet(s) allowed to make rstatd requests. For example:


        Alternately, if you want to live dangerously:

        rpc.rstatd: ALL
      4. Add rstatd entry in /etc/xinetd.d/rstatd:

        # default: off
        # description: An xinetd internal service which rstatd's characters back to clients.
        service rstatd
            type            = RPC
            rpc_version     = 2-4
            socket_type     = dgram
            protocol        = udp
            wait            = yes
            user            = root
            only_from       =
            log_on_success  += USERID
            log_on_failure  += USERID
            server          = /usr/sbin/rpc.rstatd
            disable         = no
      5. Restart xinetd:

          # /etc/rc.d/init.d/xinetd restart

        To start rpc.rstatd under Red Hat Linux, run as root

          /etc/rc.d/init.d/rstatd start

      Set screen Rstatd vs. SAR

        rstatd SAR

      • Collisions rate - Collisions per second detected on the Ethernet network wire.
      • Incoming packets errors rate - Errors per second while receiving Ethernet packets.
      • Incoming packets rate - Incoming Ethernet packets per second.
      • Outgoing packets rate - Outgoing Ethernet packets per second.

      • Paging rate - Number of pages read to physical memory or written to pagefile(s), per second.
      • Page-in rate - Number of pages read to physical memory, per second.
      • Page-out rate - Number of pages written to pagefile(s) and removed from physical memory, per second.

      • Average load - Average number of processes simultaneously in 'Ready' state during the last minute.
      • Context switches rate - Number of switches between processes or threads, per second.
      • Swap-in rate - Number of processes being swapped into memory, per second.
      • Swap-out rate - Number of processes being swapped out from memory, per second.

      • CPU utilization - Percent of time that the CPU is utilized.
      • System mode CPU utilization - Percent of time that the CPU is utilized in system mode.
      • User mode CPU utilization - Percent of time that the CPU is utilized in user mode.

      • Disk rate - Rate of disk transfers, per second.
      • Interrupts rate - Number of device interrupts per second.

      These statistics are queried and displayed using the perfmeter OpenWindows XView utility

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen SAR (System Activity Reporter)

      sar can be run either as an interactive command or as a background service.

      To interactively obtain (-A) all counters to the (-o outfile) after a timeslice of 5 repeated over 2000 samples:

        sar -A -o outfile 5 2000

        If the "5 2000" frequency is not provided, the default is to write one record.

      The background Report Package can be enabled by uncommenting the appropriate lines in the sys crontab or activated with: svcadm enable sar.

        /etc/init.d/perf writes the restart mark to the daily data file using the command entry:

        su sys -c "/usr/lib/sa/sadc /var/adm/sa/sa`date +%d`"

        This uses the sadc (data collector) in the /usr/lib/sa folder to write to dated files in the /var/adm/sa folder. On Solaris 10 the service is "svc:/system/sar:default", which reads the same kstat data that iostat uses.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Other Monitors

      Quest SQL Server Dashboard

    • Quest Spotlight on SQL Server Enterprise offers this dashboard, which provides a visual approach to organizing metrics. Throughput rate metrics are shown with arrows. Critical issues are in orange.
    • vxstat utility in HP-UX
    • MeasureWare Agent in Hewlett-Packard's OpenView network management suite
    • Perform Agent, previously named BEST/1 Agent, for BMC Patrol Perform/Predict Performance assurance

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Measured Objects


      T: prefixes UNIX rstat.d daemon counters.
      W: prefixes Windows Perfmon counters.
      S: prefixes SiteScope counters.
      R: prefixes Linux/Solaris SAR counters.

      Counters.chm file from the Windows 2000 Resource Kit.
      Counters.hlp file from the Windows NT Workstation 4.0 Resource Kit.

    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Object Potential
    Subset Total Threshold for Action Potential

    Network (Ethernet) Interface

    W: webpage article
    Collisions T: Collisions rate (of packets) per second detected on the wire. >1% Reduce # of machines on subnet or use higher bandwidth network
    Utilization Rate in Bytes/sec W: Current Bandwidth (theoretical bits per second)
    /8 bits/byte
    W: Bytes Received/sec W: Bytes Total/sec (including framing characters) -
    W: Bytes Sent/sec
    Utilization & Error Rate in Packets/sec T: Incoming packets rate per sec.
    W: Packets Received/sec
    sar -y canch/s = Input characters processed by canon (canonical queue)/sec
    sar -y rawch/s = Input characters (raw queue)/sec
    T: Incoming packets errors rate per sec.while receiving Ethernet packets. W: Packets Received Unicast/sec +
    W: Packets Received Non-Unicast/sec
    Good Packets in/sec -
    T: Outgoing packets rate per second.
    W: Packets Sent/sec
    sar -y outch/s = Output characters (output queue)/sec
    - W: Packets sent Unicast/sec +
    W: Packets Sent Non-Unicast/sec
    Good Packets out/sec -
    Error Count W: Packets Received Errors
    W: Packets Received Discarded
    W: Packets Received Unknown
    > 1 Adjust network buffers
    W: Packets Outbound Errors
    W: Packets Outbound Discarded
    Interrupts sar -y [Terminal Activity]:
    xmtin/s = Transmitter hardware interrupts per second
    mdmin/s = Modem interrupts per second
    rcvin/s = Receiver hardware interrupts per second
    Delay W: Output Queue Length > 2

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen SAR -y is Terminal Activity

      This is not also available in SiteScope.

      Set screen netstat

      This Solaris utility normalizes values to per-interval rates of the sampling interval specified on the command line

      Set screen Picking A Network Monitor

      To determine if the number of network buffers need to be set higher, watch the number of error-free packets the system is dropping obtained by subtracting from the total packet throughput the sum of Packets Outbound Discarded and Packets Received Discarded.

      Counters Packets Outbound Errors and Packets Received Errors indicate network card hardware problems.

      Don't rely on the Current Bandwidth counter because it shows theoretical rather than actual bandwidth.

      A reasonable limit for an Ethernet network is %Network Use less than 30 percent. A higher value means you need to speed up the network or reduce the amount of traffic.

      Use the %Broadcast Frames and %Multicast Frames counters to view the percentages of broadcast and multicast traffic. Network cards pass broadcast and multicast frames to a higher-level software component before they act on or discard them. This extra activity results in additional CPU use.

      As the requesting computer connects to find the server computer's network address, it generates broadcast traffic. Frame traffic increases as the server transfers the files.

      Similarly, don't use the Output Queue Length counter because it's always zero, since transmission requests are not handled by the network card but by network device interface specification (NDIS) software.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Network Monitor

      To add Network Segment counters, you must install the Network Monitor Agent.

      Microsoft provides two versions of NetMon. Install the "full" promiscuous version of Netmon from Microsoft's Systems Management Server (SMS) 1.2 and 2.0 product to capture packets on all NICs on remote network subnets.

      Otherwise, the network card typically rejects network traffic intended for other network cards.

      1. Install Network Monitor from Control Panel > Add/Remove Programs > Add/Remove Windows Components > Management And Monitoring Tools This puts netmon.exe and its dlls in the %windows%\system32\netmon folder.
      2. Apply patch from MS Security Bulletin MS00-83 to patch the buffer overrun vulnerability from malicious malformed data.
      3. Invoke Netmon from a command prompt or
        Start > Programs > Administrative Tools
      4. By default capture files are saved with the .cap file suffix in the My Captures folder under the My Documents folder of the current user.

      Netmon filter specifications are stored in the NetMon\Captures subdirectory. Netmon allows filtering by protocol, TCP/IP address, and data pattern.

      Caution! This activity drains the resources of the computer you're analyzing, limit Network Segment monitoring. Monitoring Network Segment counters increases CPU use. As these counters process network traffic, they use additional system resources.


    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Object Potential
    Subset Total Threshold
    for Action


    W: webpage article
    RAM Installed (see Determining Memory) - W: Available K/M/Bytes < 300 MBytes Add memory
    W: Committed (Virtual Memory) Bytes (for paging)
    Memory Leakage W: Pool Paged Bytes Increases over a long period of time Java?
    Page Fault operations involving physical (hard) disk activity rather than just in (soft) memory Page fault recovery operations W: Page Reads (from disk)/Sec
    sar -p pgin/s requests/sec
    W: (Hard and Soft) Page Faults/Sec > 20
    W: Page Writes (to disk)/Sec
    sar -p atch/s (attaches per second) of page faults satisfied by reclaiming a page in memory.
    W: Write Copies/sec none
    Pages involved in Page Fault physical (hard) disk activity rather than just in (soft) memory U: Paging Rate
    W: Pages/Sec (Incidents of Hard Faults)
    U: Page-in rate
    W: Pages Input (from disk)/Sec
    sar -p ppgin/s
    W: (Hard and Soft) Page Faults/Sec > 5 Increase paging file size
    U: Page-out rate
    W: Pages Output (to disk)/Sec
    W: (Non-disk/soft Recoverable) Transition Faults/sec
    sar -p pflt/s page faults from protection errors per sec (illegal access to page) or "copy-on-writes".
    sar -p vflt/s valid page not in memory (address translation) page faults/sec.
    sar -p slock/s software lock requests requiring physical I/O faults/sec

      Is there enough swap space?

      This is the usual fix for "out of memory" messages.

      How much memory is each process really using (is Private)?

      To install RMCmem:

      # cd /tmp
      # zcat RMCmem3.8.2.tar.gz | tar xvf -
      # pkgadd -d .
      To obtain private memory totals by mapped file:

      # /opt/RMCmem/bin/pmem 361
      To obtain private memory by PID:

      # /opt/RMCmem/bin/memps

      Some of the "RSS" (Resident Set Size) real physical memory includes memory containing shared libraries which are shared with other processes.

      To determine the "Private" bytes that is NOT shared with other processes (excluding application binaries),
      Download and install the RMCmem Package from for Solaris 2.5 kernel modules that provides extra instrumentation.

      sar -r [unused memory pages and disk blocks]:

        freemem = average number of memory pages available to user processes;
        freeswap = number of 512-byte disk blocks available for page process swapping.
        vswap = virtual pages available to user processes [not in solaris Sar].

      sar -b [Buffer activity]:

        bread/s, bwrit/s = average number of basic blocks transferred per second between system buffers and disk or other block devices (submitted to the buffer cache from the disk);
        lread/s, lwrit/s = average number of logical blocks transferred from the buffer cache (system buffers to user memory);
        wcncl/s = pending writes in system buffers cancelled [not in SiteScope]
        %rcach, %wcach = Fraction of logical reads that are found in the buffer cache (100% minus the ratio of bread/s to lread/s) (cache hit ratios, that is, (1-bread/lread) as a percentage);
        pread/s, pwrit/s = Average number of physical read/write requests, per second that use character device interfaces (basic block transfers via raw (physical) device mechanism.)
        The most important entries are the cache hit ratios %rcache and %wcache, which measure the effectiveness of system buffering. If %rcache falls below 90 percent, or if %wcache falls below 65 percent, it might be possible to improve performance by increasing the buffer space.

      sar -h [system heap statistics, not available in SiteScope]:

        heapmem = amount of memory currently allocated to all kernel dynamic heaps (block managed arenas, general zone heaps, and private zone heaps);
        overhd = block managed arena overhead;
        unused = block managed arena memory available for allocation;
        alloc/s = number of allocation requests per second;
        free/s = number of free requests per second.

      sar -p commands obtain paging activities stats from UNIX systems.
      Subsets of vflt/s = address translation page faults (valid page not in memory) [not in SiteScope]

        dfill/s = address translation fault on demand fill or demand zero page
        cache/s = address translation fault page reclaimed from page cache
        pgswp/s = address translation fault page reclaimed from swap space
        pgfil/s = address translation fault page reclaimed from filesystem
        rclm/s = pages reclaimed by paging daemon
      subsets of pflt/s = (hardware) protection faults -- including illegal access to page and writes to (software) writable pages [not available from SiteScope]
        cpyw/s = protection fault on shared copy-on-write page
        steal/s = protection fault on unshared writable page

      sar -k [kernel memory allocation (bytes)]

      • sml_mem = bytes the KMA has available in the small memory request pool (of less than 256 bytes).
        alloc = bytes the KMA has allocated from its small memory request pool to small memory requests.
        fail = number of requests for small amounts of memory that failed.
      • lg_mem = bytes the KMA has available in the large memory request pool (from 512 bytes to 4 Kbytes).
        alloc = bytes the KMA has allocated from its large memory request pool to large memory requests.
        fail = number of failed requests for large amounts of memory.
      • ovsz_alloc = bytes allocated for oversized requests (larger than 4 Kbytes). These requests are satisfied by the page allocator. Thus, there is no pool.
        fail = number of failed requests for oversized amounts of memory.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Server

    W: webpage article MS IIS 6 Counters of the WWW Service, its Web Service Cache, FTP Service, Internet Information Services Globals, SNMP, Active Server Pages, ASP.NET.

    • Server -> Pool Nonpaged Failures shows the number of times allocations from nonpaged pool have failed - indicates that the computer `s physical memory is too small.
    • Server -> Pool Paged Failures indicate that either physical memory or a paging file is near capacity.
    • Server -> Pool Nonpaged Peak shows the maximum number of bytes in nonpaged pool the server has had in use at any one point. Indicates how much physical memory the computer should have.

    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Potential
    Subset Total Threshold
    for Action


    W: Windows 2003 webpage article
    W: Processor Queue Length (ready but non-running threads on all CPUs) 
	From “The Optimization and Tuning of Windows NT” white paper by Scott B. Suhy The Windows Processor Queue Length counter is part of the Windows System object instead of the Processor object because there is only a single queue for processor time (even on multiprocessor computers). Unless the processor is running at very high sustained utilization, this counter is likely to return a result of 0 because it displays only the last observed value, not an average.

    sar -q [Queue length]: runq-sz (run queue size) = number of kernel threads in memory waiting for a CPU to run.
    > 2 - A sustained processor queue of more than two threads is an indication of a processor bottleneck.
    Consistently higher values mean that the system might be CPU-bound.
    More powerful (high Mhz) CPU capacity.
    Occupation sar -q [Queue length] %runocc (run queue occupied) = processes in memory and runnable. > 90% -
    Thrashing W: Context Switches/sec (all processors) This is a cumulative number, so it needs to be converted to a rate (per second) after subtracting the amount of thrashing overhead that occurs when a server is at rest. 
swap queue of processes swapped out but ready to run. mpstat csw (context switches) > 5% of total threads Less apps per CPU
    Latency prstat -mL LAT column (time waiting for CPU) mpstat csw (context switches) - -

      Note: swpq-sz & %swpocc (swap) are no longer reported by sar.

    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Potential
    Subset Total Threshold
    for Action


    W: webpage article
    (individual CPU)
    U: CPU utilization
    W: % Processor Time
    U: System mode CPU utilization
    W: % Privileged Time (in kernel-mode)
    sar -u %sys (system mode)
    W: Elasped Time (100%) > 80% Scale CPUs
    U: User mode CPU utilization
    W: % User Time (for apps)
    sar -u %usr (mode)
    W: % Idle Time
    sar -u %idle (not waiting for I/O)
    Utilization amount W: Working Set Peak (bytes) W: Working Set (bytes) Difference ?

      Note: %wio (wait i/o) is not presented (always zero) on Solaris 10.

      Solaris vmstat presents memory, run-queue, and summarized processor utilization. It uses kstat which maintains CPU utilization for each CPU.

      Solaris mpstat presents per-processor stats and utilization.


    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Potential
    Subset Total Threshold
    for Action


    W: Windows 2003 webpage article
    Swapping count of LWP Transfers per sec sar -w swpin/s - > 1 More memory to swap
    sar -w swpot/s
    512 byte Blocks sar -w bswin/s - - -
    sar -w bswot/s
    Switching sar -w pswch/s = (kernel thread) switches sar -w pswpout/s = process swapouts/sec [not in SiteScope] - - -
    Size prstat -s RSS
    prstat -z (per zone)
    - - - -

      Solaris ps presents per-process stats.

      Solaris prstat presents thread-level microstate accounting (with high-resolution time stamps) and per-project stats for resource management.

      sar -v [entries/size for each table, evaluated once at sampling point, not available in SiteScope]:

        proc-sz = number of process entries (proc structures) that are currently being used, or allocated in the kernel.
        inod-sz = total number of inodes in memory versus the maximum number of inodes that are allocated in the kernel. This number is not a strict high water mark. The number can overflow.
        file-sz = size of the open system file table. The sz is given as 0, since space is allocated dynamically for the file table.
        ov = overflows that occur between sampling points for each table. The number of shared memory record table entries currently being used or allocated in the kernel. The sz is given as 0 because space is allocated dynamically for the shared memory record table.
        lock-sz = number of shared memory record table entries currently being used or allocated in the kernel. The sz is given as 0 because space is allocated dynamically for the shared memory record table.

      sar -c [System calls]:

        scall/s = All types of system calls per second, which is generally about 30 per second on a system with 4 to 6 users.
        sread/s swrit/s = read system calls per second.
        fork/s = write system calls per second.
        exec/s = exec system calls per second. If exec/s divided by fork/s is greater than three, look for inefficient PATH variables.
        rchar/s, wchar/s = characters (bytes) transferred by read and write system calls per second.

      sar -m [Message and semaphore activities (for Interprocess Communication)]:

        msg/s = message primitives (send and receive operations) per second.
        sema/s = sempahore primitives per second.
        These figures will usually be zero (0.00), unless you are running applications that use messages or semaphores.

      sar -t [translation lookaside buffer (TLB) activities, not available in SiteScope]:

        tflt/s = user page table or kernel virtual address translation
        faults: address translation not resident in TLB;
        rflt/s = page reference faults (valid page in memory, but hardware valid bit disabled to emulate hardware reference bit);
        sync/s = TLBs flushes on all processors;
        vmwrp/s = syncs caused by clean (with respect to TLB) kernel virtual memory depletion;
        flush/s = single processor TLB flushes;
        idwrp/s = flushes because TLB ids have been depleted;
        idget/s = new TLB ids issued;
        idprg/s = tlb ids purged from process;
        vmprg/s = individual TLB entries purged.

      sar -I [interrupt statistics, not available in SiteScope]:

        intr/s = non-vme interrupts per second;
        vmeintr/s = vme interrupts per second;

      sar -a [File access system routines]:

        iget/s = number of requests made for inodes that were not in the directory name look-up cache (DNLC).
        namei/s = number of file system path searches per second. If namei does not find a directory name in the DNLC, it calls iget to get the inode for either a file or directory. Hence, most igets are the result of DNLC misses.
        dirbk/s = number of directory block reads issued per second.

    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen Object Potential
    Subset Total Threshold
    for Action

    Physical Disk

    W: Windows 2003 Bottleneck-Detection Counters article
    Bottleneck W: Current Disk Queue Length W: Avg. Disk Read Queue Length W: Avg. Disk Queue Length

    sar -d avque
    >2 More hard drives.
    W: Avg. Disk Write Queue Length
    Utilization Percentage W: % Disk Time
    sar -d %busy = portion of time device was busy servicing transfer requests.
    W: % Disk Read Time Elasped Time >90%
    W: % Disk Write Time
    W: % Idle Time n/a
    Utilization Per Incident W: Avg. Disk Bytes/Transfer W: Avg. Disk Bytes/Read < 20K may indicate an app. is accessing too little at a time.
    W: Avg. Disk Bytes/Write
    Utilization Incident Rate U: Disk Rate
    W: Disk Transfers/sec
    sar -d blks/s (of 512 bytes)
    sar -d r+w/s (read AND write I/O requests)
    W: Disk Writes/sec
    sar -d read/s
    Ratio of writes keeping up with reads? ?
    W: Disk Reads/sec
    sar -d write/s
    Speed (Latency) W: Avg. Disk sec/Transfer W: Avg. Disk sec/Read > 0.3 seconds may indicate disk controller needs too many retrys. Faster drives.
    W: Avg. Disk sec/Write
    Delay sar -d
    avwait = average wait time in milliseconds.
    avserv = average service time in milliseconds.
    - -
    Fragmentation W: Split IO/sec ? Defragment

      sar -d activity for each block device (disk or tape drive, with the exception of XDC disks and tape drives): When data is displayed, the device specification dsk- is generally used to represent a disk drive. The device specification used to represent a tape drive is machine dependent.

      Disk I/O speeds are about 10-100 times slower than memory. Disk I/O speeds will be very fast when data is store on filer disk arrays because such devices usually have a large amount of memory to cache data.

      A single spindle can generally handle 50 accesses per second.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Solaris/UNIX Hard Disk Monitoring

      One indication of whether a database server is "I/O-bound" is the Unix vmstat utility utility:

      vmstat 5 5

      	kthr     memory             page              faults        cpu
      ----- ----------- ------------------------ ------------ -----------
      r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
      0  0 217485   386  0   0   0   4   14   0 202  300 210 14 19 22 45

      In the example above, the "wa" (wait) column shows that 45% of the CPU time is being used waiting for database I/O.

      To fully understand the nature of I/O, you must first understand Oracle's asynchronous writing mechanism. Then you need to look at the SAP applications and explore how SAP tables are populated and managed from within the SAP application. With this knowledge, you can then make intelligent decisions about the proper file placement.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Windows Hard Disk Monitoring

      Action: Physical
      Enable: -yd (default) -yv -y
      Disable: -nv -nv -n
      Reminder By default, Windows 2000 only monitors “PhysicalDisk” performance counters such as “% Idle Time”, “Avg. Disk Queue Length”, and “Disk Bytes/sec”. To get these counters for a specific “LogicalDisk” (plus “% Free Space” and “Free Megabytes”) issue this command:

        diskperf.exe -y //computername

      This table presents diskperf parameters to specify the hard disk performance counters to start when the machine is restarted.

      Windows XP, however, displays this message:

        Both Logical and Physical Disk Performance counters on this system are automatically enabled on demand. For legacy applications using IOCTL_DISK_PERFORMANCE to retrieve raw counters, you can use -Y or -N to forcibly enable or disable. No reboot is required.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen DHCP Audit Alerts

      Reminder By default, Windows 2000 does NOT create DHCP audit logs. Enable DHCP audit logging by right-clicking the server for Properties, in the Server name Properties, General tab.

      tool Use this command to use the DHCP Server Locator Utility from the Windows 2000 Resource Kit to, every 6000 seconds, detect active DHCP servers on the network and send to the email addresses in file A:\Admins.txt the IP addresses of DHCP servers not listed in the file auth_dhcp_ip_list.txt.

        dhcploc.exe -p -a:"A:\Admins.txt" -i:6000 auth_dhcp_ip_list.txt


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Other Measurements

      Object Metric Threshold for Action Potential
      Memory Pool Size handles
      Thread pool Context Switches/sec .
      Temp space Page Faults/sec .

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Log File Formats

      The Performance Logs and Alerts MMC snap-in configures and collects two types of logs for later viewing:
      • System events, such as processes created or deleted, are traced using Trace logs.
      • The status of data is continuously sampled at fixed intervals (such as every 15 seconds), whether events occur or not. They are saved into Counter logs.

      Counter Logs

      Counter log data can be stored in four formats, all viewable using System Monitor:
      1. Perfmon format
      2. binary (.blg) -- the default format used by the Windows 2000 Performance Monitor. This format provides all of the information contained in the Perfmon format, but in a more space-efficient manner.
      3. comma-separated-value (.CSV) containing fields delimited by double-quotes and commas.
      4. tab-separated-value (.TSV) containing fields delimited by double-quotes and tab characters.

      Both the Perfmon and binary log file formats are proprietary formats developed by Microsoft.

      Two types of binary log files exist: circular and linear.

      1. Circular log files create log files until they reach a user-defined size, then a new log file is generated.
      2. Linear log files can be limited to a user-defined size or be set to grow to 1 GB or to consume all remaining disk space, whichever comes first.

      The first line of CSV-format and TSV-format log files serves as the header, providing information about the format of the file, the version of the PDH (Microsoft Performance Data Helper) interface used to create the log file, and the names and paths of each of the counters to the PDH.

      The PDH library can open a log file in the Perfmon format only for reading.

      Included in all versions of Microsoft Windows XP (excluding Windows XP Home Edition)

      relog.exe logfile.blg -f csv -o logfile.csv
      converts the .blg file to a .csv file. It can also resample a log file, and then create a new log file based on specified counters, a time period, or a sampling interval.
      Download it (429 KB) from the Microsoft Windows 2000 Resource Kit Tools for administrative tasks

      logman.exe start Sample_Log
      starts Sample_log data collections remotely from a central location by specifying the remote computer name.
      from logman.exe /? logman.exe can also:

      • Configure a data collection on one computer and then
      • copy that configuration to multiple computers — from a central location.
      • Query currently-running logs and traces.

      typeperf "Memory\Available Bytes" -s XPPRO -si 00:05
      outputs the Available Bytes Memory counter from a remote computer named "XPPRO" every 5 seconds. It can also:

      • Write performance data to either the command window or to a supported log file format.
      • Display the counters currently available on a particular local or remote computer.

      Trace Logs

      Trace logs generate binary .etl files. System Monitor CANNOT read these files.

      tool Use the TRACEDMP.EXE utility from the Windows 2000 Server and Professional Resource Kits to read .etl files to create DUMPFILE.CSV files for viewing by other applications. The utility also creates a SUMMARY.TXT file.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Log Codes

      Each bit of an 8-bit word are used:

      CodeType of Information
      0 Success
      1 Error
      2 Warning
      4 Information
      8 Audit_Success
      16 Audit_Failure

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Thresholds Trigger Alerts

      The Performance Logs and Alerts snap-in contains an Alerts node used to configure thresholds for system events such as a disk partition reaching capacity. The thresholds fire triggers. The triggers are configured to perform any function that can run in Windows 2000, such as sending a network message or running a program.

      Statscout Performance Monitor from Enterprise Management Associates


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Latencies

      The total amount of latency is the sum of ("A" time) within application machines and ("N" time) within the Network.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen J2EE Monitoring

      I generally recommend a "big to small" approach because the more fine-grained the monitoring, the more costly it is:

      So here are the approaches, from the least costly to the most costly:

      1. First, find the average end-to-end response time by emulating end-user client exchanges with the web server. Identify the machine and service which consume the most CPU, network bandwidth, and other resources during stress runs which incrementally add users until a server reaches its maximum rate of processing (as measured by the hits/pages processed per second metric).

      2. Identify the average response time of key services by emulating calls directly to each service (XML calls to app servers, SQL calls to databases). Watch them during stress runs.

      3. Work with developers to add application code which displays key performance information along with user data (like the times that Google displays with each search result). This allows web (HTML) based client scripts to simply obtain the information.

      4. Work with developers to add application code which issues transaction-level performance information to a log. Most mature application packages allow administrators to control the verbosity of the application's logs.

      Ideally, the alerts are formatted to make it easy for logs to be combined with other logs for analysis after the run is finished. If not, logs may need to be run through a custom parser.

      5. Formatting alerts to the ARM (Application Resource Management) standard allows the alerts to be issued (using the free API) and collected in real-time by business service management systems in production. See This is the best approach, IMHO.

      6. Code LoadRunner scripts in Java to emulate the client. This complex approach I describe briefly in This approach is time-consuming because parts of the client application needs to be rewritten in the test scripts (user authentication, file encryption, client session and cookie management, calls to servers, etc.). Such scripts needs to precisely identify and specifically format calls to services.

      7. Install agents inside J2EE servers which sends status to the Mercury Business Availability Center.

      8. The new version of WebLogic integrates with LoadRunner to provide performance monitoring that can be turned on or off dynamically on production machines.

      10. J2EE monitoring packages include:

      • SiteScope from Mercury
      • Symantec I3 (purchased from Veritas)
      • Compuware's Vantage Analyzer for J2EE (the JView product acquired from DevStream on October 4, 2004)
      • Dirig
      • Quest has added a J2EE monitor to its database monitoring product well-known to DBAs.

      • Borland's Optimizeit ServerTrace, provides J2EE performance metrics in the testing and deployment. providing monitoring and analysis of a distributed environmen.

        Borland's Optimizeit Enterprise Suite used during development to provide individual developers with a focused view into performance issues in their code.

      10. Custom use of Java 2 MBeans, Java Virtual Machine Profiling Interface (JVMPI), Java Virtual Machine Tool Interface (JVMTI),

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen ARM Instrumentation

      The most "industrial strength" approach is to make application code issue messages formatted to the industry-standard ARM (Application Response Measurement) API calls so that Enterprise Business Service Management Applicationson this page can recognize them.

      To measure application availability, application performance, application usage, and end-to-end transaction response time in a vendor-neutral way, the ARM (Application Response Measurement) API (first released June 1996) are created by an Open Source Working Group based on their webpage article UMA (Universal Measurement Architecture) which involve ARM API calls received by ARM agents feeding Enterprise Management Applicationson this page.

      The ARM Technical Standard in July 1998 originally defined a set of six library procedure calls for programmers to call from within their source code to initialize the ARM subsystem and define the beginning and end of Business Transactions being measured:

        arm_init Names an application (with a handle) and initializes the ARM environment for the new application handle.
        arm_getid Names each business transaction a unique transaction identifier monitored within the app
        arm_start Starts the clock for a unique transaction instance.
        arm_update Update statistics for a long running transaction
        arm_stop Stops the clock for (register the end of) a transaction instance.
        arm_end Cleans up the ARM environment prior to shutdown for the app handle associated from a previous arm_start.

      download ARM 2.0 SDK dated 11/11/97 is offered in UNIX and download Windows flavors, along with sample.c source code for each platform. ARM 2.0 added the ability to correlate parent and child transactions, and to collect other measurements associated with the transactions, such as the number of records processed. This SDK (explained in the User Guide) provides:

      • a libarm4 Microsoft linked dll (copied to the System32 folder) or Linux shared library.
      • The arm4.h header file for C apps in ARM version 4 added the capability to track the amount of time a transaction is blocked waiting for an external event.
      • for use in testing instrumentation, the logagent.c source code to a logging agent , which makes use of the armagent.h header file used by agents.
      • stubs for both C and (since ARM 3.0 in 2001) Java apps to use when an ARM agent is not installed.

      download ARM 2.01 Patched ARM 2.0 with new arm201.h files.
      ARM 3.0 SDK added Java bindings.
      webpage article ARM 4.0 -- also confusingly called ARM Version 2 because ARM 4.0 is not backward compatible with ARM 2.0. -- on Oct. 2003 published header files and Bindings for C and Java, but no sample source.

      Business Service Management Applications

      ARM subsystems apps are provided by a number of Business Service Management applications providers.

      The "Big Four":

      The also vendors:

      Emerging vendors:

        A website external to this site Hyperic (San Francisco open source startup focused on managing Apache, since founder Soltero came from Covalent)
        A website external to this site ZenOSS (Annapolis, MD) offers open-source software

      Integration vendors:

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Log Analysts

      Two tools from the Windows 2000 Resource Kit:

      • Seagate Crystal Reports 6 Event Log Viewer is a full-featured report writer that provides an easy way to extract, view, save, and publish information from the Windows 2000 system, application, and security event logs in a variety of formats. It integrates new web reporting technology.

      • CyberSafe Log Analyst is a Windows 2000 Security Event Log analysis tool designed as a snap-in to the Microsoft Management Console (MMC) used with Windows 2000. It organizes and interprets security event logs from Windows 2000, providing more effective, system-wide user activity analysis.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Tuning Applications

      The base priority of a process determines the order in which a process is scheduled for processing, relative to other processes. The base priority is set by the process code, not the operating system. The OS sets and changes the dynamic priorities of threads in the process within the range of the base. The Processes tab allows you to change the Base Priority value of a process, but it does not monitor threads. Base priorities changed through Task Manager are effective only as long as the process runs. The change in priority is effective at the next Task Manager update; you do not need to restart the process.

      Do this! To move the paging file to another hard disk on your computer running Windows 2000 Professional, in the System Properties dialog box, click the Advanced tab, and then click the Performance Options button.

      Quality of Service

      The Windows 2000 Server has a performance governor: the Admission Control Service. It works in conjunction with the Quality of Service (QoS) standard, which allows network administrators to either restrict or guarantee network performance to specific applications.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Quiz Questions

    • What usually has the greatest negative effect on processor performance?
      a. paging
      b. compression
      c. fragmentation
      d. network bandwidth

      Answer: b

    • Which command could you use to measure CPU load on a UNIX or Linux system?
      a. sar -q
      b. strace
      c. winstat 5
      d. netstat -m

      Answer: a

    • Where should the Linux and Windows swap partition be located to provide the best performance?
      a. at the end of the drive
      b. in the middle of the drive
      c. at the beginning of the drive
      d. do not use a swap partition; use a swap file instead

      Answer: c

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Websites on Perfmon


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Portions ©Copyright 1996-2014 Wilson Mar. All rights reserved. | Privacy Policy |

    Related Topics:
    another page on this site LoadRunner 
    another page on this site SNMP 
    another page on this site Rational Robot 
    another page on this site Free Training! 
    another page on this site Tech Support 

    How I may help

    Send a message with your email client program

    Your rating of this page:
    Low High

    Your first name:

    Your family name:

    Your location (city, country):

    Your Email address: 

      Top of Page Go to top of page

    Thank you!