|
|
|
|
with numerical Targets
supporting service levels
at optimal TCO
.
Study the Stystem/Application Under Test to specify the Test Architecture and Test Behavior. Define Test Timing Constraints. Prepare Test Data.
in response to Actual Usage Patterns
by designing experiments
and
constructing and validating
LoadRunner
and Test Scenario parameters
for the types of performance defects
of concern.
created from results of Diagnosing load test runs
while conducting performance tuning
and
exploring Various Configurations for Scalability
in an analytical model
Conclusions are published along with recommendations for management inquiries and action.
so that Off-estimate Alerts
are issued to perhaps trigger planned Upgrade trigger thresholds
previously defined in
Predicted Performance Profiles for anticipated loads
The approach above was drawn from several capacity management frameworks:
In the electronics industry:
After prototyping, and after the product goes though the Design Refinement cycle when engineers revise and improve the design to meet performance and design requirements and specifications, objective, comprehensive Design Verification Testing (DVT) is performed to verify all product specifications, interface standards, OEM requirements, and diagnostic commands.
Process (or Pilot) Verification Test (PVT) is a subset of Design Verification Tests (DVT) performed on pre-production or production units to Verify that the design has been correctly implemented into production.
The MOF (Microsoft Operations Framework) defines this circular process flow of capacity management activities:
Oracle's Expert Services'
Software Performance Engineering (SPE)Smith's Software Performance Engineering (SPE) approach begins with these detailed steps:
|
|
|
| Anti-Pattern Symptions & Conditions | Critical Success Factor (CSF) |
|---|---|
| A. Runs are conducted with no agreement on their purpose. | Define requirements for the application. |
| Define for the measurement project expected benefits, cost analysis, and justification. | |
| Define Objectives | |
| B. Much discussion about what to do next. Frustration with intermediate steps to obtain results. | Pre-agreement on deliverables from distinct project phases |
| C. Provisions for testing | Accounting and Accountability for each role |
| D. Runs are conducted with scipts that are not ready/mature | Understanding of script maturity |
| E. People make mistakes after working 16 hours a day. | Planning and implementing an appropriate level of resources to match business needs |
| F. People getting into trouble with nothing productive to do. | |
| G. Actual production usage is different than anticipated. | Availability of accurate and timely business forecasts |
| Possible impacts to performance are identified and measured. | |
| H. Changes by others are discovered as the reason for anomalies in run results — after many hours are wasted in debugging. | Communication and interaction with other
service management processes |
| I. Tests need to be rerun because actual hardware used in production is different than hardware tested. | Understanding of current and future technologies (for each application in the Service Catalog, a mapping of infrastructure dependencies). |
|
|
| ||
|
|
A. Speed Tests
conclusions |
During speed testing, the user response time (latency)
of each user action The script for each action will look for some text on each resulting page to confirm that the intended result appears as designed. Since speed testing is usually the first performance test to be performed, issues from installation and configuration are identified during this step. Because this form of performance testing is performed for a single user (under no other load), this form of testing exposes issues with the adequacy of CPU, disk I/O access and data transfer speeds, and database access optimizations.
The performance speed profile |
|
|
|
|
B. ContentionTests (for Robustness)
conclusions | This form of performance test aims to find performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources. Each run identifies the minimum, average, median, and maximum times for each action. This is done to make sure that data and processing of multiple users are appropriately segregated. Such tests identify the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads. |
|
|
|
|
C. Volume Tests (for Extendability)conclusions |
These test runs measure the pattern of response time as more data is added. These tests make sure there is enough disk space and provisions for handling that much data, such as backup and restore. |
|
|
|
|
D. Stress / Overload
conclusions |
This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear). During tests, the resources used by each server are measured to make sure there is enough transient memory space and adequate memory management techniques. This effort makes sure that admission control techniques limiting incoming work perform as intended. This includes detection of and response to Denial of Service (DoA) attacks. |
|
|
|
|
E. Fail-Over
conclusions |
For example, this form of performance testing ensures that when one computer of a cluster fails or is taken offline, other machines in the cluster are able to quickly and reliably take over the work being performed by the downed machine. This means this form of performance testing requires multiple identical servers to be configured and using Virtual IP addresses accessed through a load balancer device. |
|
|
|
|
F. Spike |
Such runs can involve a "rendevous point" where all users line up to make a specific request at a single moment in time. Such runs enable the analysis of "wave" effects through all aspects of the system. Most importantly, these runs expose the efficacy of load balancing. |
|
|
|
|
F. Endurance
conclusions |
Because longer tests usually involve use of more disk space, these test runs also measure the pattern of build-up in "cruft" (obsolete logs, intermediate data structures, and statistical data that need to be periodically pruned). Longer runs allow for the detection and measurement of the impact of occasional events (such as Java Full GC and log truncations) and anomalies that occur infrequently. These tests verifies provisions for managing space, such as log truncation "cron" jobs that normally sleeps, but awake at predetermined intervals (such as in the middle of the night). |
|
|
|
|
H. Scalability |
The outcome of scalability efforts feeds a spreadsheet to calculate how many servers the application will need based on assumptions about demand. |
|
|
|
|
I. Availability
conclusions |
These are run on applications in production mode. This provides alerts when thresholds are reached and trends to guage the average and variability of response times. |
| |
|
Articles on performance of IBM zSeries S/390 mainframes at zJournal
discuss use of reports from IBM's RMF (Resource Management Facility)
being massaged by IBM's SSR (Spreadsheet Reporter) and
data reduction applicatios such as SAS, MXG, and MICS.
Also SMF and CMF reports.
|
|
|
|
The above makes use of concepts from
Six Sigma
|
|
Traditional "Six Sigma" projects aim to improve existing products and processes
using a methodology with an acryonym of DMAIC (Commonly pronounced duh-may-ick,
for Define, Measure, Analyze, Improve, and Control).
|
|
Load Testing is a sub-process of the Capacity Management function
within the
Service Management standards |
|
The capacity plan is the consolidated output (deliverable) from the capacity management process.
|
|
|
|
|
|
|
|
|
|
| Person/Role | Responsibilities |
|---|---|
A. Project Financial Sponsor |
|
B. Functional Expert (Marketing) |
|
C. Project Manager |
|
D.1. Developement Liasion |
|
D.2. AUI Developer |
|
D.3. Performance Basis Experts |
|
E.1. Database Performance Basis Expert |
|
E.2. Database Administrator |
|
F. Functional Testers |
|
G. Performance Engineer |
|
H. Production Operations (Service Management) |
|
|
|
The capacity manager role oversees the allocation and delivery of service capacity to users. The capacity manager is responsible for planning, monitoring, and reporting activities relating to system and solution capacity, performance measurement, and forecast in the IT organization.
Activities associated with the capacity manager often include:
The capacity manager is also responsible for managing the day-to-day capacity requirements of services, including:
A sample list of Essential Functions (Responsibilities) in a Capacity Manager job description:
|
|
This pseudo usecase diagram
summarizes the information (artifacts) flowing among people assuming certain
roles
involved in managing the performance of large applications:
|
| WHATs - Requirements Critical to Satisfaction | Average Weight (1-5) | Casual User | Exp. User | Sys Admin |
|---|---|---|---|---|
| 1.1 fast to load | 4.5 | 4 | 4 | 5 |
| 1.2 quick response after submit | 4.3 | 4 | 3 | 5 |
| 2.1 accepts batched transactions | 3.3 | 0 | 2 | 5 |
| 2.1 dependable | 3.1 | 4 | 4 | 5 |
| 3.1 Does not timeout | 1.9 | 1 | 3 | 2 |
| 4.1 Quick to Recover | 1.9 | 1 | 3 | 2 |
The column to the right of each requirement contains weight ratings that allow certain customer requirements to be weighted higher in priority than others in the list. The example shown here is the average of weights for different sub-groups. The "(1-5)" range in this example can be optionally replaced with ISO/IEC 14589-1 evaluation scales or advanced methods such as Thomas Saaty's "Analytic Hierarchy Process" used to establish scales with precise scales:
Models, Methods, Concepts & Applications of the Analytic Hierarchy Process,
with Luis G. Vargas (Springer; November, 2000)
Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a Complex World
(3rd ed. May 1, 1999)
The customer sub-groups shown in this example is for roles working with a computer application:
QFD graphic programs can add:
The International TechneGroup, Inc. (ITI) approach for Concurrent Product/Manufacturing Process Development breaks this "WHATs" of the "voice of the customer" (VOC) down further into User Wants, Must Haves, Business Wants, and Provider Wants.
The CMM (Capability Maturity Model developed at Carnegie Mellon University) has 7 measures:
Associated with |
| Quality Characteristic | Sub-characteristics | Definition: Attributes of software that bear on the ... |
|---|---|---|
| Functionality | Suitability | presence and appropriateness of a set of functions for specified tasks. |
| Accurateness | provision of right or agreed results or effects. | |
| Interoperability | Attributes of software that bear on its ability to interact with specified systems. | |
| Compliance | Attributes of software that make the software adhere to application related standards or conventions or regulations in laws and similar prescriptions. | |
| Security | Attributes of software that bear on its ability to prevent unauthorized access, whether accidental or deliberate, to programs or data. | |
| Reliability | Maturity | frequency of failure by faults in the software. |
| Fault tolerance | Attributes of software that bear on its ability to maintain a specified level of performance in case of software faults or of infringement of its specified interface. | |
| Recoverability | capability to re-establish its level of performance and recover the data directly affected in case of a failure and on the time and effort needed for it. | |
| Usability | Understandability | users' effort for recognizing the logical concept and its applicability. |
| Learnability | users'effort for learning its application. | |
| Operability | users'effort for operation and operation control. | |
| Efficiency | Time behaviour | Attributes of software that bear on response and processing times and on throughput rates in performances its function. |
| Resource behavior | amount of resource used and the duration of such use in performing its function. | |
| Maintainability | Analyzability | effort needed for diagnosis of deficiencies or causes of failures, or for identification of parts to be modified. |
| Changeability | effort needed for modification, fault removal or for environmental change. | |
| Stability | risk of unexpected effect of modifications. | |
| Testability | effort needed for validating the modified software. | |
| Portability | Adaptability | opportunity for its adaptation to different specified environments without applying other actions or means than those provided for this purpose for the software considered. |
| Installability | effort needed to install the software in a specified environment. | |
| Conformance | Attributes of software that make the software adhere to standards or conventions relating to portability. | |
| Replaceability | Attributes of software that bear on opportunity and effort using it in the place of specified other software in the environment of that software. |
ISO/IEC 14598 gives methods for measurements, assessment and evaluation of software product quality.
SPICE - Software Process Improvement and Capability dEtermination is a major international standard for Software Process Assessment. There is a thriving SPICE user group known as SUGar. The SPICE initiative is supported by both the Software Engineering Institute and the European Software Institute. The SPICE standard is currently in its field trial stage.
|
Balanced Scorecard Diagnostics: Maintaining Maximum Performance
(John Wiley & Sons © 2005, 224 pages)
by Paul R. Niven
presentis a step-by-step methodology for analyzing the effectiveness of a company's balanced scorecard,
with tools to reevaluate measures for driving maximum organizational performance.
|
|
|
| Perspective & Core Measures | Sample Metrics | Relevant Metrics |
|---|---|---|
CustomerHow do our customers see us? |
Satisfaction, retention, market, and account share
|
|
Financial (Results)How do we look to shareholders? |
Return on investment and economic value-added:
|
|
Internal (Efficiency)What must we excel at? |
Quality, response time, cost, and new product introductions:
|
|
Learning and Growth How can we continue to improve and create value?
|
Employee satisfaction and information system availability:
|
|
These Balanced Scorecard metrics imply these business strategies:
|
|
|
|
|
|
|
| Test Type | Timing (When) |
|---|---|
| A. Speed | Parallel with coding construction, as this provides developers feedback on the impact of their choice of application architecture. |
| | |
| C. Data Volume | On each release when app components are being integrated. |
| D. Stress/Overload | Pre-Production for each new application version or hardware configuration. |
| E. Fail-over | Pre-Production for each new application version or hardware configuration. |
| G. Scalability | Pre-Production for each new application version or hardware configuration. |
| H. Availability | In Production for each new application version or hardware configuration. |
|
| Context | Components | Tuning Options | |
|---|---|---|---|
| A. | Business | | |
| B. | Applications | | |
| C. | Operating System | | |
| D. | Server Hardware Devices | | |
| E. | Telecommunications Infrastructure | | |
| F. | Data Center Operations |
| |
|
|
|
|
|
|
EnvironmentsSpecific machines on the technology "stack"
Machines Specfic to the Load Test Environment |
Component resources within each server
|
|
|
|
|
|
|
| Potential Obstacle / Risk | Likelihood |
Avoidance / Mitigation |
|---|---|---|
| 1. Servers not available early during the project. | High (80%) | a. Use dev. environment to develop single-user scripts. |
| 2. Difficulty with Controller licensing, capacity, etc. | Medium (50%) | b. Identify issues early by beginng to use controller as soon as the first small script (such as login only) is coded. |
| 3. Developers not available | Medium (50%) | c. Perform thorough system analysis to identify issues before scripting.
d. develop scripts with likely issues early. |
| 4. Not enough capacity in front-end (portal/login) servers. | Medium (40%) | e. Quanitfy capacity of front-end servers with login_only scripts. |
| 5. Change of personnel during the project. | Medium-High (60%) | f. Take notes. Conduct formal peer walk-throughs.
g. Make assignments for skill development. |
| 6. Servers become unavailable late during the project | Low (20%) | h. Use production |