Software Tech News 2-2: Risk Management

Assessing Project Risk

Shari Lawrence Pfleeger - University of Maryland

Introduction

Many software project managers take steps to ensure that their projects are done on time and within effort and cost constraints. However, project management involves far more than tracking effort and schedule. Managers must determine whether any unwelcome events may occur during development or maintenance, and make plans to avoid these events or, if they are inevitable, minimize their negative consequences. A risk is an unwanted event that has negative consequences. Project managers must engage in risk management to understand and control the risks on their projects.

What is a Risk?

Many events occur during software development. We distinguish risks from other project events by looking for three things:

A loss associated with the event. The event must create a situation where something negative happens to the project: loss of; time, quality, money, control, understanding, and so on. For example, if requirements change dramatically after the design is done, then the project can suffer from loss of control and understanding if the new requirements are for functions or features with which the design team is unfamiliar. A radical change in requirements is likely to lead to losses of time and money if the design is not flexible enough to be changed quickly and easily. The loss associated with a risk is called the risk impact.
The likelihood that the event will occur. We must have some idea of the probability that the event will occur. For example, suppose a project is being developed on one machine and will be ported to another when the system is fully tested. If the second machine is a new model to be delivered by the vendor, we must estimate the likelihood that it will not be ready on time. The likelihood of the risk, measured from 0 (impossible) to 1 (certainty) is called the risk probability. When the risk probability is 1, then the risk is called a problem, since it is certain to happen.
The degree to which we can change the outcome. For each risk, we must determine what we can do to minimize or avoid the impact of the event. Risk control involves a set of actions taken to reduce or eliminate a risk. For example, if the requirements may change after design, we can minimize the impact of the change by creating a flexible design. If the second machine is not ready when the software is tested, we may be able to identify other models or brands that have the same functionality and performance and can run our new software until the new model is delivered.

We can quantify the effects of the risks we identify by multiplying the risk impact by the risk probability, to yield the risk exposure. For example, if the likelihood that the requirements will change after design is .3, and the cost to redesign to new requirements is $50,000, then the risk exposure is $15,000. Clearly, the risk probability can change over time, as can the impact, so part of a project manager’s job is to track these values over time, and plan for the events accordingly.

There are two major sources of risk: generic risks and project-specific risks.
Generic risks are those common to all software projects, such as misunderstanding the requirements, losing key personnel, or allowing insufficient time for testing.

Project-specific risks are threats that result from the particular vulnerabilities of the given project. For example, a vendor may be promising network software by a particular date, but there is some risk that the network software will not be ready on time.

Risk Management Activities

Risk management involves several important steps, each of which is illustrated in Figure 1. First, you assess the risks on your project, so that you understand what may occur during the course of development or maintenance. The assessment consists of three activities: identifying the risks, analyzing them, and assigning priorities to each of them. To identify them, you may use many different techniques.

If the system you are building is similar in some way to a system you have built before, you may have a checklist of problems that may occur; you can review the checklist to determine if your new project is likely to be subject to the risks listed. For systems that are new in some way, you may augment the checklist with an analysis of each of the activities in the development cycle. By decomposing the process into small pieces, you may be able to anticipate problems that may arise. For example, you may decide that there is a risk of your chief designerŐs leaving during the design process. Similarly, you may analyze the assumptions or decisions you are making about how the project will be done, who will do it, and with what resources. Then, each assumption is assessed to determine the risks involved.

Finally, you analyze the risks you have identified, so that you can understand as much as possible about when, why and where they might occur. There are many techniques you can use to enhance your understanding, including system dynamics models, cost models, performance models, network analysis, and more.

Now that you have itemized all risks, you must use your understanding to assign priorities to the risks. A priority scheme enables you to devote your limited resources only to the most threatening risks. Usually, priorities are based on the risk exposure, which takes into account not only likely impact but also the probability of occurrence.

The risk exposure is computed from the risk impact and the risk probability, so you must estimate each of these risk aspects. To see how the quantification is done, consider the analysis depicted in Figure 2. Suppose you have analyzed the system development process, and you know you are working under tight deadlines for delivery. You will be building the system in a series of releases, where each release has more functionality than the one that preceded it. Because the system is designed so that functions are relatively independent, you are considering testing only the new functions for a release, and assuming that the existing functions still work as they did before. Thus, you may decide that there are risks associated with not performing regression testing: the assurance that existing functionality still works correctly.

For each possible outcome, you estimate two quantities: the probability of an unwanted outcome, P(UO), and the loss associated with the unwanted outcome, L(UO). For instance, there are three possible consequences of performing regression testing: finding a critical fault if one exists, not finding the critical fault (even though it exists), or deciding (correctly) that there is no critical fault. As the figure illustrates, we have estimated the probability of the first case to be 0.75, of the second to be 0.05, and of the third to be 0.20. The likelihood of an unwanted outcome is estimated to be $0.5 million if a critical fault is found, so that the risk exposure is $0.375 million. Similarly, we calculate the risk exposure for the other branches of this decision tree, and we find that our risk exposure if we perform regression testing is almost $2 million. However, the same kind of analysis shows us that the risk exposure if we do not perform regression testing is almost $17 million. Thus, we say (loosely) that more is at risk if we do not perform regression testing.

Risk exposure helps us to list the risks in priority order, with the risks of most concern given the highest priority. Next, we must take steps to control the risks. The notion of control acknowledges that we may not be able to eliminate all risks.

Instead, we may be able to minimize the risk, or mitigate it by taking action to handle the unwanted outcome in an acceptable way. Therefore, risk control involves risk reduction, risk planning, and risk resolution. There are three strategies for risk reduction:

Avoiding the risk, by changing requirements for performance or functionality
Transferring the risk, by allocating risks to other systems or by buying insurance to cover any financial loss should the risk become a reality
Assuming the risk, by accepting it and controlling it with the project’s resources

To aid decision-making about risk reduction, we must take into account the cost of reducing the risk. We call risk leverage the difference in risk exposure divided by the cost of reducing the risk. In other words, risk reduction leverage is (risk exposure before reduction-risk exposure after reduction)/(cost of risk reduction).

If the leverage value is not high enough to justify the action, then we can look for other, less costly or more effective reduction techniques. In some cases, we can choose a development process to help reduce the risk. For example, prototyping can improve understanding of the requirements and design, so selecting a prototyping process can reduce many project risks. It is useful to record your decisions in a risk management plan, so that both customer and development team can review how problems are to be avoided, as well as how they are to be handled should they arise. Then, we should monitor the project as development progresses, periodically reevaluating the risks, their probability, and their likely impact.

Table 1. summarizes what Boehm has identified as the top ten risk items2. When assessing risk on your own project, you can begin with this list, and determine if any of the items might apply. Then, you can expand your list, based on past history and your understanding of the projectĄs goals and limitations. Boehm identifies ten risk items, and recommends risk management techniques to address each of them.

**Table 1: Boehm’s top ten risk items**
Personnel Shortfalls: Staffing with top talent; job matching; team-building; morale-building; cross-training; prescheduling key people. Unrealistic Schedules and Budgets: Detailed, multisource cost and schedule estimation; design to cost; incremental development; software reuse; requirements scrubbing. Developing the wrong software functions: Organizational analysis; mission analysis; operational concept formulation; user surveys; prototyping; early users’ manuals. Developing the wrong user interface: Prototyping; scenarios; task analysis. Gold-plating. Requirements scrubbing: prototyping; cost-benefit analysis; design to cost. Continuing stream of requirements changes: High change threshold; information-hiding; incremental development (defer changes to later increments). Shortfalls in externally-performed tasks: Reference-checking; pre-award audits; award-fee contracts; competitive design or prototyping; team-building. Shortfalls in externally-furnished components: Benchmarking; inspections; reference checking; compatibility analysis. Real-time performance shortfalls: Simulation; benchmarking; modeling; prototyping; instrumentation; tuning. Straining computer science capabilities: Technical analysis; cost-benefit analysis; prototyping; reference checking.

Contact Information

Shari Lawrence Pfleeger
4519 Davenport Street NW
Washington, DC 20016-4415
(301) 405-2707, Fax: (301) 405-3691
[email protected]

This article was adapted from Software Engineering: Theory and Practice, by Shari Lawrence Pfleeger with permission from Prentice-Hall.

References

Rook, Paul, “Risk Management for Software Development”, ESCOM Tutorial, 24 March 1993.
Boehm, Barry W., “Software Risk Management: Principles and Practices ”, IEEE Software 8(1), pp. 32-41, January 1991.