Custom Search

Friday, August 14, 2009

Costly Software Bugs


Software Bugs Cost U.S. Economy $59.6 Billion Annually, RTI Study Finds
Research Triangle Park, NC -- Software bugs are costly both to software producers and users.
Extrapolating from estimates of the costs in several software-intensive industries, bugs may be
costing the U.S. economy $59.5 billion a year; about 0.6 percent of gross domestic product, says a
study conducted by RTI for the U.S. Department of Commerce's National Institute of Standards and
Technology (NIST).

"More than half of the costs are borne by software users, and the remainder by software developers
and vendors," NIST said in summarizing the findings. "More than a third of these costs … could be
eliminated by an improved testing infrastructure that enables earlier and more effective identification
and removal of software defects."


History's Worst Software Bugs

What seems certain is that bugs are here to stay. Here, in chronological order, is the Wired News list
of the 10 worst software bugs of all time … so far.

July 28, 1962 -- Mariner I space probe. A bug in the flight software for the Mariner 1 causes the
rocket to divert from its intended path on launch. Mission control destroys the rocket over the
Atlantic Ocean. The investigation into the accident discovers that a formula written on paper in
pencil was improperly transcribed into computer code, causing the computer to miscalculate the
rocket's trajectory.

1982 -- Soviet gas pipeline. Operatives working for the Central Intelligence Agency allegedly (.pdf)
plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline.
The Soviets had obtained the system as part of a wide-ranging effort to covertly purchase or steal
sensitive U.S. technology. The CIA reportedly found out about the program and decided to make it
backfire with equipment that would pass Soviet inspection and then fail once in operation. The
resulting event is reportedly the largest non-nuclear explosion in the planet's history.

1985-1987 -- Therac-25 medical accelerator. A radiation therapy device malfunctions and delivers
lethal radiation doses at several medical facilities. Based upon a previous design, the Therac-25 was
an "improved" therapy system that could deliver two different kinds of radiation: either a low-power
electron beam (beta particles) or X-rays. The Therac-25's X-rays were generated by smashing high-
power electrons into a metal target positioned between the electron gun and the patient. A second
"improvement" was the replacement of the older Therac-20's electromechanical safety interlocks with
software control, a decision made because software was perceived to be more reliable.

What engineers didn't know was that both the 20 and the 25 were built upon an operating system
that had been kludged together by a programmer with no formal training. Because of a subtle bug
called a "race condition," a quick-fingered typist could accidentally configure the Therac-25 so the
electron beam would fire in high-power mode but with the metal X-ray target out of position.
At least five patients die; others are seriously injured.



1988 -- Buffer overflow in Berkeley Unix finger daemon. The first internet worm (the so-called
Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage
of a buffer overflow. The specific code is a function in the standard input/output library routine called
gets() designed to get a line of text over the network. Unfortunately, gets() has no provision to limit
its input, and an overly large input allows the worm to take over any machine to which it can connect.

Programmers respond by attempting to stamp out the gets() function in working code, but they
refuse to remove it from the C programming language's standard input/output library, where it remains
to this day.

1988-1996 -- Kerberos Random Number Generator. The authors of the Kerberos security system
neglect to properly "seed" the program's random number generator with a truly random seed. As a
result, for eight years it is possible to trivially break into any computer that relies on Kerberos for
authentication. It is unknown if this bug was ever actually exploited.

January 15, 1990 -- AT&T Network Outage. A bug in a new release of the software that controls
AT&T's #4ESS long distance switches causes these mammoth computers to crash when they receive
a specific message from one of their neighboring machines -- a message that the neighbors send out
when they recover from a crash.

One day a switch in New York crashes and reboots, causing its neighboring switches to crash, then
their neighbors' neighbors, and so on. Soon, 114 switches are crashing and rebooting every six
seconds, leaving an estimated 60 thousand people without long distance service for nine hours. The
fix: engineers load the previous software release.

1993 -- Intel Pentium floating point divide. A silicon error causes Intel's highly promoted Pentium
chip to make mistakes when dividing floating-point numbers that occur within a specific range. For
example, dividing 4195835.0/3145727.0 yields 1.33374 instead of 1.33382, an error of 0.006 percent.
Although the bug affects few users, it becomes a public relations nightmare. With an estimated 3
million to 5 million defective chips in circulation, at first Intel only offers to replace Pentium chips for
consumers who can prove that they need high accuracy; eventually the company relents and agrees
to replace the chips for anyone who complains. The bug ultimately costs Intel $475 million.


1995/1996 -- The Ping of Death. A lack of sanity checks and error handling in the IP fragmentation
reassembly code makes it possible to crash a wide variety of operating systems by sending a
malformed "ping" packet from anywhere on the internet. Most obviously affected are computers
running Windows, which lock up and display the so-called "blue screen of death" when they receive
these packets. But the attack also affects many Macintosh and Unix systems as well.

June 4, 1996 -- Ariane 5 Flight 501. Working code for the Ariane 4 rocket is reused in the Ariane 5,
but the Ariane 5's faster engines trigger a bug in an arithmetic routine inside the rocket's flight
computer. The error is in the code that converts a 64-bit floating-point number to a 16-bit signed
integer. The faster engines cause the 64-bit numbers to be larger in the Ariane 5 than in the Ariane
4, triggering an overflow condition that results in the flight computer crashing.

First Flight 501's backup computer crashes, followed 0.05 seconds later by a crash of the primary
computer. As a result of these crashed computers, the rocket's primary processor overpowers the
rocket's engines and causes the rocket to disintegrate 40 seconds after launch.

November 2000 -- National Cancer Institute, Panama City. In a series of accidents, therapy
planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper
dosage of radiation for patients undergoing radiation therapy.

Multidata's software allows a radiation therapist to draw on a computer screen the placement of
metal shields called "blocks" designed to protect healthy tissue from the radiation. But the software
will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.

The doctors discover that they can trick the software by drawing all five blocks as a single large
block with a hole in the middle. What the doctors don't realize is that the Multidata software gives
different answers in this configuration depending on how the hole is drawn: draw it in one direction
and the correct dose is calculated, draw in another direction and the software recommends twice
the necessary exposure.

At least eight patients die, while another 20 receive overdoses likely to cause significant health
problems. The physicians, who were legally required to double-check the computer's calculations
by hand, are indicted for murder.

Monday, August 10, 2009

Code Coverage - Typical features

he feature sets, quality and usability of code coverage products vary significantly.
  • Ant integration

    Probably most Java projects today are using Ant (or Maven) to manage their build process, including running unit tests (or functional tests).

    Thus, Ant integration is one of those features a code coverage tool cannot afford not to have. However, there are subtle differences in how nicely the code coverage related targets fit into an existing build script. Yes, we will see some examples later on. Of course, most tools also provide an alternative, standalone way of running the analysis, either from the command line or via a GUI application.

  • Report formats

    Another obvious feature a code coverage tool must have is reports. Again, there are differences in the type and quality of supported reports. Some tools provide only textual summaries in Ant console output, others produce huge tables of names and numbers in HTML, and others produce nice pictures, others offer to render all this in PDF as well. We'll see examples of the main types of reports in the next section.

  • Source code linking

    Somewhat related to the previous item, source code linking is something one can't live without once having gotten the taste of it. In practice, source code linking means that, as a part of the code coverage report, the tool has generated annotated copies of the actual source code, highlighting the parts which are not covered by your tests. I wouldn't be surprised if this particular feature was the single biggest benefactor in code coverage tools reaching critical mass. Seeing the code block that is causing your coverage to stay away from the "green" is a lot more efficient than seeing that a particular method contains some code that isn't covered by the tests. All of our selected examples include source code linking in their feature set.

  • Checks

    It doesn't take too long after someone has introduced a rule when someone else introduced the idea to enforce that rule. That has also happened to code coverage. Some tools provide a means to pull up the red flag if code coverage drops below a given level. In the context of Ant integration, the build script might typically fail the build until the tests have been reinforced to cover the naked parts of the code.

  • Historical reports

    Few tools provide a way to collect a history of coverage data and produce historical reports illustrating how your project's code coverage has been fluctuating in the course of time. This is also a nice feature, although some might consider it rather irrelevant.

Thursday, August 6, 2009

Testing Estimation Process

Introduction

In my opinion, one of the most difficult and critical activities in IT is the estimation process. I believe that it occurs because when we say that one project will be accomplished in such time by at such cost, it must happen.

If it does not happen, several things may follow: from peers' comments and senior management's warnings to being fired depending on the reasons and seriousness of the failure.

Before even thinking of moving to Systems test at my organization, I always heard from the development group members that the estimations made by the Systems test group were too long and expensive. Then, when I arrived at my new seat, I tried to understand the testing estimation process.

The testing estimation process in place was quite simple. The inputs for the process, provided by the development team, were: the size of the development team and the number of working days needed for building a solution before starting systems tests.

The testing estimation process said that the number of testing engineers would be half of the number of development engineers and one third of the number of development working days.

A spreadsheet was created in order to find out the estimation and calculate the duration of tests and testing costs. They are based on the following formulas:

Testing working days = (Development working days) / 3.

Testing engineers = (Development engineers) / 2.

Testing costs = Testing working days * Testing engineers * person daily costs.

As the process was only playing with numbers, it was not necessary to register anywhere how the estimation was obtained.

To exemplify how the process worked, if one development team said that to deliver a solution for systems testing it would need 4 engineers and 66 working days then, the systems test would need 2 engineers (half) and 21 working days (one third). So, the solution would be ready for delivery to the customer after 87 (66+21) working days.

Just to be clear, in testing time, it was not included the time for developing the testcases and preparing the testing environment. Normally, it would need an extra 10 days for the testing team.


The Rules



1st Rule: Estimation shall be always based on the software requirements

All estimation should be based on what would be tested, i.e., the software requirements.

Normally, the software requirements were only established by the development team without any or just a little participation from the testing team. After the specification have been established and the project costs and duration have been estimated, the development team asks how long would take for testing the solution. The answer should be said almost right away. Then, the software requirements shall be read and understood by the testing team, too. Without the testing participation, no serious estimation can be considered.


2nd Rule: Estimation shall be based on expert judgment

Before estimating, the testing team classifies the requirements in the following categories:

" Critical: The development team has little knowledge in how to implement it;

" High: The development team has good knowledge in how to implement it but it is not an easy task;

" Normal: The development team has good knowledge in how to implement.

The experts in each requirement should say how long it would take for testing them. The categories would help the experts in estimating the effort for testing the requirements.


3rd Rule: Estimation shall be based on previous projects

All estimation should be based on previous projects. If a new project has similar requirements from a previous one, the estimation is based on that project.



4th Rule: Estimation shall be based on metrics

My organization has created an OPD, Organization Process Database, where the project metrics are recorded. We have recorded metrics from three years ago obtained from dozens of projects.

The number of requirements is the basic information for estimating a testing project. From it, my organization has metrics that guide us to estimate a testing project. The table below shows the metrics used to estimate a testing project. The team size is 01 testing engineer.


Metric Value

1.Number of testcases created for each requirement 4,53

2.Number of testcases developed by Working day 14,47

3.Number of testcases executed by Working day 10,20

4.Number of ARs for testcase 0,77

5.Number of ARs verified by Working day 24,64

For instance, if we have a project with 70 functional requirements and a testing team size of 2 engineers, we reach the following estimates:


Metric Value

Number of testcases - based on metric 1 317,10

Preparation phase - based on metric 2 11 working days

Execution phase - based on metric 3 16 working days

Number of ARs - based on metric 4 244 ARs

Regression phase - based on metric 5 6 working days

The testing duration is estimated in 22 (16+6) working days. Plus, 11 working days for preparing it.


5th Rule: Estimation shall never forget the past

I have not sent away the past. The testing team continues using the old process and the spreadsheet. After the estimation is done following the new rules, the testing team estimates again using the old process in order to compare both results.

Normally, the results from the new estimate process are cheaper and faster than the old one in about 20 to 25%. If the testing team gets a different percentage, the testing team returns to the process in order to understand if something was missed.


6th Rule: Estimation shall be recorded

All decisions should be recorded. It is very important because if requirements change for any reason, the records would help the testing team to estimate again. The testing team would not need to return for all steps and take the same decisions again. Sometimes, it is an opportunity to adjust the estimation made earlier.


7th Rule: Estimation shall be supported by tools

A new spreadsheet has been created containing metrics that help to reach the estimation quickly. The spreadsheet calculates automatically the costs and duration for each testing phase.

There is also a letter template that contains some sections such as: cost table, risks, and free notes to be filled out. This letter is sent to the customer. It also shows the different options for testing that can help the customer decides which kind of test he needs.


8th Rule: Estimation shall always be verified

Finally, All estimation should be verified. I've created another spreadsheet for recording the estimations. The estimation is compared to the previous ones recorded in a spreadsheet to see if they have similar trend. If the estimation has any deviation from the recorded ones, then a re-estimation should be made.