T E C H N I Q U E S

A piece of cake

By focusing on a selected set of variables, program slicing makes debugging a piece of cake. Mark Harman explains the technique.

Almost every programmer has endured the unpleasant task of sifting through hundreds of lines of code in order to find an error in just one. Tedium ad nauseum: there must be a better way. Enter program slicing.

Program slicing works by finding the parts of a program relevant to the value of a chosen set of variables at some chosen point in a program. A slice is constructed by deleting the parts of the program that are irrelevant to those values.

The point of interest is usually identified by annotating the program with line numbers which identify each primitive statement and each branch point. The term ‘slicing criterion’ is used for the point of interest together with the set of variables whose value the slice must preserve. In this article the point of interest will be indicated by adding a comment to the program. In general slices are constructed for a set of variables, but herein only slices constructed for a single variable will be considered. For a slicing criterion consisting of a variable v and a point of interest n, the slice is constructed for v at n.

Having picked a slicing criterion one of two forms of slice can be constructed: a backward slice or a forward slice. The former contains the statements of the program which can have some effect on the slicing criterion, whereas a forward slice contains those statements of the program which are affected by the slicing criterion. Backward slices can assist a developer by helping to locate the parts of the program which contain a bug. Forward slicing can be used to predict the parts of a program that will be affected by a modification.

Figure 1 – A program fragment to be backward sliced
x = 1;
y = 2;
z = y-2;
r = x;
z = x+y;
/* the slice point is the end of the program */.

Backward slicing and debugging

A backward slice – simply a version of the original program with some parts missing – can be compiled and executed. An important property of any backward slice is that it preserves the effect of the original program on the variable chosen at the selected point of interest within the program.

Consider the simple example fragment of C code in Figure 1. Suppose the only concern is the effect of the fragment on the variable z at the end of the program. A backward slice on z at the end of the program can be built to focus attention on this aspect of the fragment – such a slice is shown in Figure 2. It is easy to see that the line r = x; cannot affect the final value of z, so it is not included in the slice. Slightly less obvious is that the assignment z = y-2; also has no effect upon the final value of z. It too is not included in the slice because, although this statement updates the value of z in the middle of the program, the value assigned is overridden by the last line, thereby losing the effect of the first assignment.

Calculating the statements which should be deleted from a program is a tedious and intricate task, but fortunately slices can be calculated by a computer. This means that we can expect to see slicing technology in the next generation of case tools. A free prototype slicing tool, Unravel, is already available.

 Figure 2 – The backward slice of Figure 1
x = 1;
y = 2;
z = x+y;

In this article I shall consider only the way slicing can be used as a debugging and maintenance aid. It has, however, been suggested as a supporting technique for reuse, parallelisation of code, program comprehension, the calculation of metrics, reverse engineering, and program integration.

The program fragment in Figure 3 shows how slicing can help a programmer during the debugging phase of program development. The program is supposed to read in some exam marks and print the number of passes, the number of fails, the pass rate, and the average mark. Sadly, when the program is executed, the average mark always seems to be extremely low. This is a typical example of a unit test failure, and the developer would, at this point, be required to enter a debugging process to discover the cause of the bug. This process may take longer than it ought because the developer will have to consider a lot of lines of code which could not have caused the bug. Clearly any effort spent examining lines which cannot have an effect upon the value of average would be wasted effort.

 Figure 3 – An exam mark program
Pass = 0 ;
Fail = 0 ;
Count = 0 ;
while (!eof()) {
TotalMarks=0;
scanf("%d",Marks);
if (Marks >= 40)
Pass = Pass + 1;
if (Marks < 40)
Fail = Fail + 1;
Count = Count + 1;
TotalMarks = TotalMarks+Marks ;
}
printf("Out of %d, %d passed and %d failed\n",Count,Pass,Fail) ;
average = TotalMarks/Count;
/* This is the point of interest */
printf("The average was %d\n",average) ;
PassRate = Pass/Count*100 ;
printf("This is a pass rate of %d\n",PassRate) ;

Slicing to the rescue

A slicing tool can assist. The slice constructed for the variable average at the point at which it is printed out is given in Figure 4. It is about half the size of the original program, because it has to preserve only the effect of the original on the variable average. Running the slice will generate the same value for the average mark as the original. As the slice preserves the affect of the original program upon the variable average it is safe to ignore the other lines when tracking bugs which only affect the average mark. Slicing cannot compromise the debugging effort, but it can clearly assist it, as the slice is so much simpler than the original.

The slice in Figure 4 shows that the assignment of zero to the variable TotalMarks occurs inside the while loop. This is clearly wrong, as this sort of initialisation belongs outside the loop. It should be executed at the start of the program, or, at least, before the loop begins. If the original had been analysed the developer would probably have found himself saying ‘no, that line cannot be the one I’m looking for. It cannot affect the average value’, which would be a ‘virtual slice’ using the programmer’s mental arithmetic. The advantage of a slicing tool is that it relieves us of this burden.

 Figure 4 – The backward slice of the exam mark program
while (!eof()) {
TotalMarks=0;
scanf("%d",Marks);
Count = Count + 1;
TotalMarks = TotalMarks+Marks;
}
average = TotalMarks/Count;
printf("The average was %d\n",average) ;

I won’t pretend that slicing is some form of universal elixir. If, in the original program, the line that initialised the value of TotalMarks had been omitted, it would not make sense to say slicing could help locate the line which contained the bug! If a bug is caused by something that is missing from a program, this same something will be missing from any slice of the program. However, even in this situation, slicing may be of some help as the absence of the ‘missing something’ may be more noticeable in a smaller version of the program.

The pebble and the pond

As systems are developed and maintained, modifications to parts of the programs that make up the system are frequently needed. We are all painfully aware that these changes can often lead to unforeseen side effects, which come back to haunt us, often long after the original change was made. When part of a program is changed the effect of the change ‘ripples’ through the program. It can impact upon other parts of the code which ought to stay the same. Using forward slicing these ripples can be traced to establish which parts of the program may have been affected.

 Figure 5 – A program fragment to be forward sliced.
x = 1; /* considering changing this line */
y = 3;
p = x + y ;
z = y -2 ;
if(p==0)
r++ ;

See how forward slicing works in the simple program fragment in Figure 5. Suppose the first line of the program needs altering. As this line assigns a value to the variable x, any subsequent part of the program which ultimately depends upon the value of x may behave differently after the modification. The forward slice constructed for the variable x is given in Figure 6. It contains the lines of the program which are affected by a change to the first line. Notice that the line r++; has to be included in the forward slice, as its execution is controlled by the predicate p==0 which is affected by the slicing criterion.

 Figure 6 – The forward slice of Figure 5.
/* Change to first line will affect */
p = x + y ;
if(p==0)
r++ ;

Forward slicing can be used to assist a maintainer who has located a bug (perhaps assisted by backward slicing) and now wishes to modify the program to correct the bug. In so doing the maintainer also wants to be sure that account is taken of the ripple effects of the change. To illustrate this approach to maintenance, consider the example program fragment in Figure 7.

 Figure 7 – A program to be maintained
n = 0;
product = 1;
sum = 1;
scanf("%d",&x) ;
while (x >= 0){
sum = sum + x;
product = product * x ;
n = n + 1;
scanf("%d",&x);
}
average = (sum - 1) / n ;
printf("The total is %d\n",sum) ;
printf("The product is %d\n",product) ;
printf("The average is %d\n",average) ;

Suppose that the line which prints out the value of the variable sum has just been added to the program. This line cannot cause any ripple effects, because it does not change the value of any variables. It does, however, reveal that the program contains a bug which was hidden until the value of sum was inspected. The value stored in sum always seems to be slightly too big (in fact, it is always one more than it should be).

In order to locate the cause of this bug a backward slice can be constructed on the variable sum to find the lines which contribute to the calculation of its incorrect value. This slice is shown in Figure 8. It shows the initial value stored in sum is one. Since sum is a running total, it should be initialised to zero. The natural step is to replace the assignment sum = 1; with the assignment sum = 0; thereby correcting the bug. However, by changing an assignment like the one to sum, side effects may be unwittingly introduced because of the ripple effect.

 Figure 8 – The backward slice of Figure 7
sum = 1;
scanf("%d",&x) ;
while (x >= 0) {
sum = sum + x;
scanf("%d",&x);
}
printf("The total is %d\n",sum) ;

A forward slice on the variable sum will trace the ripples and show what other statements will be affected. The forward slice contains the lines indicated with the comment /* AFFECTED */ in Figure 9. The reassignment to sum inside the while loop and the line to print out its value at the end of the loop are affected by the change. However, the slice also reveals that the code to assign a value to average and the line which prints out its value are affected. A change to sum will have a ripple effect on the variable average.

Fixing the bug in the initialisation of sum would introduce a bug in the assignment to average, but since forward slicing has forewarned of this ripple effect the ugly ‘patch’ in the assignment to average can be replaced with the new assignment average = sum/n;.

 Figure 9 – The forward slice of Figure 7
n = 0;
product = 1;
sum = 0;
scanf("%d",&x) ;
while (x >= 0) {
sum = sum + x; /* AFFECTED */
product = product * x ;
n = n + 1;
scanf("%d",&x);
}
Average = (sum - 1) / n ; /* AFFECTED */
printf("The total is %d\n",sum) ; /* AFFECTED */
printf("The product is %d\n",product) ;
printf("The average is %d\n",average) ; /* AFFECTED */

Finally a forward slice should be constructed for the new assignment to average to ensure that no new ripple effects are caused by altering the assignment to average. Such a slice would reveal that the only statement affected by the correction is the one which prints out the average exam mark, so no further action is required. A slicing tool helps to identify the line of a program which causes a bug and to check that the modifications mad to fix the bug do not introduce any more errrors.

Unravel free slicing

Slicing was originally proposed by Mark Weiser in his seminal PhD thesis in 1979. Since then a great deal of research work involving the development, improvement, and extension of the algorithms for constructing program slices has been conducted. Until recently the only tools for constructing slices were academic play things which catered only for unrealistically small subsets of programming languages, but 1996 witnessed the production of a publicly available slicing tool for a real programming language. It was developed by Jim Lyle, Dolores Wallace, James Graham, Keith Gallagher, Joseph Poole, and David Binkley for the American National Institute of Standards and Technology. The Unravel team has, philanthropically, made its tool and its source code freely available on the Web.

Unravel currently runs on SUN/Sparc platforms and slices programs in ANSI C (although, programs which contain goto statements are not handled by Unravel). The authors have taken care to implement the slicing algorithms as efficiently as possible, so the slices constructed are delivered in a reasonable time. The tool has a pleasant user interface (written using X Windows) which allows the user to select the point of interest by clicking on points in the program text. The slice is displayed by highlighting the lines of the program which form the slice, so it is easy to see what is in the slice and what has been eliminated. Currently the tool produces only backward slices, but the effort required to add forward slicing is not too great.

Staff and students in my team at the University of North London are working on Unravel to provide a port to the PC and to investigate extensions to its functionality. I would like to acknowledge the major contribution made by the Unravel team to the public dissemination of slicing technology.

Other forms of slicing

The approaches to slicing I have described above are known as ‘static’ approaches, because the slices are constructed at compile time. In the ‘dynamic’ slicing paradigm, slices are constructed once the input to a program is known. Many researchers in the field believe that these slices will provide an even greater benefit to developers during the debugging phase as dynamic slices are constructed specifically for the test case that caused a bug to occur.

Some extremely exciting work is going on concerning a generalised slicing paradigm, called ‘conditioned’ or ‘quasi static slicing’. It acts as a bridge between the two extremes of static and dynamic slicing. A conditioned slice is constructed for a slicing criterion that includes the condition which cause the program to misbehave. My work on slicing indicates that it would be fruitful to set aside the restriction to command deletion as the only technique for simplifying the original program to construct the slice. The idea is to allow any program transformation which preserves the original program’s effect on the slicing criterion, thereby allowing the slicing tool to construct ‘thinner’ slices. Although this approach would not help much in bug location, in some applications the thinness of the slices produced is the overriding concern.

While we wait for these developments to come about, program slicing is already an invaluable tool for tracking down the pesky bugs that crawl into code. Try the technique – it will allow you much more time for better things. Perhaps a gateau in the lunchroom?

Mark Harman is director of research at the School of Informatics University of North London where he teaches programming in C++ and formal methods in Z. He can be contacted via email at m.harman@unl.ac.uk. Dr. Harman’s Web page http://www.unl.ac.uk/~mark/welcome.html contains more information about slicing, including a link to the Unravel page, from which a free copy of the Unravel ANSI C program slicing tool can be downloaded.