Software Engineering 1

Assignment Feedback






The following is some feedback about your Software Engineering assignments. They are general comments about the most frequently occurring faults and issues which I came across in testing the programs.

The comments can be split into the following main areas:-

  1. Constants
  2. Histogram - creating and initialising
  3. Initialising variables
  4. Structure of Program
  5. Files
  6. Arguments
  7. Miscellaneous

CONSTANTS

You should use constants whenever possible in your programs. For example, a constant should have been declared for number of grey levels (256). This could then be used to declare sizes of histogram arrays and when writing out contents of array. Therefore the value 256 should only appear once in program as a constant declaration.

The size of the image was also 256 rows and 256 columns and another constant should have been declared for this.

The same constant could have been used for each, but as each were representing separate entities (one the number of grey levels and one the size of the image) a separate constant should ideally have been declared and used for each.

Some of you declared constants but failed to use them fully.

Throughout the following discussion NUM_GREY has been used as constant to represent the number of grey levels and SIZE to represent the size of the image.

The constants should have been declared as global at the beginning of the program - no other variables should have been globals - see later note. It is OK to make constants global – since a constant by definition cannot be changed in the program. If the constants are globals all the functions can access and use them.


HISTOGRAM CREATING AND INITIALISING

When calculating histogram there is no need to process the image array 256 times, each time looking for grey level 1, then grey level 2 etc. You only need to process the image once:-

For example:
 

for (i=0; i< SIZE; i++) for (j=0; j< SIZE;j++)
{ n = image[i][j];
bin[n]++;
}
rather than the slower approach:-
  for (i = 0; i < NUM_GREY ; i ++)
{ for (j =0 ; j < SIZE; j++)
{ for (k = 0 ; k < SIZE ; k++)
{ if (image[j][k] == i) hist[i]++; }
}
}
This is still valid and will produce the same result, but is slower since it searches through the array 256 times, instead of only once. The 1st method looks at each element of the image in turn and then increments the appropriate element of the histogram array. The 2nd method searches through the image and finds all the pixels which are 0, then seraches for all the 1’s, then 2’s etc. One of the commonest faults encountered in the programs was the failure to initialise the histogram array to contain all zeros prior to starting to calculate the histogram. Both of the above methods assume that the array hist[] originally contains all zeros as it is incrementally added to the appropriate element depending on the grey level. When you declare a variable or an array, you cannot assume that it initially contains the value 0. In declaring a variable, you are simply requesting space in memory for it, that space in memory may have a value in it. The array could simply be initialised using   for (i=0; i < NUM_GREY; i++) hist[i] = 0; Many of you had obviously been lucky, in that the compiler you happened to use, and the past history of the memory resulted in the program working correctly without the initialisation. However, you should never assume a value that you are incrementally adding to will start at 0.

The array could also have been initialised when declared:-

int hist[NUM_GREY] = {0}; This initialises the first element to zero, and by default all subsequent elements which are not specified are initialised to 0. This is a special feature of initializing arrays and only works when setting all values to 0.


INITIALISING VARIABLES The initialisation of the array, is similar to the need to initialise any variable which you intend to use as a running total. For example, to calculate the mean, you incrementally add to a total, which must be initialised to 0 before use:- float total = 0;
for (i=0; i < NUM_GREY; i++) total += i*hist[i]; mean = total/(SIZE*SIZE);
.
Remember: when you declare a variable you cannot assume that it has the value of zero, unless you initialise it.

Many of you remembered to initialise the variable the 1st time you used it, ie when you first calculated the mean. You then went on to use the same variable for an incremental add later on. If the variable was used more than once in this manner, it MUST be reset before use every time. Otherwise the subsequent calculations will be wrong.
 


STRUCTURE

Many of you had given a lot of thought to the structure of their code and others hadn't and produced quite a repetitive program. For example. it was better not to calculate the histogram until the brighten or darken flags had been calculated – as these would change the histogram and result in the need to recalculate it. The mean, most frequently occurring and number of grey levels not used could then be calculated after the histogram. Again it was only necessary to calculate these once.

Another example was when a function was used to calculate the histogram but the array containing the histogram was not returned to the main program. Hence the other operations such as the most used flag (-m) and the used flag (-u) had to then recalculate the histogram. Hence in some programs the histogram was calculated three times when it was only necessary to calculate it once.

There were many other examples like this - plan your program carefully. The easiest way to avoid the repetition was using only the one array for the image and one for the histogram. When for example the reverse flag operated it changes the contents of the image array and doesn't create a new array - this avoids problems with the interaction of various flags.

DESIGN AND PLAN YOUR PROGRAM - IT WAS OBVIOUS WHO HAD REALLY THOUGHT ABOUT THE STRUCTURE AND DESIGN AND WHO HAD SIMPLY HACKED BITS ON THE END UNTIL THE PROGRAM DID EVERYTHING.

THE WELL DESIGNED PROGRAMS WERE SIMPLE AND SHORT!

It was only necessary to calculate the histogram once for the image if the order of the flags were considered. If the code to segment/threshold/reverse the image was done first the array image[][]could be altered, the histogram could then be calculated.
 
if (segment_flag == 1)
{
       // Code to segment image array
}
else if (threshold_flag==1)
{
    //code to threshold image array
}
if (reverse_flag == 1)
{
    //code to reverse image array
}

//Code to claculate histogram

If the histogram is calculated first we the need:-
//Code to calculate histogram

if (segment_flag == 1)
{
       // Code to segment image array
       // code to recalculate histogram
}
else if (threshold_flag==1)
{
    //code to threshold image array
    // code to recalculate histogram
}
if (reverse_flag == 1)
{
    //code to reverse image array
    // code to recalulate histogram
}


This is obviously more wasteful as we are repeatedly calculating teh histogram instead of waiting until all manipulations of the array image[][] are finished and then calculating the histogram.
 
 
 

See comments below under files Many of you used a function to calculate the mean. However many of you did not return a value from the function - you simply printed the value to the screen. this meant that when you needed the mean for segment - you had to physically retype the code to calculate the mean. This defeats the purpose of functions which are to allow re-use of code. If the mean function returned a value the function could be called to calculate the mean for segment.

For example many of you had:-
 

void mean_func (unsigned char image[][256])
{ float total =0;
float mean;
int i,j;

for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        total += image[i][j];
    }
}

mean = total/(SIZE*SIZE);
cout << "The mean is " << mean << "\n";
}
void segment (unsigned char image[][256])
{
float total =0;
float mean;
int i,j;

for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        total += image[i][j];
    }
}

mean = total/(SIZE*SIZE);

for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        if (image[i][j] >= mean )
            image[i][j] = 255;
        else
            image[i][j] = 0;
    }
}

}
 
whereas if the mean function returned a value - we do not need to repeat the code - we just need to re-call teh function. This is much better.
float mean_func (unsigned char image[][256])
{ float total =0;
float mean;
int i,j;

for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        total += image[i][j];
    }
}

mean = total/(SIZE*SIZE);
return (mean);
}
void segment (unsigned char image[][256])
{
 
float mean;
int i,j;
mean = mean_func(image);

for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        if (image[i][j] >= mean )
            image[i][j] = 255;
        else
            image[i][j] = 0;
    }
}

}
   


 
 

ARGUMENTS

The programs should have checked if the correct number of arguments were supplied. For those who processed the flags and then opened the files, you should have checked to ensure you had 2 arguments left, unless the -r, -t or -s  flags were selected when you should have had 3 arguments left. This checking could have taken the form:- if (segment_flag ==1 || threshold_flag==1 || reverse_flag ==1)
{ if (argc != 3)
{ cerr << "Incorrect Number of arguments\n");
help(); // Call help function and exit
}
}
else if (argc != 2) help(); // call help function and exit
Many programs worked incorrectly if for example run with hist -r lena lena.hist and no name for output image. This happened especially for those who had attempted some checking but had not been thorough enough.

This checking should also have considered the status of the -s, -t or -r flags. Some checked if the argument count was 2 or 3 and if it was 3 opened the output image file - it would have been better to check based on the flags.

Others had virtually no checking, relying on the fact that if no filenames were specified, the checking for opening the files would fail and the program would terminate. However, remember from Software Engineering, the aim is to produce reliable, easy to use programs. This sort of checking should be in place to tell the user what is wrong!

 

FILES

When you open a file, you should ensure that it has been opened correctly. If it has not been opened there is no point in continuing. Some of you omitted to check if the file was opened correctly. fi.open(argv[1],ios::binary|ios::in);

if (!fi)
{

cerr << "Error in opening file " << argv[0] << "\n";
exit(0);
}
Some checked that some files were opened but not others! You should have checked that all the files which were required were opened. The output file for the image was only required if the –r, -t  and –s flags were selected. Many of you always opened this file no matter which flags were selected. Then to prevent the reporting of errors - never checked if the file was opened correctly. This file should only have been opened depending on the flags. if (segment_flag ==1 || threshold_flag==1 || reverse_flag ==1 )
{ fo_image.open(argv[2], ios::binary| ios::out);

if (!fo_image)
{
    cerr<<<"Error opening file\n";
    exit(0);
}

}  
The function write_image() should also only have been called for the –r, -s or -t flags. Luckily this function didn't report errors if it couldn't write - otherwise many of the programs would not have functioned correctly! (in fact on other computers rather than the Linux boxes the calling of write_image() under the wrong circumstances did result in errors.
  It is only necessary to write to each file ONCE. ALthough the various flags change the histogram and the image array there is no need to write the image to the file until after all manipulations have been carried out. It is not good practise to keep opening the file, writing and closing it, only to re-open and overwrite later.

For example some of you had:-

     }
 

You could simple have had

    if (segment_flag ==1 || thershold_flag ==1 || reverse_flag ==1 )
    {
        fo_image.open(argv[2], ios::binary|ios::out);
        if (!fo_image)
        {
            cerr << "Can't open file" << argv[2] << "\n";
            exit(1);
        }
        write_image(fo_image, image);
        fo_image.close();
    }

Likewise it was only necessaryto write the histogram once.

Many of you, opened the file, wrote data to it and then if alternative data was to be written to the file, you reopened the file and overwrote the existing data in it. A better method would be to plan the program so that you only write the data required to the file and are not reopening and overwriting. This can lead to many problems and careful structuring of the program to avoid this would be beneficial. You should only close files which you have actually opened! Many of you had statements at the end of your program: fi.close();
fo_hist.close;
fo_image.close();
However, the file with file accessed with file pointer fo_image was only opened if -s, -t or -r  was selected and therefore only needs to be closed under those conditions. ie fi.close();
fo_hist.close;
          if (segment_flag ==1 || threshold_flag==1 || reverse_flag ==1 )
                fo_image.close();

MISCELLANEOUS

The header file image.h contained the declarations for the functions read_image(), write_image() etc. There was no need to repeat these declarations at the beginning of your program. They are already there if you have the line

#include "image.h"

Some of you created your own header files for your function declarations. You should also have declared any constants within this user created header file, and by including it in all other files, all parts of the program would be aware of the declarations. In calculating the total number of pixels in the image, (to use for either the mean or the probability flags) some of you wrote loops to go through the histogram array, summing the number of values in it. This was unnecessary since the total number of pixels was simply SIZE*SIZE or (256*256). Some declared the histogram array as a 2D array - for example int hist[2][256]; They then used the 1st row to store the greyscale values (0 to 255) and the 2nd row to store the appropriate number of elements with each of the greyscale values (the histogram). However, the 1st row is unnecessary as the greyscale number is the same as the element index. Using a 2D array probably makes it more complex than is necessary. The mean can be calculated in two different ways – either from the histogram array or from the image.

From the histogram array:-

float total =0;

for (i=0;i<NUM_GREY;i++)

total += i*hist[i]; mean = total/(SIZE*SIZE);
Or from the image array float total =0;
for (i=0;i<SIZE;i++)
{
    for (j=0;j<SIZE; j++)
    {
        total += i*hist[i];
    }
}
mean = total/(SIZE*SIZE); You should if possible never use GLOBAL VARIABLES. Global variables are variables declared outwith any functions and the main program. These variables can be accessed and changed from anywhere in the program. Remembering your Software Engineering and Highly Coupled systems - you should avoid this. It is bad practice and leads to programs which are difficult to follow.

You should pass the variables to the functions and return as required.

Remember - arrays are a special case - when an array is sent to a function - the function can change the array and the main program knows about the changes. See Example on Functions pages of the web pages for the course.

Some of you declared your variables as global and then used then as if they were local !

Some declared variables as global - even when not using any functions. This was unnecessary.

One of the most useful functions could have been a function containing the help information. This allowed what was essentially just lots of output statements to be placed in a function and left the main program looking much simpler. In addition, this function could have been called from several places - for example when the help flag (-h) was selected, if the user entered the wrong number of arguments or if the user entered a flag which was not allowed. In discussing the Software Engineering Principles we discussed the importance of the user requirements and specification and how these were critical to the design and implementation of the software. In presenting you with the assignment, I had essentially provided the Requirements Documentation, and this should have been followed exactly. For example, the following were noted in some programs which did not meet the specification:-

 Marking:-

The marking took into account the above comments. The highest marks were given for programs which completely met the specification, everything worked correctly, constants were used, the array and any variables were initialised as required. Marks for error checking were awarded, for example if code was added to check if files were opened correctly and to ensure that the correct number of arguments were specified. Marks were deducted if for example write_image() was always called, or the output image file always opened. Marks were also awarded if functions were used (but not so many marks if global variables were employed). The structure of the code was also taken into account - did it unnecessarily repeat parts of the code?

Marking Schedule

Marks allocated according to effort and resulting abilities of program.

> 75%              Good well written program. Program all operates correctly and functions were used extensively within the program
70 - 75%         Program all works correctly and some functions were employed, or functions with global variables were used.
65 - 70%         Program all works with a few minor anomalies or program all works but has not employed functions
60 - 65%         More serious anomalies with program including wrong calculations, typically 1 or 2 flags not behaving quite correctly although overall not to serious
55 - 60%         Approximately half of the flags are working correctly
50 - 55%         Only one or two flags are working correctly
45 - 50%         An attempt has been made on the program but it does not compile correctly
Below 45%     Poor attempt at program
 

Up to 5 marks  added/deducted within these bands depending on the structure of the code. Good well structured code where careful consideration is given to the program structure will gain marks and poorly structured repetitious code will lose marks.

Level within bands also depending on the use of the following features (reasons explained above):-

Use of Constants
Proper checking of arguments before trying to open files
Correct initialisation of histogram array if required
Initialisation of variables if required
Repetitious structure – unnecessary if statements, repeated of mean code when function exists, repeated opening/closing of files
Always calling the write_image() function
Always closing image file

If functions are employed, marks dependant on proper use of function arguments and will be deducted if global variables are employed
 
 
 

If you have any queries - please contact me by email (J.Bell@hw.ac.uk)