Software Engineering 1

Assignment Feedback


 

The following is some feedback about your Software Engineering assignments. They are general comments about the most frequently occurring faults etc. which I came across in testing the programs.

The comments can be split into the following main areas:-

  1. Constants
  2. Histogram - creating and initialising
  3. Structure of Program
  4. Files
  5. Arguments
  6. Miscellaneous


  1. CONSTANTS
  2. You should use constants whenever possible in your programs. For example, a constant should have been declared for number of grey levels (256). This could then be used to declare sizes of histogram arrays and when writing out contents of array. Therefore the value 256 should only appear once in program in constant declaration.

    The size of the image was also 256 rows and 256 columns and another constant should have been declared for this.

    The same constant could have been used for each, but as each were representing separate entities (one the number of grey levels and one the size of the image) a separate constant should ideally have been declared and used for each.

    Some of you declared constants but failed to use them fully.

    Throughout the following discussion NUM_GREY has been used as constant to represent the number of grey levels and SIZE to represent the size of the image.


  3. HISTOGRAM CREATING AND INITIALISING
  4. When calculating histogram there is no need to process the image array 256 times, each time looking for grey level 1, then grey level 2 etc. You only need to process the image once:-

    For example:
     

    for (i=0; i< SIZE; i++) for (j=0; j< SIZE;j++)
    { n = image[i][j];
    bin[n]++;
    }
     
    rather than the slower approach:-
      for (i = 0; i < NUM_GREY ; i ++)
    { for (j =0 ; j < SIZE; j++)
    { for (k = 0 ; k < SIZE ; k++)
    { if (image[j][k] == i) hist[i]++; }
    }
    }
     
    This is still valid and will produce the same result, but is slower since it searches through the array 256 times, instead of only once.

    One of the commonest faults encountered in the programs was the failure to initialise the histogram array to contain all zeros prior to starting to calculate the histogram. Both of the above methods assume that the array hist[] originally contains all zeros as it is incrementally added to the appropriate element depending on the grey level. When you declare a variable or an array, you cannot assume that it initially contains the value 0. In declaring a variable, you are simply requesting space in memory for it, that space in memory may have a value in it. The array could simply be initialised using
      for (i=0; i < NUM_GREY; i++) hist[i] = 0;
     

    Many of you had obviously been lucky, in that the compiler you happened to use, and the past history of the memory resulted in the program working correctly without the initialisation. However, in some cases I had to vary compilers and machines to get the program to work without this initialisation. It was particularly important to initialise the array if you calculated the histogram within a function. This is because of the memory management and use of arrays with functions.

    The initialisation of the array, is similar to the need to initialise any variable which you intend to use as a running total. For example, top calculate the mean, you incrementally add to a total, which must be initialised to 0 before use:-
     

    float total = 0;
    for (i=0; i < NUM_GREY; i++) total += i*hist[i]; mean = total/(SIZE*SIZE);
    .
    Remember: when you declare a variable you cannot assume that it has the value of zero, unless you initialise it
     


  5. STRUCTURE
  6. Many of you had given a lot of thought to the structure of their code and others hadn't and produced quite a repetitive program. For example:-
      if(prob_flag==0 && rev_flag==0)
    { if((fo_hist=fopen(argv[1],"w")) == NULL)
    { printf("Error2 cannot open %s for writing\n",argv[1]);
    exit(1);
    }
    for(i=0; i<256; i++)
    fprintf(fo_hist, "%d %d", i, hist[i]);
    fclose(freq_out);
    }

    if(prob_flag==0 && rev_flag==1)
    {

    /* call function Reverse */
    Reverse(hist);

    if((fo_hist=fopen(argv[1],"w")) == NULL)
    {

    printf("Error2 cannot open %s for writing\n",argv[1]);
    exit(1);
    }

    for(i=0; i<256; i++)

    fprintf(fo_hist, "%d %d", i, hist[i]);
    }

    if(prob_flag==1 && rev_flag==0)
    {

    /* call function Prob */
    Prob(hist,prob);

    if((fo_hist=fopen(argv[1],"w")) == NULL)
    {

    printf("Error2 cannot open %s for writing\n",argv[1]);
    exit(1);
    }

    for(i=0; i<256; i++)

    fprintf(fo_hist, "%d %f", i, prob[i]);
    }

    if(prob_flag==1 && rev_flag==1)
    {

    /* call function Reverse_Prob */
    Reverse_Prob(hist,prob_rev);

    if((fo_hist=fopen(argv[1],"w")) == NULL)
    {

    printf("Error2 cannot open %s for writing\n",argv[1]);
    exit(1);
    }

    for(i=0; i<256; i++)

    fprintf(fo_hist, "%d %f", i, prob_rev[i]);
    }
    Here, the code to open the file was included within every option. This was needed no matter what flags were selected so would have been better outside any option, and would then only have been required once. In this example, a function was used to calculate the probability, one for the reverse and one if both were required. The last function was unnecessary as the previously created reverse and probability functions could have been used. (For some of you, some care in the order in which you called these may have been required.)

    There were many other examples like this - plan your program carefully.

    DESIGN AND PLAN YOUR PROGRAM - IT WAS OBVIOUS WHO HAD REALLY THOUGHT ABOUT THE STRUCTURE AND DESIGN AND WHO HAD SIMPLY HACKED BITS ON THE END UNTIL THE PROGRAM DID EVERYTHING.
     
     
     
     

    When reversing the array (-r) there was no need to use an additional array. The swapping could all be achieved without introducing an extra temporary array. This saves using extra memory which was not required and gives a neater structure. For example for the reverse flag (-r), this could have been achieved as:-
      for (i=0; i < NUM_GREY/2; i++)
    { tmp = hist[i];
    hist[i] = hist[NUM_GREY-1-i];
    hist[NUM_GREY-1-i]=tmp;
    }
     
    Rather than as:-
      int tmp_array[NUM_GREY];

    for (i=0; i< NUM_GREY; i++)

    tmp_array[NUM_GREY-1-i] = hist[i];


    for (i=0; i < NUM_GREY; i++)

    hist[i] = tmp_array[i];
     
    Note that the loop only needs to go round NUM_GREY/2 times, because the loop is swapping the 1st element with the last, etc.

    NOTE: this means that element number 0 is swapped with element number 255

    Element 1 is swapped with element 254

    Therefore, element i is swapped with element 255-i (NUM_GREY-1-i) and not element 256-i as many of you implemented (I'm surprised that this didn't cause a core dump!)
     
     

    As for the reverse this could all be achieved without writing to a temporary array and then writing back. For example, many of you used,
      unsigned char temp[SIZE][SIZE];

    for(x=0; x<SIZE; x++)
    {

    for(y=0; y<SIZE; y++)
    { temp[SIZE-1-x][y] = image[x][y]; }
    }

    for(x=0; x<SIZE; x++)
    {

    for(y=0; y<SIZE; y++)
    { image[x][y] = temp[x][y]; }
    }
     
    which again could have been implemented as:-
      int temp;
    for(x=0; x<SIZE/2; x++)
    { for(y=0; y<SIZE; y++)
    { temp = image[x][y];
    image[x][y] = image[SIZE-1-x][y];
    image[SIZE-1-x][y] = temp;
    }
    }
     
    This second method saves declaring an extra array of size 256 by 256 and saves time as it doesn't go round the loops twice. (Many of you did this, but some still declared temp as an array of 256 by 256 elements when all that was required was a single variable.)

    Some used a method of calculating the probability which used two loops when everything could have been achieved in one loop. For example consider the following code structure:-
      float hist[NUM_GREY], prob[NUM_GREY];

    for (i=0; i < NUM_GREY; i++)

    prob[i] = hist[i]/(SIZE*SIZE);


    for (i=0; i < NUM_GREY; i++)

    hist[i] = prob[i};
     
    This could all have been achieved in one loop as:
      for (i=0; i < NUM_GREY; i++) hist[i] = hist[i]/(SIZE*SIZE);
     
    Even, if an integer array was used for hist and float array for prob, there is no need to introduce a second loop.
      int hist[NUM_GREY];
    float prob[NUM_GREY];

    for (i=0; i < NUM_GREY; i++)

    prob[i] = (float)hist[i]/(SIZE*SIZE);

    The value of the mean was required to calculate the variance. However, there was no need to repeat the code verbatim for both the -a and the -v flags. More thought about the structure could have avoided this. Particularly for those of you who opted to use functions. Functions are supposed to reduce code repetition and make it simple to reuse blocks of code required for different things. For example, here is an extract of code and the necessary functions from one program:-
      if (mean_flag == 1) average(image);


    if (variance_flag == 1)

    variance(image);


    void average(unsigned char image[][SIZE])
    {

    int a, b;
    float c = 0;
    float av_value;

    for (a = 0; a < SIZE; a++)
    {

    for (b = 0; b < SIZE; b++)
    { c += image[a][b]; }
    }

    av_value = (c / (SIZE * SIZE));
    printf("The average greyscale value for the image is %0.2f.\n", av_value);
    return;

    }

    void variance(unsigned char image[][SIZE])
    {

    int a, b;
    float c = 0, d = 0;
    float av_value;
    float variance;

    for (a = 0; a < SIZE; a++)
    {

    for (b = 0; b < SIZE; b++)
    { c += image[a][b]; }
    }

    av_value = (c / (SIZE * SIZE));
    for (a = 0; a < SIZE; a++)
    {

    for (b=0; b < SIZE; b++)
    { d += ((image[a][b] - av_value) * (image[a][b] -av_value)); }
    }

    variance = (d / ((SIZE * SIZE) - 1.0));
    printf("The variance of the image is %0.2f.\n\n", variance);
    return;

    }
     
    The clear repetition is obvious and unnecessary. By slightly changing the mean function to return a value, this function could have been called from the variance function. For example:-
     
      if (mean_flag == 1)
    { mean = average(image);
    printf("The average value is %0.2f\n", mean);
    }

    if (variance_flag == 1)

    variance(image);


    float average(unsigned char image[][SIZE])
    {

    int a, b;
    float c = 0;
    float av_value;

    for (a = 0; a < SIZE; a++)
    {

    for (b = 0; b < SIZE; b++)
    { c += image[a][b]; }
    }

    av_value = (c / (SIZE * SIZE));
    return (av_value);

    }

    void variance(unsigned char image[][SIZE])
    {

    int a, b;
    float d = 0;
    float av_value;
    float variance;

    av_value = average(image);

    for (a = 0; a < SIZE; a++)
    {

    for (b=0; b < SIZE; b++)
    { d += ((image[a][b] - av_value) * (image[a][b] -av_value)); }
    }

    variance = (d / ((SIZE * SIZE) - 1.0));
    printf("The variance of the image is %0.2f.\n\n", variance);
    return;

    }
     
    Even if functions had not been used, there was no need to repeat the code verbatim.


  7. FILES
  8. When you open a file, you should ensure that it has been opened correctly. If it has not been opened there is no point in continuing. Some of you omitted to check if the file was opened correctly. The opening and checking can be embedded into one statement
      if ((fi=fopen(argv[0], "r")) == NULL)
    { fprintf(stderr,"Error in opening file %s\n", argv[0]);
    exit(1);
    }
     
    Some opened the file and then repeated the openeing tin the checking statement
      fi=fopen(argv[0], "r");
    if ((fi=fopen(argv[0], "r")) == NULL)
    { fprintf(stderr,"Error in opening file %s\n", argv[0]);
    exit(1);
    }
     
    This is not required, as the fopen statement has been repeated.

    Many of you, opened the file, wrote data to it and then if alternative data was to be written to the file, you reopened the file and overwrote the existing data in it. A better method would be to plan the program so that you only write the data required to the file and are not reopening and overwriting. This can lead to many problems and careful structuring of the program to avoid this would be beneficial.

    You should only close files which you have actually opened! Many of you had statements at the end of your program:
      fclose(fi);
    fclose(fo_hist);
    fclose(fo_image);
     
    However, the file with file accessed with file pointer fo_image was only opened if -u or -f was selected and therefore only needs to be closed under those conditions. This actually caused a core dump using the compiler on the LINUX boxes , but the DECs appeared more forgiving. Remember: only close a file you have actually opened.


  9. ARGUMENTS
  10. The programs should have checked if the correct number of arguments were supplied. For those who processed the flags and then opened the files, you should have checked to ensure you had 2 arguments left, unless the -u or -f flags were selected when you should have had 3 flags left. This checking could have taken the form:-
      if (up_flag == 1 || flip_flag == 1 )
    { if (argc != 3)
    { fprintf(stderr,"Incorrect Number of arguments\n");
    help();
    }
    }
    else if (argc != 2) help();
     
    Many programs dumped core if for example run with

    hist -u lena lena.hist

    and no name for output image. This happened especially for those who had attempted some checking but had not been thorough enough. If for example, the following code was used and the program run with the -u flag and only 2 filenames (as given above), the program dumped core when it got to the stage of writing out the image, because it hadn't been opened
     

    if (argc == 2)
    { if ((fi = fopen (argv[0], "r"))==NULL)
    { fprintf(stderr, "Can't open file %s for reading\n",argv[0]);
    exit(1);
    }
    if ((fo_hist = fopen (argv[1], "w"))==NULL)
    { fprintf(stderr, "Can't open file %s for writing\n",argv [1]);
    exit(1);
    }
    }

    if (argc == 3)
    {

    if ((fi = fopen (argv[0], "r"))==NULL)
    { fprintf(stderr, "Can't open file %s for reading\n", argv [0]);
    exit(1);
    }

    if ((fo_hist = fopen (argv[1], "w"))==NULL)
    {

    fprintf(stderr, "Can't open file %s for writing\n", argv[1]);
    exit(1);
    }

    if ((fo_image = fopen (argv[2], "w"))==NULL)
    {

    fprintf(stderr, "Can't open file %s for writing\n", argv [2]);
    exit(1);
    }
    }
     
    This checking should also have considered the status of the -u and -f flags.

    Others had virtually no checking, relying on the fact that if no filenames were specified, the checking for opening the files would fail and the program would terminate. However, remember from Software Engineering, the aim is to produce reliable, easy to use programs. This sort of checking should be in place to tell the user what is wrong!
     
     


  11. MISCELLANEOUS
  12. Many opted to use functions for the mean and variance. When writing a function which will only return one value this can be returned through the type specifier rather than having to use pointers. If you look at standard library functions in C, they all use the type specifier whenever possible. Therefore, rather than writing the function in the form:-
      void average(float *average, int hist[])
    { int i;
    float total=0;

    for(i=0; i<NUM_GREY; i++)

    total+=i*pixel_no[i]; average*=total/(NUM_GREY*NUM_GREY);
    }
     
    use the type specifier if only returning one thing:-
      float average (int hist[])
    { int i;
    float total = 0;

    for (i=0; i < NUM_GREY; i++)

    total += i*hist[i]; return(total/(NUM_GREY*NUM_GREY));
    }

    The header file image.h contained the declarations for the functions read_image(), write_image() etc. There was no need to repeat these declarations at the beginning of your program. They are already there if you have the line

    #include "image.h"

    Some of you created your own header files for your function declarations. You should also have declared any constants within this user created header file, and by including it in all other files, all parts of the program would be aware of the declarations.

    In calculating the total number of pixels in the image, (to use for either the mean or the probability flags) some of you wrote loops to go through the histogram array, summing the number of values in it. This was unnecessary since the total number of pixels was simply SIZE*SIZE or (256*256).

    Some of you had values which were slightly out for the variance, because you had implemented the formula, incorrectly. The formula for the variance is

    Therefore after taking the sum of the square of each pixel value minus the mean, the total should then have been divided by the total number of pixels minus 1 (256*256 -1).
     
     
     
     


    I have not yet assigned marks to the programs, simply tested them all and noted any comments about them. Should you wish to see the specific comments relating to your program, please contact me. I will not however be able to give you a specific mark, but comments about the program are probably more beneficial to you at this stage anyway.


    Home