Writing Larger Programs

This handout deals with theoretical and practical aspects that need to be considered when writing larger programs.

When writing large programs we should divide programs up into modules. These would be separate source files. main() would be in one file, main.c say, the others will contain functions.

We can create our own library of functions by writing a suite of subroutines in one (or more) modules. In fact modules can be shared amongst many programs by simply including the modules at compilation, as we will see shortly.

There are many advantages to this approach:

·                     Header the modules will naturally divide into common groups of functions.

·                     We can compile each module separately and link in compiled modules (more on this later).

·                     UNIX utilities such as make help us maintain large systems (see later).

Header files

If we adopt a modular approach then we will naturally want to keep variable definitions, function prototypes etc. with each module. However what if several modules need to share such definitions?

It is best to centralize the definitions in one file and share this file amongst the modules. Such a file is usually called a header file.

Convention states that these files have an .h suffix.

We have met standard library header files already e.g:

   #include <stdio.h>

We can define our own header files and include then our programs via:

   #include “my_head.h”

NOTE: Header files usually ONLY contain definitions of data types,
function prototypes and C preprocessor commands.

Consider the following simple example of a large program.

 

 

 

 

Modular structure of a C program The full listings main.c, WriteMyString.c and header.h as as follows:

main.c:

/* main.c */
#include "header.h"
#include <stdio.h>
 
char *AnotherString = "Hello Everyone";
 
main()
{
   printf("Running...\n");
 
   /* Call WriteMyString() - defined in another file */
   WriteMyString(MY_STRING);
 
   printf("Finished.\n");
}

WriteMyString.c:

/* WriteMyString.c */
extern char *AnotherString;
 
void WriteMyString(char *ThisString)
{
   printf("%s\n", ThisString);
   printf("Global Variable = %s\n", AnotherString);
}

header.h:

/* header.h */
#define MY_STRING "Hello World"
void WriteMyString();

We would usually compile each module separately.

Some modules have a #include “header.h” that share common definitions.

Some, like main.c, also include standard header files .

main calls the function WriteMyString() which is in WriteMyString.c module.

The function prototype void for WriteMyString is defined in Header.h

NOTE that in general we must resolve a tradeoff between having a desire for each .c module to have access to the information it needs solely for its job and the practical reality of maintaining lots of header files.

Up to some moderate program size it is probably best to one or two header files that share more than one modules definitions.

 

 

One problem left with module approach:

SHARING VARIABLES

If we have global variables declared and instantiated in one module how can pass knowledge of this to other modules.

We could pass values as parameters to functions, BUT:

·   This can be laborious if we pass the same parameters to many functions and / or if there are long argument lists involved.

·   Very large arrays and structures are difficult to store locally -- memory problems with stack.

External variables and functions

``Internal'' implies arguments and functions are defined inside functions -- Local

``External'' variables are defined outside of functions -- they are potentially available to the whole program (Global) but NOT necessarily.

External variables are always permanent.

NOTE: That in C, all function definitions are external.

Scope of externals

An external variable (or function) is not always totally global.

C applies the following rule:

The scope of an external variable (or function) begins at its point of declaration and lasts to the end of the file (module) it is declared in.

Consider the following:

 
main()
{ .... }
 
int what_scope;
float end_of_scope[10]
 
void what_global()
{ .... }
 
char alone;
 
float fn()
{ .... }

 

 

 

 

main cannot see what_scope or end_of_scope but the functions what_global and fn can. ONLY fn can see alone.

This is also the one of the reasons why we should prototype functions before the body of code etc. is given.

So here main will not know anything about the functions what_global and fn. what_global does not know about fn but fn knows about what_global since it is declared above.

NOTE: The other reason we prototype functions is that some checking can be done the parameters passed to functions.

If we need to refer to an external variable before it is declared or if it is defined in another module we must declare it as an extern variable. e.g.

   extern int what_global

So returning to the modular example. We have a global string AnotherString declared in main.c and shared with WriteMyString.c where it is declared extern.

BEWARE the extern prefix is a declaration NOT a definition. i.e NO STORAGE is set aside in memory for an extern variable -- it is just an announcement of the property of a variable.

The actual variable must only be defined once in the whole program -- you can have as many extern declarations as needed.

Array sizes must obviously be given with
declarations but are not needed with extern declarations. e.g.:

   main.c:    int arr[100]:

   file.c:    extern int arr[];

Advantages of Using Several Files

The main advantages of spreading a program across several files are:

·   Teams of programmers can work on programs. Each programmer works on a different file.

·   An object-oriented style can be used. Each file defines a particular type of object as a data type and operations on that object as functions. The implementation of the object can be kept private from the rest of the program. This makes for well-structured programs, which are easy to maintain.

·   Files can contain all functions from a related group. For Example all matrix operations. These can then be accessed like a function library.

·   Well-implemented objects or function definitions can be re-used in other programs, reducing development time.

·   In very large programs each major function can occupy a file to itself. Any lower level functions used to implement them can be kept in the same file. Then programmers who call the major function need not be distracted by all the lower level work.

·   When changes are made to a file, only that file need be re-compiled to rebuild the program. The UNIX make facility is very useful for rebuilding multi-file programs in this way.

How to Divide a Program between Several Files

Where a function is spread over several files, each file will contain one or more functions. One file will include main while the others will contain functions, which are called by others. These other files can be treated as a library of functions.

Programmers usually start designing a program by dividing the problem into easily managed sections. Each of these sections might be implemented as one or more functions. All functions from each section will usually live in a single file.

Where objects are implemented as data structures, it is usual to keep all functions, which access that object in the same file. The advantages of this are:

·   The object can easily be re-used in other programs.

·   All related functions are stored together.

·   Later changes to the object require only one file to be modified.

Where the file contains the definition of an object, or functions, which return values, there is a further restriction on calling these functions from another file. Unless functions in another file are told about the object or function definitions, they will be unable to compile them correctly.

The best solution to this problem is to write a header file for each of the C files. This will have the same name as the C file, but ending in .h. The header file contains definitions of all the functions used in the C file.

Whenever a function in another file calls a function from our C file, it can define the function by making a #include of the appropriate .h file.

Organization of Data in each File

Any file must have its data organized in a certain order. This will typically be:

·   A introduction consisting of #defined constants, #included header files and typedefs of important data types.

·   Declaration of global and external variables. Global variables may also be initialized here.

·   One or more functions.

The order of items is important, since every object must be defined before it can be used. Functions, which return values, must be defined before they are called. This definition might be one of the following:

·   Where the function is defined and called in the same file, a full declaration of the function can be placed ahead of any call to the function.

·   If the function is called from a file where it is not defined, a prototype should appear before the call to the function, among the global variables at the start of the source file or in a header file which is read in using an #include.  .