The same mechanisms can be used to read or write data from and to files. It is also possible to treat character strings in a similar way, constructing or analysing them and storing results in variables. These variants of the basic input and output commands are discussed in the next section
The Standard Input Output File
UNIX supplies a standard package for performing input and output to files or the terminal. This contains most of the functions which will be introduced in this section, along with definitions of the datatypes required to use them. To use these facilities, your program must include these definitions by adding the line This is done by adding the line
#include
near the start of the program file.
If you do not do this, the compiler may complain about undefined functions or datatypes.
Character Input / Output
This is the lowest level of input and output. It provides very precise control, but is usually too fiddly to be useful. Most computers perform buffering of input and output. This means that they'll not start reading any input until the return key is pressed, and they'll not print characters on the terminal until there is a whole line to be printed.
getchar
getchar returns the next character of keyboard input as an int. If there is an error then EOF (end of file) is returned instead. It is therefore usual to compare this value against EOF before using it. If the return value is stored in a char, it will never be equal to EOF, so error conditions will not be handled correctly.
As an example, here is a program to count the number of characters read until an EOF is encountered. EOF can be generated by typing Control - d.
#include
main()
{ int ch, i = 0;
while((ch = getchar()) != EOF)
i ++;
printf("%d\n", i);
}
putchar
putchar puts its character argument on the standard output (usually the screen).
The following example program converts any typed input into capital letters. To do this it applies the function toupper from the character conversion library ctype.h to each character in turn.
#include
#include
main()
{ int ch;
while((ch = getchar()) != EOF)
putchar(toupper(ch));
}
Formatted Input / Output
We have met these functions earlier in the course. They are closest to the facilities offered by Pascal or Fortran, and usually the easiest to use for input and output. The versions offered under C are a little more detailed, offering precise control of layout.
printf
This offers more structured output than putchar. Its arguments are, in order; a control string, which controls what gets printed, followed by a list of values to be substituted for entries in the control string.
Control String Entry What Gets Printed
%d A Decimal Integer
%f A Floating Point Value
%c A Character
%s A Character String
There are several more types available. For full details type
man printf
on your UNIX system.
It is also possible to insert numbers into the control string to control field widths for values to be displayed. For example %6d would print a decimal value in a field 6 spaces wide, %8.2f would print a real value in a field 8 spaces wide with room to show 2 decimal places. Display is left justified by default, but can be right justified by putting a - before the format information, for example %-6d, a decimal integer right justified in a 6 space field.
scanf
scanf allows formatted reading of data from the keyboard. Like printf it has a control string, followed by the list of items to be read. However scanf wants to know the address of the items to be read, since it is a function which will change that value. Therefore the names of variables are preceeded by the & sign. Character strings are an exception to this. Since a string is already a character pointer, we give the names of string variables unmodified by a leading &.
Control string entries which match values to be read are preceeded by the percentage sign in a similar way to their printf equivalents.
Type man scanf for details of all options on your system.
Whole Lines of Input and Output
Where we are not too interested in the format of our data, or perhaps we cannot predict its format in advance, we can read and write whole lines as character strings. This approach allows us to read in a line of input, and then use various string handling functions to analyse it at our leisure.
gets
gets reads a whole line of input into a string until a newline or EOF is encountered. It is critical to ensure that the string is large enough to hold any expected input lines.
When all input is finished, NULL as defined in stdio.h is returned.
puts
puts writes a string to the output, and follows it with a newline character.
Example: Program which uses gets and puts to double space typed input.
#include
main()
{ char line[256]; /* Define string sufficiently large to
store a line of input */
while(gets(line) != NULL) /* Read line */
{ puts(line); /* Print line */
printf("\n"); /* Print blank line */
}
}
Note that putchar, printf and puts can be freely used together. So can getchar, scanf and gets.
Handling Files in C
This section describes the use of C's input / output facilities for reading and writing files. There is also a brief description of string handling functions here.
The functions are all variants on the forms of input / output which were introduced in the previous section.
UNIX File Redirection
UNIX has a facility called redirection which allows a program to access a single input file and a single output file very easily. The program is written to read from the keyboard and write to the terminal screen as normal.
To run prog1 but read data from file infile instead of the keyboard, you would type
prog1 < infile To run prog1 and write data to outfile instead of the screen, you would type prog1 > outfile
Both can also be combined as in
prog1 < infile > outfile
Redirection is simple, and allows a single program to read or write data to or from files or the screen and keyboard.
Some programs need to access several files for input or output, redirection cannot do this. In such cases you will have to use C's file handling facilities.
C File Handling - File Pointers
C communicates with files using a new datatype called a file pointer. This type is defined within stdio.h, and written as FILE *. A file pointer called output_file is declared in a statement like
FILE *output_file;
Opening a file pointer using fopen
Your program must open a file before it can access it. This is done using the fopen function, which returns the required file pointer. If the file cannot be opened for any reason then the value NULL will be returned. You will usually use fopen as follows
if ((output_file = fopen("output_file", "w")) == NULL)
fprintf(stderr, "Cannot open %s\n", "output_file");
fopen takes two arguments, both are strings, the first is the name of the file to be opened, the second is an access character, which is usually one of:
"r" Open file for reading
"w" Create file for writing
"a" Open file for appending
As usual, use the man command for further details by typing man fopen.
Standard file pointers in UNIX
UNIX systems provide three file descriptors which are automatically open to all C programs. These are
stdin The standard input. The keyboard or a redirected input file.
stdout The standard output. The screen or a redirected output file.
stderr The standard error. This is the screen, even when ouput is redirected. This is the conventional place to put any error messages.
Since these files are already open, there is no need to use fopen on them.
Closing a file using fclose
The fclose command can be used to disconnect a file pointer from a file. This is usually done so that the pointer can be used to access a different file. Systems have a limit on the number of files which can be open simultaneously, so it is a good idea to close a file when you have finished using it.
This would be done using a statement like
fclose(output_file);
If files are still open when a program exits, the system will close them for you. However it is usually better to close the files properly.
Input and Output using file pointers
Having opened a file pointer, you will wish to use it for either input or output. C supplies a set of functions to allow you to do this. All are very similar to input and output functions that you have already met.
Character Input and Output with Files
This is done using equivalents of getchar and putchar which are called getc and putc. Each takes an extra argument, which identifies the file pointer to be used for input or output.
puchar(c) is equivalent to putc(c, stdout)
getchar() is equivalent to getc(stdin)
Formatted Input Output with File Pointers
Similarly there are equivalents to the functions printf and scanf which read or write data to files. These are called fprintf and fscanf. You have already seen fprintf being used to write data to stderr.
The functions are used in the same way, except that the fprintf and fscanf take the file pointer as an additional first argument.
Formatted Input Output with Strings
These are the third set of the printf and scanf families. They are called sprintf and sscanf.
sprintf
puts formatted data into a string which must have sufficient space allocated to hold it. This can be done by declaring it as an array of char. The data is formatted according to a control string of the same form as that for p rintf.
sscanf
takes data from a string and stores it in other variables as specified by the control string. This is done in the same way that scanf reads input data into variables. sscanf is very useful for converting strings into numeric v values.
Whole Line Input and Output using File Pointers
Predictably, equivalents to gets and puts exist called fgets and fputs. The programmer should be careful in using them, since they are incompatible with gets and puts. gets requires the programmer to specify the maximum number of characters to be read. fgets and fputs retain the trailing newline character on the line they read or write, wheras gets and puts discard the newline.
When transferring data from files to standard input / output channels, the simplest way to avoid incompatibility with the newline is to use fgets and fputs for files and standard channels too.
For Example, read a line from the keyboard using
fgets(data_string, 80, stdin);
and write a line to the screen using
fputs(data_string, stdout);
Special Characters
C makes use of some 'invisible' characters which have already been mentioned. However a fuller description seems appropriate here.
NULL, The Null Pointer or Character
NULL is a character or pointer value. If a pointer, then the pointer variable does not reference any object (i.e. a pointer to nothing). It is usual for functions which return pointers to return NULL if they failed in some way. The return value can be tested. See the section on fopen for an example of this.
NULL is returned by read commands of the gets family when they try to read beyond the end of an input file.
Where it is used as a character, NULL is commonly written as '\0'. It is the string termination character which is automatically appended to any strings in your C program. You usually need not bother about this final \0', since it is handled automatically. However it sometimes makes a useful target to terminate a string search. There is an example of this in the string_length function example in the section on Functions in C.
EOF, The End of File Marker
EOF is a character which indicates the end of a file. It is returned by read commands of the getc and scanf families when they try to read beyond the end of a file.
Other String Handling Functions
As well as sprintf and sscanf, the UNIX system has a number of other string handling functions within its libraries. A number of the most useful ones are contained in the
#include
near to the head of your program file.
A couple of the functions are described below.
strcpy(str1, str2) Copies str2 into str1
strcmp(str1, str2) Compares the contents of str1 and str2. Return 0(false) if both are equal.
A full list of these functions can be seen using the man command by typing
man 3 strings
Conclusion
The variety of different types of input and output, using standard input or output, files or character strings make C a very powerful language. The addition of character input and output make it highly suitable for applications where the format of data must be controlled very precisely.
No comments:
Post a Comment