Tuesday, April 17, 2012

The Standard C Library for Linux, Part Five: Miscellaneous Functions

The last article was on <ctype.h> character handling.  This article is on <stdlib.h> which contains many small sections: integer math, sorting and searching, random numbers, string to number conversions, multibyte character conversions, memory allocation and environmental functions.   Because this library contains so many small  yet very important sections I want to discuss each of these groups in its own section.  An example will be given in each section below because these functions are too diverse to have a single example for all of them.

I am assuming a knowledge of c programming on the part of the reader.  There is no guarantee of accuracy in any of this information nor suitability for any purpose.
As always, if you see an error in my documentation please tell me and I will correct myself in a later document.  See corrections at end of the document to review corrections to the previous articles.

Integer Math
    #include <stdlib.h>int      abs(int x);
    div_t    div(int numerator, int denominator);
    long int labs(long int x);
    ldiv_t   ldiv(long int numerator, long int denominator);
int x
int numerator
int denominator
The long int versions are the same as the three int arguments. abs returns the absolute value of the argument.
div returns a data structure that contains both the quotient and remainder.
labs is the long version of the abs function.
ldiv is the long version of the div function.

Integer math is math using whole numbers.  No fractions.  This is math from the fourth grade.  If you remember the numerator is divided by the denominator and the answer is the quotient with the left over stuff being the remainder then you have got it.  The div_t and ldiv_t are structures that hold the quotient and the remainder.  These structures look like this:

struct div_t {
    int quot;
    int rem;
}
 
struct ldiv_t {
    long int quot;
    long int rem;
}
 
These types are already defined for you in the <stdlib.h> library.  The example file shows a few ways to use these four functions.

String to Number Conversions
    #include <stdlib.h>double   atof(const char *string);
    int      atoi(const char *string);
    long int atol(const char *string);
    double   strtod(const char *string, char **endptr);
    long int strtol(const char *string, char **endptr, int base);
    unsigned long int strtoul(const char *string, char **endptr, int base);
const char *string
char **endptr
int base atof is acsii to float conversion.
atoi is ascii to integer conversion.
atol is acsii to long conversion.
strtod is string to double conversion.
strtol is string to long and the string can contain numbers in bases other than base 10.
strtoul is the same as strtol, except that it returns an unsigned long.
 
If you are reading in a number from user input then you will need to use these routines to convert from the digits '1' '2' '3' to the number 123.  The easiest way to convert the other way, from a number to a string,  is to use the sprintf() function.

The example program is just a sample of use of each of the above commands.
 
Searching and Sorting
    #include <stdlib.h>void qsort(void *base, size_t num_of_objs, size_t size_of_obj, int (*compar)(const void *, const void *));
    void bsearch(const void *key, void *base, size_t num_of_objs, size_t size_of_obj, int (*compar)(const void *, const void *));
void *base
size_t num_of_objs
size_t size_of_obj
const void *
const void *key
 
qsort will sort the array of strings using a comparison function that you write yourself.
bsearch will search the sorted array using a comparison function that you write yourself.

You do not need to write your own sorting routines yourself.  Through the use of these functions you can sort and search through memory arrays.

It is important to realize that you must sort an array before you can search it because of the search method used.

In order to generate the information to have something to sort I combined this example with the random number generation.  I initialize a string array with a series of random numbers and then sort it.  I then look to see if the string 1000 is in the table.  I finally print out the sorted array.

Memory Allocation
    #include <stdlib.h>void *calloc(size_t num_of_objs, size_t size_of_objs);
    void free(void *pointer_to_obj);
    void *malloc(size_t size_of_object);
    void *realloc(void *pointer_to_obj, size_t size_of_obj);
size_t num_of_objs
size_t size_of_objs
void *pointer_to_obj free will free the specified memory that was previously allocated.  You will core dump if you try to free memory twice.
malloc will allocate the specified number of bytes and return a pointer to the memory.
calloc will allocate the array and return a pointer to the array.
realloc allows you to change the size of a memory area "on-the-fly".  You can shrink and grow the memory as you need, be aware that trying to access memory beyond what you have allocated will cause a core dump.

Runtime memory allocation allows you to write a program that only uses the memory that is needed for that program run.  No need to change a value and recompile if you ask for the memory at runtime.  Also no need to setup arrays to the maximum possible size when the average run is a fraction the size of the maximum.

The danger of using memory this way is that in complex programs it is easy to forget to free memory when you are done with it.  These "memory leaks" will eventually cause your program to use all available memory on a system and cause a dump.  It is also important to not assume that a memory allocation will always work.   Attempting to use a pointer to a memory location that your program doesn't own will cause a core dump.  A more serious problem is when a pointer is overwriting your own programs memory.  This will cause your program to work very erratically and will be hard to pinpoint the exact problem.

I had to write two different examples to demonstrate all the diversity of these functions.  In order to actually demonstrate their use I had to actually program something halfway useful.
The first example is a stack program that allocates and deallocates the memory as you push and pop values from the stack.

The second example reads any file into the computers memory, reallocating the memory as it goes.  I left debug statements in the  second program so that you can see that the memory is only reallocated when the program needs more memory.

Environmental
    #include <stdlib.h>void abort ( void );
    int atexit ( void ( *func )( void ) );
    void exit ( int status);
    char *getenv( const char *string);
    int setenv ( const char *name, const char *value, int overwrite );
    int unsetenv ( const char *name, const char *value, int overwrite );
    int system ( const char *string );
void
void (*func)(void)
int status
const char *string
const char *name
const char *value
int overwrite
 
abort causes the signal SIGABORT to be sent to your program.  Unless your program handles the signal it will exit with an abort error.
atexit will allow you to run a set of function calls upon exit from your program.  You can stack them up quite a bit, I seem to remember that you can have up to 32 of these.
exit will exit your program with the specified integer return value.
getenv will return the value of the environmental variable specified or a NULL if the environmental variable is not set.
setenv will set the specified variable to the specified value, will return a -1 on an error.
unsetenv will unset the specified variable system will execute the specified command string and return the exit value of the command.

These functions allow you to connect back to the unix environment that you ran your program from and set exit values, read the values of environmental variables and run commands from within a c program.

The example program demonstrates how to read an environmental variable and the two different methods of setting an environmental variable.  Run this program without TESTING being set and then run `export TESTING=anything` and run the program again.   You will notice the difference between the two runs.  Also notice the order of the atexit() function calls and the order that they are actually called when the program does exit.  Copy one of the abort() calls out before the exit and reexecute the program, when the abort is called the atexit() functions are not called.

Random Numbers
    #include <stdlib.h>int rand(void);
    void srand(unsigned int seed);
void
unsigned int seed rand will return a random value between 0 and RAND_MAX.
seed starts a new sequence of psuedo-random numbers.

The rand function will set the seed to 1 the first time that you call rand in your program unless you set it to something else.  The sequence of numbers that you get from rand will be in the same order if you set seed to the same value each time.  To get closer to truly random numbers you should set the seed to something that won't repeat.  time() is what I use in the example.

The example for this section has been combined with the sorting and searching.

Multibyte Conversions
    #include <stdlib.h>
    int mblen(const char *s, size_t n);
    int mbtowc(wchar_t *pwc, const char *s, size_t n);
    int wctomb(char *s, wchar_t wchar);
    size_t mbstowcs(wchar_t *pwcs, const char *s, size_t n);
    size_t mbstowcs(char *s, wchar_t *pwcs, size_t n);
This is that new fangled multilanguage character mapping stuff.  I don't think that I am qualified to write about it yet.  I will revisit it once I have covered everything else.  Or maybe someone else could tell us how to use these in everyday programming.



Bibliography:

The ANSI C Programming Language, Second Edition, Brian W. Kernighan, Dennis M. Ritchie, Printice Hall Software Series, 1988 The Standard C Library, P. J. Plauger, Printice Hall P T R, 1992
The Standard C Library, Parts 1, 2, and 3, Chuck Allison, C/C++ Users Journal, January, February, March 1995
STDLIB(3), BSD MANPAGE, Linux Programmer's Manual, 29 November 1993

No comments:

Post a Comment