C PROGRAMMING Considering that there are many feet of books on C at the UCSD bookstore, it is unrealistic to attempt a comprehensive overview of the language in part of a one-quarter course. Rather I hope to provide enough to get you started writing programs and familiar enough with the general idea that you will be able to pick up more skills as your programming experience grows. Why not C++? (1) I don't know C++ (2) Numerical Recipies is available in C, not C++ Why not FORTRAN? (1) More students now know C (2) Some people tell me C is better What if I already know C? (1) You may not need to take most of this course (2) But if you do, you can help other students (and me...) What C textbook would you recommend? I have only looked at a few books. Kernighan and Ritchie (KR) is a standard of sort and is quite readable. But the programming examples are computer-science oriented and not typical of scientific applications. I also bought Brooks (C Programming: The Essentials for Engineers and Scientists); this would be a good choice if you have little programming experience. The examples are more typical of geophysical programs than KR. Although I will try to make these notes self-contained and provide enough information to do the homework assignments, I advise you to buy or borrow a C book to use as a reference to fill in the many gaps. To standardize things, I am going to assume that everyone has access to a Sun workstation and the Sun C compiler. I do not want to have to deal with PCs, CodeWarrier, and pull-down menu stuff. Let's keep it simple, OK? --------------------------- HELLO WORLD ------------------------------------- The best way to learn a computer language is to start writing short programs that work and then gradually add complexity. The traditional first C program prints out "hello, world" and looks something like this: (hello.c) #include int main() { printf("hello, world\n"); return 0; } Let's examine this line by line: #include By itself, C is a quite low-level language that cannot do many of the things that one expects a language to be able to do. To expand the capabilities of the language many standard libraries are available in C installations but they must first be loaded. This is done at the beginning of the program. The library facilitates input and output and is the most commonly used library. As we will see later, the library is necessary for doing all but the most elementary mathematics. EVERY C PROGRAM THAT WE WRITE IN THIS CLASS WILL NEED Note that this line does NOT end in a semi-colon. int main() Every program in C is actually a series of functions. Every program must have one function called "main" that tells the compiler where to start executing. The "int" defines main as an integer function that will return an integer value. The "()" provides space for the function arguments; in this case there are none. Finally, the contents of the function are included between the curly brackets, { and }. For clarity, it is conventional to indent the contents of the function. printf("hello, world\n"); This statement sends "hello, world\n" to the screen. The quotes serve to define a character string. \n is C notation for a newline character so that the cursor advances to the next line. If you leave out \n then the cursor will remain at the end of the line. Thus, we could have written printf("hello, "); printf("world\n"); and we would get the same result. Forgetting the "\n" is a common mistake for beginning programmers (at least it was for me). Finally, you MUST end the statement with a semi-colon. This may seem unnatural if you are used to FORTRAN. A new line within a C function does not mean anything. For example, the program would run just fine as: #include int main() { printf ("hello, world\n"); return 0; } Similarly we can combine two statements on one line if they are separated by a semi-colon. The following will also work: #include int main(){printf("hello, world\n"); return 0;} There are differences in programming style. Some people seem to think that it is more efficient to put multiple statements on the same line. Perhaps they want to be able to brag that their program "only took five lines of code." This makes no sense to me. Their programs don't run any faster; they just are harder to read. Thus, for your class assignments, DO NOT PUT MORE THAN ONE STATEMENT ON EACH LINE. return 0; Functions normally return a value to the main program. In this case. main is an integer function so we return an integer value. We do this simply for the sake of completeness here because this value is never used. This example shows one way to write this program. KR uses a slightly different style: #include main() { printf("hello, world\n"); } This will run just fine. It is not required that the main function be defined as an integer or that a value be returned. Another convention is used by Numerical Recipies. i.e., #include int main(void) { printf("hello, world\n"); return 0; } This matches my convention except that "void" is used to explicitly specify that main has no arguments. ---------------- COMPILING AND RUNNING C PROGRAMS ---------------------- The C source code is just an ascii file (use your favorite text editor); the file name should end in .c, thus hello.c is the name of the program discussed above. To run the program, you first must create an executable version of the code by running the C compiler. On a UNIX system, this is done as follows: cc hello.c -o hello or make hello This should create the executable file "hello". However, if the compiler finds syntax errors in your code, you will get a list of error messages instead. To run the program, just enter the program name: rock% hello hello, world rock% If for some reason, you type "hello" and get a message something like "hello: Command not found" then you probably don't have "set path" set up correctly in your .cshrc file (see UNIX notes). ASSIGNMENT C1 Write a C program to print your favorite pithy phrase. ------------------------ HOW TO MULTIPLY TWO INTEGERS ------------------ Here is a simple C program to multiply 2 and 3: (multint.c) /* test program to multiply two integers */ #include int main() { int a, b, c; /* declare variables */ a = 2; b = 3; c = a * b; printf("Product = %d \n",c); return 0; } Notice that we have included some comment lines in this example. Comments in C are bracketed by /* and */ and can appear anywhere in the code. A common syntax error is to have a mismatch somewhere in these comment endpoints. After compiling, the result of running this program is: rock% multint Product = 6 rock% This program uses three variables: a, b, and c. All variables in C must be declared at the beginning of the function in which they are used. The "int" statement declares them as integers. For the Sun C compiler, these are 4-byte integers that can range from -2147483648 to 2147483647. (Alternatively, these variables could have been defined using float a, b, c; in which case they would be floating point (real) variables. For the Sun compiler, floats range from 1.175494e-38 to 3.402823e+38) The values of a and b are set on separate lines. Sometimes you will see this done in the variable definition, e.g., int a=2, b=3, c; This is permitted but, as a FORTRAN programmer, I'm not used to this and usually prefer to assign the variables on separate lines. c = a * b; This sets c to the product of a and b. As in FORTRAN, * indicates multiplication. Other arithmetic operations are as you might expect (+ for add, - for subtract, / for divide, parentheses can be used to define the order of operation). printf("Product = %d \n",c); This prints the string "Product = " followed by the value of c and a newline character. The "%d" is not printed; instead it defines the format in which c is printed. Some examples of different formats are: %d print as integer %8d print as integer at least 8 characters wide (including blanks) (useful to make output line up nicely in columns) %f print as floating point (real) number %7f print floating point as least 7 characters wide (including decimal point) %7.2f print as floating point at least 7 wide, with 2 after decimal point %c print single character %s print character string variable %i is sometimes used as an integer format but I don't recommend it because it causes input with leading zeros to be interpreted as octal! ASSIGNMENT C2 Write a program called multfloat.c that multiplies 2.88 and 3.14 and prints the output with two digits to the right of the decimal place. --------------- MAKING A TRIG TABLE USING A FOR STATEMENT -------------- Any reasonable programming language must provide a way to loop over a series of values for a variable. In FORTRAN this is done with the "do" statement. In C, this is most naturally implemented with the "for" statement. Here is an example program that generates a table of trig functions: (trigtable.c) #include #include int main() { float theta, stheta, ctheta, ttheta, degrad; degrad = 180./3.1415927; for (theta=0.0; theta<89.5; theta=theta+1.0){ ctheta = cos(theta/degrad); stheta = sin(theta/degrad); ttheta = tan(theta/degrad); printf("%5.1f %6.4f %6.4f %7.4f \n",theta,ctheta,stheta,ttheta); } return 0; } First, notice that we have included the math library so that we can access the trig functions. IN ORDER TO ACCESS THE MATH LIBRARY USING THE SUN COMPILER YOU MUST ALSO USE THE -lm COMPILER OPTION, e.g., cc -lm trigtable.c -o trigtable If you just enter "make trigtable" or "cc trigtable.c -o trigtable" you will get error messages for undefined symbols. This drove me nuts for awhile until I asked somebody about it. None of my C books ever mention this. If you would like the convenience of being able to continue saying simply "make trigtable" in order to compile the program, you will need to create a Makefile (see discussion later in notes for details). The C math library uses radians (not degrees) as the arguments of the trig functions. Thus, following the definitions of the variables as floats, we assign degrad to 180/pi so that we can easily make this conversion. for (theta=0.0; theta<89.5; theta=theta+1.0){ This defines the "for" loop which must eventually be closed with a "}" For clarity, we indent the inside of the loop. This is a loop over values of theta from a starting value of 0.0, incremented by 1.0 each time, until theta is not less than 89.5. Thus theta will assume the values (0.0, 1.0, 2.0, ...., 88.0, 89.0) inside the loop. Note that the statements inside "for ( )" must be separated by semi-colons, not commas. Inside the for loop we compute the cosine, sine and tangent of theta, after converting from degrees to radians by dividing by degrad. We output the results to the screen using a fixed floating point format so that the numbers line up nicely in columns. An alternative way to handle this loop is to use a "while" statement: theta = 0.0; while (theta<89.5){ ctheta = cos(theta/degrad); stheta = sin(theta/degrad); ttheta = tan(theta/degrad); printf("%5.1f %6.4f %6.4f %7.4f \n",theta,ctheta,stheta,ttheta); theta=theta+1.0; } This is a little less natural for this example, however. The conversion variable, degrad, has a single fixed value in this program. It also is likely to be useful in other C functions that we may want to add to the program. Thus it is good programming practice to define degrad as a "symbolic constant" that can be used thoughout the program. In our case, the trig table program would look like this: #include #include #define DEGRAD 57.29577951 int main() { float theta, stheta, ctheta, ttheta; for (theta=0.0; theta<89.5; theta=theta+1.0){ ctheta = cos(theta/DEGRAD); stheta = sin(theta/DEGRAD); ttheta = tan(theta/DEGRAD); printf("%5.1f %6.4f %6.4f %7.4f \n",theta,ctheta,stheta,ttheta); } return 0; } By convention, symbolic constant names are written in upper case so that they can be distinguished from variable names (usually written in lower case). Whenever a symbolic constant appears in the code, it is replaced by the text (character string) listed in the #define statement. Student question: Why will the following statement not work in the program? #define DEGRAD 180./3.1415927 Why will this alternative work? #define DEGRAD (180./3.1415927) Why is this not as efficient as DEGRAD 57.29577951 ? ASSIGNMENT C3 Write a C program to print a table of x, sinh(x), and cosh(x) for values of x ranging from 0.0 to 6.0 at increments of 0.5. Use a suitable format to make nicely aligned columns of numbers. Note: You can use the cosh and sinh functions in the math library. x is NOT in degrees. Do not apply the DEGRAD conversion factor. --------------- IF STATEMENTS + HOW TO FIND MAX(A,B) ---------------- Suppose we want to assign z to the maximum of x and y. There is no built-in MAX function in C like there is in FORTRAN or MATLAB so we will have to figure out how to do this ourselves. One approach is to use an "if" statement: z = x; if (y > x) z = y; This is not very aesthetic but it does the job. The statement inside the parenthesis is tested; if true, the following statement (or a series of statements enclosed in braces) is executed. Here is a summary of the common operators used inside the "if" test and their FORTRAN and MATLAB equivalents C FORTRAN MATLAB meaning == .eq. == equals != .ne. ~= does not equal < .lt. < less than > .gt. > greater than <= .le. <= less than or equal to >= .ge. >= greater than or equal to && .and. & and || .or. | or These operators can be combined to make complex tests.e.g, if ( (a > b && c <= 0 ) || d = 0) z == a; There is, of course, an order of operations for these things which I can't remember very well. Look it up in a C book if you are unsure or, better, just put in enough parenthesis to make it completely clear to anyone reading your code. WARNING: If you use = instead of == in a conditional statement you will probably not get an error message (variable assignments are allowed within conditionals) but you will get strange results. These operators are also used in the FOR and WHILE loops that we discussed earlier. For example, we could write: for (i=0; i<=10 && error != -1; i=i+1) { } In this case we loop from i=0 to i=10, but exit the loop immediately if error = -1 (the error variable could be assigned to this in the code inside the loop). Now, let's return to the example for finding the maximum. Our snippet of code is rather ugly and is inefficient in the sense that z is assigned twice if y>x. We can improve things by using an if-else construction: if (y < x) z = x; else z = y; Now z is only assigned once. Although this takes more lines, it is a better way to code this because it is more efficient and it is easier to understand the author's intent. More generally the if-else construction has the form: if (condition1) {statements 1} else if (condition2) {statements 2} ... ... else {statements n} The program checks each condition in turn. As soon as one is true it executes the appropriate statement(s) and then skips to the end of the construction. If no condition is found to be true, then the statements following "else" are executed. If the final else is absent then no statements are excuted. Our use of "if-else" to find the maximum has an alternative in C called the conditional expression. In our case it would look like this: z = (x > y) ? x : y; /* z = max(x, y) /* ? is called a ternary operator. z is assigned to x or y depending upon whether (x > y) is true. More generally, we have the conditional expression: expr1 ? expr2 : expr3 expr1 is evaluated first. If it is true (non-zero) then expr2 is evaluated and becomes the value of the expression, otherwise expr3 is evaluated and becomes the value of the expression. KR has a cute example of this: printf("You have %d items%s.\n", n, n==1 ? "" : "s"); Finally, you may need to compute max(x,y) many times in your program and you'd like the convenience of actually having a max() function like FORTRAN has. You could write a C function (more about this later) but you would have to do this separately for real and integer variables. A more general method is to use #define to create a macro for MAX, i.e., #define MAX(A,B) ( (A) > (B) ? (A) : (B) ) As in the DEGRAD example above, this statement goes at the top of your program following the #include statements. Now in the program, the line z = MAX(x+y, x-y); will be replaced by the line z = ( (x+y) > (x-y) ? (x+y) : (x-y) ) ------------------ INPUT USING KEYBOARD -------------- So far all of our example programs have run without prompting the user for any information. To expand our abilities, let's learn how to use the "scanf" function, which is the input version of printf. In most programs, we will want to first prompt the user to input the data, so here is an example of how to input two numbers: printf("Enter two integers \n"); scanf("%d %d", &a, &b); The formats are identical to those used in printf, in this case %d and %d indicate that integers are to be input. Now for the confusing part! a and b are not input in scanf in the same way that they would be output in printf. They must be input as what are called "pointers" in scanf. We will have more to say about pointers later; for now the only thing to remember is that the variable names in scanf MUST be preceded by & so we write &a and &b, not a and b. If you forget this, the program will likely compile without error but you will probably get a "Segmentation Fault" message right after you input the numbers during run time. (Here I must add an additional warning. Do NOT include any extra spaces at the end of the format string within the scanf call. For example, do not say scanf("%d %d ", &a, &b); /* THIS WILL NOT WORK PROPERLY!!! */ This will not give a compiler error, but the computer will not move to the next line if you hit a return after entering the second number. Instead, it will wait until you enter another number (or a blank space). This will completely foul up your inputs! This unfortunate behavior of scanf is actually related to a more advanced feature of scanf, its ability to search for strings in the input stream that match a desired target string in the format string. This may be useful in some circumstances if particular data values are "flagged" with ascii codes in the input stream. ) Here is an example of a complete program that multiplies two numbers: (usermult.c) #include int main(){ int a, b, c; printf("Enter two integers \n"); scanf("%d %d", &a, &b); c = a * b; printf("Product = %d \n",c); return 0; } Running this program, we have: rock% usermult Enter two integers 3 5 Product = 15 So far, so good. But what if the user is bad at reading directions and enters a real number by mistake? rock% usermult Enter two integers 3.1 15 Product = -12790968 Clearly our program is not very robust w.r.t. this kind of error. Unfortunately, there is no C equivalent to the FORTRAN free-format read statement (i.e., read *, a, b). One solution would be rewrite the program to read, multiply and output real numbers (floats). In this case there is no way to accidentally read an integer (3 gets read in as 3.0). But this program would still be vulnerable to problems if the user were to enter a character (letter) by mistake rather than a number. Suppose we want to keep the construction of the original program. How can we check for problems? The most direct approach takes advantage of the fact that scanf is actually a function that returns the number of input items successfully read. In this case, this should be 2 so a more robust program would be: (usermult2.c) #include int main(){ int a, b, c; printf("Enter two integers \n"); if ( scanf("%d %d", &a, &b) == 2){ c = a * b; printf("Product = %d \n",c); } else printf("***Error in input \n"); return 0; } Notice that the scanf statement now appears within the if statement. This will look strange to FORTRAN programmers, but this type of construction is extremely common in C. Careless users running this program will now obtain: rock% usermult2 Enter two integers 3.1 5 Error in input rock% This example highlights a key goal in good programming--to anticipate and flag potential problems as the code is being written. This may seem like a trivial example, but this kind of input error could occur in the middle of a complicated program and not cause the program to crash, just to get the wrong answer! Spending a little extra time in writing robust code can save hours of debugging later. Finally, let's change the code so that it will allow the user to continue entering numbers until he/she wants to stop: (usermult3.c) #include int main(){ int a, b, c; while (1) { printf("Enter two integers (zeros to stop)\n"); if ( scanf("%d %d", &a, &b) != 2){ printf("***Error in input \n"); fflush(stdin); } else if (a == 0 && b == 0) break; else { c = a * b; printf("Product = %d \n",c); } } return 0; } There are some new things in this example: (1) while(1) is used for a while loop that will continue forever. Previously we tested a condition to see if it continued to be true. In C, true conditionals are assigned the value 1; thus 1 is always true and the while loop will continue forever. (2) We use the "break" statement to manually exit the while loop in the case where both a and b are zero. "break" is an extremely useful command and can also be used to exit "for" loops. (3) We use "fflush(stdin)" to clear the keyboard buffer following the input error so that we can safely try inputting again. It is often a good idea to precede any scanf statement following the first with fflush(stdin). ASSIGNMENT C4 Write a program to repeatedly prompt the user for a positive real number. If it is negative, ask the user to try again. If it is positive, compute and display the square root using the sqrt() function in (remember to use the -lm option with the Sun compiler. Stop the program if the user enters a zero. Make the program robust with respect to input errors. ------------------------- i++ and i+= SYNTAX ------------------------ So far we used the following syntax when we have wanted to increase a variable by one: i = i + 1; This is normal usage in BASIC or FORTRAN. C, however, allows a more compact expression using the increment operator ++ i++; which has exactly the same meaning. In addition, the two FORTRAN statements: j = i i = i + 1 can now be expressed as: j = i++; Note that one is added to i AFTER j is set to i. We can also write: j = ++i; in which case the one is added BEFORE setting j to i. In other words, this last example in FORTRAN is: i = i + 1 j = i There are corresponding C statements using the decrement operator (--), "i--" and "--i", that work in the obvious way. A related feature of C is to be able to express i = i + 2 as: i += 2; In this case += is called an assignment operator. There are related expressions using -, * and /, i.e., n -= 10; means n = n - 10 x *= 2.0; means x = 2.0 * x zeta /= 8.3; means zeta = zeta/8.3 In all cases of ++, *=, etc., the arguments can be integers or floats. Some C books assert that using the increment or assignment operators (as opposed to their more verbose equivalents) can sometimes make your code run faster. I tested this idea on a few examples, but always found that there was no difference in speed, implying that the Sun compiler is smart enough to optimize things correctly in either case. Thus, if you are more comfortable using the longer versions of these statements, I see no compelling reason to stop (and your code will be more readable to novices in C!). However, you should still be aware of what these statements mean so that you can understand existing C code that you may encounter. Most C programmers will use ++, *=, etc., in order to generate more compact code. --------------------- GREATEST COMMON FACTOR EXAMPLE ------------------ Here is an example program that uses some of the concepts that we have just learned: (gcf.c) /* computes the greatest common factor of two integers */ #include #define MIN(A,B) ( (A) < (B) ? (A) : (B) ) int main() { int a, b, i, imax; while (1) { printf("Enter two integers (zeros to stop)\n"); if ( scanf("%d %d", &a, &b) != 2){ printf("***Error in input \n"); fflush(stdin); } else if (a == 0 && b == 0) break; else { for (i=1; i<=MIN(a,b); i++){ if (a % i == 0 && b % i == 0) imax=i; } printf("Greatest common factor = %d \n",imax); } } return 0; } Remember that % is the modulus operator that returns the remainder of a % i. This is not a particularly efficient algorithm, but it runs plenty fast for small numbers. ASSIGNMENT C5 Modify gfc.c to compute the least common multiple of two integers. -------------------------- USER DEFINED FUNCTIONS ------------------------ As the length and complexity of a computer program grows, it is a good strategy to break the problem down into smaller pieces by defining functions to perform smaller tasks. This provides several advantages: (1) You can test these functions individually to see if they work before trying to get the complete program to work. (2) Your code is more modular and easier to understand. (3) It is easier to use parts of the program in a different program. To illustrate how to define your own function, here again is the greatest common factor program: (gcf2.c) #include #define MIN(A,B) ( (A) < (B) ? (A) : (B) ) int getgcf(int a, int b); /* computes greatest common factor */ int main() { int a, b; while (1) { printf("Enter two integers (zeros to stop)\n"); if ( scanf("%d %d", &a, &b) != 2){ printf("***Error in input \n"); fflush(stdin); } else if (a == 0 && b == 0) break; else { printf("Greatest common factor = %d \n",getgcf(a,b)); } } return 0; } int getgcf(int a, int b) /* MIN must be available as global macro */ { int i, imax; for (i=1; i<=MIN(a,b); i++){ if (a % i == 0 && b % i == 0) imax=i; } return imax; } getgcf is a user-defined function that computes the greatest common factor of two numbers. A program can contain any number of user-defined functions and these functions can call each other as well as be called from the main program. The line immediately before the main program int getgcf(int a, int b); is called a "function prototype" and alerts the compiler to the existence of the function and its arguments. The use of function prototypes was introduced into C to improve its ability to detect errors. Older C programs do not always include these statements. If function prototypes are not used, however, then the functions must be included in the program BEFORE they are called by any other functions. In this case, the program source code will have the awkward structure of having the "main" part of the program come last! Thus, function prototypes should be used to improve the clarity of your code and so that your programs will conform to the newer ANSI C standard. The "int" indicates that the function will return an integer value. The argument list follows inside the parenthesis and indicates what type of numbers are to be passed to the function. The actual variable names in the function prototype (a and b in this example) are optional and are not actually used by the program. They are chosen at the convenience of the programmer and often simply match the variable names in the function or in the calling program. The actual function is normally included after the main program, although the code would still work if the main program came last. In the argument list, a and b are the names of the variables that are used in the function; they must agree in number and type (int, float, etc.) with the variables in the calling program, but they can have different names. The values of a and b in this example are passed to the function, but they are not passed back to the main program. Only the returned value of the function (imax) in this case goes back to the calling program. This is a big difference between C and FORTRAN. In a FORTRAN subroutine, the variables in the argument list are passed back to the main program. In a C function, the variable in the function can be changed without these changes affecting the values of the variables in the calling program. We will have more to say about this difference when we talk about pointers. ASSIGNMENT C6 Modify your program from C5 to compute the least-common multiple as a user-defined function. ASSIGNMENT C7 Write a function that computes the volume of a sphere, given its radius, together with a main program that inputs different values and outputs the result. -------------------------------- ARRAYS -------------------------------- Arrays are indicated by brackets following the variable name and are defined, for example, as follows: float x[100]; x can now assume 100 different values, from x[0] to x[99]. Yes, you read correctly. C assumes that the array indices start with zero! This is a BIG difference from FORTRAN and can be very confusing. In this example, x[100] is out of bounds! Thus, if you want to be able to refer to x[1] through x[100], you should define x as: float x[101]; In this case, x[0] is wasted. If this bothers you, we will show a trick later that uses pointers to avoid this slight inefficiency. Unfortunately, there is no equivalent in C to the FORTRAN usage real x(-100:100),y(0:5) that explicitly allows you to choose the index range of each variable. It is possible to write C functions, however, that provide this kind of control on array indices, and Numerical Recipes includes these in its nrutil.c package. Another big difference from FORTRAN is that brackets are used for array indices, not parenthesis. You will get an error if you write x(15) when you meant x[15]. Finally, higher dimensional arrays are indicated with multiple sets of brackets, not parentheses within a single set of brackets. For example, a 2-dimensional array element x(2,3) in FORTRAN is written in C as: x[2][3] We will have more to say about high-dimensional arrays later. Here is an example program that uses an array to compute prime numbers less than 100: (prime.c) #include #define MAXNUM 1000 int main() { int i, j, prod[MAXNUM+1], nprime=0; for (i=1; i <= MAXNUM; i++) prod[i] = 0; /* set prod array to zero */ for (i=2; i*i <= MAXNUM; i++){ if (prod[i] == 0) { for (j=2; i*j <= MAXNUM; j++) prod[i*j] = 1; } } for (i=2; i <= MAXNUM; i++){ if (prod[i] == 0) { nprime++; printf("%3d ", i); if (nprime % 10 == 0) printf("\n"); } } printf("\n Number of primes found = %d \n",nprime); return 0; } The method is sometimes called the sieve of Eratosthenes, named after a Greek mathematician from the 3rd century BC. We start with a list of numbers from 2 to 1000 and consider them all possible primes. In the program, this list is the array prod. We dimension the array to 1000 + 1 = 1001 so that the final array element is prod[1000]. (recall that arrays start with zero) We initialize the array by setting its values to zero. The strategy is to then flag numbers that are not prime by setting the corresponding array values to one. We then start with the number 2 and eliminate all multiples of 2 up to the maximum value of 1000. We then move to 3 and eliminate (i.e., set prod to 1) all multiples of 3. We can skip 4 because 4 and all its multiples were already eliminated. We need check numbers, i, only up to i*i<=1000 because larger factors would already have been eliminated. When we are finished, we simply print out all numbers that were not eliminated. In our case, this is all i such that prod[i] = 0. We count the number of primes using the counter variable nprime. Note that we output a linefeed after every ten primes so that the output lines do not get too long. ASSIGNMENT C8 The Sun C compiler includes a crude random number generator. Here is an example program that prints out 20 random values from between 0 and 1. (listrand.c) #include #include /* contains srand and rand functions */ int main() { int i; float xran; srand(5); /* initialize generator */ printf("RAND_MAX = %d \n",RAND_MAX); for (i=1; i<=20; i++) { xran = (float)rand()/(float)(RAND_MAX+1); printf("%f \n",xran); } return 0; } rand() returns an integer between 1 and RAND_MAX. Notice how integers can be converted to floating point numbers by putting (float) in front. In a similar fashion you can explicitly convert floats to integers by putting (int) in front (in this case, everything to the right of the decimal point is truncated). Because RAND_MAX for the Sun C compiler is only 32767, there are at most 32767 "random" values returned by this generator. Thus, this random number generater returns only two-byte integers, and, according to Numerical Recipes should be "categorically rejected" for this reason (NR includes code for better random number generators in case you are interested). Write a simple C program to generate 10000 random numbers between 0 and 1. Test the randomness by counting the number of these that fall in each of 10 evenly spaced bins between 0 and 1 (i.e., 0 to 0.1, 0.1 to 0.2, etc.). Print these counts to the screen. Here is an example of what your output might look like: 0 1009 1 1048 2 1001 3 1038 4 1008 5 959 6 993 7 925 8 1017 9 1002 HINT: Set up an integer array with 10 elements and initialize all the values to zero. For each random number generated, then add one to the appropriate array element to keep track of how many of the numbers are in that bin. --------------------- ARRAYS AS FUNCTION ARGUMENTS ------------- Suppose we have defined an array x as: int x[100]; If we use x in the argument list for a function, e.g., y = sumtot(x); we include x without any brackets. It turns out that this means that x is passed by a pointer to its first element rather than by value (more about pointers later). This means that if values of x are changed in a subroutine, these passed values will be returned to the main program (unlike the case for non-array values). For array arguments, C functions act just like FORTRAN subroutines. It is for non-array arguments that the behavior of C is different. The important thing to remember is that in C array variables and non-array variables are treated differently in function calls. ---------------- CHARACTERS AND CHARACTER ARRAYS ------------ A single character variable can be defined, specified, and printed like this: char grade; grade = 'A'; printf("Your grade in this class is: %c \n",grade); Notice that the single quotes (apostrophes) are used to write 'A' because double quotes (") are used to write strings, not a single character. Strings of multiple characters are handled as character arrays. Here is an example: char name[]="Bill Clinton"; printf("Our last president was %s \n",name); Notice that the assignment can be made in the type statement. The length of name consists of the 12 characters in "Bill Clinton" plus a terminating null character '\0' that all string constants must end with. This null character is added automatically in the assignment statement. Unfortunately, you cannot assign string constants like this name = "Bill Clinton"; /* THIS WILL NOT WORK!!! */ You could laboriously assign the elements of name manually (name[0]='B'; name[1]='i'; .... name[12]='\0') but a better approach is to use the strcpy (string copy) function in the library as in this example: (president.c) #include #include int main() { char name[81]; strcpy(name,"Bill Clinton"); printf("Our last president was %s \n",name); strcpy(name,"George Bush"); printf("Our current president is %s \n",name); return 0; } Be sure to define the size of the name array to be larger than any possible string that you might assign to name. In this example, name will handle strings of up to 80 characters. Otherwise you risk overwriting variables in the memory locations beyond the end of the array. Here is a list of some of the string functions. (see also p. 262 of Brooks). These assume that all input and output strings are null terminated. strcpy(s1,s2) copies s2 to s1, returns s1 strncpy(s1,s2,n) copies n characters of s2 to s1 strcat(s1,s2) appends s2 to end of s1, returns modified s1 strncat(s1,s2) appends at most n characters of s2 to s1 strlen(s1) returns length of s1, not counting null term. char. strcmp(s1,s2) compares s1 to s2, returns zero if s1 and s2 are identical negative value if s1s1 strncmp(s1,s2,n) like strcmp but compares at most n values of s1 and s2 strstr(s1,s2) returns a pointer to the first occurrence of string s2 within string s1, returns NULL otherwise Here is an example program that uses the strstr function and also demonstrates how to read strings from the keyboard: (vote.c) #include #include int main() { char name[81]; printf("Who did you vote for? \n"); fflush(stdin); fgets(name,81,stdin); printf("Then you are likely a"); if (strstr(name,"ill") != NULL || strstr(name,"lin") != NULL) printf(" Democrat \n"); else if (strstr(name,"eor") != NULL || strstr(name,"ush") != NULL) printf(" Republican \n"); else printf("n independent voter \n"); return 0; } We have chosen to use the library function "fgets" to read from the keyboard (stdin in this case). 81 gives the maximum number of characters that can be input. Sometimes C programs use the "gets" function to read strings from the keyboard but this is not recommended because "gets" does not set a maximum length and thus there is a risk of entering a string that is too long which will overwrite memory off the end of the allocated space for the string. fgets does have the awkward feature that it will include the linefeed character '\n' as the 2nd to last character in the string (before '\0' at the end). The function "scanf" can also be used to read a string from the keyboard as in: scanf("%s",name); The problem with this in our example is that any blank character will be interpreted as the end of the input. Thus, for example, if the user tried to input George Bush using scanf, the program would assign "George" to name and leave "Bush" in the input buffer. This could screw up the next input unless "fflush(stdin)" is called to clear the input buffer. ASSIGNMENT C9 Write a simple version of the famous doctor program that simulates a psychologist talking to a patient. A humorous fictional account of such a program is contained in "Small World," the David Lodge satire of academia. A current example of this type of program can be found at alicebot.org. Begin by asking the patient: "How are you?" Then use the fgets function to input a line from the user/patient. Search this line for various key words that you can then use to guide the computer's response. Design the program so that it will continue this "conversation" between doctor and patient. Some suggestions: If the input string contains a "?", then respond: "I'd prefer to ask you the questions. Why is it important to you?" If the string contains "!", then respond: "You seem pretty upset. What is really bothering you?" If the string contains "mother" then respond: "Tell me more about your mother." If the string contains any swear words, respond: "There is no need for that kind of language." If you identify no key words on your list, then respond with something generic like: "Go on." or "Tell me more about it." Be creative but don't spend an infinite amount of time on this assignment. Your program will be more realistic if you use random numbers (see above) to randomly select from a series of possible responses so that the computer does not keep saying the same thing. ------------------------- I/O WITH FILES -------------------------- So far, all off our examples have involved input from the keyboard and output to the screen. Now let us see how to open files, read from them, and write data to them. Here is a program that reads pairs of numbers from an input file, computes their product, and outputs the original numbers and their product to an output file. (fileinout.c) #include main() { FILE *fp1, *fp2; char infile[100], outfile[100]; float x, y, z; printf("Enter input file name \n"); scanf("%s",infile); fp1 = fopen(infile,"r"); if (fp1 == NULL) { printf("***Can't open file: %s\n", infile); return 1; } printf("Enter output file name \n"); fflush(stdin); scanf("%s",outfile); fp2 = fopen(outfile,"w"); if (fp2 == NULL) { printf("***Can't open file: %s\n", outfile); return 1; } while (1){ if (fscanf(fp1, "%f %f", &x, &y) == EOF) break; z = x * y; fprintf(fp2,"%8.3f %8.3f %8.3f \n", x, y, z); } fclose(fp1); fclose(fp2); return 0; } To open or close a file, we need to define pointers to the files. The two pointers that we need in this program are defined in: FILE *fp1, *fp2; We define strings infile and outfile to contain the names of the input and output files, respectively. We input the file names using the "scanf" function. Because the file names will normally not contain blanks, scanf works well for this purpose. However, we make sure to clear the input buffer using fflush(stdin) before attempting to read the second file name. Incidentally, if "fgets" were used here instead of "scanf" then the program would not work because the file name would end in a newline character and the program would not be able to find the file. This is an annoying "feature" of fgets. Strangely, the "gets" function does not put '\n' at the end of its input, but, as mentioned above, we are supposed to avoid use of "gets" these days. We open the first file with: fp1 = fopen(infile,"r"); The 2nd argument is called the file mode and can assume the following: "r" open an existing file for input "w" create a new file or position old file for overwriting "a" create a new file or position old file for appending These all assume that the files are in ascii (text). For binary files, use "rb", "rw" and "ra". If the program is not able to open the file it will set the file pointer to NULL. The program tests for this and prints out an appropriate message. To read data from a file, we use the "fscanf" command, which works in a similar way as "scanf" for keyboard input. Notice that the input variables, x and y, must be input as pointers (&x and &y). When the end of the file is reached, fscanf returns an EOF. The program tests for this in order to provide an exit from an otherwise infinite while loop. Finally, the program closes the two open files using the fclose statement. These closing statements could be deleted in the case of this program because open files are closed automatically upon normal termination of a program. However, it is good practice to close files whenever their I/O is finished. ASSIGNMENT C10 Write a program that reads from a text file containing 5 numbers per line. Compute the mean of the 5 numbers and then subtract the mean from each of the 5 original numbers. Output the 5 "demeaned" numbers to an output file. Allow the user to specify the input and output file names. For example, if the input file contains: 10 20 30 40 50 1 2 3 4 5 2 4 6 8 10 5 5 5 5 5 your program should write to the output file: -20.000 -10.000 0.000 10.000 20.000 -2.000 -1.000 0.000 1.000 2.000 -4.000 -2.000 0.000 2.000 4.000 0.000 0.000 0.000 0.000 0.000 Hint: Use the "fileinout.c" program listed above as a template for the file handling part of your code. --------------------- TWO-DIMENSIONAL ARRAYS -------------------- Two-dimensional arrays in C may be defined as follows: int a[4][3]; /* DO NOT WRITE a[4,3]!! */ This defines a matrix in which the first indices will vary from 0 to 3, the second from 0 to 2 (remember that all C array indices start with zero). Arrays can be initialized when they are defined, i.e., we could write: int a[4][3] = { {5, 3, 4}, {9, 1, 1}, {8, 7, 2}, {1, 1, 3} }; Note that the elements are stored such that one sweeps over the second index first. This is easy to remember is one thinks of a[i][j] as really representing (a[i])[j]. In fact, the internal brackets are not needed and the initialization could have been written as: int a[4][3] = {5, 3, 4, 9, 1, 1, 8, 7, 2, 1, 1, 3}; This same convention can, of course, be used to define vectors as well: int b[3] = {8, 6, 8}; Arrays dimensions can also be automatically set by the size of the initialization array, i.e., int b[] = {8, 6, 8}; will also work, as will: int a[][3] = { {5, 3, 4}, {9, 1, 1}, {8, 7, 2}, {1, 1, 3} } Notice in the two-dimensional case that the 2nd array dimensional is required. You cannot initialize the array as: int a[][] = ....... /* DOES NOT WORK */ Finally, if elements are left out of the array initialization, they will be set to zero. int a[2][3] = { {5, 3, 2}, {8, 0, 0} }; could have been written as: int a[2][3] = {5, 3, 2, 8}; Here is an example program that multiplies two 3x3 matrices: (matmult.c) #include #define DIM 3 int printmat(float a[DIM][DIM] ); int main() { int i, j, k; float a[DIM][DIM] = { {-5.1, 3.8, 4.2}, { 9.7, 1.3, -1.3}, {-8.0, -7.3, 2.2} }; float b[DIM][DIM] = { { 9.4, -6.2, 0.5}, {-5.1, 3.3, -2.2}, {-1.1, -1.8, 3.0} }; float c[DIM][DIM]; for (i=0; i . return 0; } Remember that in C, changes in the values of arrays listed in the function argument will be passed back to the main program. This is NOT true for ordinary variables. ---------------------- ARRAYS OF STRINGS ----------------------- As we have discussed, a string is just an array of characters. Thus, a 1-D array of strings will be a 2-D character array. Here is an example of this: (stringarray.c) #include #include int main() { char name[10][81]; strcpy(name[0],"Bill Clinton"); printf("Our last president was %s \n",name[0]); strcpy(name[1],"George Bush"); printf("Our current president is %s \n",name[1]); return 0; } This is a modified version of the president.c program. In this case, char name[10][81]; defines name to be an array of ten strings of 80 characters each (plus the null character \0). Note that the character length of the string must be the LAST array index. As shown in the example, we can now refer to the strings name[0], name[1], etc. --------- A MORE COMPLICATED EXAMPLE OF PROCESSING A DATA FILE ---------- Suppose we have a data file that contains a series of measurements that has the following form: (datafile1) 980204 22:03:34.42 10.8 10.2 9.8 11.6 980205 08:45:22.20 5.5 7.2 6.6 980205 20:13:42.88 15.0 13.8 12.9 14.2 etc. The structure is that there are date/time identifiers and then a series of measurements made at that time. The problem in reading this file is that we don't know how many measurement lines will follow each date/time line so we can't simply perform a formatted read on each line from the file. Let us assume that we want to compute the mean of these measurements and write a new file which stores this information on the same line as the date.time. In this case, our desired output file will look like: (datafile2) 980204 22:03:34.42 10.60 4 980205 08:45:22.20 6.43 3 980205 20:13:42.88 13.97 4 where the final column gives the number of measurements. Here is how such a program could be written: (procfile.c) #include #include main() { FILE *fp1, *fp2; char infile[100], outfile[100]; char linebuf[20], event[20]; int nevent, ndata; float x, sum, xavg; printf("Enter input file name \n"); scanf("%s",infile); fp1 = fopen(infile,"r"); if (fp1 == NULL) { printf("***Can't open file: %s\n", infile); return 1; } printf("Enter output file name \n"); fflush(stdin); scanf("%s",outfile); fp2 = fopen(outfile,"w"); if (fp2 == NULL) { printf("***Can't open file: %s\n", outfile); return 1; } nevent = 0; ndata = 0; sum = 0.0; while (1){ if (fgets(linebuf,20,fp1) == NULL) break; if (strncmp(linebuf," ",1) != 0){ /* no init blank, must be event */ nevent++; if (nevent !=1) { /* need to output previous event info */ if (ndata != 0) xavg = sum / (float)ndata; else xavg = 0.0; fprintf(fp2,"%s %7.2f %5d\n", event, xavg, ndata); ndata = 0; sum = 0.0; } strncpy(event,linebuf,18); } else { /* has starting blank, must be measurement */ sscanf(linebuf,"%f",&x); sum += x; ndata++; } } if (ndata != 0) xavg = sum / (float)ndata; else xavg = 0.0; fprintf(fp2,"%s %7.2f %5d \n", event, xavg, ndata); fclose(fp1); fclose(fp2); return 0; } The beginning of the program where the files are opened is the same as in fileinout.c (our earlier example of file I/O). We use the "fgets" function to read one line at a time into the string variable linebuf. Next, we use the strncmp function to check if the first character in linebuf is blank (" "). If it is not blank, then we know we have a data/time line. In our program, we term this an "event" and increment an event counter at this point. If we are not on the first event (normally the first line of the input file), then we must at this point compute the mean of the previous set of measurements and output the previous data/time line followed by this mean value. Continuing, we set the event string variable to linebuf so that we can keep this information while we overwrite linebuf later in the program. Note that we only copy 18 characters of linebuf to event. This is because fgets has the unfortunate property of including the linefeed character in the string that it inputs. We can exclude the linefeed from event by only copying the first 18 characters. If, on the other hand, the line read into linebuf does contain a leading blank, then we know we have a measurement line. In this case we need to read a number from the line. This is done using an "internal read" and the sscanf function: sscanf(linebuf,"%f",&x); sscanf works like fscanf except that a string variable takes the place of the file pointer. We read the measurement value into x and suitably increment a running sum of x and the number of measurements. When we reach the end of the input file, we have to output the results of the final set of measurements. Thus, the if (ndata != 0) xavg = sum / (float)ndata; else xavg = 0.0; fprintf(fp2,"%s %7.2f %5d \n", event, xavg, ndata); part of the code appears in two places, inside the loop as we output information from the last event before going on to the next one, and outside the loop where we output the information from the last event. Note that we must check to see if there are no measurements for an event line, in which case we do not divide by zero but simply set xavg to zero. There should be no possibility of mistaking this for a real measurement in the output file because the number of measurements in this case will also be zero. ASSIGNMENT C12 Obtain the data file, scsn.phase.dat, from mahi.ucsd.edu/pub/239. This is a list of earthquakes and arrival time data from the Southern California Seismic Network (SCSN) for the first two days of August 2000. (Files like this are readily obtainable from the SCEC Data Center at http://www.scecdc.scec.org if you ever want to obtain more of these data. The format is called the SCEC_DC phase format.) Here is what part of the file looks like: 2000/08/01 14:44:12.6 L 1.1 c 35.962 -117.693 5.2 A 9159024 8 51 0 0 WRC VHZ 911 P IU1 4.41 1.26 WCS VHZ 885 P E 3 9.98 1.85 WVP VHZ 925 P IU1 11.54 2.25 TOW VHZ 813 P E 2 18.28 3.78 CLC VHZ 202 P E 2 18.49 3.37 WMF VHZ 906 P E 3 22.86 3.94 WMF VHZ 906 S E 2 22.86 7.08 WNM VHZ 908 P E 3 23.74 4.23 2000/08/01 14:45:36.8 L 1.3 h 34.806 -116.271 7.3 A 9159025 7 79 0 0 CDY VHZ 184 P IU0 6.66 1.73 RAG VHZ 1034 P IU1 17.58 3.19 RAG VHZ 1034 S E 2 17.58 5.77 RMM VHZ 647 P ID1 37.23 6.33 TPM VHZ 1033 P IU1 52.29 8.88 GRP VHZ 355 P IU1 61.21 10.14 0299 C 65535 P IU1 11395.79 7.39 2000/08/01 15:04:17.9 L 1.4 h 34.810 -116.268 7.1 A 9159030 19 83 0 0 CDY VHZ 184 P IU0 6.76 1.73 RAG VHZ 1034 P IU1 17.76 3.16 etc. There is a line for each earthquake with the date and time (UT not local time). For the first line in the above example, the additional bits include: L = local event 1.1 = magnitude c = how magnitude was computed (coda in this case) 35.962 = lat -117.693 = lon 5.2 = depth (km) A = SCSN assigned quality (A=best) 9159024 = cusp id number 8 = number of lines of phase arrival time info The station info lines include: WRC = station name VHZ = station channel 911 = ? P = phase name (normally P or S) IU1 = phase pick info 4.41 = station-event distance (km) (=X for assignment) 1.26 = travel time (s from event origin time) (=T for assignment) Your task is to write a C program to read this file and write to an output file a series of (X,T) points with (one X-T pair per line) for all P arrivals for all events between 3 and 8 km depth. Then make an X-Y plot of the T-X points using PLOTXY. Finally, write a UNIX script that will run the program, run PLOTXY and display the plot on the screen, all by simply typing the name of the script. E-mail me a copy of the C program source code and the UNIX script. The good news about this data format is that it does include a number in the event line that lists the number of phase lines that will follow. The bad news is that sometimes the last 1 or 2 phase lines are "garbage" in that they have numbers instead of station names and are some kind of calibration info. You will have to figure out how to recognize and discard these lines in your program. Also the phase lines begin with a tab character rather than a long series of blanks; this may complicate things somewhat. Hint: Don't try to write your entire C program all at once. First, get a version working that simply reads the input file and prints out the parts of each line that you will need (i.e., the depth and number of phase lines for the event line, the phase name and the X and P values for the phase lines). Once you have this working correctly, then go on to add the part to output the (X,T) points. ------------------------- POINTERS ------------------------- OK, we can't put it off any longer. Let's talk about pointers. Any variable defined in a program is stored in a particular place in the computer memory. Its "pointer" is the address of that place in memory. Its value is the number that is stored in that location. Thus, all computer programs must use pointers, but some, like FORTRAN, take care of pointers internally so that the programmer does not generally need to be concerned with them. However, C is much more pointer oriented and knowledge of how pointers work is necessary for many C programs. The pointer of a variable in C can be referenced by adding a prefix of "&" so, for example, &x refers to the pointer of the variable named x. In the program x = value of x &x = pointer (memory address) for x You can also declare pointers and give them regular names (i.e., without the &). Pointers must refer to a specific type of variable (i.e., int, float, char, etc.) and this must be defined when the pointer is declared. Let us illustrate this be defining a pointer to an integer; we will call the pointer xpt int *xpt If we have a name defined as a pointer, we can refer to the value in the memory location by adding a prefix of "*" to the name. * is called the "dereferencing" or "indirection" operator. Within the program, we then can thus refer to xpt = pointer (memory location) *xpt = value at pointer position in memory So now let's illustrate how all this could work for a simple program: (testpoint.c) #include int main() { int x, y; int *pt; x = 1; pt = &x; /* pt1 now points to x */ y = *pt; /* y is now set to 1 */ *pt = 2; /* x is now set to 2 */ printf("%d %d %d \n",x,y,*pt); return 0; } This will print out 2 1 2 as the comment lines predict. We can use pointers if we want to write a function that changes the values of the input variables. Recall that C normally passes values to functions by value and returns only the value of the function itself. Here is an example of how a function can be written to compute both the area and circumference of a circle: (circle.c) #include #define PI 3.1415927 int areacircum(float r, float *area, float *circum); int main() { float r, area, circum; r = 1.33; areacircum(r, &area, &circum); printf("r, area, circum = %f %f %f \n", r, area, circum); return 0; } int areacircum(float r, float *area, float *circum) { *area = PI * r*r; *circum = 2. * PI * r; return 0; } The variable r is passed in the normal way to the function. However, area and circum are passed by their pointers (their memory locations). When their values are changed within the function, then these values are passed back to the main program. This is very useful because often you will want to write functions that return a number of different values, not just a single one. ---------------------- POINTERS AND ARRAYS -------------------- Pointers and arrays are closely linked. The name of an array is actually a pointer to the first array element. For example, if we define array x int x[5]; then plain x (no subscript) is a pointer to the value at x[0] (the first element in the array). If we add 1 to the pointer, then it will point to the next array element. Here is a short program to illustrate this: (testarraypt.c) #include int main() { int z[5] = {11, 12, 13, 14, 15}; int *pt; pt = z; printf("%d %d %d \n", *pt, *(pt+1), *(pt+2) ); printf("%d %d %d \n", *z, *(z+1), *(z+2) ); return 0; } The result of running this program will be: rock% testarraypt 11 12 13 11 12 13 rock% One might ask the question: How does the program know what the next memory address is? The answer will depend upon the type of variable that the pointer is addressing. If it is a (long) integer array, then it should move 4 btyes. However, if it is a character array, then it should move only 1 btye. Fortunately, the C compiler keeps track of the type of variable and shifts the appropriate amount. However, this makes it clear why pointers must be declared as referring to a specific kind of variable. We can use the ++ and -- operator with pointers as well, for example, pt++; would move the pointer to the next array element. The link between pointers and arrays should now clarify why array values are passed back to the main program when they are changed in a function. It is because arrays are passed to functions as pointers (the address of the first element in the array). A word of caution. The C compiler interprets x[10] as *(x+10) without checking to see if x was dimensioned large enough so that the value in this memory location was indeed assigned to x. There is no easy way to check whether array indices are out of bounds. You must therefore make very sure that you do not exceed the array limits in your program. A common error is to assume that if x is defined as: int x[10]; that x[10] exists! In fact, since x starts with x[0], the last array element is x[9]. --------------- POINTERS VS. ARRAY INDICES ------------------ Some C books assert that one should always perform operations on arrays using pointers rather than array indices because it will run much faster. Indeed this is sometimes claimed to be a big advantage of C. To test this, I compared the speeds of the following two subroutines for summing the elements of a vector: float sumarray1(float a[], int n){ int i; float sum=0.0; for (i=0; i #include "nrutil.c" int main() { float *a, **bb; a = vector(1,7); a[6] = 6.4; bb = matrix(1,13,1,9); bb[5][5] = 3.3; free_vector(a,1,7); free_matrix(bb,1,13,1,9); return 0; } This assumes that the nrutil.c file is within the same directory as the program. You can grab this file from the class web site or via anonymous ftp to mahi.ucsd.edu/pub/239. To use the vector functions, you must define your vector variable as a pointer (i.e., float *a). The nrutil vector functions include: vector(i1,i2) allocate float vector with range [i1....i2] ivector(i1,i2) allocate int vector with range [i1....i2] cvector(i1,i2) allocate char vector with range [i1....i2] In each case, i1 and i2 are long integers (the default for int with the Sun C compiler). To use the matrix functions, you must first define your matrix variable as a pointer to a pointer (i.e., float **b). The nrutil functions include: matrix(i1,i2,j1,j2) allocate float matrix with range [i1,i2][j1,j2] imatrix(i1,i2,j1,j2) allocate int matrix cmatrix(i1,i2,j1,j2) allocate char matrix These functions use the C function "malloc" for allocating memory. If your program will need this memory later, you can free it with the free_vector and free_matrix functions. This is one of the advantages of C over older version of FORTRAN in which array sizes are fixed and must be declared at the beginning of the program. In C, you can allocate and deallocate memory "on the fly." In scientific programming, it is common to have to deal with arrays of variable dimensions and to be able to pass these arrays to subroutines or functions. The nrutil routines provide a convenient way to be able to do this in C. Also included in this package are functions that allow one to convert arrays defined in the standard way in C to nrutil compatible arrays. This is often necessary in order to use the Numerical Recipes routines themselves. ----------------------- USING A MAKEFILE ------------------------ The #include "nrutil.c" statement in the above example, while sometimes convenient for short programs, is not the standard way to link with function libraries. It is not efficient because the nrutil.c code is compiled every time the main program is compiled. A better approach is to compile the nrutil.c code once and then link to this "object code" whenever another program requires it. The most convenient way to accomplish this is to use a Makefile. A Makefile is a file that sits in the same directory as your programs and has the name "Makefile" with the initial letter capitalized. The Makefile contains instructions about how you want your programs compiled. They can be very complicated, but we will just give some simple examples. Here is a Makefile that will compile the program trigtable.c trigtable : trigtable.c cc -lm trigtable.c -o trigtable Note that the first line just identifies the name of the program that we are "making" -- the executable file trigtable. This is considered the target of the Makefile. We follow this with a colon and then a list of everything upon which the target depends, in this case just the source code trigtable.c. The next line MUST start with a tab. If you use spaces, it will not work! The "cc -lm trigtable.c -o trigtable" is simply what will happen when you enter "make trigtable" at the UNIX command level. The advantage of using a Makefile in this example is that it allows us to avoid having to type in "cc -lm trigtable.c -o trigtable" everytime we want to recompile the program. Now let's see how we can link to the nrutil code. First, we change our definearray.c program to start with: (definearray2.c) #include #include "nrutil.h" Notice that we include the header file "nrutil.h" this time rather than the nrutil.c source code. If we try to compile the program without a Makefile we will get error messages because the computer will not be able to find the vector and matrix functions. So we must create an object code for nrutil by entering at the UNIX command level: cc -c nrutil.c This will create the file nrutil.o which is what we will want to include with our program if we are to access the functions. In the make file, we write definearray2 : definearray2.c nrutil.o cc definearray2.c nrutil.o -o definearray2 In this way, we include the nrutil functions at the time that we compile the definearray2 program. We list nrutil.o both on the first line to indicate that the target, definearray2, depends upon it, and on the second line when we compile the main program. When Makefile is used, the computer will check to see if definearray2.c or nrutil.o have been changed since the last time the target, definearray2, was created. If not, Makefile will say that definearray2 is up to date and will not recompile it. -------------- SORTING EXAMPLE USING NR UTILITIES --------------- To further illustrate the use of the Numerical Recipes routines, here is an example program that generates a list of random numbers, sorts them using the NR function piksrt, and then prints out the sorted list. (sortrand.c) #include #include /* contains srand and rand functions */ #include "nrutil.c" #define NPTS 100 void piksrt(int n, float arr[]); int main() { float *x; int i, j; x = vector(1,NPTS); srand(5); /* initialize generator */ for (i=1; i<=NPTS; i++) x[i] = (float)rand()/(float)(RAND_MAX+1); piksrt(NPTS,x); for (i=0; i 0 && arr[i] > a) { arr[i+1] = arr[i]; i--; } arr[i+1] = a; } } It is often the case that we have functions that we will want to use in more than one program. In this case, we can save them individually or within a subroutine library. For example, we can save piksrt as piksrt.c: (piksrt.c) void piksrt(int n, float arr[]) /* sorts an array arr[1,n] into ascending order by straight insertion. n is input; arr is replaced on output by its sorted rearrangement WARNING: Very slow for large n, use only for n < 50 or so */ { int i, j; float a; for (j=2; j<=n; j++) { a = arr[j]; i = j - 1; while (i > 0 && arr[i] > a) { arr[i+1] = arr[i]; i--; } arr[i+1] = a; } } and then our main program becomes: (sortrand2.c) #include #include /* contains srand and rand functions */ #include "nrutil.c" #include "piksrt.c" #define NPTS 100 int main() { float *x; int i, j; x = vector(1,NPTS); srand(5); /* initialize generator */ for (i=1; i<=NPTS; i++) x[i] = (float)rand()/(float)(RAND_MAX+1); piksrt(NPTS,x); for (i=0; i. For simplicity in presenting this example, we have continued to use the somewhat inefficient method of including nrutil.c and piksrt.c at the start of the program. As discussed above, a better approach for larger programs is to include *.h files instead and then link to the function object files during compilation by using a Makefile. In this case we would have to create a piksrt.h file that would give the function prototypes for the functions contained in piksrt.c (in this case there is just one function in piksrt.c but more generally there might be many functions). ASSIGNMENT C13 Modify the sortrand program to return the median of a user-specified number of random numbers. Do the median computation in the form of a function that returns the median value, but does NOT resort the input array in the process. Hint: Within your median-computing function, copy the input array to another array before calling piksrt. Design your function for x[1, ....., n], not x[0, 1, ... n-1]. This means that you will want to use the NR vector function to define the temporary array for input into piksrt. ------------------ EXAMPLE USING STATIC VARIABLES ------------------ A common convention in geophysics is to name an instrument site with a short ascii string. In seismology, for example, station names are usually designated with 3 to 4 character names. Southern California GPS sites are usually identified with 4 character names. Often data products are labeled by the station name, without including information about the station, such as its location. Data processing programs will often require this information. Thus, there is frequently a need for a computer routine that will retrieve information about a station, given the station name. One could hardwire this information into the program with a series of if statements, but this would become very cumbersome for large numbers of stations and would limit the flexibility and portability of the code. A better approach is to save the station information in a file and have the computer read from the file to access the information when necessary. However, file I/O is relatively slow so we won't want to open, read, and close the file every time we need to determine a station location. It would be much faster to read the station information file once and then save the information during the time that the program is running. For portability, ideally all of this overhead would be performed in a function that we could use in many programs without having to worry about the format of the station file or the size of the array that are required to store it. In order to do this, we need to retain the values of certain variables within the function for subsequent calls. Here is an example that will return station coordinates for some of the GSN (Global Seismic Network) stations. (testgetstat.c) #include #include int getstat(char stname[], float *slat, float *slon, float *sdep); int main() { char stname[5]; float slat, slon, selev; while (1) { printf("Enter station name \n"); scanf("%s",stname); if (getstat(stname, &slat, &slon, &selev) == 0) printf("lat, lon, dep = %f %f %f \n",slat,slon,selev); else printf("Station not found \n"); } return 0; } /* getstat returns station coordinates for given station name Upon first call, reads from file (STFILE) listing station coordinates. THIS FILE MUST BE IN ALPHABETICAL ORDER. Inputs: stname = station name Returns: slat = station latitude (degrees) slon = station longitude (degrees) selev = station elevation (km) */ #define STFILE "stlist.gsn" #define NMAX 5000 int getstat(char stname[], float *slat, float *slon, float *sdep) { static int firstcall=1, nsta=0; static char stlist[NMAX][5]; static float stlat[NMAX], stlon[NMAX], stdep[NMAX]; int i, i1, i2, iter, itest; FILE *fp1; if (firstcall){ firstcall = 0; printf("Reading station file: %s \n",STFILE); fp1 = fopen(STFILE,"r"); if (fp1 == NULL) { printf("***Can't open file: STFILE \n"); return 1; } for (i=0; i */ float sph_dist(float flat1, float flon1, float flat2, float flon2) { float pi=3.141592654, raddeg, theta1, theta2, phi1, phi2, ang, del; raddeg =pi/180.0; theta1 = (90.-flat1)*raddeg; theta2 = (90.-flat2)*raddeg; phi1 = flon1*raddeg; phi2 = flon2*raddeg; ang = acos(sin(theta1)*sin(theta2)*cos(phi2-phi1)+cos(theta1)*cos(theta2)); del = ang/raddeg; return del; } Don't bother to do the separate calculation for small values of del. The nearstat function should be of the form: int nearstat(float plat, float plon, char stname[], float *slat, float *slon, float *sdep); where plat and plon are the coordinates going into the function and stname, slat, slon and sdep are passed back from the function to the main program. HINT: Remember to use the string copy function, strcpy, to set stname once you have found the index for the closest station. ------------------- STRUCTURES AND COMPLEX NUMBERS ------------------ Unlike FORTRAN or MATLAB, C provides no direct method for defining and manipulating complex numbers. From a scientific programming perspective, this is a serious problem because complex numbers are essential for many problems. Fortunately, it is possible to get around this limitation in C by implementing user-defined functions that can handle complex arithmetic. Typically, these take advantage of a C feature, termed a structure, that we have not discussed so far. A structure in C is a user-defined array of data elements that may, or may not, be of the same data type. Suppose, for example, we wanted to set up a data base containing information (name, age, height, weight) about 5 different people. Here is a program that uses a structure to read in this information and print the name of the lightest person: (namebase.c) #include #define NAMELENGTH 12 #define NMAX 5 int main() { typedef struct { char name[NAMELENGTH]; int age; float height, weight; } person_type; person_type students[NMAX]; int i,ii; float weightmin; for (i=0; i #define NAMELENGTH 12 #define NMAX 5 typedef struct { char name[NAMELENGTH]; int age; float height, weight; } person_type; int getpeople(person_type students[NMAX]); person_type findlightest(person_type students[NMAX]); int main() { person_type students[NMAX], lightstudent; getpeople(students); lightstudent = findlightest(students); printf("%s is the lightest student \n", lightstudent.name ); return 0; } int getpeople(person_type students[NMAX]) { int i; for (i=0; i #include #include "complex.c" int main() { fcomplex a, b, c; float x, y, cabs; printf("Enter 1st complex number (real + imag) \n"); scanf("%f %f", &x, &y); a = Complex(x, y); printf("Enter 2nd complex number (real + imag) \n"); scanf("%f %f",&b.r, &b.i); c = Cmul(a, b); cabs = Cabs(c); printf("product = (%f, %f), cabs = %f \n", c.r, c.i, cabs); return 0; } Note that the fcomplex structure is defined within complex.c so it does not appear in the program. Within the program, the statement fcomplex a, b, c; defines a, b and c as complex. The real and imaginary members of b, for example, can then be accessed as b.r and b.i. Note that the dot operator is used to directly input the second complex number. These routines enable complex numbers to be manipulated almost as easily as they are in FORTRAN or C. Perhaps the biggest remaining disadvantage of C in this respect is that the code is less readable than the case when the arithmetic operators (+, -, *, /) can be used directly. ASSIGNMENT C15 Write a complex function called Cexp to add to the complex.c library that computes the complex exponential. Here is the formula e^z = e^(x+iy) = e^x e^(iy) = (e^x cos y) + i (e^x sin y) which you will have to adapt for your function. Include a driver program to test your function for selected input values. Make sure that your code gets the correct answer for these examples: exp( 1.1 + 2.3i) = -2.002 + 2.240i exp( 0.0 + 1.2i) = 0.362 + 0.932i exp(-0.5 + 0.0i) = 0.607 + 0.000i