Calling Fortran Subroutines from C/C++

Calling Fortran Subroutines from C/C++

A lot of excellent numerical software is written in Fortran, so it's often useful to use this software (i.e. to call Fortran subroutines and/or functions) from C and/or C++ programs. In this web page I discuss how to do this, as well as how to write C/C++ functions which can be called from Fortran code. Much of this material is the same for both C and C++, but where there are differences, I've written C++-only sections to describe the extra features C++ offers over C.

Table of Contents:

Link to my C programming page.
Link to my C++ programming page.
Link back to my scientific programming page.
Link back to my home page.


My fortran.h Header File

I have written a C/C++ header file fortran.h to assist in calling Fortran subroutines. This file defines most of the macros and typedefs described in this web page, for a number of different types of computer systems.

On the Institut für Theoretische Physik Unix machines (doppler, galileo, davinci, nottingham, mach, noether, etc.), this header file is installed in the directory /usr/local/share2/include/local/, so you can use it by putting

#include <local/fortran.h>
in your C/C++ program, and then giving the -I/usr/local/share2/include compiler option to gcc or g++. Or, you can download a copy of fortran.h for yourself. (If you extend it to handle additional types of computer systems; please send the changes back to me so I can incorporate them in my master copy.)

C++ Linkage Directives

If C++ code calls a function or subroutine written in another language, the C++ compiler generally needs to treat the call specially (i.e. according to the other language's calling conventions). To tell the C++ compiler how to do this, we add a "linkage directive" to the C function's C++ prototype, for example

extern "C" void Cfunc(int, int);
Here the extern "C" tells the C++ compiler to handle calls on Cfunc() in a C-compatible manner. Linkage directives are described in more detail in
Lippman and Lajoie, section 7.7.

In theory, this mechanism makes it very easy to call Fortran functions from C++ code, just use an extern "Fortran" linkage directive. Alas, most C++ compilers don't support this. :(

Instead, we usually treat Fortran code as if it were C for these purposes, i.e. we use an extern "C" linkage directive (which most C++ compilers do support) on the C++ prototype. In practice, this usually works.


Name Mangling

In order to call a Fortran subroutine or function from a C/C++ program, we first have to be able to name the Fortran routine. That is, given a Fortran subroutine or function, what name do we use to refer to this subroutine from within C/C++ code? In general this won't be precisely the same as the original Fortran name, but will instead be some "mangled" modification of it.

C++ normally also mangles names for its own purposes; this is invisible to the programmer so long as all code is written in C++. On most systems C doesn't mangle names this way, so an extern "C" linkage directive disables the mangling.

Name mangling is system- and compiler-dependent, but for many Unix systems and compilers, the name as seen from C will be the original Fortran name, converted to lower case, with an underscore appended. For example, the Fortran subroutines FOO and bar would be referred to as foo_ and bar_ respectively in C code.

Since name-mangling is system-dependent, it's best to encapsulate it inside a C preprocessor macro in a header file. For example, my fortran.h header file defines the macro

#define FORTRAN_NAME(foo)	/* something appropriate */
(Unfortunately, this doesn't encapsulate the convert-to-lower-case: that still has to be explicit in the code.)

Linking the Object Files Together

Here I want to digress a little bit from the programming, and say a little bit about linking C/C++ and Fortran object files together.

When we compile a program, say with the Unix command

gcc -o myprogram main.c sub.c -lm
there are really two very different operations going on:

Ordinarily, this all happens transparently "behind the scenes", and we don't need to worry about it. But for interlanguage calling we may need to manually intervene in the linking process. The reason is that Fortran programs may (and in practice usually do) make use of some additional subroutine libraries, to implement Fortran I/O and math functions. If you compile and link a Fortran program by using a Fortran compiler (for example g77, the Fortran part of gcc), the compiler knows to link these extra libraries in without you having to explicitly ask for them. But if you just compile your Fortran subroutines, and do the final linking (production of an executable binary) with a C/C++ compiler, that compiler won't know you to include the Fortran I/O library, and it probably won't include the system math library unless you explicitly ask it. The result is that the final C linking stage will fail with error messages about "undefined references", i.e. subroutines which were called (by the Fortran code) which the linker can't find.

There are a couple of different ways to solve the problem. To illustrate them, suppose our previous example had the subroutine sub.c change to be in Fortran, sub.f.

The simplest solution is to first just compile (but not link) your C code, then do the final linking with a Fortran compiler:

gcc -c main.c
g77 -o myprogram main.o sub.f
By specifying the -c option to gcc we tell it just compile (not link) main.c, producing the object file main.o. Then we use g77 to compile sub.f and link that with main.o. Because we're using g77 (a Fortran compiler) to do the linking, it knows to automatically include all of the extra libraries needed for Fortran programs, so everything works fine.

Unfortunately, this first solution fails if you're using C++, because C++ may have it's own support libraries, and there's no way to have two different language's compilers both do the final linking.

A more general solution (and the one I personally prefer) is to find out what extra libraries are used by the Fortran compiler, then specify them explicitly to the C/C++ compiler in the final linking stage. For example, on Unix g77 uses the extra library -lg2c and the system math library -lm. So, we could combine a C main program main.c with a Fortran subroutine sub.f with the Unix commands

g77 -c sub.f
gcc -o myprogram main.c sub.o -lg2c -lm
By specifying the -c option to g77, we tell it just compile (not link) sub.f, producing the object file sub.o. Then we use gcc to compile main.c and link that with sub.o and the g77 Fortran support library -lg2c, as well as the system math library -lm.

Unfortunately, the precise "extra libraries" needed by Fortran code vary from one compiler and operating system to the next. For g77 on Unix systems, the example above is generally correct. But for other compilers you may have to experiment a bit to find out what libraries you need -- this is usually poorly documented. (Possible ways to find out: the linking is generally done by running a separate program, normally called ld on Unix, so see if there's some option which will show you what libraries are being linked in when ld is run. In gcc and g77, there's a --verbose (or -V) option which causes the compiler to show you what it's doing internally. Alternatively, on Unix you can use the ps command to try and see the command line arguments to the ld process.)


Fortran Subroutines versus Functions

In Fortran, there's a distinction in the language syntax between a subroutine, which is invoked with a call statement and does not return a value, and a function, which is invoked from within an arithmetic expression and returns a value.

In C these two concepts are unified, and we always refer to functions. A Fortran subroutine is equivalent to a C function with a return type of void (i.e. the function doesn't return a value), while a Fortran function is equivalent to a C function with a non-void return type (i.e. the function returns a value).


Data Types

In order to do anything useful with C-to-Fortran calling, we need to be able to pass data (subroutine arguments, external/common variables, etc) back and forth between the languages. The basic data types are fairly similar, though the exact equivalencies depend on what type of computer system we're using, and often also on what C/C++/Fortran compilers and/or compiler options we're using.


All Fortran Subroutine/Function Arguments are Pointers

Fortran always uses "call by reference": If you call a subroutine or function, it actually gets passed pointers to the arguments you specify. For example, if you call a Fortran subroutine with call foo(x, y), the subroutine actually gets passed pointers to x and y.

In contrast, in C "what you see is what you get": functions get passed the actual values of their arguments. For example, if you call a C function with foo(x, y), the function gets passed the actual values of x and y. The only exception is for arrays, which are handled like in Fortran, i.e. C actually passes a pointer to the first array element. [I'm actually glossing over some subtle details here; see section 6, "Arrays and Pointers", of the (excellent) comp.lang.c FAQ for a more detailed discussion.]

What this means in practice is that to call a Fortran subroutine or function from a C program, you must explicitly pass pointers to any non-array arguments. Similarly, if Fortran code is going to call your C function, you must write the C function to receive pointers to any non-array arguments.

For example, suppose we have a Fortran function to compute L-k norms:

c       returns k-norm of x and y
        double precision function knorm(k, x, y)
        integer k
        double precision x, y
        return (x**k + y**k) ** (1.0d0/k)
        end
We might call this subroutine in Fortran with code like this:
        integer k
        double precision x, y, z

c             ...

        z = knorm(k, x, y)
Using the data-type typedefs and name-mangling macro from my fortran.h header file, we might call this subroutine from C using code like this:
        #include <local/fortran.h>
        
        /* prototype */
        double FORTRAN_NAME(knorm)(const integer *pk,
                                   const double *px, const double *py);
        
        /* ... */
        
        integer k;
        double x, y, z;
        
        /* ... */
        
        z = FORTRAN_NAME(knorm)(&k, &x, &y);
where we use & to create pointers to the variables. We could write the (Fortran-callable) subroutine itself in C like this:
        #include <math.h>
        #include <local/fortran.h>
        
        /*
         * returns k-norm of x and y
         * note Fortran-style interface: arguments passed via pointers
         */
        double FORTRAN_NAME(knorm)(const integer *pk,
                                   const double *px, const double *py);
        {
        /* fetch the arguments via their pointers */
        integer k = *pk;
        double x = *px;
        double y = *py;
        
        double dk = k;
        return pow( pow(x,dk) + pow(y,dk)  ,  1.0/dk );
        }
where we use * to explicitly fetch the original values via their pointers. (Computer scientists call this "dereferencing" the pointers.)

Now suppose we want to use this subroutine to compute 2-norms, i.e. we want to pass the constant 2 for the p argument. In Fortran we could do this with the obvious code:

        double precision x, y, z

c             ...

        z = knorm(2, x, y)
but in C things are a bit trickier: In C a pointer is only allowed to point to an "lvalue" (roughly speaking, a named variable, an array element, a structure member, or * of a pointer), and a constant isn't an lvalue. The solution (which is just what the Fortran compiler would do internally to implement the above Fortran code) is to store the constant in a temporary variable, then pass a pointer to that:
#include <local/fortran.h>

double FORTRAN_NAME(knorm)(const integer *pk,
                           const double *px, const double *py);
const integer k = 2;
double x, y, z;

/* ... */

z = FORTRAN_NAME(knorm)(&k, &x, &y);

C++ References

Everything in the previous section still applies to C++.

However, C++ offers an additional way of handling function arguments: When you write a C++ function, you can choose on an argument-by-argument basis either C-style "call by value", (the function gets passed the actual value of the argument), or call by reference (the function gets passed a C++ reference to the argument. (A C++ reference is essentially a hidden pointer which is automagically dereferenced whever it's used.) For example, the C++ function

void foo(double x, double &y)
{
// ...
}
will be passed the actual value of x, but a reference (a hidden pointer) to y.

You can use this mechanism to make it easier to call Fortran subroutines/functions from C++: If you tell the C++ compiler that the Fortran subroutine/function expects references to all its non-array arguments (we'll talk about array arguments later), the C++ compiler will automagically create references as necessary when you call the Fortran subroutine/function.

For example, let's take another look at the example in the previous section: Instead of using a prototype which tells the C++ compiler the Fortran function expects pointers to its arguments,

#include <local/fortran.h>
double FORTRAN_NAME(knorm)(const integer *pk,
                           const double *px, const double *py);
in C++ we can instead use a prototype which tells the C++ compiler the Fortran function expects references to its arguments:
#include <local/fortran.h>
double FORTRAN_NAME(knorm)(const integer &pk,
                           const double &px, const double &py);
Using such a prototype, we can now call the function from C++ code in a more natural way:
#include <local/fortran.h>
double FORTRAN_NAME(knorm)(const integer &pk,
                           const double &px, const double &py);

integer k;
double x, y, z;

/* ... */

z = FORTRAN_NAME(knorm)(k, x, y);
Voila!

An additional advantage of using references this way is that the C++ compiler will now automagically create references to constants for us whenever they're needed, so we can pass constant arguments in a natural manner:

#include <local/fortran.h>
double FORTRAN_NAME(knorm)(const integer &pk,
                           const double &px, const double &py);

double x, y;

/* ... */

double z = FORTRAN_NAME(knorm)(2, x, y);

However, references don't solve all Fortran interfacing problems in C++. In particular, you should still pass arrays to Fortran the same way as you would from C. That is, in C++ you should prototype a Fortran array argument as a pointer to the first array element.


Array Indexing Origin

Almost all nontrivial Fortran code uses arrays. C and Fortran arrays are fairly similar, but there are are a couple of key differences.

Probably the most important difference is that array indices in Fortran start at 1, while in C they start at 0. For example, given an array foo, the Fortran array element foo(3) is the same as the C array element foo[2]. More generally, given an array foo with N elements, in Fortran the array elements are foo(1), foo(2), foo(3), ..., foo(N), while in C they're foo[0], foo[1], foo[2], ..., foo[N-1].

For example, suppose we have a Fortran function to compute the mean value of an array:

c       compute mean of a(1...10)
        double precision function mean(a)
        double precision a(10)

        integer i
        double precision mu

        mu = 0.0d0
                do 100 i = 1, 10
                mu = mu + a(i)
100             continue
        mean = 0.1 * mu
        return
        end
Here the loop runs over indices i from 1 to 10.

Using the data-type typedefs and name-mangling macro from my fortran.h header file, we could write this same function (still Fortran-callable) in C/C++ like this:

/* compute mean of a[0...9] */
double FORTRAN_NAME(mean)(double a[10])
{
int i;
double mu = 0.0;

        for (i = 0 ; i < 10 ; ++i)
        {
        mu += a[i];
        }

return 0.1 * mu;
}
Notice that the loop now runs over indices i from 0 to 9.

Variable-Size 1-Dimensional Arrays

In Fortran, it's legitimate (and quite common) for a subroutine or function to be passed an array whose size may vary from one call to the next. To do this, the subroutine/function just declares the array with the right size, typically obtained by having the caller pass the size as another argument. For example, a more generally useful version of example in the previous section might be

c       compute mean of a(1...n)
        double precision function mean(n, a)
        integer n
        double precision a(n)

        integer i
        double precision mu, dn

        mu = 0.0d0
                do 100 i = 1, n
                mu = mu + a(i)
100             continue
        dn = n
        mean = mu / dn
        return
        end
Here the loop runs over indices i from 1 to n.

To do the same thing in C, the function should just omit the array size when declaring the "array" argument [which is actually a pointer, see section 6, "Arrays and Pointers", of the (excellent) comp.lang.c FAQ for details]. For example, using the data-type typedefs and name-mangling macro from my fortran.h header file, we could write this same function (still Fortran-callable) in C like this:

/*
 * compute mean of a[0...n-1]
 * ... note Fortran-style interface: arguments passed via pointers
 */
double FORTRAN_NAME(mean)(integer *pn, double a[])
{
int n = *pn;
int i;
double mu = 0.0;

	for (i = 0 ; i < n ; ++i)
	{
	mu += a[i];
	}

return mu / ((double) n);
}
Here the loop runs over indices i from 0 to n-1.

Or, using C++ references for the non-array arguments, we could write this function in C++ like this:

//
// compute mean of a[0...n-1]
// ... note Fortran-style interface: arguments passed via pointers/references
//
double FORTRAN_NAME(mean)(integer &n, double a[])
{
int i;
double mu = 0.0;

	for (i = 0 ; i < n ; ++i)
	{
	mu += a[i];
	}

return mu / double(n);
}

Warning

The remainder of this web page still needs revision for C++. Everything it says will still work in C++, but there are nicer techniques which it doesn't describe yet. I'll try to revise this sometime soon...


Multidimensional Array Indexing

For multidimensional arrays, there's an additional consideration: multidimensional arrays in C are stored in row-major order; in Fortran they're in column-major order. For a 2-dimensional array (matrix), this means that C stores each row contiguously in memory, while Fortran stores each column contiguously. More generally, for an N-dimensional array, in C the last dimension is contiguous in memory, while in Fortran the first dimension is contiguous.

For example, the Fortran array double precision A(2,3) is stored in memory like this:

A(1,1)   A(2,1)   A(1,2)   A(2,2)   A(1,3)   A(2,3)
while the C array double A[2][3] is stored in memory like this:
A[1][1]  A[1][2]  A[1][3]  A[2][1]  A[2][2]  A[2][3]

In practice, what this means is that if you want to pass a multidimensional array back and forth between C and Fortran, you should write the indices in the opposite order in C than in Fortran. You must do this both when declaring the array and when using (subscripting) it.

For example, to manipulate the Fortran array double precision A(2,3) from C, we'd declare it as double A[3][2]; it would be stored in memory like this:

A[1][1]  A[1][2]  A[2][1]  A[2][2]  A[3][1]  A[3][2]
Notice how this order is exactly the same as the Fortran order, except that the subscripts are reversed.

As a larger example, suppose (using Fortran notation) we have a 1-dimensional array (vector) double precision x(1000), and we want to set up an 1000 by 100 matrix A with A(i,j) = x(i)**(j-1). We could do this in Fortran with the subroutine:

	subroutine setup(x, A)
	double precision x(1000)
	double precision A(1000,100)

	integer i, j
		do 110 i = 1, 1000
			do 100 j = 1, 100
			A(i,j) = x(i) ** (j-1)
100			continue
110		continue
	return
	end

Using the data-type typedefs and the name mangling macro we defined earlier, we could write this same function (still Fortran-callable) in C like this:

#include <math.h>
void FORTRAN_NAME(setup)(const double x[], double A[100][1000])
{
int i, j;

	for (i = 0 ; i < 1000 ; ++i)
	{
		for (j = 0 ; j < 100 ; ++j)
		{
		A[j][i] = pow(x[i], (double)j);
		}
	}
}

Efficient Use of Multidimensional Arrays

On modern computer systems, it's much more efficient (faster) to access memory in sequential order than in any other order. Therefore, when programming with large arrays, it's much more efficient to step through them in sequential order as they're stored in memory. For multidimensional arrays (accessed with nested loops), what this means is that for best efficiency, the innermost (most frequently executed) loop should be on the array index which is contiguous in memory, i.e. the first index in Fortran or the last index in C.

For example, the following code does the same thing as the previous example, but it's much more efficient: the two loops are written in the opposite order, so the matrix is now accessed in sequential order.

In Fortran:

	subroutine setup(x, A)
	double precision x(1000)
	double precision A(1000,100)

	integer i, j
		do 110 j = 1, 100
			do 100 i = 1, 1000
			A(i,j) = x(i) ** (j-1)
100			continue
110		continue
	return
	end

In C (still Fortran-callable):

#include <math.h>
void FORTRAN_NAME(setup)(const double x[], double A[100][1000])
{
int i, j;

	for (j = 0 ; j < 100 ; ++j)
	{
		for (i = 0 ; i < 1000 ; ++i)
		{
		A[j][i] = pow(x[i], (double)j);
		}
	}
}

Variable-Size Multiimensional Arrays

In Fortran, it's easy to have a subroutine or function be passed a multidimensional array whose size varies from one call to the next. For example, a Fortran subroutine to multiply matrices might begin

c	matrix multiply: a(p,q) * b(q,r) --> c(p,r)
	subroutine matmul(a, b, c, p, q, r)
	integer p, q, r
	double precision a(p,q), b(q,r), c(p,r)

Unfortunately, doing this in C is rather tricky, and requires knowing a fair bit about pointers and pointer arithmetic. However, most of the tricky stuff can be hidden behind macros, and using the macros is fairly straightforward. (Setting them up remains a bit tricky.) Sometime in the next week or two I'll try to put my macros for this on this web page.

In the meantime, if you want to try doing variable-size multidimensional arrays yourself, see question 6.19, "How do I write functions which accept two-dimensional arrays when the ``width'' is not known at compile time?" of the (excellent) comp.lang.c FAQ for discussion of various options.

The draft C9X standard (the next update to the ISO C standard) contains a feature which will make variable-size multidimensional arrays easy, though it may still take some hacking around to make them Fortran-compatible. However, most compilers don't yet implement this feature. (One notable exception is gcc, the GNU C compiler. Sometime I ought to play around with gcc's variable-sized arrays and check out their Fortran compatibility. So many things to do, so few hours in each day...)

C++ makes it possible to set things up so multidimensional arrays (both Fortran-compatible and otherwise) are very clean and elegant. (Again, setting this up takes some skill.) The details of this are beyond the scope of this web page, though; see chapter 13 of Barton and Nackman's C++ book for details.


Link to my web page on numerical programming.
Link to my C programming page.
Link to my C++ programming page.
Link back to my home page.
$Revision: 1.2 $ of $Date: 2004/06/14 14:21:17 $.