This is an archival dump of old wiki content --- see scipy.org for current material.
Please see http://scipy-cookbook.readthedocs.org/

C Extensions for Using NumPy Arrays

I've written several C extensions that handle NumPy arrays. They are simple, but they seem to work well. They will show you how to pass Python variables and NumPy arrays to your C code. Once you learn how to do it, it's pretty straight-forward. I suspect they will suffice for most numerical code. I've written it up as a draft and have made the code and document file available. I found for my numerical needs I really only need to pass a limited set of things (integers, floats, strings, and NumPy arrays). If that's your category, this code might help you.

I have tested the routines and so far,so good, but I cannot guarantee anything. I am a bit new to this. If you find any errors put up a message on the SciPy mailing list.

A link to the tar ball that holds the code and docs is given below.

I have recently updated some information and included more examples. The document presented below is the original documentation which is still useful. The link below holds the latest documentation and source code.

-- Lou Pecora


What follows is the content of Lou`s word-document originally pasted here as version 1. I (DavidLinke) have converted this to wiki-markup:

C Extensions to NumPy and Python

By Lou Pecora - 2006-12-07 (Draft version 0.1)

Overview

Introduction– a little background

In my use of Python I came across a typical problem: I needed to speed up particular parts of my code. I am not a Python guru or any kind of coding/computer guru. I use Python for numerical calculations and I make heavy use of Numeric/NumPy. Almost every Python book or tutorial tells you build C extensions to Python when you need a routine to run fast. C extensions are C code that can be compiled and linked to a shared library that can be imported like any Python module and you can call specified C routines like they were Python functions.

Sounds nice, but I had reservations. It looked non-trivial (it is, to an extent). So I searched for other solutions. I found them. They are such approaches as SWIG, Pyrex, ctypes, Psycho, and Weave. I often got the simple examples given to work (not all, however) when I tried these. But I hit a barrier when I tried to apply them to NumPy. Then one gets into typemaps or other hybrid constructs. I am not knocking these approaches, but I could never figure them out and get going on my own code despite lots of online tutorials and helpful suggestions from various Python support groups and emailing lists.

So I decided to see if I could just write my own C extensions. I got help in the form of some simple C extension examples for using Numeric written about 2000 from Tom Loredo of Cornell university. These sat on my hard drive until 5 years later out of desperation I pulled them out and using his examples, I was able to quickly put together several C extensions that (at least for me) handle all of the cases (so far) where I want a speedup. These cases mostly involve passing Python integers, floats (=C doubles), strings, and NumPy 1D and 2D float and integer arrays. I rarely need to pass anything else to a C routine to do a calculation. If you are in the same situation as me, then this package I put together might help you. It turns out to be fairly easy once you get going.

Please note, Tom Loredo is not responsible for any errors in my code or instructions although I am deeply indebted to him. Likewise, this code is for research only. It was tested by only my development and usage. It is not guaranteed, and comes with no warranty. Do not use this code where there are any threats of loss of life, limb, property, or money or anything you or others hold dear.

I developed these C extensions and their Python wrappers on a Macintosh G4 laptop using system OS X 10.4 (essential BSD Unix), Python 2.4, NumPy 0.9x, and the gnu compiler and linker gcc. I think most of what I tell you here will be easily translated to Linux and other Unix systems beyond the Mac. I am not sure about Windows. I hope that my low-level approach will make it easy for Windows users, too.

The code (both C and Python) for the extensions may look like a lot, but it is very repetitious. Once you get the main scheme for one extension function you will see that repeated over and over again in all the others with minor variations to handle different arguments or return different objects to the calling routine. Don't be put off by the code. The good news is that for many numerical uses extensions will follow the same format so you can quickly reuse what you already have written for new projects. Focus on one extension function and follow it in detail (in fact, I will do this below). Once you understand it, the other routines will be almost obvious. The same is true of the several utility functions that come with the package. They help you create, test, and manipulate the data and they also have a lot of repetition. The utility functions are also very short and simple so nothing to fear there.

General Scheme for NumPy Extensions

This will be covered in detail below, but first I wanted to give you a sense of how each extension is organized.

Three things that must be done before your C extension functions in the C source file.

  1. You must include Python and NumPy headers.

  2. Each extension must be named in a defining structure at the beginning of the file. This is a name used to access the extension from a Python function.
  3. Next an initialization set of calls is made to set up the Python and NumPy calls and interface. It will be the same for all extensions involving NumPy and Python unless you add extensions to access other Python packages or classes beyond NumPy arrays. I will not cover any of that here (because I don't know it). So the init calls can be copied to each extension file.

Each C extension will have the following form.

Python Wrapper Functions

It is best to call the C extensions by calling a Python function that then calls the extension. This is called a Python wrapper function. It puts a more pythonic look to your code (e.g. you can use keywords easily) and, as I pointed out above, allows you to easily check that the function arguments and data are correct before you had them over to the C extension and other C functions for that big calculation. It may seem like an unnecessary extra step, but it's worth it.

The Code

In this section I refer to the code in the source files C_arraytest.h, C_arraytest.c, C_arraytest.py, and C_arraytest.mak. You should keep those files handy (probably printed out) so you can follow the explanations of the code below.

The C Code – one detailed example with utilities

First, I will use the example of code from C_arraytest.h, C_arraytest.c for the routine called matsq. This function takes a (NumPy) matrix A, integer i, and (Python) float y as input and outputs a return (NumPy) matrix B each of whose components is equal to the square of the input matrix component times the integer times the float. Mathematically:

The Python code to call the matsq routine is A=matsq(B,i,y). Here is the relevant code in one place:

The Header file, C_arraytest.h:

The Source file, C_arraytest.c:

Now, lets look at the source code in smaller chunks.

Headers

You must include the following headers with Python.h always the first header included.

I also include the header C_arraytest.h which contains the prototype of the matsq function:

}}}

The static keyword in front of a function declaration makes this function private to your extension module. The linker just won't see it. This way you can use the same intuitional function names(i.e. sum, check, trace) for all extension modules without having name clashes between them at link time. The type of the function is PyObject * because it will always be returning to a Python calling function so you can (must, actually) return a Python object. The arguments are always the same ,

}}}

The first one self is never used, but necessary because of how Python passes arguments. The second args is a pointer to a Python tuple that contains all of the arguments (B,i,x) of the function.

Method definitions

This sets up a table of function names that will be the interface from your Python code to your C extension. The name of the C extension module will be _C_arraytest (note the leading underscore). It is important to get the name right each time it is used because there are strict requirements on using the module name in the code. The name appears first in the method definitions table as the first part of the table name:

}; }}}

where I used ellipses (...) to ignore other code not relevant to this function. The METH_VARARGS parameter tells the compiler that you will pass the arguments the usual way without keywords as in the example A=matsq(B,i,x) above. There are ways to use Python keywords, but I have not tried them out. The table should always end with {NULL, NULL} which is just a "marker" to note the end of the table.

Initializations

These functions tell the Python interpreter what to call when the module is loaded. Note the name of the module (_C_arraytest) must come directly after the init in the name of the initialization structure.

} }}}

The order is important and you must call these two initialization functions first.

The matsqfunction code

Now here is the actual function that you will call from Python code. I will split it up and explain each section.

The function name and type:

}}}

You can see they match the prototype in C_arraytest.h.

The local variables:

SciPy: Cookbook/C_Extensions/NumPy_arrays (last edited 2015-10-24 17:48:26 by anonymous)