accum, a function like MATLAB's accumarray

NumPy doesn't include a function that is equivalent to MATLAB's accumarray function. The following function, accum, is close.

Note that accum can handle n-dimensional arrays, and allows the data type of the result to be specified.

   1 from itertools import product
   2 import numpy as np
   3 
   4 
   5 def accum(accmap, a, func=None, size=None, fill_value=0, dtype=None):
   6     """
   7     An accumulation function similar to Matlab's `accumarray` function.
   8 
   9     Parameters
  10     ----------
  11     accmap : ndarray
  12         This is the "accumulation map".  It maps input (i.e. indices into
  13         `a`) to their destination in the output array.  The first `a.ndim`
  14         dimensions of `accmap` must be the same as `a.shape`.  That is,
  15         `accmap.shape[:a.ndim]` must equal `a.shape`.  For example, if `a`
  16         has shape (15,4), then `accmap.shape[:2]` must equal (15,4).  In this
  17         case `accmap[i,j]` gives the index into the output array where
  18         element (i,j) of `a` is to be accumulated.  If the output is, say,
  19         a 2D, then `accmap` must have shape (15,4,2).  The value in the
  20         last dimension give indices into the output array. If the output is
  21         1D, then the shape of `accmap` can be either (15,4) or (15,4,1) 
  22     a : ndarray
  23         The input data to be accumulated.
  24     func : callable or None
  25         The accumulation function.  The function will be passed a list
  26         of values from `a` to be accumulated.
  27         If None, numpy.sum is assumed.
  28     size : ndarray or None
  29         The size of the output array.  If None, the size will be determined
  30         from `accmap`.
  31     fill_value : scalar
  32         The default value for elements of the output array. 
  33     dtype : numpy data type, or None
  34         The data type of the output array.  If None, the data type of
  35         `a` is used.
  36 
  37     Returns
  38     -------
  39     out : ndarray
  40         The accumulated results.
  41 
  42         The shape of `out` is `size` if `size` is given.  Otherwise the
  43         shape is determined by the (lexicographically) largest indices of
  44         the output found in `accmap`.
  45 
  46 
  47     Examples
  48     --------
  49     >>> from numpy import array, prod
  50     >>> a = array([[1,2,3],[4,-1,6],[-1,8,9]])
  51     >>> a
  52     array([[ 1,  2,  3],
  53            [ 4, -1,  6],
  54            [-1,  8,  9]])
  55     >>> # Sum the diagonals.
  56     >>> accmap = array([[0,1,2],[2,0,1],[1,2,0]])
  57     >>> s = accum(accmap, a)
  58     array([9, 7, 15])
  59     >>> # A 2D output, from sub-arrays with shapes and positions like this:
  60     >>> # [ (2,2) (2,1)]
  61     >>> # [ (1,2) (1,1)]
  62     >>> accmap = array([
  63             [[0,0],[0,0],[0,1]],
  64             [[0,0],[0,0],[0,1]],
  65             [[1,0],[1,0],[1,1]],
  66         ])
  67     >>> # Accumulate using a product.
  68     >>> accum(accmap, a, func=prod, dtype=float)
  69     array([[ -8.,  18.],
  70            [ -8.,   9.]])
  71     >>> # Same accmap, but create an array of lists of values.
  72     >>> accum(accmap, a, func=lambda x: x, dtype='O')
  73     array([[[1, 2, 4, -1], [3, 6]],
  74            [[-1, 8], [9]]], dtype=object)
  75     """
  76 
  77     # Check for bad arguments and handle the defaults.
  78     if accmap.shape[:a.ndim] != a.shape:
  79         raise ValueError("The initial dimensions of accmap must be the same as a.shape")
  80     if func is None:
  81         func = np.sum
  82     if dtype is None:
  83         dtype = a.dtype
  84     if accmap.shape == a.shape:
  85         accmap = np.expand_dims(accmap, -1)
  86     adims = tuple(range(a.ndim))
  87     if size is None:
  88         size = 1 + np.squeeze(np.apply_over_axes(np.max, accmap, axes=adims))
  89     size = np.atleast_1d(size)
  90 
  91     # Create an array of python lists of values.
  92     vals = np.empty(size, dtype='O')
  93     for s in product(*[range(k) for k in size]):
  94         vals[s] = []
  95     for s in product(*[range(k) for k in a.shape]):
  96         indx = tuple(accmap[s])
  97         val = a[s]
  98         vals[indx].append(val)
  99 
 100     # Create the output array.
 101     out = np.empty(size, dtype=dtype)
 102     for s in product(*[range(k) for k in size]):
 103         if vals[s] == []:
 104             out[s] = fill_value
 105         else:
 106             out[s] = func(vals[s])
 107 
 108     return out

Examples

A basic example--sum the diagonals (with wrapping) of a 3 by 3 array:

In [5]: from numpy import array, prod

In [6]: from accum import accum

In [7]: a = array([[1,2,3],[4,-1,6],[-1,8,9]])

In [8]: a
Out[8]: 
array([[ 1,  2,  3],
       [ 4, -1,  6],
       [-1,  8,  9]])

In [9]: accmap = array([[0,1,2],[2,0,1],[1,2,0]])

In [10]: s = accum(accmap, a)

In [11]: s
Out[11]: array([ 9,  7, 15])

Accumulate using multiplication, going from a 3 by 3 array to 2 by 2 array:

In [12]: accmap = array([
   ....:             [[0,0],[0,0],[0,1]],
   ....:             [[0,0],[0,0],[0,1]],
   ....:             [[1,0],[1,0],[1,1]],
   ....:         ])

In [13]: accum(accmap, a, func=prod, dtype=float)
Out[13]: 
array([[ -8.,  18.],
       [ -8.,   9.]])

Create an array of lists containing the values to be accumulated in each position in the output array:

In [14]: accum(accmap, a, func=lambda x: x, dtype='O')
Out[14]: 
array([[[1, 2, 4, -1], [3, 6]],
       [[-1, 8], [9]]], dtype=object)

Use accum to arrange some values from a 1D array in a 2D array (note that using accum for this is overkill; fancy indexing would suffice):

In [15]: subs = np.array([[k,5-k] for k in range(6)])

In [16]: subs
Out[16]: 
array([[0, 5],
       [1, 4],
       [2, 3],
       [3, 2],
       [4, 1],
       [5, 0]])

In [17]: vals = array(range(10,16))

In [18]: accum(subs, vals)
Out[18]: 
array([[ 0,  0,  0,  0,  0, 10],
       [ 0,  0,  0,  0, 11,  0],
       [ 0,  0,  0, 12,  0,  0],
       [ 0,  0, 13,  0,  0,  0],
       [ 0, 14,  0,  0,  0,  0],
       [15,  0,  0,  0,  0,  0]])