In NumPy 1.0.4 and earlier, matrices were treated as containers of matrices rather than as containers of 1d arrays. This leads to some behavior that is surprising for new users and not useful to experienced users. Here are three example anomalies:
>>> x = np.mat('0 1;2 3') >>> x #anomaly #1 matrix([[0, 1]]) >>> np.array([x,x]) #anomaly #2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: setting an array element with a sequence. >>> for row in x: #anomaly #3 ... for item in row: ... print item ... [[0 1]] [[2 3]]
Anomaly #1 demonstrates a breakage of a fundamental expecation for indexing. Example #3 demonstrates how this behavior currently undermines the generic use of iteration. (In principle we could change __iter__, at the cost of another deviation from array behavior.) Anomaly #2 demonstrates that the current breakage in the behavior of iteration has implications for all kinds of code that should be functional but currently is not.
A number of proposals have surfaced to eliminate this odd behavior. Unfortunately, some of them conflict with another behavior that some consider desirable, that x == x[0,:] and that x[0,:] is a matrix (i.e., 2d).
Note that NumPy indexing has none of these problems or inconsistencies. A core to several proposals is therefore: restore NumPy indexing to the matrix class, but return matrices when the result is 2d.
However, quite a few people argue that x should raise an error (since matrix indexing should always be 2d). At this point it appears that NumPy will have to pick two of the three following matrix behaviors:
- non-scalar indexing returns a submatrix
- x==x[0,:] and x[0,:] is a matrix.
There appears to be wide but not unanimous agreement that x==x[0,0] is a basic expectation for the behavior of 2d array-like objects, and should not be broken. (Some people argue that x should raise an error for a matrix x, and one person even argued that this should be true for ndarrays as well.) A summary of considerations can be found in the Guidelines section. You can also skip to the List of Proposals.
Two currently related (but potentially independent) questions are: what is x, and what objects are yielded by iteration over a matrix.
What is x?
Suppose x is a matrix, and I say v=A. What is v?
- If v is a 1d array, its behavior is known. E.g., v is the first element and v[0,0] is an IndexError. If we want to keep current behavior of submatrix extraction, we give up x == x[0,:]. On the other hand, if we stick close to ndarray behavior, x[0,:] should not return a matrix either! (We should use the usual NumPy slicing instead: x[:1,:].)
- If v is a matrix, its behavior can be learned by the experienced but breaks basic expectations (e.g., the x problem).
- If v is a "RowVector" what behavior do we get? Suppose the idea is that this will rescue x==x[0,0] and x=x[0,:]. Of course we must give up that x[0,:] is a submatrix. (This may be the right thing to give up!) Must v deviate from submatrix behavior in an important way? Yes: v is an IndexError. (Note that the same problem arises if we just override __getitem__ for the matrix class to special case row and column matrices.) It seems that v is a lot like a 1d array. Should it have other behavior that make it usable in linear algebra? Would any of these properties conflict with the behavior of 1d arrays? If so, is there any motivation for such row vectors other than a desire to be able to use single index rather than 2?
Comment: currently access through iteration yields the same result as access with a scalar index. (While this could be changed, the equivalence seems to be a good thing.) But since it is in principle separable, there is a second question.
What is x[0,:]?
Currently x[0,:] returns a rank-two object, a matrix. Thus to extract element 2, one uses x[0,:][0,2]. This may break some code that assumes (for example) x[i,...] has rank one less than x. One argument for doing it this way is so that x[0,:] still has matrix-like "*" semantics.
What Does Iteration Over x Yield?
This question raises issues similar to the previous question.
List of Proposals
- Introduce separate Row- and ColumnVector objects, which behave like (1xN) and (Nx1) matrices. Their elements can be indexed with a single element. A Matrix becomes a container of Vectors. For an example which explores whether there are advantages to this, see ConjugateGradientExample.
- Let a matrix be a container of 1d arrays.
- Introduce a vector object that is a subclass of matrix that handles scalar indexes differently when there is only one row or one column. (No separate row and column vector objects.)
- Take a new approach to matrices: add meta information to arrays to make them look more like tensors, and then construct a matrix class as a minimal deviation from these tensor-like arrays.
- Make x raise an error, but add row and column attributes to allow iteration across rows or columns. (Here the rows and columns are interpreted as submatrices, so iteration is yielding objects of the matrix type.) Override __iter__ so that iteration on a matrix yields 1d arrays, just like with an ndarray. This keeps behavior close to ndarray behavior but adds attributes that allow for iteration over rows or columns of a matrix (as submatrices). Indexing with non-scalar indexes should always yield a submatrix (e.g., x[0,:]).
- Use ndarray-style indexing. If the result would be a 2d array, return a matrix. If the result would be 1d array, return a 1d array or possibly a new vector object. (Note: this returns to ndarray style indexing and so gives up the current behavior that x[0,:] returns a matrix.)
Something similar to proposal (3) was implemented in r5072, but was reverted in r5084. (The implied behavior that x is a type error if x is a 1 by N matrix was considered untenable.) Proposal (1) remains under active consideration. Proposal (4) has not been fleshed out yet.
All proposals will eliminate the anomaly above. Any option eventually implemented should do so as well. Agreement on this appears general.
One of the proposals above is likely to be implemented in the long run. They are not compatible, so a good decision needs to consider the pros and cons of each approach.
A key question that seems likely to guide the decision is: how will the choice of proposal affect the ability of NumPy to maintain consistency between the behavior of matrices and the behavior of sparse matrices.
- Subclasses of ndarray should introduce deviant behavior only for clear gains in functionality or ease of use.
- Matrices and sparse matrices should behave consistently.
- Matrices should facilitate linear algebra. Specifically, they should should override the * and ** operators as does the current matrix class, and they should allow convenient extraction of submatrices.
- The following questions should be explicitly addressed:
- what behavior allows the most generic code?
- what behavior breaks fewest expectations?
- what behavior is most useful?
This seems to imply answers to the following questions (although agreement on this remains far from universal):
- Do we want to be able to extract a single row or column from a matrix, and have it be indexable like a 1-d object? (Answer: certainly yes for rows. Columns could be gotten from the transpose, e.g. x.T, but see (3) below. A 1d object is most useful, allows the most generic code, and breaks fewest expectations.)
- Do we want to be able to do (1) without calling the Array attribute: M.A[i]? (Yes: this is not less useful, allows the most generic code, and breaks fewest expectations.)
- Do we want to have those 1-d objects to have an orientation so that they act like Row or Column vectors in linear algebra operations (and broadcasting)? (Unclear. Most resulting functionality could be provided by rows and columns attributes that yield corresponding matrices. The exception is access to elements by scalar indexing, but is this important? Maybe not.)
- Do we want to require slice notation to get a submatrix that happens to have a single row or column? (Yes: this is slightly clumsier than current behavior for this purpose, but is consistent with ndarray indexing so that it breaks fewest expectations, and it allows the most generic code.)
Proposal #1: Introduce Row- and ColumnVectors
N-d arrays are organised in the following hierarchy:
- N-d array
- (N-1)-d array
- 1-d array
In other words, an N-dimensional ndarray is a container of (N-1)-dimensional ndarrays. This holds down to N==1, where the ndarray contains scalars.
This proposal (#1) suggests a similar hierarchy for matrices:
- RowVector or ColumnVector
This hierarchy ensures a minimum of behavioral differences between ndarrays and Matrices. Among others, the following expression would yield the same results for ndarrays and Matrices:
- x == x[0,:]
Note that any slice returning a vector preserves its orientation, like the current Matrix implementation does:
- x[0,:] -> Row Vector
- x[:,0] -> Column Vector
This behavior is intentionally different from ndarrays.
See Related Questions #1 above.
- Deviations from the behavior of ndarrays. Specifically, iteration over a matrix will not yield 1d NumPy arrays. (Also, see the Unanswered Question above.
- Inconsistency in the indexing style for producing row vectors and column vectors
- Loss of a standard way to extract a row or a column as a submatrix (rather than a vector)
- No clear gain in functionality over the simpler proposal 2. (But see Functionality Gains below.)
- Increase in complexity
Because this proposal is more complex, it should be implemented only if there are clear gains in functionality.
One possible gain has been suggested: it may prove easier to keep matrix behavior compatible with sparse matrix behavior under this proposal. (This suggestion needs to be fleshed out.)
One gain from this approach relative to say proposal #6 is that we can continue to simply express the quadratic x[0,:]*A*x[:,0] in exactly the same way, which is likely to be intuitive to beginners.
An Alternative Approach to the Same Functionality
The goals of this proposal might be met by introducting new attributes (perhaps rows and columns, allowing iteration over the matrix rows and columns. This appears to be a more consistent way to get the same functionality.
A 2D matrix becomes a container of 1D arrays. This has been superceded by Proposal 6.
Proposal: Special Indexing for Matrices that are Row or Column Vectors
This might involve no more than a change in __getitem__ so that the matrix tests its shape before processing a scalar index.
This introduces an obvious indexing inconsistency, making general code difficult.
Proposal: Meta Data to Support Tensor-Like Arrays
Add new attributes for row and column access, which will return matrices (i.e., 2d).
- Add a rows method and a cols method for returning iterators on the rows and columns.
- Also allow access to specified rows.
The rows and cols methods would render the following code valid:
for row in x.rows(): for col in row.cols(): print col
and is thus a alternative to the currently not working:
for row in x: for col in row: print col
which currently gives a unexpected result.
Comment: with these methods in place, there seems little reason not to let iteration over a matrix yield 1d arrays, which keeps matrix behavior closer to ndarray behavior. This is good, since it means much array code can handle matrices without any special handling.
Access to Specified Rows/Columns
One possibility is to let the rows and cols methods take an argument with a default of None. The default is to return an iterator over the rows/columns. An iterator over subsets of rows/columns could be specified with a sequence. A single row/column could be specified with a scalar argument.
Use standard ndarray indexing syntax. If this would produce a 2d array, return a matrix. Otherwise, return an array.
- iteration over a matrix yields 1d arrays, just like iteration over a 2d array
- x[i] and x.A[i] are identical for integer i (which fixes the example anomalies in the Introduction).
Why this is desirable
- the resulting matrix behavior is more natural and less surprising to new users
- behavioral consistency across array-like objects is a good thing when possible. On this principle, matrix behavior should deviate from 2d array behavior only when there is a clear increase in functionality. The current matrix response to a scalar index is a deviation from 2d array behavior that provides no additional functionality to experienced users and is surprising to new users.
- the quadratic x[0,:]*A*x[:,0] would have to be written in another way, e.g.,
- x[idx,:] returns a different type depending on the type of idx
- if idx is a scalar, the result is a 1d array
- if idx is a list or array, the result is a matrix
- strongly ensures viability of generic code