This roadmap page contains only the most important ideas and needs for SciPy going forward. For a more detailed roadmap, including per-subpackage status, many more ideas, API stability and more, see Detailed SciPy Roadmap.
Evolve BLAS and LAPACK support¶
The Python and Cython interfaces to BLAS and LAPACK in
scipy.linalg are one
of the most important things that SciPy provides. In general
is in good shape, however we can make a number of improvements:
1. Library support. Our released wheels now ship with OpenBLAS, which is currently the only feasible performant option (ATLAS is too slow, MKL cannot be the default due to licensing issues, Accelerate support is dropped because Apple doesn’t update Accelerate anymore). OpenBLAS isn’t very stable though, sometimes its releases break things and it has issues with threading (currently the only issue for using SciPy with PyPy3). We need at the very least better support for debugging OpenBLAS issues, and better documentation on how to build SciPy with it. An option is to use BLIS for a BLAS interface (see numpy gh-7372).
2. Support for newer LAPACK features. In SciPy 1.2.0 we increased the minimum supported version of LAPACK to 3.4.0. Now that we dropped Python 2.7, we can increase that version further (MKL + Python 2.7 was the blocker for >3.4.0 previously) and start adding support for new features in LAPACK.
Implement sparse arrays in addition to sparse matrices¶
The sparse matrix formats are mostly feature-complete, however the main issue
is that they act like
numpy.matrix (which will be deprecated in NumPy at
some point). What we want is sparse arrays that act like
This is being worked on in https://github.com/pydata/sparse, which is quite far
along. The tentative plan is:
Start depending on
pydata/sparseonce it’s feature-complete enough (it still needs a CSC/CSR equivalent) and okay performance-wise.
Add support for
scipy.sparse.linalg(and perhaps to
Indicate in the documentation that for new code users should prefer
pydata/sparseover sparse matrices.
When NumPy deprecates
numpy.matrix, vendor that or maintain it as a stand-alone package.
Fourier transform enhancements¶
scipy.fft subpackage should be extended to add a backend system with
support for PyFFTW and mkl-fft.
Support for distributed arrays and GPU arrays¶
NumPy is splitting its API from its execution engine with
__array_ufunc__. This will enable parts of SciPy
to accept distributed arrays (e.g.
dask.array.Array) and GPU arrays (e.g.
cupy.ndarray) that implement the
ndarray interface. At the moment it is
not yet clear which algorithms will work out of the box, and if there are
significant performance gains when they do. We want to create a map of which
parts of the SciPy API work, and improve support over time.
In addition to making use of NumPy protocols like
__array_function__, we can
make use of these protocols in SciPy as well. That will make it possible to
(re)implement SciPy functions like, e.g., those in
scipy.signal for Dask
or GPU arrays (see
NEP 18 - use outside of NumPy).
Improve source builds on Windows¶
SciPy critically relies on Fortran code. This is still problematic on Windows. There are currently only two options: using Intel Fortran, or using MSVC + gfortran. The former is expensive, while the latter works (it’s what we use for releases) but is quite hard to do correctly. For allowing contributors and end users to reliably build SciPy on Windows, using the Flang compiler looks like the best way forward long-term. Until Flang support materializes, we need to streamline and better document the MSVC + gfortran build.
Improve the options for fitting a probability distribution to data.
Expand the set of hypothesis tests. In particular, include all the basic variations of analysis of variance.
Add confidence intervals for all statistical tests.