|
1 | 1 | # Use cases
|
2 | 2 |
|
| 3 | +Use cases inform the requirements for, and design choices made in, this array |
| 4 | +API standard. This section first discusses what types of use cases are |
| 5 | +considered, and then works out a few concrete use cases in more detail. |
| 6 | + |
3 | 7 | ## Types of use cases
|
4 | 8 |
|
| 9 | +- Packages that depend on a specific array library currently, and would like |
| 10 | + to support multiple of them (e.g. for GPU or distributed array support, for |
| 11 | + improved performance, or for reaching a wider user base). |
| 12 | +- Writing new libraries/tools that wrap multiple array libraries. |
| 13 | +- Projects that implement new types of arrays with, e.g., hardware-specific |
| 14 | + optimizations or auto-parallelization behavior, and need an API to put on |
| 15 | + top that is familiar to end users. |
| 16 | +- End users that want to switch from one library to another without learning |
| 17 | + about all the small differences between those libraries. |
5 | 18 |
|
6 | 19 |
|
7 | 20 | ## Concrete use cases
|
| 21 | + |
| 22 | +- :ref:`use-case-scipy` |
| 23 | +- :ref:`use-case-einops` |
| 24 | +- :ref:`use-case-xtensor` |
| 25 | +- :ref:`use-case-numba` |
| 26 | + |
| 27 | + |
| 28 | +.. _use-case-scipy: |
| 29 | + |
| 30 | +### Use case 1: add hardware accelerator and distributed support to SciPy |
| 31 | + |
| 32 | +When surveying a representative set of advanced users and research software |
| 33 | +engineers in 2019 (for [this NSF proposal](https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441)), |
| 34 | +the single most common pain point brought up about SciPy was performance. |
| 35 | + |
| 36 | +SciPy heavily relies on NumPy (its only non-optional runtime dependency). |
| 37 | +NumPy provides an array implementation that's in-memory, CPU-only and |
| 38 | +single-threaded. Common performance-related wishes users have are: |
| 39 | + |
| 40 | +- parallel algorithms (can be multi-threaded or multiprocessing based) |
| 41 | +- support for distributed arrays (with Dask in particular) |
| 42 | +- support for GPUs and other hardware accelerators (shortened to just "GPU" |
| 43 | + in the rest of this use case) |
| 44 | + |
| 45 | +Some parallelism can be supported in SciPy, it has a `workers` keyword |
| 46 | +(similar to scikit-learn's `n_jobs` keyword) that allows specifying to use |
| 47 | +parallelism in some algorithms. However SciPy itself will not directly start |
| 48 | +depending on a GPU or distributed array implementation, or contain (e.g.) |
| 49 | +CUDA code - that's not maintainable given the resources for development. |
| 50 | +_However_, there is a way to provide distributed or GPU support. Part of the |
| 51 | +solution is provided by NumPy's "array protocols" (see gh-1), that allow |
| 52 | +dispatching to other array implementations. The main problem then becomes how |
| 53 | +to know whether this will work with a particular distributed or GPU array |
| 54 | +implementation - given that there are zero other array implementations that |
| 55 | +are even close to providing full NumPy compatibility - without adding that |
| 56 | +array implementation as a dependency. |
| 57 | + |
| 58 | +It's clear that SciPy functionality that relies on compiled extensions (C, |
| 59 | +C++, Cython, Fortran) directly can't easily be run on another array library |
| 60 | +than NumPy (see :ref:`C-api` for more details about this topic). Pure Python |
| 61 | +code can work though. There's two main possibilities: |
| 62 | + |
| 63 | +1. Testing with another package, manually or in CI, and simply provide a list |
| 64 | + of functionality that is found to work. Then make ad-hoc fixes to expand |
| 65 | + the set that works. |
| 66 | +2. Start relying on a well-defined subset of the NumPy API (or a new |
| 67 | + NumPy-like API), for which compatibility is guaranteed. |
| 68 | + |
| 69 | +Option (2) seems strongly preferable, and that "well-defined subset" is _what |
| 70 | +an API standard should provide_. Testing will still be needed, to ensure there |
| 71 | +are no critical corner cases or bugs between array implementations, however |
| 72 | +that's then a very tractable task. |
| 73 | + |
| 74 | +As a concrete example, consider the spectral analysis functions in `scipy.signal`. |
| 75 | +All of those functions (e.g., `periodogram`, `spectrogram`, `csd`, `welch`, `stft`, |
| 76 | +`istft`) are pure Python - with the exception of `lombscargle` which is ~40 |
| 77 | +lines of Cython - and uses NumPy function calls, array attributes and |
| 78 | +indexing. The beginning of each function could be changed to retrieve the |
| 79 | +module that implements the array API standard for the given input array type, |
| 80 | +and then functions from that module could be used instead of NumPy functions. |
| 81 | + |
| 82 | +If the user has another array type, say a CuPy or PyTorch array `x` on their |
| 83 | +GPU, doing: |
| 84 | +``` |
| 85 | +from scipy import signal |
| 86 | +
|
| 87 | +signal.welch(x) |
| 88 | +``` |
| 89 | +will result in: |
| 90 | +``` |
| 91 | +# For CuPy |
| 92 | +ValueError: object __array__ method not producing an array |
| 93 | +
|
| 94 | +# For PyTorch |
| 95 | +TypeError: can't convert cuda:0 device type tensor to numpy. |
| 96 | +``` |
| 97 | +and therefore the user will have to explicitly convert to and from a |
| 98 | +`numpy.ndarray` (which is quite inefficient): |
| 99 | +``` |
| 100 | +# For CuPy |
| 101 | +x_np = cupy.asnumpy(x) |
| 102 | +freq, Pxx = (cupy.asarray(res) for res in signal.welch(x_np)) |
| 103 | +
|
| 104 | +# For PyTorch |
| 105 | +x_np = x.cpu().numpy() |
| 106 | +# Note: ends up with tensors on CPU, may still have to move them back |
| 107 | +freq, Pxx = (torch.tensor(res) for res in signal.welch(x_np)) |
| 108 | +``` |
| 109 | +This code will look a little different for each array library. The end goal |
| 110 | +here is to be able to write this instead as: |
| 111 | +``` |
| 112 | +freq, Pxx = signal.welch(x) |
| 113 | +``` |
| 114 | +and have `freq`, `Pxx` be arrays of the same type and on the same device as `x`. |
| 115 | + |
| 116 | +.. note:: |
| 117 | + |
| 118 | + This type of use case applies to many other libraries, from scikit-learn |
| 119 | + and scikit-image to domain-specific libraries like AstroPy and |
| 120 | + scikit-bio, to code written for a single purpose or user. |
| 121 | + |
| 122 | + |
| 123 | +.. _use-case-einops: |
| 124 | + |
| 125 | +### Use case 2: simplify einops by removing the backend system |
| 126 | + |
| 127 | +[einops](https://github.com/arogozhnikov/einops) is a library that provides flexible tensor operations and supports many array libraries (NumPy, TensorFlow, PyTorch, CuPy, MXNet, JAX). |
| 128 | +Most of the code in `einops` is: |
| 129 | + |
| 130 | +- [einops.py](https://github.com/arogozhnikov/einops/blob/master/einops/einops.py) |
| 131 | + contains the functions it offers as public API (`rearrange`, `reduce`, `repeat`). |
| 132 | +- [_backends.py](https://github.com/arogozhnikov/einops/blob/master/einops/_backends.py) |
| 133 | + contains the glue code needed to support that many array libraries. |
| 134 | + |
| 135 | +The amount of code in each of those two files is almost the same (~550 LoC each). |
| 136 | +The typical pattern in `einops.py` is: |
| 137 | +``` |
| 138 | +def some_func(x): |
| 139 | + ... |
| 140 | + backend = get_backend(x) |
| 141 | + shape = backend.shape(x) |
| 142 | + result = backend.reduce(x) |
| 143 | + ... |
| 144 | +``` |
| 145 | +With a standard array API, the `_backends.py` glue layer could almost completely disappear, |
| 146 | +because the purpose it serves (providing a unified interface to array operations from each |
| 147 | +of the supported backends) is already addressed by the array API standard. |
| 148 | +Hence the complete `einops` code base could be close to 50% smaller, and easier to maintain or add to. |
| 149 | + |
| 150 | +.. note:: |
| 151 | + |
| 152 | + Other libraries that have a similar backend system to support many array libraries |
| 153 | + include [TensorLy](https://github.com/tensorly/tensorly), the (now discontinued) |
| 154 | + multi-backend version of [Keras](https://github.com/keras-team/keras), |
| 155 | + [Unumpy](https://github.com/Quansight-Labs/unumpy) and |
| 156 | + [EagerPy](https://github.com/jonasrauber/eagerpy). Many end users and organizations will also have such glue code - it tends to be needed whenever one tries to support multiple |
| 157 | + array types in a single API. |
| 158 | + |
| 159 | + |
| 160 | +.. _use-case-xtensor: |
| 161 | + |
| 162 | +### Use case 3: adding a Python API to xtensor |
| 163 | + |
| 164 | +[xtensor](https://github.com/xtensor-stack/xtensor) is a C++ array library |
| 165 | +that is NumPy-inspired and provides lazy arrays. It has Python (and Julia and R) |
| 166 | +bindings, however it does not have a Python array API. |
| 167 | + |
| 168 | +Xtensor aims to follow NumPy closely, however it only implements a subset of functionality |
| 169 | +and documents some API differences in |
| 170 | +[Notable differences with NumPy](https://xtensor.readthedocs.io/en/latest/numpy-differences.html). |
| 171 | + |
| 172 | +Note that other libraries document similar differences, see for example |
| 173 | +[this page for JAX](https://jax.readthedocs.io/en/latest/jax.numpy.html) and |
| 174 | +[this page for TensorFlow](https://www.tensorflow.org/guide/tf_numpy). |
| 175 | + |
| 176 | +Each time an array library author designs a new API, they have to choose (a) |
| 177 | +what subset of NumPy makes sense to implement, and (b) where to deviate |
| 178 | +because NumPy's API for a particular function is suboptimal or the semantics |
| 179 | +don't fit their execution model. |
| 180 | + |
| 181 | +This array API standard aims to provide an API that can be readily adopted, |
| 182 | +without having to make the above-mentioned choices. |
| 183 | + |
| 184 | +.. note:: |
| 185 | + |
| 186 | + XND is another array library, written in C, that still needs a Python API. |
| 187 | + Array implementations in other languages are often in a similar situation, |
| 188 | + and could translate this array API standard 1:1 to their language. |
| 189 | + |
| 190 | + |
| 191 | +.. _use-case-numba: |
| 192 | + |
| 193 | +### Use case 4: make JIT compilation of array computations easier and more robust |
| 194 | + |
| 195 | +[Numba](https://github.com/numba/numba) is a Just-In-Time (JIT) compiler for |
| 196 | +numerical functions in Python; it is NumPy-aware. [PyPy](https://pypy.org) |
| 197 | +is an implementation of Python with a JIT at its core; its NumPy support relies |
| 198 | +on running NumPy itself through a compatibility layer (`cpyext`), while a |
| 199 | +previous attempt to implement NumPy support directly was unsuccessful. |
| 200 | + |
| 201 | +Other array libraries may have an internal JIT (e.g., TensorFlow, PyTorch, |
| 202 | +JAX, MXNet) or work with an external JIT like |
| 203 | +[XLA](https://www.tensorflow.org/xla) or [VTA](https://tvm.apache.org/docs/vta/index.html). |
| 204 | + |
| 205 | +Numba currently has to jump through some hoops to accommodate NumPy's casting rules |
| 206 | +and may not attain full compatibility with NumPy in some cases - see, e.g., |
| 207 | +[this](https://github.com/numba/numba/issues/4749) or |
| 208 | +[this](https://github.com/numba/numba/issues/5907) example issue regarding (array) scalar |
| 209 | +return values. |
| 210 | + |
| 211 | +An [explicit suggestion from a Numba developer](https://twitter.com/esc___/status/1295389487485333505) |
| 212 | +for this array API standard was: |
| 213 | + |
| 214 | +> for JIT compilers (e.g. Numba) it will be important, that the type of the |
| 215 | + returned value(s) depends only on the *types* of the input but not on the |
| 216 | + *values*. |
| 217 | + |
| 218 | +A concrete goal for this use case is to have better matching between |
| 219 | +JIT-compiled and non-JIT execution. Here is an example from the Numba code |
| 220 | +base, the need for which should be avoided in the future: |
| 221 | + |
| 222 | +``` |
| 223 | +def check(x, y): |
| 224 | + got = cfunc(x, y) |
| 225 | + np.testing.assert_array_almost_equal(got, pyfunc(x, y)) |
| 226 | + # Check the power operation conserved the input's dtype |
| 227 | + # (this is different from Numpy, whose behaviour depends on |
| 228 | + # the *values* of the arguments -- see PyArray_CanCastArrayTo). |
| 229 | + self.assertEqual(got.dtype, x.dtype) |
| 230 | +``` |
0 commit comments