Skip to content

Commit c6889f1

Browse files
authored
Merge pull request #35 from data-apis/use-cases
Add use cases section
2 parents d0ec621 + 1db2b52 commit c6889f1

File tree

2 files changed

+225
-0
lines changed

2 files changed

+225
-0
lines changed

spec/design_topics/C_API.md

+2
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1+
.. _C-api:
2+
13
# C API

spec/use_cases.md

+223
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,230 @@
11
# Use cases
22

3+
Use cases inform the requirements for, and design choices made in, this array
4+
API standard. This section first discusses what types of use cases are
5+
considered, and then works out a few concrete use cases in more detail.
6+
37
## Types of use cases
48

9+
- Packages that depend on a specific array library currently, and would like
10+
to support multiple of them (e.g. for GPU or distributed array support, for
11+
improved performance, or for reaching a wider user base).
12+
- Writing new libraries/tools that wrap multiple array libraries.
13+
- Projects that implement new types of arrays with, e.g., hardware-specific
14+
optimizations or auto-parallelization behavior, and need an API to put on
15+
top that is familiar to end users.
16+
- End users that want to switch from one library to another without learning
17+
about all the small differences between those libraries.
518

619

720
## Concrete use cases
21+
22+
- :ref:`use-case-scipy`
23+
- :ref:`use-case-einops`
24+
- :ref:`use-case-xtensor`
25+
- :ref:`use-case-numba`
26+
27+
28+
.. _use-case-scipy:
29+
30+
### Use case 1: add hardware accelerator and distributed support to SciPy
31+
32+
When surveying a representative set of advanced users and research software
33+
engineers in 2019 (for [this NSF proposal](https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441)),
34+
the single most common pain point brought up about SciPy was performance.
35+
36+
SciPy heavily relies on NumPy (its only non-optional runtime dependency).
37+
NumPy provides an array implementation that's in-memory, CPU-only and
38+
single-threaded. Common performance-related wishes users have are:
39+
40+
- parallel algorithms (can be multi-threaded or multiprocessing based)
41+
- support for distributed arrays (with Dask in particular)
42+
- support for GPUs and other hardware accelerators (shortened to just "GPU"
43+
in the rest of this use case)
44+
45+
Some parallelism can be supported in SciPy, it has a `workers` keyword
46+
(similar to scikit-learn's `n_jobs` keyword) that allows specifying to use
47+
parallelism in some algorithms. However SciPy itself will not directly start
48+
depending on a GPU or distributed array implementation, or contain (e.g.)
49+
CUDA code - that's not maintainable given the resources for development.
50+
_However_, there is a way to provide distributed or GPU support. Part of the
51+
solution is provided by NumPy's "array protocols" (see gh-1), that allow
52+
dispatching to other array implementations. The main problem then becomes how
53+
to know whether this will work with a particular distributed or GPU array
54+
implementation - given that there are zero other array implementations that
55+
are even close to providing full NumPy compatibility - without adding that
56+
array implementation as a dependency.
57+
58+
It's clear that SciPy functionality that relies on compiled extensions (C,
59+
C++, Cython, Fortran) directly can't easily be run on another array library
60+
than NumPy (see :ref:`C-api` for more details about this topic). Pure Python
61+
code can work though. There's two main possibilities:
62+
63+
1. Testing with another package, manually or in CI, and simply provide a list
64+
of functionality that is found to work. Then make ad-hoc fixes to expand
65+
the set that works.
66+
2. Start relying on a well-defined subset of the NumPy API (or a new
67+
NumPy-like API), for which compatibility is guaranteed.
68+
69+
Option (2) seems strongly preferable, and that "well-defined subset" is _what
70+
an API standard should provide_. Testing will still be needed, to ensure there
71+
are no critical corner cases or bugs between array implementations, however
72+
that's then a very tractable task.
73+
74+
As a concrete example, consider the spectral analysis functions in `scipy.signal`.
75+
All of those functions (e.g., `periodogram`, `spectrogram`, `csd`, `welch`, `stft`,
76+
`istft`) are pure Python - with the exception of `lombscargle` which is ~40
77+
lines of Cython - and uses NumPy function calls, array attributes and
78+
indexing. The beginning of each function could be changed to retrieve the
79+
module that implements the array API standard for the given input array type,
80+
and then functions from that module could be used instead of NumPy functions.
81+
82+
If the user has another array type, say a CuPy or PyTorch array `x` on their
83+
GPU, doing:
84+
```
85+
from scipy import signal
86+
87+
signal.welch(x)
88+
```
89+
will result in:
90+
```
91+
# For CuPy
92+
ValueError: object __array__ method not producing an array
93+
94+
# For PyTorch
95+
TypeError: can't convert cuda:0 device type tensor to numpy.
96+
```
97+
and therefore the user will have to explicitly convert to and from a
98+
`numpy.ndarray` (which is quite inefficient):
99+
```
100+
# For CuPy
101+
x_np = cupy.asnumpy(x)
102+
freq, Pxx = (cupy.asarray(res) for res in signal.welch(x_np))
103+
104+
# For PyTorch
105+
x_np = x.cpu().numpy()
106+
# Note: ends up with tensors on CPU, may still have to move them back
107+
freq, Pxx = (torch.tensor(res) for res in signal.welch(x_np))
108+
```
109+
This code will look a little different for each array library. The end goal
110+
here is to be able to write this instead as:
111+
```
112+
freq, Pxx = signal.welch(x)
113+
```
114+
and have `freq`, `Pxx` be arrays of the same type and on the same device as `x`.
115+
116+
.. note::
117+
118+
This type of use case applies to many other libraries, from scikit-learn
119+
and scikit-image to domain-specific libraries like AstroPy and
120+
scikit-bio, to code written for a single purpose or user.
121+
122+
123+
.. _use-case-einops:
124+
125+
### Use case 2: simplify einops by removing the backend system
126+
127+
[einops](https://github.com/arogozhnikov/einops) is a library that provides flexible tensor operations and supports many array libraries (NumPy, TensorFlow, PyTorch, CuPy, MXNet, JAX).
128+
Most of the code in `einops` is:
129+
130+
- [einops.py](https://github.com/arogozhnikov/einops/blob/master/einops/einops.py)
131+
contains the functions it offers as public API (`rearrange`, `reduce`, `repeat`).
132+
- [_backends.py](https://github.com/arogozhnikov/einops/blob/master/einops/_backends.py)
133+
contains the glue code needed to support that many array libraries.
134+
135+
The amount of code in each of those two files is almost the same (~550 LoC each).
136+
The typical pattern in `einops.py` is:
137+
```
138+
def some_func(x):
139+
...
140+
backend = get_backend(x)
141+
shape = backend.shape(x)
142+
result = backend.reduce(x)
143+
...
144+
```
145+
With a standard array API, the `_backends.py` glue layer could almost completely disappear,
146+
because the purpose it serves (providing a unified interface to array operations from each
147+
of the supported backends) is already addressed by the array API standard.
148+
Hence the complete `einops` code base could be close to 50% smaller, and easier to maintain or add to.
149+
150+
.. note::
151+
152+
Other libraries that have a similar backend system to support many array libraries
153+
include [TensorLy](https://github.com/tensorly/tensorly), the (now discontinued)
154+
multi-backend version of [Keras](https://github.com/keras-team/keras),
155+
[Unumpy](https://github.com/Quansight-Labs/unumpy) and
156+
[EagerPy](https://github.com/jonasrauber/eagerpy). Many end users and organizations will also have such glue code - it tends to be needed whenever one tries to support multiple
157+
array types in a single API.
158+
159+
160+
.. _use-case-xtensor:
161+
162+
### Use case 3: adding a Python API to xtensor
163+
164+
[xtensor](https://github.com/xtensor-stack/xtensor) is a C++ array library
165+
that is NumPy-inspired and provides lazy arrays. It has Python (and Julia and R)
166+
bindings, however it does not have a Python array API.
167+
168+
Xtensor aims to follow NumPy closely, however it only implements a subset of functionality
169+
and documents some API differences in
170+
[Notable differences with NumPy](https://xtensor.readthedocs.io/en/latest/numpy-differences.html).
171+
172+
Note that other libraries document similar differences, see for example
173+
[this page for JAX](https://jax.readthedocs.io/en/latest/jax.numpy.html) and
174+
[this page for TensorFlow](https://www.tensorflow.org/guide/tf_numpy).
175+
176+
Each time an array library author designs a new API, they have to choose (a)
177+
what subset of NumPy makes sense to implement, and (b) where to deviate
178+
because NumPy's API for a particular function is suboptimal or the semantics
179+
don't fit their execution model.
180+
181+
This array API standard aims to provide an API that can be readily adopted,
182+
without having to make the above-mentioned choices.
183+
184+
.. note::
185+
186+
XND is another array library, written in C, that still needs a Python API.
187+
Array implementations in other languages are often in a similar situation,
188+
and could translate this array API standard 1:1 to their language.
189+
190+
191+
.. _use-case-numba:
192+
193+
### Use case 4: make JIT compilation of array computations easier and more robust
194+
195+
[Numba](https://github.com/numba/numba) is a Just-In-Time (JIT) compiler for
196+
numerical functions in Python; it is NumPy-aware. [PyPy](https://pypy.org)
197+
is an implementation of Python with a JIT at its core; its NumPy support relies
198+
on running NumPy itself through a compatibility layer (`cpyext`), while a
199+
previous attempt to implement NumPy support directly was unsuccessful.
200+
201+
Other array libraries may have an internal JIT (e.g., TensorFlow, PyTorch,
202+
JAX, MXNet) or work with an external JIT like
203+
[XLA](https://www.tensorflow.org/xla) or [VTA](https://tvm.apache.org/docs/vta/index.html).
204+
205+
Numba currently has to jump through some hoops to accommodate NumPy's casting rules
206+
and may not attain full compatibility with NumPy in some cases - see, e.g.,
207+
[this](https://github.com/numba/numba/issues/4749) or
208+
[this](https://github.com/numba/numba/issues/5907) example issue regarding (array) scalar
209+
return values.
210+
211+
An [explicit suggestion from a Numba developer](https://twitter.com/esc___/status/1295389487485333505)
212+
for this array API standard was:
213+
214+
> for JIT compilers (e.g. Numba) it will be important, that the type of the
215+
returned value(s) depends only on the *types* of the input but not on the
216+
*values*.
217+
218+
A concrete goal for this use case is to have better matching between
219+
JIT-compiled and non-JIT execution. Here is an example from the Numba code
220+
base, the need for which should be avoided in the future:
221+
222+
```
223+
def check(x, y):
224+
got = cfunc(x, y)
225+
np.testing.assert_array_almost_equal(got, pyfunc(x, y))
226+
# Check the power operation conserved the input's dtype
227+
# (this is different from Numpy, whose behaviour depends on
228+
# the *values* of the arguments -- see PyArray_CanCastArrayTo).
229+
self.assertEqual(got.dtype, x.dtype)
230+
```

0 commit comments

Comments
 (0)