-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict fastpath isel indexes to the case of all PandasIndex #10066
Conversation
Thank you for tackling this in a very cool way!!! On main:
On this branch:
On v2024.03.0
It seems that the performance improvements that were brought over 2024 are maintained (I think the 300its/s drop is real, but the main gains were in the 1000s of it/s in my benchmarks). In 2024 there was also a bug that made timestamps very slow to load in which is why my benchmark tests against data where the timestamp is also stored as a float64. import xarray
from tqdm import tqdm
for filename in [
'software_timestamp.nc',
'software_timestamp_float64.nc',
]:
print(filename)
for engine in ['h5netcdf', 'netcdf4']:
for computed in [False, True]:
dataset = xarray.open_dataset(filename, engine=engine)
N_frames = dataset.sizes['frame_number']
N_iters = 100_000
software_timestamp = dataset['software_timestamp']
if computed:
software_timestamp = software_timestamp.compute()
if computed:
computed_str = "computed"
else:
computed_str = "lazy"
desc = f"{engine:10s}{computed_str:10s}"
for i in tqdm(range(N_iters), desc=desc):
software_timestamp.isel(frame_number=i % N_frames)
dataset.close() |
Looking at your benchmark, I think we could (should?) cache We could also cache I think this will provide benchmark results at least on par with the fastest ones above while this will work with all kinds of indexes (both single and multi-coordinate) and without duplicating |
whats-new.rst
Implement #10063 (comment). Since we use
type(idx) is ...
instead ofisinstance(idx, ...)
this should be even faster.The only performance degradation would be in the case of single coordinate custom indexes implemented in 3rd-party libraries, which could go through the fastpath as well but is harder to detect, but I think that's ok for now.
@hmaarrfk @dcherian