Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Clickhouse offline store #4725

Merged

Conversation

iamhatesz
Copy link
Contributor

What this PR does / why we need it:

This PR adds a new contrib offline store backed by Clickhouse.

Which issue(s) this PR fixes:

Lack of Clickhouse support :)

Misc

The implementation is heavily based on the Postgres store and tested against it. The resulting features were identical to the point that it's possible with two different backends (e.g., different data types).

I added a helper to run integration tests: make test-python-universal-clickhouse-offline. Unfortunately, 3 test cases are failing:

ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='float', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()
ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='bool', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()
ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='datetime', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()

This is because Clickhouse doesn't support Nullable(Array(...)) type. I could have added test_universal_types to the ignore list, but I thought it's worth keeping it on, as many other test cases from this test are passing.

Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz iamhatesz requested a review from a team as a code owner October 31, 2024 13:31
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@zerafachris
Copy link

Hi @iamhatesz , I patched the requirements on your branch to hopefully fix the errors being thrown by the checks

Merge at your earliest convenience.

Looking forward to using this. ATM, running on PG and would like to move to CH asap

@franciscojavierarceo
Copy link
Member

Bunch of stuff failing here, think you also need to rebase. Let me know if you need some help!

@iamhatesz
Copy link
Contributor Author

@zerafachris @franciscojavierarceo thanks! I will take a look and push some updates.

…fline-store

# Conflicts:
#	sdk/python/docs/source/feast.infra.online_stores.contrib.rst
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.10-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.11-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
#	sdk/python/requirements/py3.9-requirements.txt
#	setup.py
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz iamhatesz force-pushed the feast-clickhouse-offline-store branch from 97678aa to bcb90ca Compare February 11, 2025 16:59
@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo! I fixed some issues related to Python 3.9, regenerated lock files and merged the latest changes. Could you please approve the workflows to see if the tests are passing now?

@iamhatesz
Copy link
Contributor Author

FAILED sdk/python/tests/integration/online_store/test_universal_online.py::test_async_online_retrieval_with_event_timestamps_dynamo[ParameterSet(values=(LOCAL:File:dynamodb:python_fs:False,), marks=[], id=None)] - botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: Event loop is closed

this failure doesn't seem to be related with the content of this PR.

@laurynas-stasys
Copy link

Would love to have this functionality as part of Feast ❤️

@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo! Could you please help me find a reviewer for this PR?

…fline-store

# Conflicts:
#	sdk/python/docs/source/feast.infra.offline_stores.contrib.rst
#	sdk/python/docs/source/feast.infra.utils.rst
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.10-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.11-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
#	sdk/python/requirements/py3.9-requirements.txt
#	setup.py
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz iamhatesz force-pushed the feast-clickhouse-offline-store branch from 57c04ff to 80bf5d7 Compare March 6, 2025 10:13
@franciscojavierarceo
Copy link
Member

Yes, reviewing now!

@franciscojavierarceo
Copy link
Member

For the failed test cases, can you mark ignore on them for clickhouse? Probably worth adding that Nullable Arrays aren't supported, yeah?

Apologies for the delay on this, frankly, this is embarrassing on my part. Do feel free to tag me directly going forward. I'll get this addressed ASAP, thank you for the contribution. 🙏

Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo ! Thanks for the review. I hope I addressed all your concerns. I had to add the ignore rule as part of a fixture, since pytest -k doesn't seem to work with complex test case names (e.g., with parentheses). I hope it's not a problem since I based on a similar hack for Redshift.

@masterlexa
Copy link

Hello, we are really looking forward to this feature, can you tell us when it will be added?)

Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@franciscojavierarceo
Copy link
Member

@iamhatesz sorry one last conflict needs to be resolved 😞

…fline-store

# Conflicts:
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz
Copy link
Contributor Author

@franciscojavierarceo merged the latest changes and regenerated lock files with make lock-python-dependencies-all but it also updates all packages to their latest versions. After updating torch (which is completely irrelevant to this PR), two unit tests failed due to missing wheels. Let me know how I can fix this. How do you guys update the lock files if not with that command?

The e2e test failure seems unrelated as well.

@franciscojavierarceo
Copy link
Member

franciscojavierarceo commented Mar 12, 2025

@ntkathole @jyejare

@ntkathole
Copy link
Contributor

ntkathole commented Mar 12, 2025

@franciscojavierarceo merged the latest changes and regenerated lock files with make lock-python-dependencies-all but it also updates all packages to their latest versions. After updating torch (which is completely irrelevant to this PR), two unit tests failed due to missing wheels. Let me know how I can fix this. How do you guys update the lock files if not with that command?

The e2e test failure seems unrelated as well.

@iamhatesz can you please pin torch version 2.2.2 in setup.py and pyproject.toml ? This is happening since PyTorch 2.2.x is the last version that supports macOS x64.

Signed-off-by: Tomasz Wrona <tomasz@cast.ai>
@iamhatesz
Copy link
Contributor Author

@ntkathole done

Copy link
Contributor

@ntkathole ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@franciscojavierarceo franciscojavierarceo merged commit 86794c2 into feast-dev:master Mar 12, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants