Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rebuild with pydantic #6

Merged
merged 2 commits into from
Jan 12, 2025
Merged

feat: rebuild with pydantic #6

merged 2 commits into from
Jan 12, 2025

Conversation

kengz
Copy link
Owner

@kengz kengz commented Jan 12, 2025

How does it work

Feature Transform simply builds sklearn ColumnTransformer and its estimators/pipelines with 1-1 mapping from a spec file:

  1. Spec is defined via Pydantic feature_transform/validator/. This defines:
    • spec: the Estimator, Pipeline, ColumnTransformer
  2. If spec specifies:
    1. transformers=list[(name, transformer, columns)], then use ColumnTransformer
    2. transformers=list[(transformer, columns)], then use make_column_transformer with auto-generated names

See more in the pydantic spec definition:

Guiding principles

The design of Feature Transform is guided as follows:

  1. simple: the module spec is straightforward:
    1. it is simply sklearn class name with kwargs.
    2. it supports official sklearn estimators, Pipeline, and custom-defined modules registered via ft.register_class
  2. expressive: it can be used to build both simple and advanced ColumnTransformer easily
  3. portable: it returns ColumnTransformer that can be used anywhere; it is not a framework.
  4. parametrizable: data-based feature transformation unlocks fast experimentation, e.g. by building logic for hyperparameter / data feature search

BREAKING CHANGE: rebuild with pydantic
@kengz kengz merged commit c4d721c into main Jan 12, 2025
2 checks passed
@kengz kengz deleted the v2 branch January 12, 2025 05:46
Copy link

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant