Skip to content

refactor: break down gbq.py file to several smaller ones #909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

sycai
Copy link
Contributor

@sycai sycai commented Apr 22, 2025

The intention of this PR is to re-organize the code to make it easier to implement the dry_run feature: #585

The file gbq.py contains too many lines which makes it hard for code navigations.

Change summary:

  • GbqConnector and its helper functions are relocated from gbq.py to a new file gbq_connector.py
  • Exceptions defined in gbq.py are relocated to exceptions.py file.
  • Context class and context global instance are relocated from gbq.py to a new file context.py.

These items are all imported back to the gbq.py file to avoid breaking imports of our downstream products.

@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. api: bigquery Issues related to the googleapis/python-bigquery-pandas API. labels Apr 22, 2025
@sycai sycai marked this pull request as ready for review April 22, 2025 21:00
@sycai sycai requested review from a team as code owners April 22, 2025 21:00
@sycai sycai requested a review from tswast April 22, 2025 21:00
import pandas_gbq.exceptions
from pandas_gbq.exceptions import GenericGBQException, QueryTimeout
from pandas_gbq.contexts import context
from pandas_gbq.exceptions import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be explicit about what aliases we are keeping around with __all__ + a comment explaining why these "unused" aliases are here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__all__ may give us extra burden due to potential star imports. I used # noqa to mark all unused imports instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're refactoring, I wonder if we actually want GbqConnector at all for our main implementation? It seems like one of those "you don't need a class" situations https://www.youtube.com/watch?v=o9pEzgHorH0 since we aren't actually benefiting from any object oriented isms here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline. We won't address this issue for now.

That said, GbqConnector does seem like an over-engineered OOP class.

@sycai sycai added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 23, 2025
@sycai sycai requested a review from tswast April 23, 2025 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. size: xl Pull request size is extra large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants