-
Notifications
You must be signed in to change notification settings - Fork 173
Neptune client #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Neptune client #104
Conversation
… into one `client` objet. - Add a builder object to facilitate creating the client with various options - Remove specification of `iam_credentials_provider_type` and instead make use of the default boto3 session for obtaining aws credentials (as we do for Sagemaker integration) - Organize all tests using pytest to more easily filter on what tests should be run or not. The neptune client can be build either directly with its constructor: ```python from graph_notebook.neptune.client import Client c = Client(host=foo) c.status() ``` It can also be created using our builder class: ```python from botocore.session import get_session from graph_notebook.neptune.client import ClientBuilder builder = ClientBuilder() \ .with_host(config.host) \ .with_port(config.port) \ .with_region(config.aws_region) \ .with_tls(config.ssl) \ .with_iam(get_session()) c = builder.build() c.status() ``` The `Client` object has some components which are Neptune-specific, and some which are not: - `sparql` - takes any SPARQL query and interprets whether it should be issued as type `query` or type `update` - `sparql_query` - sends a query request to the configured SPARQL endpoint with the payload `{'query': 'YOUR QUERY'}` - `sparql_update` - sends an update request to the configured SPARQL endpoint with the payload `{'update': 'YOUR QUERY'}` - `do_sparql_request` - submits the given payload to the configured SPARQL endpoint - `get_gremlin_connection` - returns a websocket connection to the configured gremlin endpoint. - `gremlin_query` - obtains a new gremlin connection and submits the given query. The opened connection will be closed after obtaining query results - `gremlin_http_query` - executes the given gremlin query via http(s) instead of websocket. - `gremlin_status` - returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_explain` - obtains an explain query plan for the given SPARQL query (can be of type update or query) - `sparql_status` - returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_cancel` - cancels the running SPARQL query with the provided query_id - `gremlin_cancel` - cancels the running Gremlin query with the provided `query_id` - `gremlin_explain` - obtains an explain query plan for a given Gremlin query - `gremlin_profile` - obtains a profile query plan for a given Gremlin query - `status` - retrieves the status of the configured Neptune endpoint - `load` - submits a new bulk load job with the provided parameters. - `load_status` - obtains the status of the bulk loader. Takes an optional `query_id` to obtain the status of a specific loader job - `cancel_load` - cancels the provided bulk loader job id - `initiate_reset` - obtains a token needed to execute a fast reset of your configured Neptune endpoint - `perform_reset` - takes a token obtained from `initiate_reset` and performs the reset - `dataprocessing_start` - starts a NeptuneML dataprocessing job with the provided parameters - `dataprocessing_job_status` - obtains the status of a given dataprocessing job id - `dataprocessing_status` - obtains the status of the configured Neptune dataprocessing endpoint - `dataprocessing_stop` - stops the given dataprocessing job id - `modeltraining_start` - starts a NeptuneML modeltraining job with the provided parameters - `modeltraining_job_status` - obtains the status of a given modeltraining job id - `modeltraining_status` - obtains the status of the configured Neptune modeltraining endpoint - `modeltraining_stop` - stops the given modeltraining job id - `endpoints_create` - creates a NeptuneML endpoint with the provided parameters - `endpoints_status` - obtain the status of a given endpoint job - `endpoints_delete` - delete a given endpoint id - `endpoints` - obtain the status of all endpoints to the configured Neptune database - `export` - helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint. - `export_status` - obtain the status of the configured exporter endpoint.
97c4f72
to
349e87d
Compare
src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py
Outdated
Show resolved
Hide resolved
src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py
Outdated
Show resolved
Hide resolved
src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py
Outdated
Show resolved
Hide resolved
src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py
Outdated
Show resolved
Hide resolved
query_check_for_airports = "g.V('3684').outE().inV().has(id, '3444')" | ||
res = do_gremlin_query(query_check_for_airports, self.host, self.port, self.ssl, self.client_provider) | ||
res = self.client.gremlin_query(query_check_for_airports) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better not to use explicit ID values here in case the data set ever changes and that route gets deleted. I am not sure what is needed but a different test might be more future proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was put in place to ensure that all airports were added (by checking for the last one), we could instead rewrite it to look for the content.
Looks like this PR could fix one reported bug: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Neptune Client
client
objet.iam_credentials_provider_type
and instead make use ofthe default boto3 session for obtaining aws credentials (as we do for Sagemaker integration)
Client
The neptune client can be build either directly with its constructor:
It can also be created using our builder class:
The
Client
object has some components which are Neptune-specific, and some which are not:Not Neptune Specific
sparql
- takes any SPARQL query and interprets whether it should be issued as typequery
or typeupdate
sparql_query
- sends a query request to the configured SPARQL endpoint with the payload{'query': 'YOUR QUERY'}
sparql_update
- sends an update request to the configured SPARQL endpoint with the payload{'update': 'YOUR QUERY'}
do_sparql_request
- submits the given payload to the configured SPARQL endpointget_gremlin_connection
- returns a websocket connection to the configured gremlin endpoint.gremlin_query
- obtains a new gremlin connection and submits the given query. The opened connection will be closedafter obtaining query results
gremlin_http_query
- executes the given gremlin query via http(s) instead of websocket.gremlin_status
- returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optionalquery_id
input to obtain the status of a specific queryNeptune specific
sparql_explain
- obtains an explain query plan for the given SPARQL query (can be of type update or query)sparql_status
- returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optionalquery_id
input to obtain the status of a specific querysparql_cancel
- cancels the running SPARQL query with the provided query_idgremlin_cancel
- cancels the running Gremlin query with the providedquery_id
gremlin_explain
- obtains an explain query plan for a given Gremlin querygremlin_profile
- obtains a profile query plan for a given Gremlin querystatus
- retrieves the status of the configured Neptune endpointload
- submits a new bulk load job with the provided parameters.load_status
- obtains the status of the bulk loader. Takes an optionalquery_id
to obtain the status of a specific loader jobcancel_load
- cancels the provided bulk loader job idinitiate_reset
- obtains a token needed to execute a fast reset of your configured Neptune endpointperform_reset
- takes a token obtained frominitiate_reset
and performs the resetdataprocessing_start
- starts a NeptuneML dataprocessing job with the provided parametersdataprocessing_job_status
- obtains the status of a given dataprocessing job iddataprocessing_status
- obtains the status of the configured Neptune dataprocessing endpointdataprocessing_stop
- stops the given dataprocessing job idmodeltraining_start
- starts a NeptuneML modeltraining job with the provided parametersmodeltraining_job_status
- obtains the status of a given modeltraining job idmodeltraining_status
- obtains the status of the configured Neptune modeltraining endpointmodeltraining_stop
- stops the given modeltraining job idendpoints_create
- creates a NeptuneML endpoint with the provided parametersendpoints_status
- obtain the status of a given endpoint jobendpoints_delete
- delete a given endpoint idendpoints
- obtain the status of all endpoints to the configured Neptune databaseexport
- helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint.export_status
- obtain the status of the configured exporter endpoint.