Skip to content

[BUG] Some gremlin queries not generating graphs in Air-Routes-Gremlin.ipynb #254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
holleyism opened this issue Feb 2, 2022 · 10 comments
Closed
Labels
bug Something isn't working

Comments

@holleyism
Copy link

Describe the bug
Several of the cells in the Air-Routes-Gremlin.ipynb do not generate results in the graph tab.

To Reproduce
Steps to reproduce the behavior:

  1. Go to Air-Routes-Gremlin.ipynb
  2. Scroll down to the text "The next query also produces a result that is fun to explore using the Graph tab"
  3. Run the "my_node_labels" cell
  4. Run the gremlin query cell
  5. There is only a Console and Query Metadata tabs.

Expected behavior
A graph tab with interesting results.

Screenshots
image

Desktop (please complete the following information):

  • OS: Ubuntu 20.04
  • Browser: Chrome
  • Version: 97.0.4692.99

Additional context
Latest version of graph-notebook 3.1.1
Backend gremlin-server using instructions here.
Seeded with %seed in notebook.

@holleyism holleyism added the bug Something isn't working label Feb 2, 2022
@krlawrence
Copy link
Contributor

Hi Adam!

I just tested this against Neptune as the backend and it seems to work OK. Which DB are you using in your case?
image

@holleyism
Copy link
Author

holleyism commented Feb 2, 2022

Hey! I'm using the gremlin-server backend. I updated the properties per the docs. Is there an easy way to debug? I'm not seeing any errors in the console where the notebook was started. What's weird is some queries work and some don't. I'll dig through the results to see if there's an obvious difference.

@holleyism
Copy link
Author

If appears that if the output is a simple path, e.g.:
path[YT, e[56072][3640-contains->1670], DZA]
the graph tab is available. when the path contains objects, it doesn't. e.g.:
path[{<T.id: 1>: '3', <T.label: 4>: 'airport', 'country': ['US'], 'code': ['AUS'], 'longest': [12250], 'city': ['Austin'], 'elev': [542], 'icao': ['KAUS'], 'lon': [Decimal('-97.66989898681640625')], 'type': ['airport'], 'region': ['US-TX'], 'runways': [2], 'lat': [Decimal('30.19449996948240055871792719699442386627197265625')], 'desc': ['Austin Bergstrom International Airport']}, 722, {<T.id: 1>: '187', <T.label: 4>: 'airport', 'country': ['US'], 'code': ['STL'], 'longest': [11019], 'city': ['St Louis'], 'elev': [618], 'icao': ['KSTL'], 'lon': [Decimal('-90.370002746582002828290569595992565155029296875')], 'type': ['airport'], 'region': ['US-MO'], 'runways': [4], 'lat': [Decimal('38.74869918823240055871792719699442386627197265625')], 'desc': ['Lambert St Louis International Airport']}, 3191, {<T.id: 1>: '217', <T.label: 4>: 'airport', 'country': ['IS'], 'code': ['KEF'], 'longest': [10056], 'city': ['Reykjavik'], 'elev': [171], 'icao': ['BIKF'], 'lon': [Decimal('-22.60560035705569958963678800500929355621337890625')], 'type': ['airport'], 'region': ['IS-2'], 'runways': [2], 'lat': [Decimal('63.98500061035159802713678800500929355621337890625')], 'desc': ['Reykjavik, Keflavik International Airport']}, 870, {<T.id: 1>: '1811', <T.label: 4>: 'airport', 'country': ['GL'], 'code': ['GOH'], 'longest': [3117], 'city': ['Nuuk'], 'elev': [283], 'icao': ['BGGH'], 'lon': [Decimal('-51.6781005858999975544065819121897220611572265625')], 'type': ['airport'], 'region': ['GL-U-A'], 'runways': [1], 'lat': [Decimal('64.1909027100000031396120903082191944122314453125')], 'desc': ['Godthaab / Nuuk Airport']}, 887, {<T.id: 1>: '3017', <T.label: 4>: 'airport', 'country': ['IS'], 'code': ['RKV'], 'longest': [5141], 'city': ['Reykjavik'], 'elev': [48], 'icao': ['BIRK'], 'lon': [Decimal('-21.940599441500001631766281207092106342315673828125')], 'type': ['airport'], 'region': ['IS-1'], 'runways': [3], 'lat': [Decimal('64.129997253400006229639984667301177978515625')], 'desc': ['Reykjavik Airport']}]

@holleyism
Copy link
Author

I have some more information. It appears that if I remove lat and lon from the valueMap, all queries seem to render the graph tab. So it appears that when something like [Decimal('44.123123123')] is in the value of a property, the graph tab does not render. Still looking.

@krlawrence
Copy link
Contributor

krlawrence commented Feb 3, 2022

Thanks for the updated info. I think that is probably it. We can look into fixes on the graph-notebook side, but for now is it possible to just use Double/Float for those?

@holleyism
Copy link
Author

I'm not sure how I'd change them. They're coming like that from gremlin. The graphml has it as double. The query I'm running is below. Can you typecast in valueMap?

%%gremlin -p v,oute,inv,oute,inv,oute,inv,oute,inv -g country -d $my_node_labels 
g.V().
  has('code','AUS').
  repeat(outE().inV().where(without('x')).store('x')).
  times(4).
  limit(50).
  path().
    by(valueMap(true,'airport','country','code','type','desc','longest','city','elev','icao','region','runways')).
    by('dist')

@krlawrence
Copy link
Contributor

Thanks for the follow up. It seems that for some reason the values are getting serialized back as Decimals even though they are created as doubles. Will dig into this more. Seems we have two actions:

  1. Figure out why Gremlin Server is sending back Decimal types.
  2. Add support for Decimal types to the graph-notebook code, which may also require looking at the Jupyter JSON libraries as I don't think they support Decimal today.

I ran a small test to help figure out at least what that value conversion you are seeing is using:

from decimal import *
a = Decimal(64.1299972534)
print(a)

64.129997253400006229639984667301177978515625

@krlawrence
Copy link
Contributor

I setup a test environment with the latest graph-notebook and Gremlin Server 3.5.2 running locally. I loaded air routes into the graph using g.io('air-routes.xml').read() from a Gremlin Console and then tried your queries from the notebook and everything worked fine. My local Java is still Java 8, my Python version is 3.7.6. I'll keep digging

@krlawrence
Copy link
Contributor

krlawrence commented Feb 4, 2022

After much experimentation we figured out the difference. I had originally used g.io() to load data into my Gremlin Server and it was working fine. The GraphML file explicitly states the values are doubles. When loaded via %seed (which uses Gremlin addV steps) the data is getting converted to java.math.BigDecimal form.. The short term workaround for Gremlin Server is to load the air-routes data using g.io from a Gremlin Console. The seed command sends Gremlin steps to the Gremlin Server which uses Groovy to parse the data. It is Groovy that is creating the BigDecimal types. Another work around would be to edit all of the values in the %seed source files to be explicitly coded as doubles; such as 23.1234d

@krlawrence
Copy link
Contributor

I have pushed a change for csv-gremlin that we can use to re-build the air-routes %seed data. This will stop the problem that @holleyism ran into. We should also update the notebooks to handle Decimal types correctly. awslabs/amazon-neptune-tools#201

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Resolved
Development

No branches or pull requests

3 participants