Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxanomic matching #934

Closed
derek-mba opened this issue Jul 31, 2023 · 7 comments
Closed

Taxanomic matching #934

derek-mba opened this issue Jul 31, 2023 · 7 comments

Comments

@derek-mba
Copy link

When a record is submitted via the IPT that contains a valid scientificNameID and a scientificName, the scientificNameID should be considered authoritative.

See https://discourse.gbif.org/t/millipedes-in-the-ocean/3991

The core of the problem here is that GBIF is using the ScientificName instead of the ScientificNameId (in this case it's Aphia ID). The latter should be definitive, and is correct on the MBA records. ScientificName should only be used if ScientificNameId is not present. It's true, that for some reason our ScientificName didn't match the ScientificNameId, but OBIS harvests these same records and gets the classifications right (I am a little surprised that EurOBIS, which has very stringent checking of taxonomy, had not rejected these records because the ScientificName hadn't matched the Aphia, but I can hardly blame them for our bad data!).

As for GBIF "fixing" the data, please don't. We're always ready to fix our own once we know there's an issue. Perhaps some data providers do ignore flags, but if this had been brought to our attention earlier, we'd have fixed it (and have done now, though I'm not sure how soon the data will be republished).

@MattBlissett MattBlissett transferred this issue from gbif/ipt Jul 31, 2023
@rubenpp7
Copy link

rubenpp7 commented Aug 1, 2023

Hi,

Thanks for highlighting this Derek.
In EurOBIS we have an internal check (soon to be part of our public QC tool http://rshiny.lifewatch.be/BioCheck/) that compares the aphiaID under scientificNameID with the value under scientificName. So we do consider relevant using scientificName with the original identification together with the scientificNameID to do this crosscheck.

This issue has however given me an idea on how to improve that taxonomy check by adding also the higher classification to the check.

Thank you!

@bart-v
Copy link

bart-v commented Aug 7, 2023

FYI: GBIF not using the ScientificNameID is a known issue #217
And it's a shame: why are we using PIDs after all then...

@ymgan
Copy link

ymgan commented Aug 7, 2023

@bart-v I agree, please see a different concern when scientificNameID is not being interpreted #895
It could be confusing to the data user and our data provider got confused by why this is happening when they have done their best in providing data with utmost clarity.

@timrobertson100
Copy link
Member

I'll close this, linking to the original issue already capturing this #217

@derek-mba
Copy link
Author

Please don't close issues as "completed", when they're not. This should have been "merged" into #217.

@timrobertson100
Copy link
Member

Sorry @derek-mba

GitHub doesn't have a merge option for issues, so I linked them and closed this only to try and keep the discussion together on the original issue. The alternative was to close this using the "won't fix" option.

I'll reopen this

@timrobertson100
Copy link
Member

With #217 closed with an implementation I'll also close this again as I don't think there is anything here that isn't covered in that thread, but please comment if I am mistaken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants