Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT #51

Open
iDigBioBot opened this issue Jan 5, 2018 · 51 comments
Open

TG2-VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT #51

iDigBioBot opened this issue Jan 5, 2018 · 51 comments
Labels
CODED Consistency CORE TG2 CORE tests Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID b9c184ce-a859-410c-9d12-71a338200380
Label VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT
Description Does the marine/non-marine biome of a taxon from the bdq:sourceAuthority match the biome at the location given by the coordinates?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:decimalLatitude
dwc:decimalLongitude
Information Elements Consulted dwc:scientificName
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if either bdq:taxonIsMarine or bdq:geospatialLand are not available; INTERNAL_PREREQUISITES_NOT_MET if (1) dwc:scientificName is bdq:Empty or (2) the values of dwc:decimalLatitude or dwc:decimalLongitude are bdq:Empty or (3) if bdq:assumptionOnUnknownBiome is noassumption and the marine/nonmarine status of the taxon is not interpretable from bdq:taxonIsMarine; COMPLIANT if (1) the taxon marine/nonmarine status from bdq:taxonIsMarine matches the marine/nonmarine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters or (2) if the marine/nonmarine status of the taxon is not interpretable from bdq:taxonIsMarine and bdq:assumptionOnUnknownBiome matches the marine/nonmarine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
Data Quality Dimension Consistency
Term-Actions COORDINATESTERRESTRIALMARINE_CONSISTENT
Parameter(s) bdq:taxonIsMarine
bdq:geospatialLand
bdq:spatialBufferInMeters
bdq:assumptionOnUnknownBiome
Source Authority bdq:taxonIsMarine default = "World Register of Marine Species (WoRMS)" {[https://www.marinespecies.org/]} {Web service [https://www.marinespecies.org/aphia.php?p=webservice]}
bdq:geospatialLand default = "Union of NaturalEarth 10m-physical-vectors for Land and NaturalEarth Minor Islands" {[https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip], [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]}
bdq:spatialBufferInMeters default = "3000"
bdq:assumptionOnUnknownBiome default = "noassumption"
Specification Last Updated 2024-08-30
Examples [dwc:decimalLatitude="-41.0525925872862", dwc:decimalLongitude="-71.5310546742521", dwc:scientificName="Aegla neuquensis": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="The species is freshwater aquatic and the coordinates fall in a lake and thus COMPLIANT"]
[dwc:decimalLatitude="20.0", dwc:decimalLongitude="-30.0", dwc:scientificName="Viviparus contectus (Millet, 1813)": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:scientificName is non-marine according to dwc:taxonIsMarine but coordinates are marine"]
Source ALA, OBIS
References
Example Implementations (Mechanisms) Kurator/FilteredPush geo_ref_qc Library DOI: 10.5281/zenodo.14064324
Link to Specification Source Code https://github.com/FilteredPush/geo_ref_qc/blob/v2.0.1/src/main/java/org/filteredpush/qc/georeference/DwCGeoRefDQ.java#L3198
Notes dwc:coordinatePrecicision and dwc:coordinateUncertaintyInMeters (if present) imply a potential displacement of the provided coordinates. These two terms can be considered spatial buffers. Likewise, country polygons cannot be 100% accurate at all scales (Dooley 2005), so a spatial buffer of the country boundaries is justified. Taking the spatial buffers into account does however greatly complicate both the logic and the implementation of such tests. The same applies to potential conversion of the Spatial Reference System (SRS) of dwc:decimalLatitude and dwc:decimalLongitude to the SRS used in the bdq:sourceAuthority. Note that in the current implementation tests treat "brackish" in WoRMS as both marine and terrestrial. Note that both bdq:taxonIsMarine and bdq:geospatialLand are bdq:sourceAuthorities, but as they form two parameters, distinct names are used for them.
@iDigBioBot
Copy link
Collaborator Author

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet:
Should we consider whether we are talking about paleo records?

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
terrestrial spatial layer isn't a Darwin Core term, needs to be listed in a different column. Difficult to work with nearshore environments, precision of the GIS layers may not be high enough and coastal boundaries and occurrence locations may need to be buffered as part of a test.

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
OBIS codes taxa as marine, freshwater, brackish, or terrestrial. Definition of terrestrial unclear here, meaning on land, or non-marine, would freshwater taxa be expected to fall inside terrestrial polygons. Some taxa can be in different envrionments depending on where they are in their lifecycle, freshwater, brackish water, and marine phases are not unusual. Brackish water and nearshore environments tend to be problematic for detection of problematic coordinates, particularly without high resolution gis layers.

@iDigBioBot
Copy link
Collaborator Author

Comment by Christian Gendreau (@cgendreau) migrated from spreadsheet:
don't forget it also implies you "understood" the species so it's extremly difficult to implement. I would definitly not include it in the core

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
dIFFICULT TO IMPLEMENT

@ArthurChapman
Copy link
Collaborator

This probably needs further discussion. There are two ways this can be done. Some earlier discussion seemed to indicate that it may be better to use the taxon to decide if it is Marine or not - i.e. go the OBIS way and use WORMS - and basically decide on a Taxon basis. Alternatively (and harder to implement) is to use similar techniques to #73 and use a GIS layer. If the latter, then we probably need to add the 3km buffer - but what about things in estuaries (where coastlines are particularly unreliable). I think I favour using the taxon to decide, using WORMS. Whatever we do it needs to be rewritten.

@ArthurChapman
Copy link
Collaborator

Also see comment by @iDigBioBot above

@Tasilee
Copy link
Collaborator

Tasilee commented May 5, 2020

The test is certainly useful if it can flag species in a wrong location and marine vs non-marine is the most broad first cut.

This test requires a) the habitat identification of the taxon and b) the location using dwc:decimalLatitude and dwc:decimalLongitude and (c), a spatial buffer. (a) could be iffy if is it based on WORMS or IRMNG.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented May 6, 2020

Note that if you are using layers for the marine/terrestrial boundaries - the scale of the land/water interface in the EEZ layers on marineregions.org is a lot more course than that of the country (and hence land/marine interface) GADM country boundaries, for example so I would suggest the latter.

@Tasilee
Copy link
Collaborator

Tasilee commented May 6, 2020

Is the EEZ relevant here? All we need to know is land vs water? Maybe. Australia has a category called 'External territories' that includes a bunch of islands like Cocos, Heard, McDonald, Norfolk and I guess from https://www.ga.gov.au/scientific-topics/marine/jurisdiction/maritime-boundary-definitions that these are part of the EEZ.

@ArthurChapman
Copy link
Collaborator

The GADM country layers should have the islands, so the EEZ is probably not as relevant and is at a muh worse scale. I would just use GADM.

@ArthurChapman
Copy link
Collaborator

Looking at the Expected Response in this one - it is not clear if we are using geographic boundaries or relying on taxon IRMNG and the OBIS codes to determine isMarine. As written we seem to be having bets both ways.

@Tasilee
Copy link
Collaborator

Tasilee commented May 8, 2020

As far as I am concerned, there are at least two bdq:sourceAuthority references here. The first is either WORMS or IRMNG and the second is GADM. A potential third may be EEZs.

I cannot see why we don't have an EXTERNAL_PREREQUISITES_NOT_MET

We again have the potential problem of spatial buffers. As per #73 (and others), buffering makes for serious complications. I would be happier to accept the false positive here if we skip the complexities (and put scenarios with buffers in Notes) as a VALIDATION, than I would for an AMENDMENT as in #73.

@ArthurChapman
Copy link
Collaborator

It depends on whether you go the spatial route - then GADM, if you go the taxon route then IRMNG - I think we have to use one or the other and not both? The IRMNG I think includes the OBIS Codes for marine, freshwater, brackish, or terrestrial - perhaps these could be used in some way. If you go the Spatial route, then you have the problem of fuzzy coastlines (so not going to be accurate within 3km if using global layers and not a localised GIS), and the IRMNG may thus be more accurate. It may need be something that is tested against some real data. If we go the taxon route, then we are at least consistent with OBIS.

@davewatts3
Copy link

OBIS uses the habitat values from WoRMS not IRMNG. A taxa can have more than one value. Only those tagged with marine appear in the OBIS portal. OBIS does report on suspect terrestrial locations but the data still appears. Obviously if it is a seabird it can be inland - migrating or nesting. We have seen terrestrial birds fly south in the southern ocean (likely never to return) yet a valid observation. Many observations of marine animals are done from the coast so appear in the wrong spot. Salt water crocs travel up rivers so appear inland even if tagged as marine. I think the WoRMS taxonomic editors are more likely to tag the taxa correctly as marine versus using observation records to define 'marine'

@ArthurChapman
Copy link
Collaborator

Thanks @davewatts3 - very valuable contribution.

@tucotuco
Copy link
Member

Even biome is too generic. I vote for leaving it as is.

@ArthurChapman
Copy link
Collaborator

OK as is - we have TERRESTRIALMARINE defined in #152

@tucotuco
Copy link
Member

I suggest the Description:

'Does the marine/non-marine biome of a taxon from the bdq:sourceAuthority match the biome at the location given by the coordinates?'

in place of:

'Does the marine/nonmarine status of a taxon from bdq:sourceAuthority[taxonomyismarine] match the location given by the coordinates?'

@Tasilee Tasilee removed the NEEDS WORK label Apr 3, 2022
@Tasilee
Copy link
Collaborator

Tasilee commented Jun 2, 2022

Source Authority of land and islands merged to "bdq:sourceAuthority[geospatialland] default = the union of "NaturalEarth 10m-physical-vectors for Land" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip] and "NaturalEarth Minor Islands" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]"

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 13, 2023

Restructured Parameter(s) and Source authority

@ArthurChapman
Copy link
Collaborator

Updated Expected Response, Parameter(s), Source Authority and Specification Last Updated to

replace:
bdq:sourceAuthority[taxonomyismarine] with bdq:taxonomyIsMarine
bdq:sourceAuthority[geospatialland] with bdq:geospatialLand

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 11, 2023

ost Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:taxonIsMarine default = "WORMS" [https://www.marinespecies.org/aphia.php?p=webservice] bdq:geospatialLand default = the spatial union of "NaturalEarth 10m-physical-vectors for Land" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip] and "NaturalEarth Minor Islands" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]
bdq:spatialBufferInMeters default = "3000"

to

bdq:taxonIsMarine default = "World Register of Marine Organisms (WORMS") {[https://www.marinespecies.org/]}
{Web service [https://www.marinespecies.org/aphia.php?p=webservice]}
{bdq:geospatialLand default = The spatial union of "NaturalEarth 10m-physical-vectors for Land" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip] and "NaturalEarth Minor Islands" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]}
bdq:spatialBufferInMeters default = "3000"

@chicoreus
Copy link
Collaborator

chicoreus commented Jul 11, 2023 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 7, 2023

This test should have Data Quality Dimension "Consistency" rather than "Conformance". Edited.

@ymgan
Copy link
Collaborator

ymgan commented Sep 13, 2023

Hi,

bdq:taxonIsMarine default = "World Register of Marine Organisms (WoRMS") {[https://www.marinespecies.org/]} {Web service [https://www.marinespecies.org/aphia.php?p=webservice]}

I believe there is a typo, WoRMS is "World Register of Marine Species", thanks a lot!

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 14, 2023

Thanks @ymgan. Corrected.

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 16, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@Tasilee
Copy link
Collaborator

Tasilee commented Apr 15, 2024

Changed "was" to "is" to align with standard phrasing in ER as in "INTERNAL_PREREQUISITES_NOT_MET if xxx is EMPTY"

@chicoreus
Copy link
Collaborator

The default source authority identifier must be a single string, it can't be a list of source authorities. Thus change from:

{bdq:geospatialLand default = The spatial union of "NaturalEarth 10m-physical-vectors for Land" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip] and "NaturalEarth Minor Islands" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]}

To:

{bdq:geospatialLand default = "Union of NaturalEarth 10m-physical-vectors for Land and NaturalEarth Minor Islands" [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip], [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]}

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 17, 2024
…TERRESTRIALMARINE. Including sci_name_qc library as dependency to provide WoRMSService for looking up names and habitats in WoRMS.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 17, 2024
@ArthurChapman
Copy link
Collaborator

Expected Response changed from

From

EXTERNAL_PREREQUISITES_NOT_MET if either bdq:taxonomyIsMarine or bdq:geospatialLand are not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is EMPTY or the marine/non-marine status of the taxon is not interpretable from bdq:taxonomyIsMarine or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if the taxon marine/non-marine status from bdq:taxonomyIsMarine matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT

to

EXTERNAL_PREREQUISITES_NOT_MET if either bdq:taxonomyIsMarine or bdq:geospatialLand are not available; INTERNAL_PREREQUISITES_NOT_MET if (1) dwc:scientificName is EMPTY or (2) the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY or (3) if bdq:assumptionOnUnknownHabitat is NoAssumption and the marine/non-marine status of the taxon is not interpretable from bdq:taxonomyIsMarine; COMPLIANT if (1) the taxon marine/non-marine status from bdq:taxonomyIsMarine matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters or (2) if the marine/non-marine status of the taxon is not interpretable from bdq:taxonomyIsMarine and the taxon marine/non-marine status from bdq:assumptionOnUnknownHabitat matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters ; otherwise NOT_COMPLIANT |

This was to allow Terrestrial values to be COMPLIANT whereas previously anything not in WoRMS for example to report INTERNAL_PREREQUISITES_NOT_MET

Note added to report that current implementation tests teat "brackish" in WoRMS as both marine and terrestrial.

Updated Specification Last Updated

@ArthurChapman
Copy link
Collaborator

Changed "bdq:taxonomyIsMarine" to "bdq:taxonIsMarine" throughout

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 25, 2024
…patial check against land, adding more types of name matches as matching. Adding a minimal unit test.
@chicoreus chicoreus changed the title TG2-VALIDATION_COORDINATES_TERRESTRIALMARINE TG2-VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT Aug 30, 2024
@chicoreus
Copy link
Collaborator

Changed name to make consistent TERM_ACTION string, with an action.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Nov 10, 2024
@chicoreus chicoreus added the CODED label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CODED Consistency CORE TG2 CORE tests Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

8 participants