-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description of a taxonomic entity in RDF #359
Comments
I think the underlying answers to these questions here has to do with the standards process as it exists in TDWG. As an RDF user, you can use Darwin Core properties to say anything you want. But if you don't use them the same way as others, nobody will understand what you mean. It would be the job of TDWG to define how those properties should be used in a stable way that makes sense to most possible users rather than just one person. As I've said in other responses, using Darwin Core terms in RDF was an afterthought. So when we wrote the RDF guide, a question in our mind was "can we describe a way to use these terms that is not likely to change in the future?" Once a normative change to a standard is adopted, the Maintenance Group is required to assess all future changes to determine if they will disrupt the stability of the standard. If terms are required to be used in a certain way and the Maintenance Group changes that, things will break. (This is the "stability" requirement discussed in section 3.1 of the Vocabulary Maintenance Specification.) So standards maintainers are reluctant to adopt changes to a standard that are not likely to be stable. In the case of the Taxon class terms, at the time the RDF Guide was created, there had not been enough work done on modeling taxa/taxonConcepts/TNUs for there to be a consensus on how they should be described in RDF. So it seemed best to not attempt to prescribe how Darwin Core terms should be used in that way, given that those terms were really designed with tabular data users in mind. If we had tried to hack together a way to use those terms in RDF, it probably would not have been stable. Since the RDF Guide was written a task group has been formed to create a robust model for taxa/taxonConcepts/TNUs. You can find there work here and here. If you are interested in this modeling work, the task group is open to anyone to participate. The idea of "convenience" properties is described in detain in section 2.7 of the RDF Guide, so I won't go into it here. But the main point is that certain sets of properties in Darwin Core are not intended to be used to create descriptions of resources. Rather, they are intended as aids for searching. For example, imagine that I have a table with a description of 1000 insects. In each row, I provide a value of "Insecta" for dwc:class and "Arthropoda" for dwc:phylum. Is my intention really to describe the relationship between the class Insecta and the phylum Arthropoda 1000 times? That's silly, one person only needs to do that one time. The reason I include those values is so that when someone is searching records in GBIF or some other aggregator, they can easily search for insect or arthropod records. So indicating that the string-valued Taxon terms should be used with an Identification instance is a hack to get around the fact that we don't yet have a system for robustly defining taxa in RDF. If we provide a bunch of literal values for taxon-related convenience terms, what we are really doing is saying, "I'd like to link to a permanent IRI of a taxon that's described well in RDF, but since I can't because it doesn't exist yet, here are a bunch of search terms that you could use to find it in the future if someday it is created." The term dwciri:toTaxon was provided in the RDF Guide to make this linking possible at some point in the future. |
Thank you Steve for the detailed explanation. My use-case involves trying to map a description of Taxon and taxon names to DWC, in RDF (https://www.sandre.eaufrance.fr/urn.php?urn=urn:sandre:dictionnaire:APT:FRA:::ressource:2.1:::pdf, page 45 and 46 for UML diagrams). I understand from your explanation that this is simply outside of the scope of Darwin Core in RDF - correct ? (as you said, I could always do it, but I would not be conformant to DWC, and this is not what I want). But if I was to try to map the same model to DWC in XML, it would be in scope of DWC - correct ? (The XML guide at https://dwc.tdwg.org/xml/ shows example of dwc:Taxon XML elements with dwc:scientificName, family, order, class, genus, etc.) So the same set of terms, when used in different serializations, have different usage rules ? Some have tried to embed DWC in JSON-LD in their webpages to describe Taxon, like https://inpn.mnhn.fr/espece/cd_nom/20704 (look at source starting at line 195 - I think the JSON-LD is incorrect and mixes DWC with schema.org, but that's not the point here). This is not consistent with using DWC in RDF - correct ? If my understanding is correct (thanks to your explanations !) then that's a major pitfall in using DWC. |
I think that it would be great for you to bring your use case to the Taxon Names and Concepts task group (@nielsklazenga is the convener). The kind of modeling you are trying to do is similar to what they want to enable with the development of that standard, and I believe that a robust RDF model is more likely to come from that group than Darwin Core. I think it has not yet been determined exactly how the new TNC standard would be used together with Darwin Core, but I think your use case wouldn't really need Darwin Core if you had the new TNC standard. I'm familiar with the attempt to use schema.org and LSON-LD that you mentioned. My feeling is that their approach is more of a "quick and dirty" Linked Data approach (to make data available easily to clients that "understand" schema.org) rather than a robust Semantic Web approach that depends on careful modeling. The places in their JSON-LD where I see them actually using |
@tfrancart, I think you are right that there is no reason why you could not use We have just had a Task Group approved to produce a new version of TCS, which will be written up as a vocabulary standard just like Darwin Core. Our repository should be up in another week or so. We'll be most happy to have your input on how to link everything up. In the meantime, this publication by Viktor Senderov and others might be useful. Also our earlier discussions in the TNC Repository. |
This extract from https://dwc.tdwg.org/rdf/ §2.7.4 leaves me skeptical :
If I understand properly what is written here:
I don't understand the logical entailment between the 2 propositions above. Why would the absence of defined object properties to describe Taxon prevent the use of string properties to describe a Taxon ?
Besides, as DWC properties do not define a range, why couldn't I use them as object properties ?
I don't understand why, as a user of an RDF serialization, I should be forced to make different choices of properties-classes associations than a user of an XML-based serialisation.
See related discussion about domain/range specification of properties here : #357
Sorry if these remarks are obvious or out of scope here; I come from RDF/OWL world, am pretty new to DWC and I try to sort things out on how to best use DWC in RDF. Please also direct me to other channels if this is not the best place to raise these topics.
The text was updated successfully, but these errors were encountered: