Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change term - move typeStatus from Identification to MaterialEntity #525

Open
nielsklazenga opened this issue Sep 23, 2024 · 21 comments
Open

Comments

@nielsklazenga
Copy link
Member

nielsklazenga commented Sep 23, 2024

Term change

  • Submitter: Niels Klazenga
  • Efficacy Justification (why is this change necessary?): The placement of typeStatus in the Identification class is problematic, as discussed in New term - typifiedName #28 , but until now there was no better place to put it. The new MaterialEntity class, however, is a very good fit.
  • Demand Justification (if the change is semantic in nature, name at least two organizations that independently need this term): N/A
  • Stability Justification (what concerns are there that this might affect existing implementations?): N/A
  • Implications for dwciri: namespace (does this change affect a dwciri term version)?: N/A

Current Term definition: https://dwc.tdwg.org/list/#dwc_typeStatus

Proposed attributes of the new term version (Please put actual changes to be implemented in bold and strikethrough):

  • Term name (in lowerCamelCase for properties, UpperCamelCase for classes): typeStatus
  • Term label (English, not normative): Type Status
  • Organized in Class (e.g., Occurrence, Event, Location, Taxon): Identification MaterialEntity
  • Definition of the term (normative): A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.
@tucotuco
Copy link
Member

@nielsklazenga Would you also go so far as to change the definition to be explicit? "A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the dwc:MaterialEntity"?

@nielsklazenga
Copy link
Member Author

@tucotuco , yes, that makes sense. However, if we are going to change the definition, I would make another small change:

A list (concatenated and separated) of nomenclatural types (at a minimum type status, of type and typified scientific name, publication) applied to the dwc:MaterialEntity.

I think we should leave the publication out of the definition, as nine out of ten times it is the same as the namePublishedIn publication and also in the case of lectotypes, neotypes, epitypes and conserved types, where it is different, it would normally not be provided in the typeStatus string. I would consider it metadata of the Nomenclatural Type, rather than really part of it. The publication I cite in my type status annotations on the specimen is the publication of the name.

Also, while we are at it, in the examples holotype of Pinus abies should be changed to holotype of Pinus abies L. and holotype of Picea abies should be deleted as Picea abies (L.) H.Karst. is a combination of Pinus abies L. The other one I would change to holotype of Ctenomys sociabilis Pearson & Christie, 1985. Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388, but I am not a zoologist, so do not take my word for it.

@tucotuco
Copy link
Member

There are other implications to think of going forward. It make me a bit nervous that a generic physical object should have a type status attribute. It seems too much of a specialized characteristic. Is that a characteristic inherent in the material? Or is it a designation applied to a piece of material following some process. The latter makes for a better model, I think, and allows the process to be repeated for additional type designations. None of that is inherent in the material, and it is much more problematic to model as if it were. As we seek to build a semantic layer for Darwin Core, the semantics will be crucial.

@nielsklazenga
Copy link
Member Author

nielsklazenga commented Sep 24, 2024

There was a reason I did not start tinkering with the definition immediately. Yes, it is a bit counterintuitive. In a nomenclatural type designation, the Name is the subject and the Specimen is the object. So, if things were simple and we did not want to hang more information off this relation, we could just have a isNomenclaturalTypeOf property on the dwc:MaterialEntity and a hasNomenclaturalType property on the tcs:TaxonName.

We have defined (or propose to define) a tcs:NomenclaturalType in TCS (see tcs:NomenclaturalType. dwciri:typeStatus is the inverse of the tcs:typeSpecimen.

The NomenclaturalType (or 'type status') is of course just a special case of a dwc:ResourceRelationship, where – coming from dwc:MaterialEntitydwc:resourceID is the ID of the dwc:MaterialEntity, dwc:relatedResourceID the ID of the tcs:TaxonName and the value of tcs:relationshipOfResource is isHolotypeOf, isIsotypeOf, isSyntypeOf, etc.

So, people who are concerned about models would not use dwc:typeStatus at all, but would use dwc:ResourceRelationship. I do not think a proposal to sink dwc:typeStatus into dwc:materialEntityRemarks would go very far (I do not even support it myself), but a dwc:materialEntityRemarks is all that it really is (in this form).

@tucotuco
Copy link
Member

tucotuco commented Mar 1, 2025

Changed label to 'normative' as there is a semantic changed proposed to the definition.

@mdoering
Copy link
Contributor

As we already closed #328 because of stability arguments trying to change the dwc:typeStatus definition, I would suggest to stick to the original proposal to simply move the term to the MaterialEntity class and leave anything else as it was.

Can we agree on that?

@nielsklazenga
Copy link
Member Author

I am happy either way but the changes proposed to the definition seem uncontroversial to me and do not change the meaning of the term, so why pass up an opportunity to improve the definition?

@mdoering
Copy link
Contributor

mdoering commented Mar 12, 2025

Removing the publication from the definition you mean
Current DEF:

A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.

Proposed by @tucotuco above:

A list (concatenated and separated) of nomenclatural types (at a minimum type of type and typified scientific name) applied to the dwc:MaterialEntity.

I don't think at a minimum is good addition as the typified scientific name is not given frequently in current practise. It would also slightly change the meaning just as removing the publication does.

On a side note, the phrase "A list of nomenclatural types" does trigger some very different meaning in my head. If someone tells me to give him a list of nomenclatural types, I would expect to see type material citations ;) For me the entire sentance "A list of nomenclatural types ... applied to the MaterialEntity" makes no sense. The only understandable bit is the list in brackets and the examples.

@deepreef
Copy link

@tucotuco captured my concerns with this sentence:

It make me a bit nervous that a generic physical object should have a type status attribute.

Many people think of it this way (that typeStatus is a property of a physical object), because there is a label on a specimen that includes a typeStatus value. But as I've said elsewhere, this is a lazy way of representing the actual situation, which is that a typeStatus is a property of the intersection of a physical object and a scientific name (aka: Identification), as asserted within a publication.

Just because many people don't have the publication information, doesn't mean it's not important information; and as @nielsklazenga noted, 90% of the time it's the same as namePublishedIn -- which means that 10% of the time it's not the same as namePublishedIn.

@nielsklazenga
Copy link
Member Author

@tucotuco captured my concerns with this sentence:

It make me a bit nervous that a generic physical object should have a type status attribute.

How on earth can that be a concern? 99 per cent of physical objects will not have a typeStatus attribute. The definition of a nomenclatural type is "the element to which the name of a taxon is permanently attached". dwc:MaterialEntity is the class in Darwin Core that best fits the bill. Call me lazy, but I think creativity is not something we are looking for in standards. dwc:MaterialEntity does not have to be perfect. It only has to be better than dwc:Identification which seems like a no-brainer to me.

Nobody says that the place a typification is published is not important, but that does not mean it should be required. There is also the issue that a publication cited in the citation of a nomenclatural type it will most likely be the dwc:namePublishedIn rather than the type published in (and the current definition does not say which it should be). I think the purpose of the term is to capture the data rather than prescribe what a typification note on a specimen should look like. The 'type of type' and 'typified name' elements are sufficient to differentiate dwc:typeStatus from dwc:materialEntityRemarks.

@nielsklazenga
Copy link
Member Author

@mdoering, the 'at a minimum' does not have to be there for me, but it indicates that typeStatus can have other elements than the 'type of type' and 'typified name' (for example the publication in which the type was published) and if you remove it the definition will still say that typeStatus has to include the typified name.

I do not really like the 'A list of ...' either, but it just indicates that a specimen can be the nomenclatural type of more than one name and is written in the same way as in various other Darwin Core terms that can have multiple values. It is also already in the current definition and I thought I'd pick my battles.

@mdoering
Copy link
Contributor

@nielsklazenga my point is that 92% of records in GBIF are simply the type status, e.g. holotype, and do not include anything further. Changing the definition to "at a minimum type of type and typified scientific name" seems to render all these values wrong as they miss the name. Plus "type of type" is pretty hard to understand for normal people. Overall I think the definition we had was better.

@nielsklazenga
Copy link
Member Author

@mdoering, looks like GBIF has a major data quality issue on its hands. Those values are wrong under the current definition, which already says that typeStatus has to include the typified name. The 'at a minimum' I added merely indicates that typeStatus can include other elements as well. The new definition is actually less prescriptive than the current one.

We could move the entire bit in parentheses to usage notes, but usage notes in Darwin Core are not normative (I think) and the typified name being part of the typeStatus should be normative and the type of type should never be used without the typified name. I was the one who wrote the proposal to let typeStatus only be the 'type of type' (#328), because I was made to think that was the majority opinion, even though it is not my own, but that got no support at all so ended up on the cutting room floor. I think there is no point revisiting that, because it will always be controversial at best.

The term 'type of type' (https://github.com/tdwg/ontology/blob/master/ontology/voc/TaxonName.rdf#L389) being hard to understand for normal people is a funny argument given that the same normal people got the meaning of typeStatus so wrong.

@mdoering
Copy link
Contributor

The term 'type of type' (https://github.com/tdwg/ontology/blob/master/ontology/voc/TaxonName.rdf#L389) being hard to understand for normal people is a funny argument given that the same normal people got the meaning of typeStatus so wrong.

the values you see, like holotype, are what people (including me) intuitively, without reading the exact dwc definition, think type status means. type status is intuitively just the status. We currently even use the term recursively in it's own definition - with the inner usage meaning again just the status. A mess.

@mdoering
Copy link
Contributor

I am linking a GBIF comment from the typifiedName discussion, that GBIF interprets dwc:typeStatus just as the status value for the type and clearly recommends against including the typified name or publication which would get flagged as invalid values.

@nielsklazenga
Copy link
Member Author

@tucotuco, I think it would be best to table this proposal for when the work on a more structured approach to Darwin core that was alluded to in #28 takes place and remove it from the current milestone if that is possible. Although I stand behind the proposal, it could be quite disruptive, as moving dwc:typeStatus to dwc:MaterialEntity will mean that it will disappear from the Identification History extension of the Darwin Core Archive, which will affect implementations that use it there.

@tucotuco
Copy link
Member

@nielsklazenga The work on structured Darwin Core (called Darwin Core Data Package, 'DwC-DP', which will come with a Semantic Layer Applicability Statement) is actually quite mature. We expect to open the public review for the whole thing at the beginning of September. In that model, typestatus is a property of the Identification class. The Identification class acts precisely as the Identification History Extension, allowing things that are Identifiable (Material, Organisms in Occurrences, GeneticSequences) to have as many Identifications as desired. If typeStatus and the accompanying typifiedName were properties of the MaterialEntity class they would have to support lists, and since the two terms would be separate, the content of those two lists would be decoupled, so different orders in the lists would misconnect the typeStatus and the typifiedName. That can't happen in an Identification record where lists are not supported. THE typeStatus in the Identification would be singularly coupled with THE typifiedName in that Identification record - no confusion possible.

To me, having resolution on the typeStatus/typifiedName in this current public review would be ideal, because there is going to be SOOOO much more to consider in the DwC-DP review, each one of which might end up in extended discussions. The fewer the number of extended discussions to have to resolve in that review, the better.

@nielsklazenga
Copy link
Member Author

Thanks @tucotuco. So I take it you are interested in pursuing the discussion here? I am actually going to take a bit of a break from this, as I want to have TCS out for public review before the TDWG Working Sessions, but I can get back to it when that is out of the way. Just noting for now that dwc:typeStatus is a list by definition, so any class it is in would have to support lists, and that, when dwc:Identification is considered to include determinations as well as type status annotations, dwc:typifiedName is superfluous as it would be the same as dwc:scientificName. I could also make the case that when you do this, dwc:typeStatus, again by its definition rather than any associations people might make with the term, would be part of (or at least considerably overlap with) dwc:previousIdentifications (and one would not put dwc:previousIdentifications in dwc:Identification). Also, dwc:typeStatus and dwc:typifiedName cannot be used side by side as one includes the other. A third term, equivalent with abcd://Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypeStatus, is needed to make dwc:typifiedName really useful (I myself think that the typified name is the more important part of the type status, so I think it is useful on its own).

There is also an issue for a JSON-LD guide (#447). I presume it is the intention that there will be some concordance between DwC-DP and the JSON-LD. In JSON-LD it is not possible to have typeStatus on dwc:Identification, as in RDF classes actually mean something. dwc:Identification has a dwciri:toTaxon property (for which the dwc:Taxon does not qualify as an object as per the Darwin Core RDF Guide), so a dwc:Identification can only lead to a taxon and taxa do not have types.

I have not heard about this for a while, but a few years ago there was this thing going on about alignment between TDWG standards. ABCD 2.06 is a TDWG standard and ABCD has got this right. Why reinvent the wheel?

@deepreef
Copy link

@nielsklazenga raises an important point regarding the fact that the definition of dwc:typeStatus is presented as a "list":

A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.

This makes absolutely no sense in the context of dwc:typeStatus when organized within the Identification Class. I have to assume this was framed as such (i.e., as a "list") either before dwc terms were organized into classes (and/or before dwc:typeStatus was organized within the Identifcation class), or with the realization that people would ignore the classes in which dwc terms are organized, and "flatten" the information to accompany a single Occurrence record (as many people often do for specimen records...sigh...).

Also, the point about JSON-LD underscores a much more fundamental and long-standing issue within TDWG/DwC-space, which is the conflation of nomenclatural information with taxonomic information. There are good reasons why they've historically been conflated (i.e., they've been conflated by a long history of actual taxonomic practice, and biodiversity informatics wasn't [and kind of still isn't] mature enough to adequately disentagle them). All my soap-box preaching over the years about the notion of "Taxon[omic] Name Usages" stems from the recognition that such TNUs lie at the core of both Taxonomy and Nomenclature, and may offer a pathway for the much-sought-after "GUMTI" ("Grand Unified Model for Taxonomic Information").

In any case, I agree with what @nielsklazenga says above, especially with respect to the current definition of dwc:typeStatus and the subtle but important disctinctions between properties of nomenclature and properties of taxonomy, and the pitfalls of conflating the two in an effort to [over-?]simplify our biodiversity data standards. This is Not a crticism at all (we wouldn't be as far with biodiversity informatics as we are now without a path that included extensive simplifcation of things to get us here). Rather, it's an observation with a dash of genuine hope that we can continue down that path to even better places.

@afuchs1
Copy link

afuchs1 commented Mar 17, 2025

It seems to me that the current definition https://dwc.tdwg.org/terms/#dwc:typeStatus in identification works as a list in identification as an occurrence represented by a catalougeNo can have multiple TYPE of TYPE for different scientific names.

Example data : CANB 377016.1
HOLOTYPE of Hibbertia complanata Toelken | HOLOTYPE of Hibbertia persquamata Toelken subsp. persquamata

https://avh.ala.org.au/occurrences/ea4a580a-2f0f-49ce-b57e-dbc2901ef69c (noting the supplied data has been post processed)

so the definition in identification works, if the data is to be split up it needs to be in a different structure so that information is not lost.

@deepreef
Copy link

@afuchs1 yes, exactly -- this is needed to accommodate multiple typeStatus instances for a particular specimen (MaterialSample/MaterialEntity). But in the contenxt of an Identification, there would me multiple separate instances, one of which asserts CANB 377016.1 as Hibbertia complanata with dwc:typeStaus "HOLOTYPE", and a separate instance asserting CANB 377016.1 as Hibbertia persquamata subsp. persquamata with dwc:typeStaus "HOLOTYPE". In this case, each instance would not require multiple values (as you would need if dwc:typeStatus was flattened and represented as a propert of an instance of MateiralSample/MaterialEntity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants