-
Notifications
You must be signed in to change notification settings - Fork 9
MaterialType - definition and controlled vocabulary needed. #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
the Ontology for BIoBanking (OBIB)has a start of an ontology for material types, but does not extend to the non-biological specimens that we would need. |
Links to #14 |
@only1chunts These seem mainly to relate to tissue preparations of internal body organs for biobanking purposes. I'm not sure how relevant that is for natural science collections. Having said that, the Smithsonian NMNH Biorepository is an example of where we might want some new materialtypes such as 'silica dried' and 'liquid nitrogen preserved' - I picked these up from https://doi.org/10.3897/BDJ.5.e11625. There might be more. According to those published guidelines these terms are likely to be used in conjunction with some other material descriptor such as 'leaf' or 'flower'. Perhaps materialType needs to be two part information element in which we say what is preserved and how it is preserved. For reference dwc:preparations is just a list with separate values separated by a vertical bar e.g., fossil, cast, photograph, DNA extract, skin | skull | skeleton, whole animal (ETOH) | tissue (EDTA). |
There may be useful concepts in the GGBN standard. They have terms for preparation and/or preservation and some vocabs that may be relevant although with a molecular bent. There is a current mapping activity going on between GGBN and DwC... |
The approach that's been used in the originally geoscience focused IGSN system is After reviewing sample description datasets from various sources in the iSamples work, we're thinking a third facet of basic sample categorization would be useful; currently labeled 'sampledFeature type', intending capture in broad terms the kind of thing the sample represents. Here's the decision tree for the Sampled Feature Type current draft. For a cross domain sample registration system, some convergence on these basic, common facets for categorizing samples is critical for interoperability. The vocabularies for these facets should be relatively small, in the range of ~20 terms to keep it manageable. User interfaces for this kind of system will need to present users with terminology they are familiar with, probably using categorization schemes that are more granular; in the back end, these domain-specific schemes will need to map to the high level categories for interoperability, while maintaining the granular domain terminology for local usage. |
@wouteraddink made a proposal at Sixth TG meeting, 6th May based on six questions:
My rough spreadsheet tries to compare against this proposal from perspectives of TDWG CD, iSamples, DwC, ABCD/EFG and then to make a MIDS proposal, summarized in the table below and derived from the proposal by @wouteraddink. For MIDS there are several issues:
|
I think Question 6 is needed, as certain part of the collection will be digitized as a lot, Element name: MaterialSample? |
iSamples specimenType vocabulary includes classes that cover the 'whole', 'part', 'lot' distinction I think. 'lot' is named 'aggregation' in the current draft. I think 6 and 1 can be combined. |
I think the list is missing an important facet-- what kind of thing does the sample represent (what kind of sampledFeature). |
For MIDS level 1, I agree with Alex's table above, i.e. objectType, materialType and preparationType. |
I also suggest combining 1 and 6, as material sample type covers where it is whole, part, or aggregate. Strongly agree with Alex's comment about a tree structure. For iSamples, we are only defining a top level, and we expect domain or discipline specific details to fall below them in one or more trees. |
Thinking about digitisation, I think 6 is an additional level of detail to 1, where in a digitisation street the object type may be scored for all objects in a batch at once, but whether it is a whole, a part or a lot needs to be scored for each object individually, so I would not combine them and keep 6 in MIDS lvl 2 |
@smrgeoinfo: important data, but the feature that is sampled seems not relevant in digitisation of a specimen, I see it more as metadata that is part of the sampling event not as part of the specimen metadata (it will be the same for all specimen collected in the sampling event). |
I would put discipline in mids 1 as it is easy to score in batch during digitisation and I think it makes sense to use discipline in e.g. monitoring progress in digitisation, assigning digitisation priorities and perhaps also in digitisation policies alignment. Monitoring progress by objectType seem to make less sense, looking at the values of iSamples that would lead to a dashboard where you compare progress in digitisation in e.g. organism parts vs organism products and biome aggregrations rater than progress in botany collections vs zoology invertebrates collections |
Looking at the current version of the spreadsheet comparison I think that what the proposed terms are expected to represent needs to be clarified more (together with an update on the proposed definitions). There are a number of cases where an element turns up as example for different attributes in the different schemes compared (e.g., fossil). If (1) and (3) are to operate as a hierarchical tandem I think it will be necessary to determine the desired level of granularity in each case (and reflect it in corresponding definitions) even if the actual categories / the controlled vocabulary will be determined later. Regarding (6) I think that there are many examples where a collection object can be legitimately considered a whole and a part at the same time (e.g., a bone of a skeleton, a mineral on a larger piece of rock). I would favor to represent these relations between physical specimens (and the downstream relations between their digital representations) as attributes connecting object IDs. |
I wonder if we could trim this down to:
(+ With these three, we keep track of what the specimen was initially (1), what it is now (2) and what happened to it along the way (3). This does imply that the current vocabularies need to be modified, so the distinction between objectType and preparationType is more clear. A lot of fishes in a jar does not represent a lot of fishes in a jar, but a lot of (whole) fish. The lot was preserved with a certain fluid and is now kept in a jar.
Maybe we want an explicitly taxonomic term in addition to the I'm not sure what the added value is of As other have suggested, I would include the whole, lot, part distinction in an ontology for objectType. |
I disagree @wouteraddink. Since a material sample is always a sample of something, it is very important to know what it is a sample of. Yes, this is also linked to the sampling event, but since we are not building an ontology here that fully describes the sampling event, I think it is very important to include the sampled feature as part of the metadata for the sample. Now if you want to build an ontology... :) |
Again, if you were building an ontology, you could probably deduce what a sample was made of by its sample type, but you are not building an ontology. Therefore, you need to capture all of the information that could be used to build one. |
At RBGE, we have looked at the RBGE Herbarium collection data to determine how the 6 questions proposed by Wouter could be implemented. In doing this, we considered the kinds of objects that we have in the collections, the use cases for the categorisation and the purpose and outcome of categorising the objects. I think that considering the use cases for categorising could be helpful for discussions. The high level use cases we identified are:
Within each of these there are additional use cases for categorising objects including:
For each of these use cases, the following aspects were considered to be important:
This was a start, and I'm sure we will be missing things. Just to reiterate that we were focussing very much on Herbarium collections. As part of this exercise, we started with the following list of objects: Herbarium sheet We then started pulling this together into a framework, working through each item, moving them into either object type or preservation method or structure, or part of organism, etc. We refined this process during several iterations. We have now gone through the RBGE list of objects and mapped them to the following categories:
When we looked at these in terms of the Element Issue format and mapping we then came up with the following examples: This is just to contribute to the discussion. |
Analysis of object types from #5 (comment)
The object is a mounting sheet.
The object is ??? what???
The object is what???
Object is a glass mounting sheet
This is a preservation process??? What was dried??
????
Object is...? probably some kind of container with the DNA in it. DNA is a material, not an object
?? I guess this means the material has been consumed in some analytical process? What was the object?
object (probably) a peice of wood, or a bag of peices of wood (hopefully from the same plant)
object (probably) a seed, or a bag of seeds (hopefully from the same plant)
Object is a print (??? peice of paper, or other imaging material???). Analogous to photograph of specimen?
Object is a piece of film
Object is a piece of paper
The photo is a related resource about the specimen, not the specimen
The illustration is a related resource about the specimen, not the specimen
Object is (likely) a bag of granular material
Object is a packet (container) containing plant fragments (from the same plant??, or from individuals of same species?)
Object is a stub (some kind of mounting object, analogous to herbarium sheet or microscope slide)
?? object is what??? granular aggregate of plant parts, individual plant part? whole plant?
The image is a related resource about the specimen, not the specimen
Object is some kind of culture container? |
I would modify somewhat:
|
I can live with inclusion of sampled feature. What I get from the discussions though is that already at the very minimal level of MIDS1 we seem to have different needs for different classes of objects: preservation mode only for fossil specimens, material type for non-biological specimens (earth samples), preservation method only for preserved specimens. I think we therefore need different metadata profiles for these different classes, which we should perhaps treat as different digital specimen (sub-)types: preserved biological specimen, fossil biological specimen, living biological specimen, earth sample specimen, recorded specimen (e.g. sound recordings, drawings, photos). These should be extendible with classes for non-natural history specimens in the future if there is a need for it. |
I think that profiles are a good idea, and would suggest that the specimenType property could be the basis for determining the profile. Likely there would be a hierarchy of profiles, e.g. 'Whole organism' profile might have child profiles for 'preserved', and 'living'. In our iSamples thinking, drawings, or photos of physical specimens are related resources, not physical specimens. Sound recordings (e.g. bird sounds) are an interesting case; my off the cuff reaction is that the recording is a kind of dataset that is linked to a physical thing (the bird) in the world. |
Object type versus preparation/preservation method are not sufficiently well differentiated in present practical usage.
An opportunity exists to introduce a controlled vocabulary for a new MaterialType information element to tidy this up.
This spreadsheet contains examples of how different kinds of specimens have been mapped to various existing terms/fields.
Source: CETAF Digitization Working Group, 7th December 2020.
The text was updated successfully, but these errors were encountered: