MaterialType - definition and controlled vocabulary needed. #5

hardistyar · 2020-12-07T17:09:22Z

Object type versus preparation/preservation method are not sufficiently well differentiated in present practical usage.

An opportunity exists to introduce a controlled vocabulary for a new MaterialType information element to tidy this up.

This spreadsheet contains examples of how different kinds of specimens have been mapped to various existing terms/fields.

Source: CETAF Digitization Working Group, 7th December 2020.

only1chunts · 2021-01-07T16:26:45Z

the Ontology for BIoBanking (OBIB)has a start of an ontology for material types, but does not extend to the non-biological specimens that we would need.

RBGE-Herbarium · 2021-01-20T09:31:42Z

Links to #14

hardistyar · 2021-03-04T14:10:47Z

the Ontology for BIoBanking (OBIB)has a start of an ontology for material types, but does not extend to the non-biological specimens that we would need.

@only1chunts These seem mainly to relate to tissue preparations of internal body organs for biobanking purposes. I'm not sure how relevant that is for natural science collections.

Having said that, the Smithsonian NMNH Biorepository is an example of where we might want some new materialtypes such as 'silica dried' and 'liquid nitrogen preserved' - I picked these up from https://doi.org/10.3897/BDJ.5.e11625. There might be more. According to those published guidelines these terms are likely to be used in conjunction with some other material descriptor such as 'leaf' or 'flower'. Perhaps materialType needs to be two part information element in which we say what is preserved and how it is preserved.

For reference dwc:preparations is just a list with separate values separated by a vertical bar e.g., fossil, cast, photograph, DNA extract, skin | skull | skeleton, whole animal (ETOH) | tissue (EDTA).

jmacklin · 2021-03-04T14:47:58Z

There may be useful concepts in the GGBN standard. They have terms for preparation and/or preservation and some vocabs that may be relevant although with a molecular bent. There is a current mapping activity going on between GGBN and DwC...

https://wiki.ggbn.org/ggbn/GGBN_Data_Standard_v1

smrgeoinfo · 2021-04-13T17:16:21Z

The approach that's been used in the originally geoscience focused IGSN system is
MaterialType -- what is the specimen itself composed of? (distinct from any preservation media). Here's a decision tree for our current Material type draft for iSamples. You'll see its pretty high level, for faceting search results across a registry of samples from geoscience, archaeology, biology in current work.
ObjectType -- what kind of thing is the specimen. This has been renamed SpecimenType in our current iSample work, here's the decision tree for Specimen Type for the current vocabulary draft

After reviewing sample description datasets from various sources in the iSamples work, we're thinking a third facet of basic sample categorization would be useful; currently labeled 'sampledFeature type', intending capture in broad terms the kind of thing the sample represents. Here's the decision tree for the Sampled Feature Type current draft.

For a cross domain sample registration system, some convergence on these basic, common facets for categorizing samples is critical for interoperability. The vocabularies for these facets should be relatively small, in the range of ~20 terms to keep it manageable. User interfaces for this kind of system will need to present users with terminology they are familiar with, probably using categorization schemes that are more granular; in the back end, these domain-specific schemes will need to map to the high level categories for interoperability, while maintaining the granular domain terminology for local usage.

hardistyar · 2021-05-26T12:12:02Z

@wouteraddink made a proposal at Sixth TG meeting, 6th May based on six questions:

What kind of object is it?
Under which discipline was it described?
What is it made of?
What does it look like? (in the sense of how it is prepared)
How is it fixed/preserved?
Is it a whole, a part or a lot?

My rough spreadsheet tries to compare against this proposal from perspectives of TDWG CD, iSamples, DwC, ABCD/EFG and then to make a MIDS proposal, summarized in the table below and derived from the proposal by @wouteraddink.

For MIDS there are several issues:

Which elements should be included at MIDS level 1 and at MIDS level 2? At level 1 this should be enough to support discovery of relevant specimens and to aid further digitization. The table proposes three elements are needed at level 1 and that other more specific information should appear at level 2.
What should be the name of the information elements corresponding to each of the six questions above? To what extent should MIDS (and openDS) maintain alignment with pre-existing term names/labels in other standards, especially when there are several options? What implications might this have elsewhere in MIDS and other new standards? Generality towards/across a range of disciplines (biology/zoology/botany, geology, archaeology) might be becoming more important than maintaining alignment to, for example Darwin Core. The table contains current suggestions.
What should be the controlled vocabulary associated with each of the six information elements? Here we have a tension between terms and existing familiar lists of words used in a loose and fragmented manner and new, more controlled lists of words being proposed by, for example iSamples. This is to be discussed further (t.b.d.). One suggestion might be to adopt short controlled vocabularies (<20 words) for each of the two coarsest levels (Q1, Q3) e.g., as has been suggested by iSamples; and to prepare longer controlled or partially controlled lists of preparationTypes (Q4) and preservationMethods (Q5) lower down. A tree structure can help.

Question	MIDS level	Element name	Vocabulary
1. What kind of object is it?	1	materialSampleType or objectType (latter is preferred)	t.b.d.
2. Under which discipline was it described?	2	discipline	t.b.d.
3. What is it made of?	1	materialType	t.b.d.
4. What does it look like? (in the sense of how it is prepared)	1	preparationType	t.b.d.
5. How is it fixed/preserved?	2	preservationMethod	t.b.d.
6. Is it a whole, a part or a lot?	2	?? Is it needed?	t.b.d.

Rindiser · 2021-05-26T15:19:18Z

I think Question 6 is needed, as certain part of the collection will be digitized as a lot, Element name: MaterialSample?

smrgeoinfo · 2021-05-28T22:27:12Z

iSamples specimenType vocabulary includes classes that cover the 'whole', 'part', 'lot' distinction I think. 'lot' is named 'aggregation' in the current draft. I think 6 and 1 can be combined.

smrgeoinfo · 2021-05-28T22:30:50Z

I think the list is missing an important facet-- what kind of thing does the sample represent (what kind of sampledFeature).

only1chunts · 2021-06-01T13:16:49Z

For MIDS level 1, I agree with Alex's table above, i.e. objectType, materialType and preparationType.
I think concentrating on getting those 3 defined with appropriate controlled vocabularies should be the focus. Then when we come to look at level 2 terms we can address the others.

ramonawalls · 2021-06-01T22:02:29Z

I also suggest combining 1 and 6, as material sample type covers where it is whole, part, or aggregate.

Strongly agree with Alex's comment about a tree structure. For iSamples, we are only defining a top level, and we expect domain or discipline specific details to fall below them in one or more trees.

wouteraddink · 2021-06-02T09:56:16Z

Thinking about digitisation, I think 6 is an additional level of detail to 1, where in a digitisation street the object type may be scored for all objects in a batch at once, but whether it is a whole, a part or a lot needs to be scored for each object individually, so I would not combine them and keep 6 in MIDS lvl 2

wouteraddink · 2021-06-02T10:04:05Z

@smrgeoinfo: important data, but the feature that is sampled seems not relevant in digitisation of a specimen, I see it more as metadata that is part of the sampling event not as part of the specimen metadata (it will be the same for all specimen collected in the sampling event).

wouteraddink · 2021-06-02T10:17:46Z

I would put discipline in mids 1 as it is easy to score in batch during digitisation and I think it makes sense to use discipline in e.g. monitoring progress in digitisation, assigning digitisation priorities and perhaps also in digitisation policies alignment. Monitoring progress by objectType seem to make less sense, looking at the values of iSamples that would lead to a dashboard where you compare progress in digitisation in e.g. organism parts vs organism products and biome aggregrations rater than progress in botany collections vs zoology invertebrates collections

cboelling · 2021-06-02T11:00:01Z

Looking at the current version of the spreadsheet comparison I think that what the proposed terms are expected to represent needs to be clarified more (together with an update on the proposed definitions). There are a number of cases where an element turns up as example for different attributes in the different schemes compared (e.g., fossil).
While orthogonality between the attributes is perhaps not necessary, it would be nice to have a more clearly defined complementarity especially among objectType, materialType and preparationType.

If (1) and (3) are to operate as a hierarchical tandem I think it will be necessary to determine the desired level of granularity in each case (and reflect it in corresponding definitions) even if the actual categories / the controlled vocabulary will be determined later.

Regarding (6) I think that there are many examples where a collection object can be legitimately considered a whole and a part at the same time (e.g., a bone of a skeleton, a mineral on a larger piece of rock). I would favor to represent these relations between physical specimens (and the downstream relations between their digital representations) as attributes connecting object IDs.

matdillen · 2021-06-02T12:53:19Z

I wonder if we could trim this down to:

What does the object represent? objectType (MIDS1)
What does the object look like? preparationType (MIDS1)
How is the object preserved? preservationMethod (MIDS2)

(+ preservationMode for fossils) (MIDS2)

With these three, we keep track of what the specimen was initially (1), what it is now (2) and what happened to it along the way (3). This does imply that the current vocabularies need to be modified, so the distinction between objectType and preparationType is more clear. A lot of fishes in a jar does not represent a lot of fishes in a jar, but a lot of (whole) fish. The lot was preserved with a certain fluid and is now kept in a jar.

discipline seems much more suitable to be defined as a taxonomic term, at least for the biological specimens. If we're looking at this from a more general, curatory perspective, we should probably be looking at collectionCode (currently MIDS2). Connecting a specimen to a scientific discipline is much less obvious. Is this the intent of the collector? How do we determine that? What if there are multiple scientific disciplines that may apply?

Maybe we want an explicitly taxonomic term in addition to the name (dc:title analogue) in MIDS1? But I'm not sure how that works for nonbiological specimens.

I'm not sure what the added value is of materialSample (what is it made of), when compared to the information present in the three concepts listed above. In what scenario is the content of this field useful? Can it not always be deduced from the other ones? Are we not always going to say that a soil sample is made out of soil, an animal out of organic material and a rock out of rock?

As other have suggested, I would include the whole, lot, part distinction in an ontology for objectType.

ramonawalls · 2021-06-03T02:06:09Z

@smrgeoinfo: important data, but the feature that is sampled seems not relevant in digitisation of a specimen, I see it more as metadata that is part of the sampling event not as part of the specimen metadata (it will be the same for all specimen collected in the sampling event).

I disagree @wouteraddink. Since a material sample is always a sample of something, it is very important to know what it is a sample of. Yes, this is also linked to the sampling event, but since we are not building an ontology here that fully describes the sampling event, I think it is very important to include the sampled feature as part of the metadata for the sample. Now if you want to build an ontology... :)

ramonawalls · 2021-06-03T02:08:30Z

I wonder if we could trim this down to:

What does the object represent? objectType (MIDS1)

What does the object look like? preparationType (MIDS1)

How is the object preserved? preservationMethod (MIDS2)

(+ preservationMode for fossils) (MIDS2)

With these three, we keep track of what the specimen was initially (1), what it is now (2) and what happened to it along the way (3). This does imply that the current vocabularies need to be modified, so the distinction between objectType and preparationType is more clear. A lot of fishes in a jar does not represent a lot of fishes in a jar, but a lot of (whole) fish. The lot was preserved with a certain fluid and is now kept in a jar.

discipline seems much more suitable to be defined as a taxonomic term, at least for the biological specimens. If we're looking at this from a more general, curatory perspective, we should probably be looking at collectionCode (currently MIDS2). Connecting a specimen to a scientific discipline is much less obvious. Is this the intent of the collector? How do we determine that? What if there are multiple scientific disciplines that may apply?

Maybe we want an explicitly taxonomic term in addition to the name (dc:title analogue) in MIDS1? But I'm not sure how that works for nonbiological specimens.

I'm not sure what the added value is of materialSample (what is it made of), when compared to the information present in the three concepts listed above. In what scenario is the content of this field useful? Can it not always be deduced from the other ones? Are we not always going to say that a soil sample is made out of soil, an animal out of organic material and a rock out of rock?

As other have suggested, I would include the whole, lot, part distinction in an ontology for objectType.

Again, if you were building an ontology, you could probably deduce what a sample was made of by its sample type, but you are not building an ontology. Therefore, you need to capture all of the information that could be used to build one.

emhaston · 2021-06-03T10:23:19Z

At RBGE, we have looked at the RBGE Herbarium collection data to determine how the 6 questions proposed by Wouter could be implemented.

In doing this, we considered the kinds of objects that we have in the collections, the use cases for the categorisation and the purpose and outcome of categorising the objects. I think that considering the use cases for categorising could be helpful for discussions.

The high level use cases we identified are:

Curation
Digitisation
Research
Exhibitions

Within each of these there are additional use cases for categorising objects including:

Location (findability)
Digitisation method (equipment and pipelines)
Research discipline, expertise, techniques & equipment
Storage requirements (environmental conditions, space etc)
Access (physical, virtual, including loanability)

For each of these use cases, the following aspects were considered to be important:

Size
Shape
Container
Physical and chemical structure
Preservation method

This was a start, and I'm sure we will be missing things. Just to reiterate that we were focussing very much on Herbarium collections.

As part of this exercise, we started with the following list of objects:

Herbarium sheet
Carpological specimen
Spirit
Microscope slide
Silica-dried
TLC
Extracted DNA
Destructive sample
Wood samples
Seed sample
Spore prints
Photographic slides
Photographs
Photographs of specimen
Illustrations
Soil sample
Herbarium packet
SEM stubs
Air/silica dried material
SEM images
Soil/water sample/ cultures

We then started pulling this together into a framework, working through each item, moving them into either object type or preservation method or structure, or part of organism, etc. We refined this process during several iterations. We have now gone through the RBGE list of objects and mapped them to the following categories:

Object Type
Preparation Type
Preservation method
Structure
Format
Whole organism / what part of organism

When we looked at these in terms of the Element Issue format and mapping we then came up with the following examples:

This is just to contribute to the discussion.

smrgeoinfo · 2021-06-03T15:43:21Z

Analysis of object types from #5 (comment)

Herbarium sheet

The object is a mounting sheet.

Carpological specimen

The object is ??? what???

Spirit

The object is what???

Microscope slide

Object is a glass mounting sheet

Silica-dried

This is a preservation process??? What was dried??

TLC

????

Extracted DNA

Object is...? probably some kind of container with the DNA in it. DNA is a material, not an object

Destructive sample

?? I guess this means the material has been consumed in some analytical process? What was the object?

Wood samples

object (probably) a peice of wood, or a bag of peices of wood (hopefully from the same plant)

Seed sample

object (probably) a seed, or a bag of seeds (hopefully from the same plant)

Spore prints

Object is a print (??? peice of paper, or other imaging material???). Analogous to photograph of specimen?

Photographic slides

Object is a piece of film

Photographs

Object is a piece of paper

Photographs of specimen

The photo is a related resource about the specimen, not the specimen

Illustrations

The illustration is a related resource about the specimen, not the specimen

Soil sample

Object is (likely) a bag of granular material

Herbarium packet

Object is a packet (container) containing plant fragments (from the same plant??, or from individuals of same species?)

SEM stubs

Object is a stub (some kind of mounting object, analogous to herbarium sheet or microscope slide)

Air/silica dried material

?? object is what??? granular aggregate of plant parts, individual plant part? whole plant?

SEM images

The image is a related resource about the specimen, not the specimen

Soil/water sample/ cultures

Object is some kind of culture container?

smrgeoinfo · 2021-06-03T18:15:13Z

I wonder if we could trim this down to:

What does the object represent? objectType (MIDS1)

What does the object look like? preparationType (MIDS1)

How is the object preserved? preservationMethod (MIDS2)

(+ preservationMode for fossils) (MIDS2)

I would modify somewhat:

what kind of object is it (objectType)
what is it composed of (materialType)
what does it represent (sampled feature)
how is it preserved (preservationMethod)? is not applicable to kinds of non-biological specimens that don't need preservation.
preservation mode (taphonomy) is very specific to fossil specimens, and should be an additional property required in a fossil specimen profile.

wouteraddink · 2021-06-25T16:07:07Z

I can live with inclusion of sampled feature. What I get from the discussions though is that already at the very minimal level of MIDS1 we seem to have different needs for different classes of objects: preservation mode only for fossil specimens, material type for non-biological specimens (earth samples), preservation method only for preserved specimens. I think we therefore need different metadata profiles for these different classes, which we should perhaps treat as different digital specimen (sub-)types: preserved biological specimen, fossil biological specimen, living biological specimen, earth sample specimen, recorded specimen (e.g. sound recordings, drawings, photos). These should be extendible with classes for non-natural history specimens in the future if there is a need for it.

smrgeoinfo · 2021-06-25T20:14:55Z

I think that profiles are a good idea, and would suggest that the specimenType property could be the basis for determining the profile. Likely there would be a hierarchy of profiles, e.g. 'Whole organism' profile might have child profiles for 'preserved', and 'living'. In our iSamples thinking, drawings, or photos of physical specimens are related resources, not physical specimens. Sound recordings (e.g. bird sounds) are an interesting case; my off the cuff reaction is that the recording is a kind of dataset that is linked to a physical thing (the bird) in the world.

hardistyar added the MIDS-1 Info element appears at MIDS level 1 label Dec 7, 2020

wouteraddink mentioned this issue Jun 25, 2021

Create a Galaxy tool to create an open digital specimen object DiSSCo/SDR#23

Closed

emhaston removed the MIDS-1 Info element appears at MIDS level 1 label May 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaterialType - definition and controlled vocabulary needed. #5

MaterialType - definition and controlled vocabulary needed. #5

hardistyar commented Dec 7, 2020

only1chunts commented Jan 7, 2021 •

edited

Loading

RBGE-Herbarium commented Jan 20, 2021

hardistyar commented Mar 4, 2021

jmacklin commented Mar 4, 2021

smrgeoinfo commented Apr 13, 2021

hardistyar commented May 26, 2021

Rindiser commented May 26, 2021

smrgeoinfo commented May 28, 2021

smrgeoinfo commented May 28, 2021

only1chunts commented Jun 1, 2021

ramonawalls commented Jun 1, 2021

wouteraddink commented Jun 2, 2021

wouteraddink commented Jun 2, 2021

wouteraddink commented Jun 2, 2021

cboelling commented Jun 2, 2021

matdillen commented Jun 2, 2021

ramonawalls commented Jun 3, 2021

ramonawalls commented Jun 3, 2021

emhaston commented Jun 3, 2021

smrgeoinfo commented Jun 3, 2021

smrgeoinfo commented Jun 3, 2021

wouteraddink commented Jun 25, 2021

smrgeoinfo commented Jun 25, 2021

MaterialType - definition and controlled vocabulary needed. #5

MaterialType - definition and controlled vocabulary needed. #5

Comments

hardistyar commented Dec 7, 2020

only1chunts commented Jan 7, 2021 • edited Loading

RBGE-Herbarium commented Jan 20, 2021

hardistyar commented Mar 4, 2021

jmacklin commented Mar 4, 2021

smrgeoinfo commented Apr 13, 2021

hardistyar commented May 26, 2021

Rindiser commented May 26, 2021

smrgeoinfo commented May 28, 2021

smrgeoinfo commented May 28, 2021

only1chunts commented Jun 1, 2021

ramonawalls commented Jun 1, 2021

wouteraddink commented Jun 2, 2021

wouteraddink commented Jun 2, 2021

wouteraddink commented Jun 2, 2021

cboelling commented Jun 2, 2021

matdillen commented Jun 2, 2021

ramonawalls commented Jun 3, 2021

ramonawalls commented Jun 3, 2021

emhaston commented Jun 3, 2021

smrgeoinfo commented Jun 3, 2021

smrgeoinfo commented Jun 3, 2021

wouteraddink commented Jun 25, 2021

smrgeoinfo commented Jun 25, 2021

only1chunts commented Jan 7, 2021 •

edited

Loading