Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_YEAR_STANDARD #141

Closed
Tasilee opened this issue Mar 23, 2018 · 38 comments
Closed

TG2-VALIDATION_YEAR_STANDARD #141

Tasilee opened this issue Mar 23, 2018 · 38 comments
Labels
Conformance DO NOT IMPLEMENT A potential test that it is not recommended be implemented Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 TIME Validation

Comments

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 23, 2018

TestField Value
GUID 8e74db19-cfb3-4426-a3ab-e89249712681
Label VALIDATION_YEAR_STANDARD
Description Can the value for year be interpreted as a valid year?
TestType Validation
Darwin Core Class Event
Information Elements ActedUpon dwc:year
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:year is bdq:Empty or can not be cast as an integer; COMPLIANT if the value of dwc:year cast as an integer does not extend outside optionally-provided begin and end years; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions YEAR_STANDARD
Parameter(s) Default values: earliest year = 1600, latest year = current year
Source Authority
Specification Last Updated 2024-02-23
Examples [dwc:year="1623": Respose.status=COMPLIANT, Response.result="", Response.comment="The value in year is between the earliest date and current date"]
[dwc:year="X": Respose.status=NOT_COMPLIANT, Response.result="", Response.comment="The value in year cannot be interpreted as a valid year"]
Source @Tasilee, @ArthurChapman , VertNet
References
Example Implementations (Mechanisms) Kurator:event_date_qc
Link to Specification Source Code https://github.com/FilteredPush/event_date_qc/blob/fb472b8fe25b72fc1203472b261900def3af61a9/src/main/java/org/filteredpush/qc/date/DwCEventDQ.java#L1601 unit test at https://github.com/FilteredPush/event_date_qc/blob/fb472b8fe25b72fc1203472b261900def3af61a9/src/test/java/org/filteredpush/qc/date/DwcEventDQTest.java#L1794
Notes The results of this test are time-dependent. Next year is not valid now; next year it will be. This test provides the option to designate lower and upper limits to the year. The upper limit, if not provided, should default to the year when the test is run. There should be no default lower limit. NB By convention, use 1600 as a lower limit for collecting dates of biological specimens.
@Tasilee Tasilee added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Aug 14, 2018
@chicoreus
Copy link
Collaborator

Added missing guid.

@tucotuco
Copy link
Member

Working with Zooarch, I do not agree with the rejection of years that are less than four digits. We have plenty of existing dated examples with those characteristics.

@ArthurChapman
Copy link
Collaborator

But don't you want to identify them @tuco This then links to our #129 which we are talking about getting rid of. If it is unambigous to convert those years with less than 4 digits to four digit years - then perhaps we need to keep #129 ?

@tucotuco
Copy link
Member

Identify them as non-standard? Not in the Zooarch case. 900 is a great year in zooarch, for example. So is 50. So is -4000. Perhaps if this test was similarly parametrized it would make all the difference.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jan 24, 2019

I agree with @tucotuco regards parameterized/paramatrized :). Regarding #129, flagging a potential/actual issue is easier than knowing what was intended. We cannot have 'tests' that potentially lower the 'quality' of a record.

@Tasilee Tasilee added the Parameterized Test requires a parameter label Jan 24, 2019
@ArthurChapman
Copy link
Collaborator

We still need to add a reference to why it is parameterized. For example to set the earliest date (for example for most data it will be 1700. I wonder if in the notes we should say that if a parameter is not set it defaults to a four digit year (or to 1700) or something like that. This is the year of an Event - it interests me that we have a lot of fossil collections that were "collected" as far as 50 or -4000. I guess that boils down to what is meant by an Event. Interesting!

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jan 24, 2019

We need to perhaps change the Expected Response to something like

INTERNAL_PREREQUISITES_NOT_MET if the field dwc:year is not present or is EMPTY; COMPLIANT if the value of the field dwc:year was unambiguously interpreted to be an integer less than the current year, and optionally not extending before a year designated when the test is run; otherwise NOT_COMPLIANT

@tucotuco
Copy link
Member

I agree with the @ArthurChapman solution. I still don't like the test in Issue #129.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jan 30, 2019

I have made some changes to the Expected Response and the Notes. Please check. And #129 is no longer.

@ArthurChapman
Copy link
Collaborator

Looks better. I added "set" after Parameretized in Expected Response - Should we say when the test was "Run" rather than "established" I think we have used run elsewhere.

Do you want to add to the Note something about "Some palaentological records may be earlier than the year 1000 and thus two or three digits would be acceptable in those cases."

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jan 30, 2019

I agree @ArthurChapman. I've also added negative integers as an example? What about things like 1.5BP?

@ArthurChapman
Copy link
Collaborator

1.5BP would not be valid - would need to be translated (that is another example for now deleted #129 - so not only MCMXXVIII etc.)

@ArthurChapman
Copy link
Collaborator

Should 123 in Example stay or should it say dwc:year="123" (if Parameter set to 1753, for example)

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jan 30, 2019

I would think we do need the "If"

@tucotuco
Copy link
Member

I would change the expected response from

"INTERNAL_PREREQUISITES_NOT_MET if the field dwc:year is not present or is EMPTY; COMPLIANT if the value of the field dwc:year was unambiguously interpreted to be an integer less than or equal to the current year, and not before a year designated by the parameter set when the test was established; otherwise NOT_COMPLIANT"

to

"INTERNAL_PREREQUISITES_NOT_MET if dwc:year is not present or is EMPTY; COMPLIANT if the
value of dwc:year lies between optionally-provided begin and end years; otherwise
NOT_COMPLIANT"

I would change the notes from

"This test should detect at least 1, 2 and 3 digit values for year where interpretation may be ambiguous and non integer values. The test is Parameterized by having a value set for the minimal (earliest) year that will be acceptable in the environment. Palaentological records may be earlier than the year 1000 and thus two or three digits would be acceptable, and in some cases, negative integers."

to

"The results of this test are time-dependent. Next year is not valid now. Next year it will be. This test provides the option to designate lower and upper limits to the year. The upper limit, if not provided, should default to the year when the test is run. There should be no default lower limit. NB By convention, use 1700 as a lower limit for collecting dates of biological specimens."

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented May 15, 2019

So, if following principle of #178 and adding a Field called Parameter it would read something like
| Parameter | Set Parameter(1) value as the minimal (earliest) year that will be acceptable in the environment. Default=1700. Set Parameter(2) to the maximum (latest) year that will be acceptable in the environment. Default=current year |
And leave the note
I note that, although in your example @tucotuco in your last message you imply that there is a parameterized upper limit ("The upper limit, if not provided...) but in the Expected Response there is only current year mentioned.

@Tasilee
Copy link
Collaborator Author

Tasilee commented May 15, 2019

I've edited the table, according to comments

@ArthurChapman
Copy link
Collaborator

I would suggest default earliest year should be 1700, and if there are palaentological records they could set to -500 or something. This would pick up all the living material earlier than 1700.

@Tasilee
Copy link
Collaborator Author

Tasilee commented May 15, 2019

Seems reasonable.

@ArthurChapman
Copy link
Collaborator

Not #76 that should be 1753 (can't identify it before the nomenclatural system date) but what about #131 and #130 (we don't have that paramaterized as written, but should we?) - BTW - Example in #130 doesn't look correct - I think last part should be deleted (looks like it came from an amendment)

@ArthurChapman
Copy link
Collaborator

And #121

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jun 12, 2019

#130 or #131 would not have a parameter.
#121 isn't relevant, surely?
#130 examples seem ok.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jun 12, 2019

I have updated #36, #84 and #141.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jun 12, 2019

Above comments from @tucotuco down should have been on #178?

@ArthurChapman
Copy link
Collaborator

Woops - I think I meant #129

@tucotuco
Copy link
Member

I have taken the liberty to edit the Parameter(s) to be explicit about what the parameters are and what their default values are. Was, "Default values = 1600 and current year". Changed to, "Default values: earliest year = 1600, latest year = current year".

@tucotuco
Copy link
Member

Following the discussions arising from the event date case study for the BISS paper, I believe that this test should be updated to use the expected response of TG2-VALIDATION_YEAR_OUTOFRANGE (#84), which should then be deprecated. Everything else about this test is already up to date with the proposed change.

@ArthurChapman
Copy link
Collaborator

I agree with this assessment, John. Once the wording is combined, then #84 deprecated. They overlap and I think #84 is redundant.

@tucotuco
Copy link
Member

Updated to incorporate TG2_VALIDATION_YEAR_OUTOFRANGE (#84).

@chicoreus
Copy link
Collaborator

The current wording of the specification is problematic for implementors. "cast" has a specific meaning in typed programing languages that is probably not what is meant here, and confilicts with the meaning of nonstandard.

This looks like it comes from bringing in the language from #84, which should probably be reopened and the paameters switched to there. The two tests are for different concepts. #84 tests if a validly formatted year is within range, #141 tests if a year is validly formatted and would go in paralell with an ammendment which would convert a value that can be unabmigously interpreted as an integer into that integer. We had this set of paralell tests on the board in gainesville, and it is important that we don't merge multiple concepts into single tests.

This test, and the accompaniying ammendment should assert "unambiguously interpreted to be an integer", and should not include a parameter for range.

Test #84 can assert "cast as an integer" (meaning that the string "1" can be treated as the integer 1), and should include the range parameters.

@chicoreus
Copy link
Collaborator

See note in #84, I think we should mark this test as do not implement, and use #84 instead.

@chicoreus
Copy link
Collaborator

The name and dimension of this test no longer agree with the specification.

@Tasilee Tasilee closed this as completed Aug 13, 2019
@chicoreus chicoreus added the DO NOT IMPLEMENT A potential test that it is not recommended be implemented label Aug 15, 2019
@ArthurChapman ArthurChapman added Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. and removed Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT DO NOT IMPLEMENT A potential test that it is not recommended be implemented labels Sep 18, 2023
@chicoreus chicoreus added Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT DO NOT IMPLEMENT A potential test that it is not recommended be implemented and removed Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT labels Sep 18, 2023
@Tasilee Tasilee added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 14, 2024
@Tasilee Tasilee changed the title TG2-VALIDATION_YEAR_NOTSTANDARD TG2-VALIDATION_YEAR_STANDARD Feb 22, 2024
@Tasilee
Copy link
Collaborator Author

Tasilee commented Feb 22, 2024

Specifications updated to align with the current template

@Tasilee
Copy link
Collaborator Author

Tasilee commented Apr 19, 2024

Corrected the TERM-ACTION from "YEAR_NOTATANDARD" to "YEAR_STANDARD"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Conformance DO NOT IMPLEMENT A potential test that it is not recommended be implemented Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 TIME Validation
Projects
None yet
Development

No branches or pull requests

4 participants