Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrd_zip: drop Manifestation-Depth, disallow fetch.txt #182

Merged
merged 1 commit into from
Apr 28, 2022
Merged

Conversation

kba
Copy link
Member

@kba kba commented May 31, 2021

This PR removes the Ocrd-Manifestation-Depth parameter and disallows the fetch.txt mechanism.

We introduced these to allow for iterative ingestions of only the changed files into OCR-D GT endpoints like the OCR-D GT Repo or OLA-HD.

However, I think this flexibility is a premature optimization. Yes, workspaces can become very large and ingesting full manifestations for every update is inefficient. But bandwidth has not been an issue so far and it will be difficult to map these mechanisms to the (messy) real-life data we want to process, e.g. with hard-to-categorize @xlink:href (think file:/ URL from Goobi/Kitodo).

I therefore think it would be better if we focussed on packaging all the OCR-D produced data in a well-defined way and ensure that data consumers don't have to do any extra steps (that might fail!) to create a complete manifestation. "What you see is what you get" is more important than maximum efficiency for re-ingestion.

We should still have such a mechanism for updating an ingested OCRD-ZIP but we should find a more efficient and less ambiguous way POSTing an an incomplete bag (such as a set of patches against the contents of the OCRD-ZIP or an API to PUT specific results in the OCRD-ZIP).

@cneud
Copy link
Member

cneud commented Apr 27, 2022

this flexibility is a premature optimization

I wholeheartedly agree. Let's keep it simple but safe.

If there are no objections from the perspective of OLA-HD this has my full support.

@cneud cneud mentioned this pull request Apr 27, 2022
@kba
Copy link
Member Author

kba commented Apr 28, 2022

I don't think there's a reason not to change this. Will merge and also fix the URL to the profile (which has been broken for a while, ht @joschrew).

@kba kba merged commit 4818174 into master Apr 28, 2022
@kba kba deleted the simplify-bagit branch September 6, 2022 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants