Skip to content

Improve binary data definitions #430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
webron opened this issue Jul 31, 2015 · 10 comments
Closed

Improve binary data definitions #430

webron opened this issue Jul 31, 2015 · 10 comments

Comments

@webron
Copy link
Member

webron commented Jul 31, 2015

Following the discussion in #50, for the next version of the spec we should provide a better way of defining transfer of binary data. This should also include support for multiple file upload (#254) and supporting file transfer using other mime types (#326).

@evigeant
Copy link

evigeant commented Dec 1, 2015

As mentioned here: swagger-api/swagger-core#1531 (comment)

I would like the spec to support a binary format which would indicate that it needs to be streamed. I believe a lot of APIs performing file upload and download would benefit from this as the current file type is too restrictive (only for multipart upload, cannot be used for generated content).

I reproduce here my proposal for an additional string format:

Data Types

Common Name type format Comments
string string
byte string byte base64 encoded characters
binary string binary any sequence of octets
blob string blob a long sequence of octets (treated as a stream)

The generators would then use an appropriate data type (ex: java.io.InputStream) to represent these parameters.

The name blob is only a suggestion, it comes from the database world, however it does convey the notion of an object too large to be handled in the normal sense.

@webron
Copy link
Member Author

webron commented Dec 1, 2015

Well, I'll copy my comment too ;)

Not sure the spec should deal with how things are handled but rather what they are. To me, in your case, binary and blob are the same. If you want to have something that's implementation-specific, that's where the extensions come in.

@evigeant
Copy link

evigeant commented Dec 1, 2015

Sorry for the cross-posting, I looked around a lot in the issues to find problems similar to what I was having. I'll continue the discussion here only.

I understand your point and I agree that the spec should document the data type and not the implementation details in general. However, this specific problem is quite common and one could argue that binary data that cannot be handled in memory is different from binary data that can.

But the real issue is that any API for large file download cannot be represented optimally using swagger. What I mean is that if I specify binary as my result type, all (most?) clients will be generated using byte[] which will not work because the data cannot be held entirely in memory. This happens regardless of any vendor extensions I put in the API specification. Basically, one of the major benefits of using swagger, namely generating client code, is not applicable for any API doing file download or raw file upload (not multipart).

@webron
Copy link
Member Author

webron commented Dec 1, 2015

No worries about the cross posting and happy to see you moved the discussion to the right place.

Thanks for providing the additional details, they should be taken into account. Question is - can we really tell the clients what's 'too big' for the to handle? I get the need to hint them about it, we just need to be careful about how we end up documenting it.

@evigeant
Copy link

evigeant commented Dec 1, 2015

Actually, if there are two data types, the API designers will have to make this decision based on their knowledge of what the API is used for.

I agree with you that if such a type was introduced, the spec would need to be clear about the difference and when to use binary vs blob. However, if you consider that a swagger-generated client could be used in a watch or other embedded device, 'too big' might not be that big at all.

In the limit, all binary data could be streamed. Then we would not need blob at all, binary could always be represented as a stream. But when you have small amounts of binary data embedded inside a larger structure, then this would be inconvenient again (although one could use byte for this case).

To me, byte and binary are closer than binary and blob and the only reason I see for keeping byte and binary as they are currently, is to support existing APIs which send small amounts of binary data either as base64 or raw.

@webron
Copy link
Member Author

webron commented Dec 1, 2015

So it looks like we're in sync in terms of the challenges here. If you have any additional thoughts, those would be welcome. Hopefully, over time, more people would voice their opinion as well.

@webron
Copy link
Member Author

webron commented Mar 16, 2016

Parent issues: #565 and #579.

@webron
Copy link
Member Author

webron commented Jul 21, 2016

Tackling PR: #741

@webron
Copy link
Member Author

webron commented Feb 22, 2017

This was handled in #878.

@spacether
Copy link

spacether commented Sep 7, 2022

Related #3024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants