Skip to content

Recreating long submission bytestream datum #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
theeldermillenial opened this issue Sep 20, 2023 · 15 comments · Fixed by #269
Closed

Recreating long submission bytestream datum #268

theeldermillenial opened this issue Sep 20, 2023 · 15 comments · Fixed by #269

Comments

@theeldermillenial
Copy link
Contributor

Describe the bug
I am trying to submit to a contract. One of the bytes in the datum is going to be more than 64 bytes long. Every time I submit to blockfrost, I get an error back saying that the bytestring cannot be longer than 64 bytes. I narrowed it down to the bytestring that is more than 64 bytes long.

I have even gone so far as to copy the hexstring from the application of interest, ingest with pycardano, sign, and submit and I get the same error. I am certain is not a configuration error on my part, unless the transaction is being parsed incorrectly on ingestion.

I guess I'm wondering what the issue could be here. Blockfrost? Is the application using an invalid bytestring length? If so, how are they able to submit this? Could Pycardano be sending the transaction incorrectly to Blockfrost?

Logs
Here is a reference transaction:
https://preprod.cardanoscan.io/transaction/a165c606d2600324b51b15d138bb91eb6bef46e33d51eaebbc7d78e0702f12c1

And the associated datum:
https://preprod.cardanoscan.io/datumInspector?datum=d8799f581c100bba42156945783bfc45ad9b0f9969a5fc34d186914cd528f439bf581cb92203c0c15c0b0da3e7ac99d48de85cad1beda530b8f54a12e9e33b584024ccdb90a49a970539327b45d585624d027f1d4c05c521a2c9cbe805d3d02cb74c08f25d32f5cc923e22badd041902424d5dbe53886f349adac7912b89a095075f584066756e64696e672d622758205c7831643365715c7864625c786663635c7839322c5c783837735c783865795c7864324a5c7839665c7865345c7864355c78656158295c7831393a5c783935365c7839665c786662745c7831665c7863645c7831335c7831622e5c78633127ff1a02747cd00a01ff

@nielstron
Copy link
Contributor

It might boil down to how the bytestring is serialized internally. The cardano node specification demands that any occurring bytestring be transferred as a list of bytestrings at most 64 bytes long. You can not see this difference in serialization in the cardanoscan datum inspector but on more detailed tools such as cbor.nemo157.com/

@theeldermillenial
Copy link
Contributor Author

Well, that's fun. Definitely bookmarked that link. Based on what I'm seeing there versus what I see in pycardano, it definitely looks different. I was wondering if maybe this was the issue, because I have made other transactions with lists of bytes.

I'll experiment and report back. I guess this could turn into a feature request.

@theeldermillenial
Copy link
Contributor Author

Okay, looks like that made some progress. What I did was make the type a List[bytes], but that was clearly the wrong approach because it actually interpreted that as a list. I think I'll play around with things in that tool you shared, and worse case scenario I'll create a hacky way to break that message up. Or maybe I'll do it the right way and open a PR.

@nielstron
Copy link
Contributor

There is a custom cbor encoder in pycardano that splits up bytestrings correctly... maybe its just not invoked correctly in your case?

@theeldermillenial
Copy link
Contributor Author

It could be. Basically, I get a bytestring from an application. Then I set a PlutusData value to that bytestring.

Here's a minimum reproducible example.

from dataclasses import dataclass

from pycardano import PlutusData


@dataclass
class LongBytesTest(PlutusData):
    CONSTR_ID = 0
    long_bytes: bytes


datum = LongBytesTest(
    long_bytes=b"funding-b'X z\\xef6\\x0c\\x12\\xac\\x17\\x12\\xadv@i\\x03\\x91\\x0c\\x02\\xce\\xa0CGG\\x01\\xd3\\xcf\\xbd\\x9d\\xa3\\x7f\\xa2\\x0e\\t\\x03'"
)

print(datum.to_cbor_hex())

That evaluates to a bytestring of 115 bytes long.

@theeldermillenial
Copy link
Contributor Author

theeldermillenial commented Sep 20, 2023

Taking it a step further, it doesn't appear as though a round trip of the datum is successful

from pycardano import PlutusData
original_cbor = "d8799f581c100bba42156945783bfc45ad9b0f9969a5fc34d186914cd528f439bf581ca61d35467c816cceeeadbe1ab8c1ca63402c5b1f4a02b234965069e65840a6ac3ef82abcb42647751de776a9b75b9dee8cfc072873b9d523169e45a0786830a3ede0ae5c2153265e0a500e2ee590737b712d3e6372888462b483933f39015f584066756e64696e672d622758205c7831617b485c78386533235c7831395c7838625c786434755c7862367a525c7839315c783835265c783030525c7830305c78385820625c7865375c7862665c7863355c7830623d435c7864305c7862386925334627ff1a027302e60a01ff"
datum = PlutusData.from_cbor(original_cbor)

# This produces an assertion error
assert original_cbor == datum.to_cbor_hex()

@theeldermillenial
Copy link
Contributor Author

Actually...the to_cbor_hex appears to generate an empty cbor object...

@theeldermillenial
Copy link
Contributor Author

Okay, so it looks like there is an IndefiniteList, and I can create something that resembles what it should look like, but it's not quite right. It looks like we might need an IndefiniteByteString or something. I can manually change the cbor to get it to be correct.

@theeldermillenial
Copy link
Contributor Author

Alright, I think I have it figured it out. I tracked down the way things are being serialized in PlutusData and ArrayCBORSerializable. It does not appear that there is a special handling of long bytes objects, but there should be. Somewhere.

I can open a PR for this because I am fairly certain of how to fix this, the question I have is where to put it. To me, it seems incorrect to put it in the ArrayCBORSerializable class, because there may be places where long bytes are permitted? I'm still kind of new to all this, but based on this article it seems like the bytestring length is limited to metadata:
https://developers.cardano.org/docs/get-started/cardano-serialization-lib/transaction-metadata/

It seems to me that a custom encoder might be needed? That feels wrong though. If you can point me to where this should be implemented, I'll open a PR. This should be an easy thing.

@theeldermillenial
Copy link
Contributor Author

It looks like based on this, the 64 byte length should only be applied to PlutusData, so maybe a custom/modified version of the default_encoder is the right approach.
IntersectMBO/cardano-ledger#2216

@theeldermillenial
Copy link
Contributor Author

Alright, sorry for the spam. I created an alternative encoder for PlutusData, but I see that getting this integrated properly is challenging. If I create the PlutusData and encode to cbor, it gives me the expected result. However, when I put that in a transaction, it falls back to the default encoder. I'll leave this here in case you might find it useful. However, I think this does amount to an actual "bug" in pycardano in that metadata bytestrings should be broken up.

def plutus_encoder(
    encoder: CBOREncoder, value: Union[CBORSerializable, IndefiniteList]
):
    """Overload for default_encoder to properly break up bytestrings."""
    if isinstance(value, (IndefiniteList, IndefiniteFrozenList)):
        # Currently, cbor2 doesn't support indefinite list, therefore we need special
        # handling here to explicitly write header (b'\x9f'), each body item, and footer (b'\xff') to
        # the output bytestring.
        encoder.write(b"\x9f")
        for item in value:
            if isinstance(item, bytes) and len(item) > 64:
                encoder.write(b"\x5f")
                for i in range(0,len(item),64):
                    imax = min(i+64,len(item))
                    encoder.encode(item[i:imax])
                encoder.write(b"\xff")
            else:
                encoder.encode(item)
        encoder.write(b"\xff")
    else:
        default_encoder(encoder, value)

@nielstron
Copy link
Contributor

I am wondering... there is a custom default encoder. Is it incorrectly called? Does it not perform the bytestring mapping that would be needed?

https://github.com/Python-Cardano/pycardano/blob/main/pycardano/serialization.py#L156

@nielstron
Copy link
Contributor

nielstron commented Sep 21, 2023

Or maybe we need to call this encoder for PlutusData. It should handle encoding correctly. But I recall copying it/parts of it from somewhere else in pycardano.

https://github.com/OpShin/uplc/blob/448f634cc1225de6dd7390b670b01396d2e71156/uplc/ast.py#L430

@theeldermillenial
Copy link
Contributor Author

In response to your first message, I looked at that and it is missing the indefinite bytestring header that I included in my plutus_encoder (the 0x5f byte header). The default_encoder appears to be called correctly, it's just missing the special catch for long bytestrings on plutus data.

In response to your second message, that looks like it would do the trick. However, I don't recognize those classes as a part of pycardano, so I suspect some tooling would be involved. I guess I could study that and figure out how they are calling it. I've been meaning to start learning opshin, so this might be a good first exercise.

@theeldermillenial
Copy link
Contributor Author

Pretty sure I figured it out. See the PR I just opened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants