-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
secret: add-debug-port/import-result for secret-arithmetic op #1550
base: main
Are you sure you want to change the base?
Conversation
17c246a
to
46783e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the symlinks are fine, but I suspect this may behave badly with bazel in the long run. If possible, it would be better to have the rule's srcs
point directly to the file, and you may need to use https://bazel.build/reference/be/functions#exports_files to make the file visible from the BUILD file of the source files, or else wrap the files in a filegroup
rule and use it as a data dependency.
If you can't get either to work relatively quickly, we can submit as-is and I can patch it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it for another PR.
tests/Examples/openfhe/ckks/dot_product_8f_debug/dot_product_8f_debug_test.cpp
Outdated
Show resolved
Hide resolved
For these I wonder if it makes sense to have a general "forward-propagate an attribute" pass that would do something like: given an attribute name, for each op that has that attribute attached to its results, propagate it forward to downstream user ops, unless the user op itself has that attribute. Then you could call this pass to do cleanup whenever needed. |
46783e2
to
907ae3f
Compare
907ae3f
to
7148aa9
Compare
Comments addressed. I am still a little concerned about the artifact EDIT:
I agree with such approach. Lets leave it for another PR when we need it. |
It is not hard to add a proper bazel rule for these. For one tool in isolation, it would be similar to The slightly trickier part is that you want to run the generated binary, capture the stdout to a file, and then use that file as an input to another rule. The easiest way to do this is probably to wrap the execution of the generated binary in a genrule, though you could also make a custom rule. The main reason for needing a rule is that any time a file is produced that needs to be passed between rules, the file itself must be declared as an output by one rule and as an input by another rule. Then you could chain all these rules together in a bazel macro. I'm fine if we merge this as-is and do this in a followup. If you can't figure it out, I will probably have time to do it in early April. |
Should be rebased after #1513 is in.
The idea behind this PR is that, with plaintext backend, we can compare the plaintext backend execution result with the CKKS execution result to know the precision loss.
Knowing the precision loss is the first step towards implementing/validating CKKS noise model.
However, as these two pipelines are distinct, and we have two executable (plaintext executable and CKKS executable), we can not run them in one-line to get the result. Lets first see the design
Design
There are two passes added
--secret-add-debug-port
and--secret-import-execution-result
, the first is similar to--lwe-add-debug-port
, while the second one is responsible for importing the execution result of the executable back to IR. I will show why this design is needed.For the plaintext backend, now it can has debug handler like
__heir_debug_tensor_8xf32_
to print the intermediate value during execution.Then such execution result is imported back to the IR via
--secret-import-execution-result=file-name=trace.log
, and the resulting IR isThen such information can be used by the CKKS backend to determine precision.
Why
A natural question is, why do we need to import the execution result in the first place. The reason is that further passes beyond
secret-arithmetic
will change the IR with additional operations (like mgmt op and tensor.extract lowering)For example, in plaintext backend for the following IR you will only see 3 result
But when lowered, it will become
Then in
lwe-add-debug-port
there will be 7 sentenses. How do we know what line corresponds to what line when comparing? Only the compiler know. So compiler should annotate the plaintext result and any further transformation should inherit the attribute.Current known issues
secret-insert-mgmt
wont inherit theplaintext.result
attribute from previous opoptimize-relinearization
will not preserve the attribute in it as it just removes the relin op and re-insert it.tensor.extract
is tooo late, and we can not simulate it in plaintext backend.About the plaintext backend
Currently,
a.out
in plaintext backend is built by lit in sandbox so it can not persist. I tried to build it by bazel (then we will get rid of the annoyingcc
issue), but then we do not havemlir_translate.bzl
andllc.bzl
macro to use. So the dumping of result in plaintext backend is ugly now.Example
The result would be
# bazel test --test_output=all //tests/Examples/openfhe/ckks/dot_product_8f_debug/... Input Precision lost: 2^-47.9 Input Precision lost: 2^-47.3 openfhe.mul_no_relin Precision lost: 2^-25.4 openfhe.relin openfhe.add_plain Precision lost: 2^-15.3 openfhe.rot Precision lost: 2^-15.3 openfhe.rot Precision lost: 2^-25.4 openfhe.add Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.rot Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.rot Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.rot Precision lost: 2^-15.3 openfhe.add Precision lost: 2^-15.3 openfhe.mod_reduce openfhe.mul_plain Precision lost: 2^1.32 openfhe.rot Precision lost: 2^-15.3 lwe.reinterpret_application_data openfhe.mod_reduce
You can notice that there is no comparison result for mgmt op like relin/reduce, and the comparison for tensor.extract is wrong.