Skip to content

Barracuda + MiDaS v2 #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
GeorgeAdamon opened this issue Jun 6, 2021 · 17 comments
Closed

Barracuda + MiDaS v2 #187

GeorgeAdamon opened this issue Jun 6, 2021 · 17 comments

Comments

@GeorgeAdamon
Copy link

GeorgeAdamon commented Jun 6, 2021

Hello I'm trying to run Intel's MiDaS v2 Monocular-Image-To-Depth model, which is provided by the authors in .onnx format. Here's my environment:

Platform version
Unity 2021.1.2f1
com.unity.barracuda 1.0.4 - 2.1.0 preview

The model loads fine in Unity, without warnings, and I generate my tensors by providing a texture in the right format (RGB24, 384 x 384). However when I try to execute the model, I get this error pointing to the DepthwiseConv2D operator:

NotImplementedException: The method or operation is not implemented.

Unity.Barracuda.ReferenceCPUOps.DepthwiseConv2D (Unity.Barracuda.Tensor X, Unity.Barracuda.Tensor K, Unity.Barracuda.Tensor B, System.Int32[] stride, System.Int32[] pad, Unity.Barracuda.Layer+FusedActivation fusedActivation) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/BarracudaReferenceCPU.cs:412)

Unity.Barracuda.ReferenceComputeOps.DepthwiseConv2D (Unity.Barracuda.Tensor X, Unity.Barracuda.Tensor K, Unity.Barracuda.Tensor B, System.Int32[] stride, System.Int32[] pad, Unity.Barracuda.Layer+FusedActivation fusedActivation) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/BarracudaReferenceCompute.cs:920)

Unity.Barracuda.ComputeOps.DepthwiseConv2D (Unity.Barracuda.Tensor X, Unity.Barracuda.Tensor K, Unity.Barracuda.Tensor B, System.Int32[] stride, System.Int32[] pad, Unity.Barracuda.Layer+FusedActivation fusedActivation) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/BarracudaCompute.cs:925)

Unity.Barracuda.PrecompiledComputeOps.DepthwiseConv2D (Unity.Barracuda.Tensor X, Unity.Barracuda.Tensor K, Unity.Barracuda.Tensor B, System.Int32[] stride, System.Int32[] pad, Unity.Barracuda.Layer+FusedActivation fusedActivation) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/BarracudaPrecompiledCompute.cs:661)

Unity.Barracuda.StatsOps.Unity.Barracuda.IOps.DepthwiseConv2D (Unity.Barracuda.Tensor X, Unity.Barracuda.Tensor K, Unity.Barracuda.Tensor B, System.Int32[] stride, System.Int32[] pad, Unity.Barracuda.Layer+FusedActivation fusedActivation) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/StatsOps.cs:79)

Unity.Barracuda.GenericWorker+<StartManualSchedule>d__30.MoveNext () (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/GenericWorker.cs:221)

Unity.Barracuda.GenericWorker.Execute () (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/GenericWorker.cs:121)

Unity.Barracuda.GenericWorker.Execute (Unity.Barracuda.Tensor input) (at Library/PackageCache/com.unity.barracuda@1.0.4/Barracuda/Runtime/Core/Backends/GenericWorker.cs:115)

The error seems to be triggered here:

public virtual Tensor DepthwiseConv2D(Tensor X, Tensor K, Tensor B, int[] stride, int[] pad, Layer.FusedActivation fusedActivation)
    {
        if (K.kernelDepth != 1)
            throw new NotImplementedException();

Is this genuinely something that is not supported at the moment, or does the .onnx model need to be transformed slightly for it to play nicely with Barracuda?

Any help would be appreciated !

@FlorentGuinier
Copy link

Hi @GeorgeAdamon thanks for reporting we will be looking into it!

@GeorgeAdamon
Copy link
Author

Thanks in advance for your time @FlorentGuinier !

@FlorentGuinier
Copy link

FlorentGuinier commented Jun 7, 2021

Sure! I was looking into converting the model to repro however it seems at least on my system this is a bit of a rabbithole.
Could you share the onnx file please (Either here or via barracuda-support @ unity3d.com) ?

PS: random weights are fine.

@GeorgeAdamon
Copy link
Author

GeorgeAdamon commented Jun 7, 2021

@FlorentGuinier
Copy link

FlorentGuinier commented Jun 8, 2021

Thanks!

So the problem here is that some convolution in the model are using "group" that are not 1 nor the input channel count. At the moment we only support those two version (ie regular convolution with group == 1, and depthwise convolution where group == input channel count).

I will clarify the error message, however the real question here is "Do you need DepthwiseConvolution where group != input channel?". Is so please let us know your use case and dead line and we will open a feature request.

@GeorgeAdamon
Copy link
Author

Thanks for your reply @FlorentGuinier!

To be honest, I am not familiar with the design philosophy of this model, therefore the reasons that led the authors to this kind of convolution, and whether this approach is inevitable / needed. I can try and raise this with intel isl, linking to this issue.

The use-case is an academic one, a few groups of UCL interactive design students rely on this algorithm to be able to sense depth using simple smartphone cameras (not Lidar iphones). Their tests work using the offline python scripts, but their projects are actually real-time 3d, so the python workflow would add significant overhead. They would ideally like to have something presentable by end of June (intermediate crits).

@GeorgeAdamon
Copy link
Author

@FlorentGuinier as you can see in isl-org/MiDaS#113, this architectural feature is crucial for the MiDaS model.
I will try their work-around and report.

Are grouped convolutions something that the Unity Barracuda would be interested in supporting ?

@FlorentGuinier
Copy link

FlorentGuinier commented Jun 8, 2021

We have no immediate plan of supporting depthwise conv with group != channel atm unfortunatly, also performance might be hard with current implementaiton for realtime use case on mobile. Have you consider XRFoundation for the depth estimation?

@GeorgeAdamon
Copy link
Author

GeorgeAdamon commented Jun 8, 2021

https://github.com/GeorgeAdamon/monocular-depth-unity

Followed @ranftlr 's advice and used an alternative smaller model. Works like a charm !

It's absolutely worth doing performance/quality comparisons on mobile between MiDaS & XRFoundation.

Thanks a lot for your help @FlorentGuinier

@wtesler
Copy link

wtesler commented Dec 25, 2021

Still a shame that the larger MiDaS model doesn't work with Barracuda. I'm using it on desktop and the PC is more than capable of running inference on that model in realtime. For now I'm stuck with the MiDaS small model... Hopefully MiDaS 3 will work with Barracuda.

@FlorentGuinier
Copy link

Hi @wtesler,
Indeed for desktop larger MiDaS model make senses, If you need the improve accuracy for your project this is something we can look into. BTW why MiDaS in particular, I haven't dig however it seems there are quite a bunch of other depth estimation paper out there? Finally this might be of interrest to you : https://github.com/compphoto/BoostingMonocularDepth

Happy new year!
Florent

@wtesler
Copy link

wtesler commented Jan 3, 2022

Hi @FlorentGuinier,
Thank you for response. MiDaS is nice because they run in real-time and provide their models as ONXX files which makes it very accessible for me to integrate into Unity. I have seen alternate models for monocular depth estimation, Such as LeRes, and this one from Facebook: https://github.com/facebookresearch/consistent_depth , but MiDaS is the only one that can run in real-time with decent quality.

I am hoping that an ONXX distribution of MiDaS 3 comes out soon which is based on something called a Dense Predictive Transformer (So totally different than MiDaS 2), but also in the meantime, Using the larger MiDaS 2 model would be my short-term goal.

@FlorentGuinier
Copy link

Interresting!
So i guess depending on your project deadlines and the performance of the new models vs target hardware it could be either good to wait for MiDaS v3 (as they report a 21% accuracy increase) or go with large 2.1 MiDaS model.

Back to square one :)

  • If you need the depthwise conv with group != channel feature --> Please let us know your project deadline so we can prioritize extending depthwise conv support.
  • If MiDaS 3 make more sense for your project --> please note that there is a risk it might require work on our side, we will in that case be ready to help for sure.

Florent

@wtesler
Copy link

wtesler commented Jan 4, 2022

@FlorentGuinier I would say the depthwise conv would be nice, and other people may benefit from it because MiDaS is excellent for AR applications and thus is a natural fit for Unity. I don't have a specific deadline for the project because the smaller MiDaS model at least works to some degree, it's just a matter of quality improvement to see the larger model work.

@FlorentGuinier
Copy link

I'm adding this to the backlog, thanks for your feedback.

@GeorgeAdamon
Copy link
Author

@FlorentGuinier Is there any status update on this item (depthwise conv) ?
It seems MiDaS .onnx support is stuck in 2.1, unfortunately.

@AlexRibard
Copy link
Collaborator

I would recommend signing up to Sentis
It is the newest and improved version of Barracuda and would run MiDaS without any trouble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants