Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Multi-image preprocessing, such as dino, voxelize and 3d vae #177

Open
LeeCASC opened this issue Feb 10, 2025 · 1 comment
Open

About Multi-image preprocessing, such as dino, voxelize and 3d vae #177

LeeCASC opened this issue Feb 10, 2025 · 1 comment

Comments

@LeeCASC
Copy link

LeeCASC commented Feb 10, 2025

I want to confirm the experimental process.
First, input 150 images into the Dinov2 model to obtain the feature (150, 1024, 37, 37).
Then, by voxelization, project it to the plane and then project it back to the voxel space and then average each voxel, will the feature (1024, 64, 64, 64) be obtained? Will averaging reduce the number of channels(1024)? How to filter the obtained features from 64x64x64=262144 to 20,000?
Finally, the pre-trained model 3D VAE is used to process the feature. What is this pre-trained model? What data is used to pretrain 3D VAE?

@JeffreyXiang
Copy link
Collaborator

Hi, we provide dataset_toolkits that contains the process of dino feature fusion. VAE is also trained with 500K assets we filtered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants