How to process SDF data for 3D generative models

This is simplest process I found out for SDF data processing, which is a necessary step for 3D generative models. It does not need explicit watertight conversion. My projects 3DILG, 3DShape2VecSet, Functional Diffusion, and LaGeM are based on the code.

🌏 Environment Setup

You can skip some steps if you already have installed some packages.

# install pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

# install nvcc
conda install cuda-nvcc=12.4 -c nvidia

# install necessary develop packages
conda install libcusparse-dev
conda install libcublas-dev
conda install libcusolver-dev

# install mesh processing packages (you can use either one)
pip install point_cloud_utils
pip install trimesh

# compile the cuda code (adapted from DualSDF)
cd mesh2sdf2_cuda
python setup.py install

💾 Mesh loading

The code snippet loads the mesh vertices and faces into v and f,

import trimesh

mesh = trimesh.load(mesh_path, skip_materials=True, process=False, force='mesh')
v = mesh.vertices
f = mesh.faces

📝 Mesh normalization

First, we need to normalize meshes consistently. You can either use the sphere normalization,

shifts = (v.max(axis=0) + v.min(axis=0)) / 2
v = v - shifts
distances = np.linalg.norm(v, axis=1)
scale = 1 / np.max(distances)
v *= scale

or box normalization ([-1, 1]),

shifts = (v.max(axis=0) + v.min(axis=0)) / 2
v = v - shifts
scale = (1 / np.abs(v).max()) * 0.99
v *= scale

I use box normalization in 3DILG, 3DShape2VecSet, and Functional Diffusion. In my latest work LaGeM, I use sphere normalization.

🔨 Processing

In this section, we will process the meshes. There are multiple type of points, including,

Surface points: sampled on the mesh surfaces.
Labeled points: sampled within the bounding volume.
- Volume points: uniformly sampled in the bounding volume.
- Near-surface points: sampled in the near-surface region. They are obtained by jittering surface points.

We begin by determining how many points to sample.

N_vol = 250000 # volume points
N_near = 125000 # near-surface points

Surface points sampling.

import point_cloud_utils as pcu

fid, bc = pcu.sample_mesh_random(v, f, N_near)
surface_points = pcu.interpolate_barycentric_coords(f, fid, bc, v) # N_near x 3

Volume points sampling.

If we are using box nomralization,

vol_points = np.random.rand(N_vol, 3) * 2 - 1

If we are using sphere normalization,

vol_points = np.random.randn(N_vol, 3)
vol_points = vol_points / np.linalg.norm(vol_points, axis=1)[:, None] * np.sqrt(3)
vol_points = vol_points * np.power(np.random.rand(N_vol), 1. / 3)[:, None]

Near-surface points sampling.

near_points = [
    surface_points + np.random.normal(scale=0.005, size=(N_near, 3)),
    surface_points + np.random.normal(scale=0.05, size=(N_near, 3)),
]
near_points = np.concatenate(near_points)

Calculation of signed distances

We transfer the mesh data to the GPU (using CUDA)

v = torch.from_numpy(v).float().cuda()
f = torch.from_numpy(f).cuda()
mesh = v[f]

The package mesh2sdf is adapted from DualSDF.

import mesh2sdf

vol_points = torch.from_numpy(vol_points).float().cuda()
vol_sdf = mesh2sdf.mesh2sdf_gpu(vol_points, mesh)[0].cpu().numpy()

near_points = torch.from_numpy(near_points).float().cuda()
near_sdf = mesh2sdf.mesh2sdf_gpu(near_points, mesh)[0].cpu().numpy()

Save data

np.savez(
    save_filename, 
    shifts=shifts,
    scale=scale,
    vol_points=vol_points.cpu().numpy().astype(np.float32),
    vol_sdf=vol_sdf.astype(np.float32), 
    near_points=near_points.cpu().numpy().astype(np.float32), 
    near_sdf=near_sdf.astype(np.float32), 
    surface_points=surface_points.astype(np.float32),
)

🍴 Data Split

The data splits in my projects can be found here ShapeNet.

☎️ FAQ

How long does it take?

For a single mesh, it approximately takes less than 10 seconds. It largely depends on the number of triangles in the mesh.

How do I get a watertight mesh?

The method does not explicitly output a watertight mesh. You can apply marching cubes to the signed distances (on a grid).

How does the SDF calculation work?

The calculation method is based on ray stabbing.

How many points do I need to train 3D generative models?

The larger the better. The numbers shown in this post are working well on ShapeNet and Objaverse.

📧 Contact

Send me an email (biao.zhang@kaust.edu.sa or biao.zhang.ai@outlook.com) if you have further questions.

📘 Citation

If you use the code in your projects, consider citing the related papers,

@inproceedings{Biao_2022_3DILG,
author = {Zhang, Biao and Nie\ss{}ner, Matthias and Wonka, Peter},
title = {{3DILG}: irregular latent grids for 3D generative modeling},
year = {2022},
isbn = {9781713871088},
publisher = {Curran Associates Inc.},
address = {Red Hook, NY, USA},
booktitle = {Proceedings of the 36th International Conference on Neural Information Processing Systems},
articleno = {1590},
numpages = {15},
location = {New Orleans, LA, USA},
series = {NIPS '22}
}

@article{Biao_2023_VecSet,
author = {Zhang, Biao and Tang, Jiapeng and Nie\ss{}ner, Matthias and Wonka, Peter},
title = {{3DShape2VecSet}: A 3D Shape Representation for Neural Fields and Generative Diffusion Models},
year = {2023},
issue_date = {August 2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {42},
number = {4},
issn = {0730-0301},
url = {https://doi.org/10.1145/3592442},
doi = {10.1145/3592442},
journal = {ACM Trans. Graph.},
month = jul,
articleno = {92},
numpages = {16},
keywords = {3D shape generation, 3D shape representation, diffusion models, shape reconstruction, generative models}
}

@InProceedings{Biao_2024_Functional,
    author    = {Zhang, Biao and Wonka, Peter},
    title     = {Functional Diffusion},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {4723-4732}
}

@inproceedings{Biao_2024_LaGeM,
title={{LaGeM}: A Large Geometry Model for 3D Representation Learning and Diffusion},
author={Biao Zhang and Peter Wonka},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=72OSO38a2z}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
mesh2sdf2_cuda		mesh2sdf2_cuda
shapenet_split		shapenet_split
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to process SDF data for 3D generative models

🌏 Environment Setup

💾 Mesh loading

📝 Mesh normalization

🔨 Processing

We begin by determining how many points to sample.

Surface points sampling.

Volume points sampling.

Near-surface points sampling.

Calculation of signed distances

Save data

🍴 Data Split

☎️ FAQ

How long does it take?

How do I get a watertight mesh?

How does the SDF calculation work?

How many points do I need to train 3D generative models?

📧 Contact

📘 Citation

About

Releases

Packages

Languages

License

1zb/sdf_gen

Folders and files

Latest commit

History

Repository files navigation

How to process SDF data for 3D generative models

🌏 Environment Setup

💾 Mesh loading

📝 Mesh normalization

🔨 Processing

We begin by determining how many points to sample.

Surface points sampling.

Volume points sampling.

Near-surface points sampling.

Calculation of signed distances

Save data

🍴 Data Split

☎️ FAQ

How long does it take?

How do I get a watertight mesh?

How does the SDF calculation work?

How many points do I need to train 3D generative models?

📧 Contact

📘 Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages