FoldMark: Safeguarding Protein Structure Generative Models with Distributional and Evolutionary Watermarking

🌟 Try Our Demo!

We've created an interactive demo on Hugging Face Spaces where you can:

Input protein sequences and get watermarked structure predictions
Compare watermarked vs. non-watermarked structures
Visualize the differences in 3D
Pretrained Checkpoints and Inference code

🚀 Overview

FoldMark is a novel watermarking framework for protein generative models that embeds user-specific data across protein structures. It:

Leverages evolutionary principles to adaptively embed watermarks (higher capacity in flexible regions, minimal disruption in conserved areas)
Maintains structural quality (>0.9 scTM scores) while achieving >95% watermark bit accuracy at 32 bits
Enables tracking of up to 1 million users and detection of unauthorized model training (even with only 30% watermarked data)
Works with leading models like AlphaFold3, ESMFold, RFDiffusion, and RFDiffusionAA
Withstands post-processing and adaptive attacks, offering a generalized solution for ethical protein AI deployment

📊 Results

Structure Prediction with Watermarking

De Novo Protein Structure Design with Watermarking

🛠️ Installation

# Create and activate conda environment
conda env create -f foldmark.yml
conda activate fm

# Install torch-scatter
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cu117.html

# Install local package
pip install -e .

📊 Training Pipeline

Data Setup

Download preprocessed SCOPe dataset (~280MB): Download Link

Extract the data:

tar -xvzf preprocessed_scope.tar.gz
rm preprocessed_scope.tar.gz

Training Steps

Pretrain the model:

python -W ignore experiments/pretrain.py

Finetune with watermarking:

python -W ignore experiments/finetune.py

📝 Citation

If you find this work helpful, please cite our paper:

@article{zhang2024foldmark,
  title={FoldMark: Protecting Protein Generative Models with Watermarking},
  author={Zhang, Zaixi and Jin, Ruofan and Fu, Kaidi and Cong, Le and Zitnik, Marinka and Wang, Mengdi},
  journal={bioRxiv},
  pages={2024--10},
  year={2024},
  publisher={Cold Spring Harbor Laboratory}
}

🙏 Acknowledgments

We thank the following open-source projects for their valuable contributions:

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
configs		configs
data		data
experiments		experiments
models		models
openfold		openfold
LICENSE		LICENSE
README.md		README.md
foldmark.yml		foldmark.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FoldMark: Safeguarding Protein Structure Generative Models with Distributional and Evolutionary Watermarking

🌟 Try Our Demo!

🚀 Overview

📊 Results

Structure Prediction with Watermarking

De Novo Protein Structure Design with Watermarking

🛠️ Installation

📊 Training Pipeline

Data Setup

Training Steps

📝 Citation

🙏 Acknowledgments

📄 License

About

Releases

Packages

Languages

License

zaixizhang/FoldMark

Folders and files

Latest commit

History

Repository files navigation

FoldMark: Safeguarding Protein Structure Generative Models with Distributional and Evolutionary Watermarking

🌟 Try Our Demo!

🚀 Overview

📊 Results

Structure Prediction with Watermarking

De Novo Protein Structure Design with Watermarking

🛠️ Installation

📊 Training Pipeline

Data Setup

Training Steps

📝 Citation

🙏 Acknowledgments

📄 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages