Skip to content

svats73/md-sfa-msm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MD-SFA: Molecular Dynamics Slow Feature Analysis CLI Tool

Overview

MD-SFA is a powerful command-line interface (CLI) tool designed for the analysis of molecular dynamics (MD) simulations. Leveraging the capabilities of the md_sfa library, MD-SFA facilitates the loading of MD trajectory data, the featurization of trajectories, the execution of Slow Feature Analysis (SFA), and the creation of PLUMED files for biasing simulations based on SFA components.

Custom Installation

Before installing MD-SFA, it is necessary to install a custom version of sklearnsfa, which is packaged within the md-sfa repository, as well as msmbuilder2022 found at https://github.com/msmbuilder/msmbuilder2022. This ensures compatibility and optimal performance for SFA computations within MD-SFA. Follow these steps to install both the custom sklearnsfa and md-sfa:

  1. Install msmbuilder2022 using conda: conda install -c conda-forge testmsm

Alternatively, one can install msmbuilder2022 with pip by cloning the repository at https://github.com/msmbuilder/msmbuilder2022 and running:

pip install ./msmbuilder2022

  1. Clone the md-sfa repository to your local machine: git clone https://github.com/svats73/md-sfa-msm.git

  2. Navigate to the cloned repository directory: cd md-sfa-msm

  3. Install the custom sklearnsfa package: pip install ./sklearn-sfa

  4. Install the md_sfa package: pip install .

This installation process will ensure that you have both the md-sfa tool and the custom sklearn-sfa library installed and ready for your MD analysis tasks.

Usage

The MD-SFA CLI tool supports various commands for processing and analyzing your MD trajectories. Below is a guide to using these commands:

Loading Trajectories

md-sfa load-trajectories --path_to_trajectories PATH --topology_file FILE --stride N --atom_indices "selection"

  • --path_to_trajectories: Directory containing trajectory files.
  • --topology_file: Topology file path.
  • --stride: Interval for loading frames (optional).
  • --atom_indices: Atom selection string (optional).

Featurizing Dihedrals

md-sfa featurize --types TYPE1 --types TYPE2 --nosincos

  • --types: Types of dihedrals to featurize. Can specify multiple types, such as chi1, chi2, phi, psi. --types must be put before each type input
  • --nosincos: Disables the sin/cos transformation if set.

Describing Features

md-sfa describe-features --nosincos

  • --nosincos: Disables the sin/cos transformation if set.

Dumping Description

md-sfa dump-description --description_file_path PATH --nosincos

  • --description_file_path: File path to save the feature description.
  • --nosincos: Dump non-transformed feature description if created.

Dumping Featurized Data

md-sfa dump-featurized --dump_file_path PATH --nosincos

  • --dump_file_path: File path to save the featurized data.
  • --nosincos: Dump non-transformed features if created.

Running Slow Feature Analysis (SFA)

md-sfa run-sfa --n_components N --tau T

  • --n_components: Number of SFA components to extract.
  • --tau: The tau parameter for SFA.

Creating PLUMED File

md-sfa create-plumed_file --plumed_filename FILENAME

  • --plumed_filename: File path to save the generated PLUMED file.

Dumping SFA Components

md-sfa dump-sfa-components --save_file FILE

  • --save_file: File path to save the SFA components.

Cluster on SFA Components

md-sfa cluster --algorithm ALGORITHM_NAME --n_clusters NUMBER_OF_CLUSTERS

  • --algorithm: Name of clustering algorithm to use (currently supporting 'kcenters', 'kmeans', and 'gmm')
  • --n_clusters: For 'kcenters' and 'kmeans', how many cluster centers to use (optional, unused for 'gmm')

Dumping structures clustered on SFA Components

md-sfa dump-clusters --num_samples NUMBER_OF_SAMPLES

  • --num_samples: Number of structures to sample from cluster centers generated by clustering to be dumped

Classify ensembles

md-sfa classify --ensemble_one PATH_TO_FIRST_ENSEMBLES --ensemble_two PATH_TO_SECOND_ENSEMBLES --ensemble_features

  • --ensemble_one: Path to first featurized ensemble of trajectories
  • --ensemble_two: Path to second featurized ensemble of trajectories
  • --ensemble_features: Path to dataframe which contains featurization information for the given ensembles

Dump classified ensembles as a plumed file

md-sfa create-classifier-plumed --plumed_filename FILENAME

  • --plumed_filename: File path to save the generated PLUMED file.

Dumping SFA Weights as B-Factors

md-sfa plumed-bfactor --dat_file FILE --pdb_input PDB_INPUT_FILE --pdb_output PDB_OUTPUT_FILE

  • --dat_file: File path containing saved SFA components.
  • --pdb_input: File path to PDB on which SFA weights will be dumped.
  • --pdb_input: Output filename for PDB with added SFA weights.

ga

Restarting the Tool

To clear the current state and start fresh:

md-sfa restart

This command deletes any serialized state, allowing you to start a new analysis without interference from previous runs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published