Skip to content

A program for predicting unknown proton affinities of small molecules and metabolites

License

Notifications You must be signed in to change notification settings

mitkeng/ProFinity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python tensorflow user user user

ProFinity: proton affinity prediction

image

Introduction

ProFinity is a machine learning program for predicting proton affinities (PA) of various small molecules and metabolites. Since PA is a gas phase property, the amount of diverse experimental PA measurements available is limited, thus, ProFinity can be practical for accurately predicting PA values for unknown chemicals within an interpolation limit.

Overall, ProFinity uses two neural network models: 1) a model for predicting PA values and 2) a model for error correction. Ultimately, both models synergistically deliver error attenuated results.

Performance

focus focus

Functionality

The program supports single PA query or batch PA queries. For single query, only a canonical SMILE is required as input string. For batch queries, mirror the below input csv data layout for applicability:

SMILES
Cc1cccc(C)c1
...

Upon completion of a task a tabulated result like the table below is saved in a csv file.

SMILES PA (kcal/mol)
Cc1cccc(C)c1 189.36111
... ...

Requirement

Google account needed to access Google Colab notebook.

Support

To create a small batch query csv input file ad hoc:

import pandas as pd

try:
  !touch small_batch.csv
except:
  pass

column_names=["SMILES"]
small_batch=pd.read_csv("small_batch.csv", names=column_names)
comp_list = #example: ["C(=O)=O", "O"]
small_batch['SMILES'] = comp_list
small_batch.to_csv("small_batch.csv", index=False)

Limitations

ProFinity currently only supports chemicals containing the following atom types: H, He, B, C, N, O, F, P, S, Cl, Fe, As, Br, I, and Xe. The models have been trained on small molecules and metabolites, therefore, it may significantly underperform when applied to sizeable biomolecules. Also, training did not explicitly account for temperature or electric field effect.

Accessibility

Open In Colab to access the ProFinity platform. focus


About

A program for predicting unknown proton affinities of small molecules and metabolites

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published