
ProFinity is a machine learning program for predicting proton affinities (PA) of various small molecules and metabolites. Since PA is a gas phase property, the amount of diverse experimental PA measurements available is limited, thus, ProFinity can be practical for accurately predicting PA values for unknown chemicals within an interpolation limit.
Overall, ProFinity uses two neural network models: 1) a model for predicting PA values and 2) a model for error correction. Ultimately, both models synergistically deliver error attenuated results.


The program supports single PA query or batch PA queries. For single query, only a canonical SMILE is required as input string. For batch queries, mirror the below input csv data layout for applicability:
SMILES |
---|
Cc1cccc(C)c1 |
... |
Upon completion of a task a tabulated result like the table below is saved in a csv file.
SMILES | PA (kcal/mol) |
---|---|
Cc1cccc(C)c1 | 189.36111 |
... | ... |
Google account needed to access Google Colab notebook.
To create a small batch query csv input file ad hoc:
import pandas as pd
try:
!touch small_batch.csv
except:
pass
column_names=["SMILES"]
small_batch=pd.read_csv("small_batch.csv", names=column_names)
comp_list = #example: ["C(=O)=O", "O"]
small_batch['SMILES'] = comp_list
small_batch.to_csv("small_batch.csv", index=False)
ProFinity currently only supports chemicals containing the following atom types: H, He, B, C, N, O, F, P, S, Cl, Fe, As, Br, I, and Xe. The models have been trained on small molecules and metabolites, therefore, it may significantly underperform when applied to sizeable biomolecules. Also, training did not explicitly account for temperature or electric field effect.
to access the ProFinity platform.