Predicting gasification results for biomasses.
biomassml allows you to predict gasification results for biomass samples as a function of the biomass properties and the process operating conditions.
This approach aims to be used for optimizing biomass utilization in the gasification process. It allows you to identify promising gasification pathways for biomasses, and to find optimal process operating conditions.
The biomassml command line tool is automatically installed. It can
be used from the shell with the --help
flag to show all subcommands:
$ biomassml --help
The most recent code and data can be installed directly from GitHub with:
$ pip install git+https://github.com/vgvinter/biomassml.git
To install in development mode, use the following:
$ git clone git+https://github.com/vgvinter/biomassml.git
$ cd biomassml
$ pip install -e .
The data for training the models can be found in data/data_GASIF_biomass.csv
, which contains data on biomass properties, gasification operating conditions, and results of the gasification process.
The data for the new biomasses used in this work to predict gasification results can be found in data/data_NEW_biomasses.csv
, which contains data on the properties of different biomasses collected from the literature.
The code to build the Gaussian Process Regression (GPR) models used in this work can be found in src/biomassml/build_model.py
. Code to build single-output GPR and coregionalized GPR models is included.
Helper functions to perform leave-one-out cross-validation (LOOCV) on the model and to calculate metrics can be found in src/biomassml/metrics.py
. Functions to perform LOOCV on a given kernel can be found in src/biomassml/pipeline.py
.
The code for the analysis of the feature importance can be found in src/biomassml/feature_importance.py
, including partial dependency plots and SHapley Additive exPlanations (SHAP) analysis.`
The code to predict outputs can be found in src/biomassml/predict_outputs.py
. Functions to predict the biomass gasification outputs used in this work can be found in src/biomassml/predict_outputs.py
.
We provide the Gaussian Process Regression (GPR) models trained in this work in the models
directory. The model trained using leave-one-out cross-validation (LOOCV) can be found in models/model_GPR_loocv
. The model retrained on all data can be found in models/model_GPR_retrained
.
The use of the main functions of this package is shown in Jupyter Notebooks in the notebooks
directory. The training of the Gaussian Process Regression (GPR) models used in this work and their performance evaluation can be found in notebooks/train_GPR_model.ipynb
. The analysis of the feature importance can be found in notebooks/feature_importance.ipynb
. The prediction of the gasification results for different biomasses from the literature, together with the optimization of the process operating conditions, can be found in notebooks/predictions_new_dataset.ipynb
. The cluster analysis can be found in notebooks/cluster_analysis.ipynb
.
Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.rst for more information on getting involved.
The code in this package is licensed under the MIT License.
This work was carried out with financial support from the Spanish Agencia Estatal de Investigación (AEI) through Grant TED2021-131693B-I00 funded by MCIN/AEI/ 10.13039/501100011033 and by the “European Union NextGenerationEU/PRTR”, and from the Spanish National Research Council (CSIC) through Programme for internationalization i-LINK 2021 (Project LINKA20412).
This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.
See developer instrutions
The final section of the README is for if you want to get involved by making a code contribution.
After cloning the repository and installing tox
with pip install tox
, the unit tests in the tests/
folder can be
run reproducibly with:
$ tox
Additionally, these tests are automatically re-run with each commit in a GitHub Action.
After installing the package in development mode and installing
tox
with pip install tox
, the commands for making a new release are contained within the finish
environment
in tox.ini
. Run the following from the shell:
$ tox -e finish
This script does the following:
- Uses BumpVersion to switch the version number in the
setup.cfg
andsrc/biomassml/version.py
to not have the-dev
suffix - Packages the code in both a tar archive and a wheel
- Uploads to PyPI using
twine
. Be sure to have a.pypirc
file configured to avoid the need for manual input at this step - Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
- Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can
use
tox -e bumpversion minor
after.