diff --git a/docs/docs/tutorials/protein-folding-nft-minting.md b/docs/docs/tutorials/protein-folding-nft-minting.md
deleted file mode 100644
index f20c68dee..000000000
--- a/docs/docs/tutorials/protein-folding-nft-minting.md
+++ /dev/null
@@ -1,11 +0,0 @@
----
-title: Minting ProofOfScience tokens
-sidebar_label: ProofOfScience NFTs
-sidebar_position: 3
----
-
-import OpenInColab from '../../src/components/OpenInColab.js';
-
-The following interactive notebook demo has been prepared to demonstrate minting ProofOfScience tokens using plex. It is an extension of the Protein Folding tutorial with a few additional modules appended at the end. To use the notebook, please visit the Google Colab link below.
-
-
\ No newline at end of file
diff --git a/docs/docs/tutorials/protein-folding.md b/docs/docs/tutorials/protein-folding.md
index bc9553a41..2651c3a20 100644
--- a/docs/docs/tutorials/protein-folding.md
+++ b/docs/docs/tutorials/protein-folding.md
@@ -6,23 +6,25 @@ sidebar_position: 2
import OpenInColab from '../../src/components/OpenInColab.js';
-
+
## Protein folding in silico
-In this tutorial, we perform protein folding with PLEX.
+In this tutorial we perform protein folding with **plex**.
-There are multiple reasons we believe PLEX is a new standard for computational biology π§«:
-1. With a simple python interface, running containerised tools with your data is only a few commands away
-2. The infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node
-3. Every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.
-4. We made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.
+There are multiple reasons we believe plex is a new standard for computational biology π§«:
+1. with a simple python interface, running containerised tools with your data is only a few commands away
+2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node
+3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.
+4. we made adding new tools to the network as easy as possible - moving your favorite tool to plex is one JSON document away.
-We'll walk through an example of how to use PLEX to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.
+In this tutorial, we'll walk through an example of how to use plex to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.
-
+We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.
-## Install PLEX
+
+
+## Install plex
```python
@@ -30,14 +32,15 @@ We'll walk through an example of how to use PLEX to predict a protein's 3D struc
```
Collecting PlexLabExchange
- Downloading PlexLabExchange-0.8.18-py3-none-manylinux2014_x86_64.whl (26.9 MB)
- [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
+ Downloading PlexLabExchange-0.8.20-py3-none-manylinux2014_x86_64.whl (26.9 MB)
+ [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m16.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PlexLabExchange
- Successfully installed PlexLabExchange-0.8.18
+ Successfully installed PlexLabExchange-0.8.20
Then, create a directory where we can save our project files.
+
```python
import os
@@ -52,41 +55,51 @@ dir_path = f"{cwd}/project"
We'll download a `.fasta` file containing the sequence of the protein we want to fold. Here, we're using the sequence of Streptavidin.
+
+
```python
!wget https://rest.uniprot.org/uniprotkb/P22629.fasta -O {dir_path}/P22629.fasta # Streptavidin
```
- --2023-08-01 21:39:21-- https://rest.uniprot.org/uniprotkb/P22629.fasta
+ --2023-08-08 18:49:21-- https://rest.uniprot.org/uniprotkb/P22629.fasta
Resolving rest.uniprot.org (rest.uniprot.org)... 193.62.193.81
Connecting to rest.uniprot.org (rest.uniprot.org)|193.62.193.81|:443... connected.
HTTP request sent, awaiting response... 200 OK
- Length: 264 [text/plain]
+ Length: unspecified [text/plain]
Saving to: β/content/project/P22629.fastaβ
-
- /content/project/P2 100%[===================>] 264 --.-KB/s in 0s
-
- 2023-08-01 21:39:21 (144 MB/s) - β/content/project/P22629.fastaβ saved [264/264]
+
+ /content/project/P2 [ <=> ] 264 --.-KB/s in 0s
+
+ 2023-08-08 18:49:21 (157 MB/s) - β/content/project/P22629.fastaβ saved [264]
+
## Fold the protein
With the sequence downloaded, we can now use ColabFold to fold the protein.
+
+
+
```python
-from plex import CoreTools, plex_create
+from plex import CoreTools, plex_init
-initial_io_cid = plex_create(CoreTools.COLABFOLD_MINI.value, dir_path)
+fasta_local_filepaths = [f"{dir_path}/P22629.fasta"]
+
+initial_io_cid = plex_init(
+ CoreTools.COLABFOLD_MINI.value,
+ sequence=fasta_local_filepaths
+)
```
+ plex init -t QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD -i {"sequence": ["/content/project/P22629.fasta"]} --scatteringMethod=dotProduct
Plex version (v0.8.4) up to date.
- Temporary directory created: /tmp/9ed8c638-c1b0-43da-bf92-7f054517d45c2889128719
- Reading tool config: QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD
- Creating IO entries from input directory: /content/project
- Initialized IO file at: /tmp/9ed8c638-c1b0-43da-bf92-7f054517d45c2889128719/io.json
- Initial IO JSON file CID: QmUhysTE4aLZNw2ePRMCxHWko868xmQoXnGP25fKM1aofb
+ Pinned IO JSON CID: QmZgLQypfjvK9kTsqLXwbNRiFifEU5CC7eduWWPbminybi
+
This code initiates the folding process. We'll need to run it to complete the operation.
+
```python
from plex import plex_run
@@ -94,27 +107,29 @@ completed_io_cid, completed_io_filepath = plex_run(initial_io_cid, dir_path)
```
Plex version (v0.8.4) up to date.
- Created working directory: /content/project/2ef79c16-6f59-4e44-aea7-c39db85280cb
- Initialized IO file at: /content/project/2ef79c16-6f59-4e44-aea7-c39db85280cb/io.json
+ Created working directory: /content/project/9102a179-ac65-4823-9a03-93766ea32671
+ Initialized IO file at: /content/project/9102a179-ac65-4823-9a03-93766ea32671/io.json
Processing IO Entries
Starting to process IO entry 0
Job running...
- Bacalhau job id: 476d232b-e1c6-42d6-b1c0-2f4d237244b1
-
+ Bacalhau job id: 271f4b64-cb2d-4be6-86af-ed16186e69e0
+
Computing default go-libp2p Resource Manager limits based on:
- 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
- 'Swarm.ResourceMgr.MaxFileDescriptors': 524288
-
+
Applying any user-supplied overrides on top.
Run 'ipfs swarm limit all' to see the resulting limits.
-
+
Success processing IO entry 0
- Finished processing, results written to /content/project/2ef79c16-6f59-4e44-aea7-c39db85280cb/io.json
+ Finished processing, results written to /content/project/9102a179-ac65-4823-9a03-93766ea32671/io.json
Completed IO JSON CID: QmdnjMsUar6nTqGwgjCwN1Fyjaan4i3zyht9SE9L235YRm
+ 2023/08/08 18:51:17 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
+
+
+After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.
-## Viewing the results
-After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content address.
```python
@@ -181,6 +196,21 @@ with open(completed_io_filepath, 'r') as f:
}
]
-The output is a JSON file with information about the folded protein structures. This can be used for further analysis, visualization, and more.
-
+The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.
+
+**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)
+
+
+```python
+print(f"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}")
+```
+
+ View this result on IPFS: https://ipfs.io/ipfs/QmdnjMsUar6nTqGwgjCwN1Fyjaan4i3zyht9SE9L235YRm
+
+
+## Visualization and NFT minting
+
+For visualization and NFT minting steps, please visit the Colab notebook below.
+
+
diff --git a/docs/docs/tutorials/small-molecule-binding.md b/docs/docs/tutorials/small-molecule-binding.md
index 93758900c..183dada5d 100644
--- a/docs/docs/tutorials/small-molecule-binding.md
+++ b/docs/docs/tutorials/small-molecule-binding.md
@@ -8,17 +8,23 @@ import OpenInColab from '../../src/components/OpenInColab.js';
-## Small molecule binding in silico
+## Small molecule docking with plex
-Small molecule binding is a fundamental aspect of drug discovery, facilitating the interaction of potential drugs with target proteins. With PLEX, this intricate process is simplified and made efficient.
+In this tutorial we perform small molecule docking with **plex**.
-In the following tutorial, we illustrate how PLEX can be used to conduct small molecule binding studies to explore potential drug interactions with proteins. We demonstrate this with [Equibind](https://hannes-stark.com/assets/EquiBind.pdf).
+There are multiple reasons we believe plex is a new standard for computational biology π§«:
+1. with a simple python interface, running containerised tools with your data is only a few commands away
+2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node
+3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.
+4. we made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.
-
+In the following tutorial, we illustrate how plex can be used to conduct small molecule binding studies to explore potential drug interactions with proteins. We demonstrate this with [Equibind](https://hannes-stark.com/assets/EquiBind.pdf).
-## Install PLEX
+We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.
-We first install the plex pip package.
+
+
+## Install plex
```python
@@ -26,54 +32,99 @@ We first install the plex pip package.
```
Collecting PlexLabExchange
- Downloading PlexLabExchange-0.8.18-py3-none-manylinux2014_x86_64.whl (26.9 MB)
- [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
+ Downloading PlexLabExchange-0.8.20-py3-none-manylinux2014_x86_64.whl (26.9 MB)
+ [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PlexLabExchange
- Successfully installed PlexLabExchange-0.8.18
+ Successfully installed PlexLabExchange-0.8.20
-## Load small molecule and protein data
+Then, create a directory where we can save our project files.
-Next, we need to load the data about the small molecule and the protein that we're studying. This data, which is available on IPFS, will be used to initialize an IO JSON. This JSON file will serve as the job instructions for our binding study.
```python
-small_molecule_path = ["QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf"]
-protein_path = ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"]
+import os
+
+cwd = os.getcwd()
+!mkdir project
+
+dir_path = f"{cwd}/project"
```
+## Download small molecule and protein data
+
+We'll download the small molecule `.sdf` and protein `.pdb` we want to dock with Equibind.
+
+
+```python
+# small molecule
+!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/ZINC000003986735.sdf -O {dir_path}/ZINC000003986735.sdf
+# protein
+!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/7n9g.pdb -O {dir_path}/7n9g.pdb
+```
+
+ --2023-08-08 18:56:14-- https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/ZINC000003986735.sdf
+ Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
+ Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
+ HTTP request sent, awaiting response... 200 OK
+ Length: 2967 (2.9K) [text/plain]
+ Saving to: β/content/project/ZINC000003986735.sdfβ
+
+ /content/project/ZI 100%[===================>] 2.90K --.-KB/s in 0s
+
+ 2023-08-08 18:56:14 (47.2 MB/s) - β/content/project/ZINC000003986735.sdfβ saved [2967/2967]
+
+ --2023-08-08 18:56:14-- https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/7n9g.pdb
+ Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
+ Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
+ HTTP request sent, awaiting response... 200 OK
+ Length: 580284 (567K) [text/plain]
+ Saving to: β/content/project/7n9g.pdbβ
+
+ /content/project/7n 100%[===================>] 566.68K --.-KB/s in 0.05s
+
+ 2023-08-08 18:56:14 (12.1 MB/s) - β/content/project/7n9g.pdbβ saved [580284/580284]
+
+
+
+## Small molecule docking
+
+With the small molecule and protein files downloaded, we can now use Equibind to run a docking simulation.
+
```python
from plex import CoreTools, plex_init
+protein_path = [f"{dir_path}/7n9g.pdb"]
+small_molecule_path = [f"{dir_path}/ZINC000003986735.sdf"]
+
initial_io_cid = plex_init(
CoreTools.EQUIBIND.value,
protein=protein_path,
- small_molecule=small_molecule_path
+ small_molecule=small_molecule_path,
)
```
- plex init -t QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB -i {"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf"]} --scatteringMethod=dotProduct
- Plex version (v0.8.3) up to date.
+ plex init -t QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB -i {"protein": ["/content/project/7n9g.pdb"], "small_molecule": ["/content/project/ZINC000003986735.sdf"]} --scatteringMethod=dotProduct
+ Plex version (v0.8.4) up to date.
Pinned IO JSON CID: QmShD7ApeDBUqqy98RuuKdyv8AdmBsvyZqqxSLAEvB9EKP
-## Dock the small molecule and protein using Equibind
+This code initiates the docking process. We'll need to run it to complete the operation.
-Now that we've prepared our job instructions, we're ready to dock the small molecule and protein using Equibind. With the IO JSON created and pinned to IPFS, we submit the job to the LabDAO Bacalhau cluster for computation.
```python
from plex import plex_run
-completed_io_cid, io_local_filepath = plex_run(initial_io_cid)
+completed_io_cid, io_local_filepath = plex_run(initial_io_cid, dir_path)
```
- Plex version (v0.8.3) up to date.
- Created working directory: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866
- Initialized IO file at: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json
+ Plex version (v0.8.4) up to date.
+ Created working directory: /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de
+ Initialized IO file at: /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de/io.json
Processing IO Entries
Starting to process IO entry 0
Job running...
- Bacalhau job id: a292c5fc-a717-47d5-a5b4-4d3401670a4f
+ Bacalhau job id: 892bf30d-7f6d-4cc7-a490-c1fa17d82171
Computing default go-libp2p Resource Manager limits based on:
- 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
@@ -83,13 +134,12 @@ completed_io_cid, io_local_filepath = plex_run(initial_io_cid)
Run 'ipfs swarm limit all' to see the resulting limits.
Success processing IO entry 0
- Finished processing, results written to /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json
+ Finished processing, results written to /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de/io.json
Completed IO JSON CID: QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6
+ 2023/08/08 18:56:21 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
-## Viewing the results
-
-The final step is to view our results. We read in the IO JSON file that contains the output from our job and print it. This data includes the best docked small molecule and the protein used, each with their own IPFS CIDs.
+After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.
```python
@@ -136,6 +186,22 @@ with open(io_local_filepath, 'r') as f:
}
]
+
This output provides us with key information about the small molecule-protein interaction. The "best_docked_small_molecule" represents the most likely interaction between the protein and the small molecule, which can inform subsequent analysis and experiments.
+The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.
+
+**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)
+
+
+```python
+print(f"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}")
+```
+
+ View this result on IPFS: https://ipfs.io/ipfs/QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6
+
+## Visualization and NFT minting
+
+For visualization and NFT minting steps, please visit the Colab notebook below.
+
diff --git a/docs/static/notebooks/plex_tutorial_colabfold.md b/docs/static/notebooks/plex_tutorial_colabfold.md
deleted file mode 100644
index b11c50e5c..000000000
--- a/docs/static/notebooks/plex_tutorial_colabfold.md
+++ /dev/null
@@ -1,165 +0,0 @@
-## Install PLEX
-
-
-```python
-!pip install PlexLabExchange
-```
-
- Collecting PlexLabExchange
- Downloading PlexLabExchange-0.8.18-py3-none-manylinux2014_x86_64.whl (26.9 MB)
- [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
- [?25hInstalling collected packages: PlexLabExchange
- Successfully installed PlexLabExchange-0.8.18
-
-
-
-```python
-import os
-
-cwd = os.getcwd()
-!mkdir project
-
-dir_path = f"{cwd}/project"
-```
-
-## Download `.fasta` file
-
-
-```python
-!pip install requests
-
-import requests
-
-def download_file(url, directory, filename=None):
- local_filename = filename if filename else url.split('/')[-1]
- with requests.get(url, stream=True) as r:
- r.raise_for_status()
- with open(os.path.join(directory, local_filename), 'wb') as f:
- for chunk in r.iter_content(chunk_size=8192):
- f.write(chunk)
- return local_filename
-
-url = 'https://rest.uniprot.org/uniprotkb/P22629.fasta' # Streptavidin
-
-fasta_filepath = download_file(url, dir_path)
-```
-
- Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (2.27.1)
- Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests) (1.26.16)
- Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests) (2023.5.7)
- Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests) (2.0.12)
- Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests) (3.4)
-
-
-## Fold the protein using ColabFold
-
-
-```python
-from plex import CoreTools, plex_create
-
-sequences = [fasta_filepath]
-
-initial_io_cid = plex_create(CoreTools.COLABFOLD_MINI.value, dir_path)
-```
-
- Plex version (v0.8.3) up to date.
- Temporary directory created: /tmp/2604ada3-04ec-4d58-9ecc-1e65134c15674117000244
- Reading tool config: QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD
- Creating IO entries from input directory: /content/project
- Initialized IO file at: /tmp/2604ada3-04ec-4d58-9ecc-1e65134c15674117000244/io.json
- Initial IO JSON file CID: QmUhysTE4aLZNw2ePRMCxHWko868xmQoXnGP25fKM1aofb
-
-
-
-```python
-from plex import plex_run
-
-completed_io_cid, completed_io_filepath = plex_run(initial_io_cid, dir_path)
-```
-
- Plex version (v0.8.3) up to date.
- Created working directory: /content/project/03ef6ae4-b2ff-424b-894c-05f8fbe48888
- Initialized IO file at: /content/project/03ef6ae4-b2ff-424b-894c-05f8fbe48888/io.json
- Processing IO Entries
- Starting to process IO entry 0
- Job running...
- Bacalhau job id: ac42f8de-1fea-4e09-9644-75c940bdbd5c
-
- Computing default go-libp2p Resource Manager limits based on:
- - 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
- - 'Swarm.ResourceMgr.MaxFileDescriptors': 524288
-
- Applying any user-supplied overrides on top.
- Run 'ipfs swarm limit all' to see the resulting limits.
-
- Success processing IO entry 0
- Finished processing, results written to /content/project/03ef6ae4-b2ff-424b-894c-05f8fbe48888/io.json
- Completed IO JSON CID: QmdnjMsUar6nTqGwgjCwN1Fyjaan4i3zyht9SE9L235YRm
- 2023/07/20 04:50:10 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
-
-
-
-```python
-import json
-
-with open(completed_io_filepath, 'r') as f:
- data = json.load(f)
- pretty_data = json.dumps(data, indent=4, sort_keys=True)
- print(pretty_data)
-```
-
- [
- {
- "errMsg": "",
- "inputs": {
- "sequence": {
- "class": "File",
- "filepath": "P22629.fasta",
- "ipfs": "QmR3TRtG1EWszHJTpZWZut6VFqzBPWT5KYVJvaMdXFLWXn"
- }
- },
- "outputs": {
- "all_folded_proteins": {
- "class": "Array",
- "files": [
- {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_1_model_1.pdb",
- "ipfs": "QmXZHhB7qP1tnJNyR2TeH7m4gB1R5UF84SzvK94eYB9qdL"
- },
- {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_2_model_4.pdb",
- "ipfs": "QmPWGR36mbm5qptniHxd5KjUQKVn8EFMc57DMJzwcetNnU"
- },
- {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_3_model_3.pdb",
- "ipfs": "QmXQ1F8xD3TP1qDvU1HDhpuR5JDZvxv1G2udJSdTsimKvH"
- },
- {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_4_model_2.pdb",
- "ipfs": "QmV4TZJyWbu4CcmLTvD6nKM8YpzDK4fBsiiA3KQkHjW1RG"
- },
- {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_5_model_5.pdb",
- "ipfs": "QmVHT7nQzmNkxDJsRTJPAFqwqhqEgmD3QBGZpUPneogVqX"
- }
- ]
- },
- "best_folded_protein": {
- "class": "File",
- "filepath": "P22629_unrelaxed_rank_1_model_1.pdb",
- "ipfs": "QmTxVHTSUr8kLa9W8yM7KUNth2pNn8m3x6M18x8yiaV2SU"
- }
- },
- "state": "completed",
- "tool": {
- "ipfs": "QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD",
- "name": "colabfold-mini"
- }
- }
- ]
-
diff --git a/docs/static/notebooks/plex_tutorial_docking_outputs.md b/docs/static/notebooks/plex_tutorial_docking_outputs.md
new file mode 100644
index 000000000..fea3b4c29
--- /dev/null
+++ b/docs/static/notebooks/plex_tutorial_docking_outputs.md
@@ -0,0 +1,191 @@
+## Small molecule docking with plex
+
+In this tutorial we perform small molecule docking with **plex**.
+
+There are multiple reasons we believe plex is a new standard for computational biology π§«:
+1. with a simple python interface, running containerised tools with your data is only a few commands away
+2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node
+3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.
+4. we made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.
+
+In the following tutorial, we illustrate how plex can be used to conduct small molecule binding studies to explore potential drug interactions with proteins. We demonstrate this with [Equibind](https://hannes-stark.com/assets/EquiBind.pdf).
+
+We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.
+
+
+
+## Install plex
+
+
+```python
+!pip install PlexLabExchange
+```
+
+ Collecting PlexLabExchange
+ Downloading PlexLabExchange-0.8.20-py3-none-manylinux2014_x86_64.whl (26.9 MB)
+ [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
+ [?25hInstalling collected packages: PlexLabExchange
+ Successfully installed PlexLabExchange-0.8.20
+
+
+Then, create a directory where we can save our project files.
+
+
+```python
+import os
+
+cwd = os.getcwd()
+!mkdir project
+
+dir_path = f"{cwd}/project"
+```
+
+## Download small molecule and protein data
+
+We'll download the small molecule `.sdf` and protein `.pdb` we want to dock with Equibind.
+
+
+```python
+# small molecule
+!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/ZINC000003986735.sdf -O {dir_path}/ZINC000003986735.sdf
+# protein
+!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/7n9g.pdb -O {dir_path}/7n9g.pdb
+```
+
+ --2023-08-08 18:56:14-- https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/ZINC000003986735.sdf
+ Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
+ Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
+ HTTP request sent, awaiting response... 200 OK
+ Length: 2967 (2.9K) [text/plain]
+ Saving to: β/content/project/ZINC000003986735.sdfβ
+
+ /content/project/ZI 100%[===================>] 2.90K --.-KB/s in 0s
+
+ 2023-08-08 18:56:14 (47.2 MB/s) - β/content/project/ZINC000003986735.sdfβ saved [2967/2967]
+
+ --2023-08-08 18:56:14-- https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/7n9g.pdb
+ Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
+ Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
+ HTTP request sent, awaiting response... 200 OK
+ Length: 580284 (567K) [text/plain]
+ Saving to: β/content/project/7n9g.pdbβ
+
+ /content/project/7n 100%[===================>] 566.68K --.-KB/s in 0.05s
+
+ 2023-08-08 18:56:14 (12.1 MB/s) - β/content/project/7n9g.pdbβ saved [580284/580284]
+
+
+
+## Small molecule docking
+
+With the small molecule and protein files downloaded, we can now use Equibind to run a docking simulation.
+
+
+```python
+from plex import CoreTools, plex_init
+
+protein_path = [f"{dir_path}/7n9g.pdb"]
+small_molecule_path = [f"{dir_path}/ZINC000003986735.sdf"]
+
+initial_io_cid = plex_init(
+ CoreTools.EQUIBIND.value,
+ protein=protein_path,
+ small_molecule=small_molecule_path,
+)
+```
+
+ plex init -t QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB -i {"protein": ["/content/project/7n9g.pdb"], "small_molecule": ["/content/project/ZINC000003986735.sdf"]} --scatteringMethod=dotProduct
+ Plex version (v0.8.4) up to date.
+ Pinned IO JSON CID: QmShD7ApeDBUqqy98RuuKdyv8AdmBsvyZqqxSLAEvB9EKP
+
+
+This code initiates the docking process. We'll need to run it to complete the operation.
+
+
+```python
+from plex import plex_run
+
+completed_io_cid, io_local_filepath = plex_run(initial_io_cid, dir_path)
+```
+
+ Plex version (v0.8.4) up to date.
+ Created working directory: /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de
+ Initialized IO file at: /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de/io.json
+ Processing IO Entries
+ Starting to process IO entry 0
+ Job running...
+ Bacalhau job id: 892bf30d-7f6d-4cc7-a490-c1fa17d82171
+
+ Computing default go-libp2p Resource Manager limits based on:
+ - 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
+ - 'Swarm.ResourceMgr.MaxFileDescriptors': 524288
+
+ Applying any user-supplied overrides on top.
+ Run 'ipfs swarm limit all' to see the resulting limits.
+
+ Success processing IO entry 0
+ Finished processing, results written to /content/project/2e3a8afd-928d-4fb7-a381-fff63c7d51de/io.json
+ Completed IO JSON CID: QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6
+ 2023/08/08 18:56:21 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
+
+
+After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.
+
+
+```python
+import json
+
+with open(io_local_filepath, 'r') as f:
+ data = json.load(f)
+ pretty_data = json.dumps(data, indent=4, sort_keys=True)
+ print(pretty_data)
+```
+
+ [
+ {
+ "errMsg": "",
+ "inputs": {
+ "protein": {
+ "class": "File",
+ "filepath": "7n9g.pdb",
+ "ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
+ },
+ "small_molecule": {
+ "class": "File",
+ "filepath": "ZINC000003986735.sdf",
+ "ipfs": "QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz"
+ }
+ },
+ "outputs": {
+ "best_docked_small_molecule": {
+ "class": "File",
+ "filepath": "7n9g_ZINC000003986735_docked.sdf",
+ "ipfs": "QmZdoaKEGtESnLoHFMb9bvqdwXjyUuRK6DbEoYz8PYpZ8W"
+ },
+ "protein": {
+ "class": "File",
+ "filepath": "7n9g.pdb",
+ "ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
+ }
+ },
+ "state": "completed",
+ "tool": {
+ "ipfs": "QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB",
+ "name": "equibind"
+ }
+ }
+ ]
+
+
+This output provides us with key information about the small molecule-protein interaction. The "best_docked_small_molecule" represents the most likely interaction between the protein and the small molecule, which can inform subsequent analysis and experiments.
+
+The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.
+
+**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)
+
+
+```python
+print(f"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}")
+```
+
+ View this result on IPFS: https://ipfs.io/ipfs/QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6
diff --git a/docs/static/notebooks/plex_tutorial_equibind.md b/docs/static/notebooks/plex_tutorial_equibind.md
deleted file mode 100644
index 10d0669c5..000000000
--- a/docs/static/notebooks/plex_tutorial_equibind.md
+++ /dev/null
@@ -1,120 +0,0 @@
-## Install PLEX
-
-We first install the plex pip package.
-
-
-```python
-!pip install PlexLabExchange
-```
-
- Collecting PlexLabExchange
- Downloading PlexLabExchange-0.8.18-py3-none-manylinux2014_x86_64.whl (26.9 MB)
- [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
- [?25hInstalling collected packages: PlexLabExchange
- Successfully installed PlexLabExchange-0.8.18
-
-
-## Load small molecule and protein data
-
-Next, we load the small molecule and protein data which are available on IPFS. This data is used to initialize an IO JSON that serves as job instructions.
-
-
-```python
-small_molecule_path = ["QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf"]
-protein_path = ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"]
-```
-
-
-```python
-from plex import CoreTools, plex_init
-
-initial_io_cid = plex_init(
- CoreTools.EQUIBIND.value,
- protein=protein_path,
- small_molecule=small_molecule_path
-)
-```
-
- plex init -t QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB -i {"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf"]} --scatteringMethod=dotProduct
- Plex version (v0.8.3) up to date.
- Pinned IO JSON CID: QmShD7ApeDBUqqy98RuuKdyv8AdmBsvyZqqxSLAEvB9EKP
-
-
-## Dock the small molecule and protein using Equibind
-
-With the IO JSON created and pinned to IPFS, we can now submit the job to the LabDAO Bacalhau cluster for compute.
-
-
-```python
-from plex import plex_run
-
-completed_io_cid, io_local_filepath = plex_run(initial_io_cid)
-```
-
- Plex version (v0.8.3) up to date.
- Created working directory: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866
- Initialized IO file at: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json
- Processing IO Entries
- Starting to process IO entry 0
- Job running...
- Bacalhau job id: a292c5fc-a717-47d5-a5b4-4d3401670a4f
-
- Computing default go-libp2p Resource Manager limits based on:
- - 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
- - 'Swarm.ResourceMgr.MaxFileDescriptors': 524288
-
- Applying any user-supplied overrides on top.
- Run 'ipfs swarm limit all' to see the resulting limits.
-
- Success processing IO entry 0
- Finished processing, results written to /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json
- Completed IO JSON CID: QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6
-
-
-Time to view our results!
-
-
-```python
-import json
-
-with open(io_local_filepath, 'r') as f:
- data = json.load(f)
- pretty_data = json.dumps(data, indent=4, sort_keys=True)
- print(pretty_data)
-```
-
- [
- {
- "errMsg": "",
- "inputs": {
- "protein": {
- "class": "File",
- "filepath": "7n9g.pdb",
- "ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
- },
- "small_molecule": {
- "class": "File",
- "filepath": "ZINC000003986735.sdf",
- "ipfs": "QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz"
- }
- },
- "outputs": {
- "best_docked_small_molecule": {
- "class": "File",
- "filepath": "7n9g_ZINC000003986735_docked.sdf",
- "ipfs": "QmZdoaKEGtESnLoHFMb9bvqdwXjyUuRK6DbEoYz8PYpZ8W"
- },
- "protein": {
- "class": "File",
- "filepath": "7n9g.pdb",
- "ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
- }
- },
- "state": "completed",
- "tool": {
- "ipfs": "QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB",
- "name": "equibind"
- }
- }
- ]
-
diff --git a/docs/static/notebooks/plex_tutorial_protein_folding_outputs.md b/docs/static/notebooks/plex_tutorial_protein_folding_outputs.md
new file mode 100644
index 000000000..a8373d99b
--- /dev/null
+++ b/docs/static/notebooks/plex_tutorial_protein_folding_outputs.md
@@ -0,0 +1,199 @@
+## Protein folding in silico
+
+In this tutorial we perform protein folding with **plex**.
+
+There are multiple reasons we believe plex is a new standard for computational biology π§«:
+1. with a simple python interface, running containerised tools with your data is only a few commands away
+2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node
+3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.
+4. we made adding new tools to the network as easy as possible - moving your favorite tool to plex is one JSON document away.
+
+In this tutorial, we'll walk through an example of how to use plex to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.
+
+We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.
+
+
+
+## Install plex
+
+
+```python
+!pip install PlexLabExchange
+```
+
+ Collecting PlexLabExchange
+ Downloading PlexLabExchange-0.8.20-py3-none-manylinux2014_x86_64.whl (26.9 MB)
+ [2K [90mββββββββββββββββββββββββββββββββββββββββ[0m [32m26.9/26.9 MB[0m [31m16.6 MB/s[0m eta [36m0:00:00[0m
+ [?25hInstalling collected packages: PlexLabExchange
+ Successfully installed PlexLabExchange-0.8.20
+
+
+Then, create a directory where we can save our project files.
+
+
+```python
+import os
+
+cwd = os.getcwd()
+!mkdir project
+
+dir_path = f"{cwd}/project"
+```
+
+## Download protein sequence
+
+We'll download a `.fasta` file containing the sequence of the protein we want to fold. Here, we're using the sequence of Streptavidin.
+
+
+
+
+```python
+!wget https://rest.uniprot.org/uniprotkb/P22629.fasta -O {dir_path}/P22629.fasta # Streptavidin
+```
+
+ --2023-08-08 18:49:21-- https://rest.uniprot.org/uniprotkb/P22629.fasta
+ Resolving rest.uniprot.org (rest.uniprot.org)... 193.62.193.81
+ Connecting to rest.uniprot.org (rest.uniprot.org)|193.62.193.81|:443... connected.
+ HTTP request sent, awaiting response... 200 OK
+ Length: unspecified [text/plain]
+ Saving to: β/content/project/P22629.fastaβ
+
+ /content/project/P2 [ <=> ] 264 --.-KB/s in 0s
+
+ 2023-08-08 18:49:21 (157 MB/s) - β/content/project/P22629.fastaβ saved [264]
+
+
+
+## Fold the protein
+
+With the sequence downloaded, we can now use ColabFold to fold the protein.
+
+
+
+
+```python
+from plex import CoreTools, plex_init
+
+fasta_local_filepaths = [f"{dir_path}/P22629.fasta"]
+
+initial_io_cid = plex_init(
+ CoreTools.COLABFOLD_MINI.value,
+ sequence=fasta_local_filepaths
+)
+```
+
+ plex init -t QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD -i {"sequence": ["/content/project/P22629.fasta"]} --scatteringMethod=dotProduct
+ Plex version (v0.8.4) up to date.
+ Pinned IO JSON CID: QmZgLQypfjvK9kTsqLXwbNRiFifEU5CC7eduWWPbminybi
+
+
+This code initiates the folding process. We'll need to run it to complete the operation.
+
+
+```python
+from plex import plex_run
+
+completed_io_cid, completed_io_filepath = plex_run(initial_io_cid, dir_path)
+```
+
+ Plex version (v0.8.4) up to date.
+ Created working directory: /content/project/9102a179-ac65-4823-9a03-93766ea32671
+ Initialized IO file at: /content/project/9102a179-ac65-4823-9a03-93766ea32671/io.json
+ Processing IO Entries
+ Starting to process IO entry 0
+ Job running...
+ Bacalhau job id: 271f4b64-cb2d-4be6-86af-ed16186e69e0
+
+ Computing default go-libp2p Resource Manager limits based on:
+ - 'Swarm.ResourceMgr.MaxMemory': "6.8 GB"
+ - 'Swarm.ResourceMgr.MaxFileDescriptors': 524288
+
+ Applying any user-supplied overrides on top.
+ Run 'ipfs swarm limit all' to see the resulting limits.
+
+ Success processing IO entry 0
+ Finished processing, results written to /content/project/9102a179-ac65-4823-9a03-93766ea32671/io.json
+ Completed IO JSON CID: QmdnjMsUar6nTqGwgjCwN1Fyjaan4i3zyht9SE9L235YRm
+ 2023/08/08 18:51:17 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
+
+
+After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.
+
+
+
+
+```python
+import json
+
+with open(completed_io_filepath, 'r') as f:
+ data = json.load(f)
+ pretty_data = json.dumps(data, indent=4, sort_keys=True)
+ print(pretty_data)
+```
+
+ [
+ {
+ "errMsg": "",
+ "inputs": {
+ "sequence": {
+ "class": "File",
+ "filepath": "P22629.fasta",
+ "ipfs": "QmR3TRtG1EWszHJTpZWZut6VFqzBPWT5KYVJvaMdXFLWXn"
+ }
+ },
+ "outputs": {
+ "all_folded_proteins": {
+ "class": "Array",
+ "files": [
+ {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_1_model_1.pdb",
+ "ipfs": "QmXZHhB7qP1tnJNyR2TeH7m4gB1R5UF84SzvK94eYB9qdL"
+ },
+ {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_2_model_4.pdb",
+ "ipfs": "QmPWGR36mbm5qptniHxd5KjUQKVn8EFMc57DMJzwcetNnU"
+ },
+ {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_3_model_3.pdb",
+ "ipfs": "QmXQ1F8xD3TP1qDvU1HDhpuR5JDZvxv1G2udJSdTsimKvH"
+ },
+ {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_4_model_2.pdb",
+ "ipfs": "QmV4TZJyWbu4CcmLTvD6nKM8YpzDK4fBsiiA3KQkHjW1RG"
+ },
+ {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_5_model_5.pdb",
+ "ipfs": "QmVHT7nQzmNkxDJsRTJPAFqwqhqEgmD3QBGZpUPneogVqX"
+ }
+ ]
+ },
+ "best_folded_protein": {
+ "class": "File",
+ "filepath": "P22629_unrelaxed_rank_1_model_1.pdb",
+ "ipfs": "QmTxVHTSUr8kLa9W8yM7KUNth2pNn8m3x6M18x8yiaV2SU"
+ }
+ },
+ "state": "completed",
+ "tool": {
+ "ipfs": "QmcRH74qfqDBJFku3mEDGxkAf6CSpaHTpdbe1pMkHnbcZD",
+ "name": "colabfold-mini"
+ }
+ }
+ ]
+
+
+The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.
+
+**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)
+
+
+```python
+print(f"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}")
+```
+
+ View this result on IPFS: https://ipfs.io/ipfs/QmdnjMsUar6nTqGwgjCwN1Fyjaan4i3zyht9SE9L235YRm
diff --git a/python/notebooks/colab/plex_tutorial_docking.ipynb b/python/notebooks/colab/plex_tutorial_docking.ipynb
new file mode 100644
index 000000000..2a53d80cb
--- /dev/null
+++ b/python/notebooks/colab/plex_tutorial_docking.ipynb
@@ -0,0 +1,338 @@
+{
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "name": "python3",
+ "display_name": "Python 3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Small molecule docking with plex\n",
+ "\n",
+ "In this tutorial we perform small molecule docking with **plex**.\n",
+ "\n",
+ "There are multiple reasons we believe plex is a new standard for computational biology π§«:\n",
+ "1. with a simple python interface, running containerised tools with your data is only a few commands away\n",
+ "2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node\n",
+ "3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.\n",
+ "4. we made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.\n",
+ "\n",
+ "In the following tutorial, we illustrate how plex can be used to conduct small molecule binding studies to explore potential drug interactions with proteins. We demonstrate this with [Equibind](https://hannes-stark.com/assets/EquiBind.pdf).\n",
+ "\n",
+ "We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.\n",
+ "\n",
+ ""
+ ],
+ "metadata": {
+ "id": "Ql6qNQbwfiCx"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Install plex"
+ ],
+ "metadata": {
+ "id": "X_WSZx7OSFll"
+ }
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "99yQg2fbNPBT"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install PlexLabExchange"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Then, create a directory where we can save our project files."
+ ],
+ "metadata": {
+ "id": "flNVIfd32JnL"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import os\n",
+ "\n",
+ "cwd = os.getcwd()\n",
+ "!mkdir project\n",
+ "\n",
+ "dir_path = f\"{cwd}/project\""
+ ],
+ "metadata": {
+ "id": "Urna1EK2gMvr"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Download small molecule and protein data\n",
+ "\n",
+ "We'll download the small molecule `.sdf` and protein `.pdb` we want to dock with Equibind."
+ ],
+ "metadata": {
+ "id": "GAqQ9Eg0SOTS"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# small molecule\n",
+ "!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/ZINC000003986735.sdf -O {dir_path}/ZINC000003986735.sdf\n",
+ "# protein\n",
+ "!wget https://raw.githubusercontent.com/labdao/plex/main/testdata/binding/abl/7n9g.pdb -O {dir_path}/7n9g.pdb"
+ ],
+ "metadata": {
+ "id": "ZnM0Tm7bNb1R"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Small molecule docking\n",
+ "\n",
+ "With the small molecule and protein files downloaded, we can now use Equibind to run a docking simulation."
+ ],
+ "metadata": {
+ "id": "yZMizxZY2znl"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import CoreTools, plex_init\n",
+ "\n",
+ "protein_path = [f\"{dir_path}/7n9g.pdb\"]\n",
+ "small_molecule_path = [f\"{dir_path}/ZINC000003986735.sdf\"]\n",
+ "\n",
+ "initial_io_cid = plex_init(\n",
+ " CoreTools.EQUIBIND.value,\n",
+ " protein=protein_path,\n",
+ " small_molecule=small_molecule_path,\n",
+ ")"
+ ],
+ "metadata": {
+ "id": "ZT2weQliNeGX"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "This code initiates the docking process. We'll need to run it to complete the operation."
+ ],
+ "metadata": {
+ "id": "iaOF6_5b3D_E"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import plex_run\n",
+ "\n",
+ "completed_io_cid, io_local_filepath = plex_run(initial_io_cid, dir_path)"
+ ],
+ "metadata": {
+ "id": "sRkhy9HPRmbp"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address."
+ ],
+ "metadata": {
+ "id": "rAVv3gtxVga6"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import json\n",
+ "\n",
+ "with open(io_local_filepath, 'r') as f:\n",
+ " data = json.load(f)\n",
+ " pretty_data = json.dumps(data, indent=4, sort_keys=True)\n",
+ " print(pretty_data)"
+ ],
+ "metadata": {
+ "id": "5Q9yPul1SsYO"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "This output provides us with key information about the small molecule-protein interaction. The \"best_docked_small_molecule\" represents the most likely interaction between the protein and the small molecule, which can inform subsequent analysis and experiments."
+ ],
+ "metadata": {
+ "id": "o0zJu176gWPu"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.\n",
+ "\n",
+ "**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)"
+ ],
+ "metadata": {
+ "id": "rr8OsUZQ3YMS"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "print(f\"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}\")"
+ ],
+ "metadata": {
+ "id": "bQ4EtVYIAjxX"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# Visualize the small molecule docking"
+ ],
+ "metadata": {
+ "id": "vC1u6QfW3qUk"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import plex_vectorize\n",
+ "\n",
+ "results = plex_vectorize(completed_io_cid, CoreTools.EQUIBIND.value)\n",
+ "\n",
+ "best_docked_small_molecule_path = results['best_docked_small_molecule']['filePaths'][0]\n",
+ "best_docked_small_molecule_cid = results['best_docked_small_molecule']['cidPaths'][0]\n",
+ "\n",
+ "print(results)\n",
+ "print(best_docked_small_molecule_path)\n",
+ "print(best_docked_small_molecule_cid)"
+ ],
+ "metadata": {
+ "id": "qNDIb1N0idP9"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!pip install py3Dmol"
+ ],
+ "metadata": {
+ "id": "cnrU4D9ci7KL"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import py3Dmol\n",
+ "\n",
+ "def show_pdb_and_sdf(pdb_file, sdf_file):\n",
+ " viewer = py3Dmol.view()\n",
+ "\n",
+ " # Add PDB model\n",
+ " with open(pdb_file, 'r') as f:\n",
+ " viewer.addModel(f.read(), 'pdb')\n",
+ "\n",
+ " # Add SDF model\n",
+ " with open(sdf_file, 'r') as f:\n",
+ " viewer.addModel(f.read(), 'sdf')\n",
+ "\n",
+ " # Set style for the visualization\n",
+ " viewer.setStyle({'model':0}, {'cartoon': {'color':'spectrum'}})\n",
+ " viewer.setStyle({'model':1}, {'stick': {}})\n",
+ "\n",
+ " viewer.zoomTo()\n",
+ " viewer.show()\n",
+ "\n",
+ "show_pdb_and_sdf(results['protein']['filePaths'][0], best_docked_small_molecule_path)"
+ ],
+ "metadata": {
+ "id": "p4sh-SccjBvT"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Mint a ProofOfScience NFT\n",
+ "\n",
+ "We can now mint a ProofOfScience token by providing the IPFS CID of the completed IO JSON to the `plex_mint` function."
+ ],
+ "metadata": {
+ "id": "dgJyNyJAmyqD"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "os.environ[\"RECIPIENT_WALLET\"] = \"\" # enter your wallet address"
+ ],
+ "metadata": {
+ "id": "psLIAje7EkOy"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import plex_mint\n",
+ "\n",
+ "# using the autotask webhook enables gasless minting\n",
+ "os.environ[\"AUTOTASK_WEBHOOK\"] = \"https://api.defender.openzeppelin.com/autotasks/e15b3f39-28f8-4d30-9bf3-5d569bdf2e78/runs/webhook/8315d17c-c493-4d04-a257-79209f95bb64/2gmqi9SRRAQMoy1SRdktai\"\n",
+ "\n",
+ "plex_mint(completed_io_cid)"
+ ],
+ "metadata": {
+ "id": "vvIF_WXamws4"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Congratulations on making it through this tutorial! If you'd like to stay up to date with LabDAO, please consider signing up for our [newsletter]()."
+ ],
+ "metadata": {
+ "id": "nDV858Andmbl"
+ }
+ }
+ ]
+}
\ No newline at end of file
diff --git a/python/notebooks/colab/plex_tutorial_equibind.ipynb b/python/notebooks/colab/plex_tutorial_equibind.ipynb
deleted file mode 100644
index 699c6903e..000000000
--- a/python/notebooks/colab/plex_tutorial_equibind.ipynb
+++ /dev/null
@@ -1,234 +0,0 @@
-{
- "nbformat": 4,
- "nbformat_minor": 0,
- "metadata": {
- "colab": {
- "provenance": []
- },
- "kernelspec": {
- "name": "python3",
- "display_name": "Python 3"
- },
- "language_info": {
- "name": "python"
- }
- },
- "cells": [
- {
- "cell_type": "markdown",
- "source": [
- "## Install PLEX\n",
- "\n",
- "We first install the plex pip package."
- ],
- "metadata": {
- "id": "X_WSZx7OSFll"
- }
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "99yQg2fbNPBT",
- "outputId": "7f884352-2ffc-49f5-be67-7bf9422c3e53"
- },
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Collecting PlexLabExchange\n",
- " Downloading PlexLabExchange-0.8.18-py3-none-manylinux2014_x86_64.whl (26.9 MB)\n",
- "\u001b[2K \u001b[90mββββββββββββββββββββββββββββββββββββββββ\u001b[0m \u001b[32m26.9/26.9 MB\u001b[0m \u001b[31m19.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
- "\u001b[?25hInstalling collected packages: PlexLabExchange\n",
- "Successfully installed PlexLabExchange-0.8.18\n"
- ]
- }
- ],
- "source": [
- "!pip install PlexLabExchange"
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- "## Load small molecule and protein data\n",
- "\n",
- "Next, we load the small molecule and protein data which are available on IPFS. This data is used to initialize an IO JSON that serves as job instructions."
- ],
- "metadata": {
- "id": "GAqQ9Eg0SOTS"
- }
- },
- {
- "cell_type": "code",
- "source": [
- "small_molecule_path = [\"QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf\"]\n",
- "protein_path = [\"QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb\"]"
- ],
- "metadata": {
- "id": "ZnM0Tm7bNb1R"
- },
- "execution_count": 2,
- "outputs": []
- },
- {
- "cell_type": "code",
- "source": [
- "from plex import CoreTools, plex_init\n",
- "\n",
- "initial_io_cid = plex_init(\n",
- " CoreTools.EQUIBIND.value,\n",
- " protein=protein_path,\n",
- " small_molecule=small_molecule_path\n",
- ")"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "ZT2weQliNeGX",
- "outputId": "ca7aef97-96de-4d8b-aa55-2018db6eb833"
- },
- "execution_count": 3,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "plex init -t QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB -i {\"protein\": [\"QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb\"], \"small_molecule\": [\"QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz/ZINC000003986735.sdf\"]} --scatteringMethod=dotProduct\n",
- "Plex version (v0.8.3) up to date.\n",
- "Pinned IO JSON CID: QmShD7ApeDBUqqy98RuuKdyv8AdmBsvyZqqxSLAEvB9EKP\n"
- ]
- }
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- "## Dock the small molecule and protein using Equibind\n",
- "\n",
- "With the IO JSON created and pinned to IPFS, we can now submit the job to the LabDAO Bacalhau cluster for compute."
- ],
- "metadata": {
- "id": "9Rkyd4_KSZG0"
- }
- },
- {
- "cell_type": "code",
- "source": [
- "from plex import plex_run\n",
- "\n",
- "completed_io_cid, io_local_filepath = plex_run(initial_io_cid)"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "sRkhy9HPRmbp",
- "outputId": "b1ec1b5d-1c97-4261-9578-29e3f1e8d1e3"
- },
- "execution_count": 4,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Plex version (v0.8.3) up to date.\n",
- "Created working directory: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866\n",
- "Initialized IO file at: /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json\n",
- "Processing IO Entries\n",
- "Starting to process IO entry 0 \n",
- "Job running...\n",
- "Bacalhau job id: a292c5fc-a717-47d5-a5b4-4d3401670a4f \n",
- "\n",
- "Computing default go-libp2p Resource Manager limits based on:\n",
- " - 'Swarm.ResourceMgr.MaxMemory': \"6.8 GB\"\n",
- " - 'Swarm.ResourceMgr.MaxFileDescriptors': 524288\n",
- "\n",
- "Applying any user-supplied overrides on top.\n",
- "Run 'ipfs swarm limit all' to see the resulting limits.\n",
- "\n",
- "Success processing IO entry 0 \n",
- "Finished processing, results written to /jobs/3f9b386d-a74d-463c-8ca6-a882d053c866/io.json\n",
- "Completed IO JSON CID: QmVG4mT2kkPSb6wzT5QxYZndB5VbKLU8nH2dErZW2zxae6\n",
- "2023/07/20 03:28:45 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.\n"
- ]
- }
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- "Time to view our results!"
- ],
- "metadata": {
- "id": "rAVv3gtxVga6"
- }
- },
- {
- "cell_type": "code",
- "source": [
- "import json\n",
- "\n",
- "with open(io_local_filepath, 'r') as f:\n",
- " data = json.load(f)\n",
- " pretty_data = json.dumps(data, indent=4, sort_keys=True)\n",
- " print(pretty_data)"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "5Q9yPul1SsYO",
- "outputId": "dd9ef771-bcdb-455d-ee44-25ca7d84426c"
- },
- "execution_count": 5,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "[\n",
- " {\n",
- " \"errMsg\": \"\",\n",
- " \"inputs\": {\n",
- " \"protein\": {\n",
- " \"class\": \"File\",\n",
- " \"filepath\": \"7n9g.pdb\",\n",
- " \"ipfs\": \"QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd\"\n",
- " },\n",
- " \"small_molecule\": {\n",
- " \"class\": \"File\",\n",
- " \"filepath\": \"ZINC000003986735.sdf\",\n",
- " \"ipfs\": \"QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz\"\n",
- " }\n",
- " },\n",
- " \"outputs\": {\n",
- " \"best_docked_small_molecule\": {\n",
- " \"class\": \"File\",\n",
- " \"filepath\": \"7n9g_ZINC000003986735_docked.sdf\",\n",
- " \"ipfs\": \"QmZdoaKEGtESnLoHFMb9bvqdwXjyUuRK6DbEoYz8PYpZ8W\"\n",
- " },\n",
- " \"protein\": {\n",
- " \"class\": \"File\",\n",
- " \"filepath\": \"7n9g.pdb\",\n",
- " \"ipfs\": \"QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd\"\n",
- " }\n",
- " },\n",
- " \"state\": \"completed\",\n",
- " \"tool\": {\n",
- " \"ipfs\": \"QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB\",\n",
- " \"name\": \"equibind\"\n",
- " }\n",
- " }\n",
- "]\n"
- ]
- }
- ]
- }
- ]
-}
\ No newline at end of file
diff --git a/python/notebooks/colab/plex_tutorial_protein_folding.ipynb b/python/notebooks/colab/plex_tutorial_protein_folding.ipynb
index f0df4e95e..4370845b0 100644
--- a/python/notebooks/colab/plex_tutorial_protein_folding.ipynb
+++ b/python/notebooks/colab/plex_tutorial_protein_folding.ipynb
@@ -1,44 +1,56 @@
{
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "name": "python3",
+ "display_name": "Python 3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
"cells": [
{
"cell_type": "markdown",
- "metadata": {
- "id": "eAGjvflZbCDh"
- },
"source": [
"## Protein folding in silico\n",
"\n",
- "In this tutorial, we perform protein folding with PLEX.\n",
+ "In this tutorial we perform protein folding with **plex**.\n",
"\n",
- "There are multiple reasons we believe PLEX is a new standard for computational biology π§«:\n",
- "1. With a simple python interface, running containerised tools with your data is only a few commands away\n",
- "2. The infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node\n",
- "3. Every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.\n",
- "4. We made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.\n",
+ "There are multiple reasons we believe plex is a new standard for computational biology π§«:\n",
+ "1. with a simple python interface, running containerised tools with your data is only a few commands away\n",
+ "2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node\n",
+ "3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.\n",
+ "4. we made adding new tools to the network as easy as possible - moving your favorite tool to plex is one JSON document away.\n",
"\n",
- "We'll walk through an example of how to use PLEX to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.\n",
+ "In this tutorial, we'll walk through an example of how to use plex to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.\n",
+ "\n",
+ "We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.\n",
"\n",
""
- ]
+ ],
+ "metadata": {
+ "id": "eAGjvflZbCDh"
+ }
},
{
"cell_type": "markdown",
+ "source": [
+ "## Install plex"
+ ],
"metadata": {
"id": "s2c3TpEImg_r"
- },
- "source": [
- "## Install PLEX"
- ]
+ }
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "Hs1hTk0umdD6",
- "outputId": "abd4c812-6743-4740-9fdb-df88e5124381"
+ "id": "Hs1hTk0umdD6"
},
"outputs": [],
"source": [
@@ -47,20 +59,15 @@
},
{
"cell_type": "markdown",
+ "source": [
+ "Then, create a directory where we can save our project files."
+ ],
"metadata": {
"id": "GHwHKauOgwEE"
- },
- "source": [
- "Then, create a directory where we can save our project files.\n"
- ]
+ }
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "ZF0r7o8wnNrv"
- },
- "outputs": [],
"source": [
"import os\n",
"\n",
@@ -68,111 +75,100 @@
"!mkdir project\n",
"\n",
"dir_path = f\"{cwd}/project\""
- ]
+ ],
+ "metadata": {
+ "id": "ZF0r7o8wnNrv"
+ },
+ "execution_count": null,
+ "outputs": []
},
{
"cell_type": "markdown",
- "metadata": {
- "id": "LTOFbSGBm38S"
- },
"source": [
"## Download protein sequence\n",
"\n",
"We'll download a `.fasta` file containing the sequence of the protein we want to fold. Here, we're using the sequence of Streptavidin.\n",
"\n"
- ]
+ ],
+ "metadata": {
+ "id": "LTOFbSGBm38S"
+ }
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "drMqTIxrm805",
- "outputId": "7d615b05-738e-4740-b427-c984fe9f44ad"
- },
- "outputs": [],
"source": [
"!wget https://rest.uniprot.org/uniprotkb/P22629.fasta -O {dir_path}/P22629.fasta # Streptavidin"
- ]
+ ],
+ "metadata": {
+ "id": "drMqTIxrm805"
+ },
+ "execution_count": null,
+ "outputs": []
},
{
"cell_type": "markdown",
- "metadata": {
- "id": "hCTgrr1nnDqn"
- },
"source": [
"## Fold the protein\n",
"\n",
"With the sequence downloaded, we can now use ColabFold to fold the protein.\n",
"\n"
- ]
+ ],
+ "metadata": {
+ "id": "hCTgrr1nnDqn"
+ }
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "vXP9OKCVnX1u",
- "outputId": "73b0803f-7305-44ba-e0b6-1e440a52f550"
- },
- "outputs": [],
"source": [
- "from plex import CoreTools, plex_create\n",
+ "from plex import CoreTools, plex_init\n",
"\n",
- "initial_io_cid = plex_create(CoreTools.COLABFOLD_MINI.value, dir_path)"
- ]
+ "fasta_local_filepaths = [f\"{dir_path}/P22629.fasta\"]\n",
+ "\n",
+ "initial_io_cid = plex_init(\n",
+ " CoreTools.COLABFOLD_MINI.value,\n",
+ " sequence=fasta_local_filepaths\n",
+ ")"
+ ],
+ "metadata": {
+ "id": "vXP9OKCVnX1u"
+ },
+ "execution_count": null,
+ "outputs": []
},
{
"cell_type": "markdown",
- "metadata": {
- "id": "aYRh4KjfhKLC"
- },
"source": [
"This code initiates the folding process. We'll need to run it to complete the operation."
- ]
+ ],
+ "metadata": {
+ "id": "aYRh4KjfhKLC"
+ }
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "ZWlkedBYnfwo",
- "outputId": "bfc01da1-8caf-4170-e209-4b542848e20d"
- },
- "outputs": [],
"source": [
"from plex import plex_run\n",
"\n",
"completed_io_cid, completed_io_filepath = plex_run(initial_io_cid, dir_path)"
- ]
+ ],
+ "metadata": {
+ "id": "ZWlkedBYnfwo"
+ },
+ "execution_count": null,
+ "outputs": []
},
{
"cell_type": "markdown",
- "metadata": {
- "id": "OCFxmkr8hOnZ"
- },
"source": [
"After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.\n",
"\n"
- ]
+ ],
+ "metadata": {
+ "id": "OCFxmkr8hOnZ"
+ }
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "eUf66N5Anrcf",
- "outputId": "db57bf1e-7bb6-4af6-bffb-700b0d62342d"
- },
- "outputs": [],
"source": [
"import json\n",
"\n",
@@ -180,21 +176,139 @@
" data = json.load(f)\n",
" pretty_data = json.dumps(data, indent=4, sort_keys=True)\n",
" print(pretty_data)"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "provenance": []
+ ],
+ "metadata": {
+ "id": "eUf66N5Anrcf"
+ },
+ "execution_count": null,
+ "outputs": []
},
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
+ {
+ "cell_type": "markdown",
+ "source": [
+ "The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.\n",
+ "\n",
+ "**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)"
+ ],
+ "metadata": {
+ "id": "Fs2SXcsE-OGq"
+ }
},
- "language_info": {
- "name": "python"
+ {
+ "cell_type": "code",
+ "source": [
+ "print(f\"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}\")"
+ ],
+ "metadata": {
+ "id": "BrpWcc3R4F_n"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Visualize the folded protein"
+ ],
+ "metadata": {
+ "id": "EZV9f8g3o6Es"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import plex_vectorize\n",
+ "\n",
+ "results = plex_vectorize(completed_io_cid, CoreTools.COLABFOLD_MINI.value)\n",
+ "\n",
+ "print(results)\n",
+ "print(results['best_folded_protein']['filePaths']) # for local file path references\n",
+ "print(results['best_folded_protein']['cidPaths']) # IPFS path references"
+ ],
+ "metadata": {
+ "id": "hyvNDdU40UbC"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!pip install py3Dmol"
+ ],
+ "metadata": {
+ "id": "pZsoCfCn1F2B"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import py3Dmol\n",
+ "\n",
+ "def show_pdbfile(pdbfile):\n",
+ " viewer = py3Dmol.view()\n",
+ " with open(pdbfile, 'r') as f:\n",
+ " viewer.addModel(f.read(), 'pdb')\n",
+ " viewer.setStyle({'cartoon': {'color':'spectrum'}})\n",
+ " viewer.show()\n",
+ "\n",
+ "# Use the function to show a protein from a PDB file\n",
+ "show_pdbfile(results['best_folded_protein']['filePaths'][0])"
+ ],
+ "metadata": {
+ "id": "s9Kakx_IN7Lf"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Mint a ProofOfScience NFT\n",
+ "\n",
+ "We can now mint a ProofOfScience token by providing the IPFS CID of the completed IO JSON to the `plex_mint` function."
+ ],
+ "metadata": {
+ "id": "QV4QoStFpE_h"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "os.environ[\"RECIPIENT_WALLET\"] = \"\" # enter your wallet address"
+ ],
+ "metadata": {
+ "id": "tvCqDi9ucNSF"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from plex import plex_mint\n",
+ "\n",
+ "# using the autotask webhook enables gasless minting\n",
+ "os.environ[\"AUTOTASK_WEBHOOK\"] = \"https://api.defender.openzeppelin.com/autotasks/e15b3f39-28f8-4d30-9bf3-5d569bdf2e78/runs/webhook/8315d17c-c493-4d04-a257-79209f95bb64/2gmqi9SRRAQMoy1SRdktai\"\n",
+ "\n",
+ "plex_mint(completed_io_cid)"
+ ],
+ "metadata": {
+ "id": "bwSJNGwLOTFf"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Congratulations on making it through this tutorial! If you'd like to stay up to date with LabDAO, please consider signing up for our [newsletter]()."
+ ],
+ "metadata": {
+ "id": "1VXkE8BtcuTq"
+ }
}
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}
+ ]
+}
\ No newline at end of file
diff --git a/python/notebooks/colab/plex_tutorial_protein_folding_nft_minting.ipynb b/python/notebooks/colab/plex_tutorial_protein_folding_nft_minting.ipynb
deleted file mode 100644
index 1ad713544..000000000
--- a/python/notebooks/colab/plex_tutorial_protein_folding_nft_minting.ipynb
+++ /dev/null
@@ -1,298 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "eAGjvflZbCDh"
- },
- "source": [
- "## Protein folding in silico\n",
- "\n",
- "In this tutorial we perform protein folding with PLEX.\n",
- "\n",
- "There are multiple reasons we believe PLEX is a new standard for computational biology π§«:\n",
- "1. with a simple python interface, running containerised tools with your data is only a few commands away\n",
- "2. the infrastructure of the compute network is fully open source - use the public network or work with us to set up your own node\n",
- "3. every event on the compute network is tracked - no more results are lost in an interactive compute session. You can base your decisions and publications on fully reproducible results.\n",
- "4. we made adding new tools to the network as easy as possible - moving your favorite tool to PLEX is one JSON document away.\n",
- "\n",
- "In this tutorial, we'll walk through an example of how to use PLEX to predict a protein's 3D structure using [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We will use the sequence of the Streptavidin protein for this demo.\n",
- "\n",
- "We will also walk through the process of minting a ProofOfScience NFT. These tokens represent on-chain, verifiable records of the compute job and its input/output data. This enables reproducible scientific results.\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "s2c3TpEImg_r"
- },
- "source": [
- "## Install PLEX"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Hs1hTk0umdD6"
- },
- "outputs": [],
- "source": [
- "!pip install PlexLabExchange"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "GHwHKauOgwEE"
- },
- "source": [
- "Then, create a directory where we can save our project files.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "ZF0r7o8wnNrv"
- },
- "outputs": [],
- "source": [
- "import os\n",
- "\n",
- "cwd = os.getcwd()\n",
- "!mkdir project\n",
- "\n",
- "dir_path = f\"{cwd}/project\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "LTOFbSGBm38S"
- },
- "source": [
- "## Download protein sequence\n",
- "\n",
- "We'll download a `.fasta` file containing the sequence of the protein we want to fold. Here, we're using the sequence of Streptavidin.\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "drMqTIxrm805"
- },
- "outputs": [],
- "source": [
- "!wget https://rest.uniprot.org/uniprotkb/P22629.fasta -O {dir_path}/P22629.fasta # Streptavidin"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "hCTgrr1nnDqn"
- },
- "source": [
- "## Fold the protein\n",
- "\n",
- "With the sequence downloaded, we can now use ColabFold to fold the protein.\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "vXP9OKCVnX1u"
- },
- "outputs": [],
- "source": [
- "from plex import CoreTools, plex_init\n",
- "\n",
- "fasta_local_filepaths = [f\"{dir_path}/P22629.fasta\"]\n",
- "\n",
- "initial_io_cid = plex_init(\n",
- " CoreTools.COLABFOLD_MINI.value,\n",
- " sequence=fasta_local_filepaths\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "aYRh4KjfhKLC"
- },
- "source": [
- "This code initiates the folding process. We'll need to run it to complete the operation."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "ZWlkedBYnfwo"
- },
- "outputs": [],
- "source": [
- "from plex import plex_run\n",
- "\n",
- "completed_io_cid, completed_io_filepath = plex_run(initial_io_cid, dir_path)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "OCFxmkr8hOnZ"
- },
- "source": [
- "After the job is complete, we can retrieve and view the results. The state of each object is written in a JSON object. Every file has a unique content-address.\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "eUf66N5Anrcf"
- },
- "outputs": [],
- "source": [
- "import json\n",
- "\n",
- "with open(completed_io_filepath, 'r') as f:\n",
- " data = json.load(f)\n",
- " pretty_data = json.dumps(data, indent=4, sort_keys=True)\n",
- " print(pretty_data)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "Fs2SXcsE-OGq"
- },
- "source": [
- "The results can also be viewed using an IPFS gateway. Below, the state of the IO JSON is read using the ipfs.io gateway.\n",
- "\n",
- "**Note:** Depending on how long it takes for the results to propagate to the ipfs.io nodes, the data may not be available immediately. The results can also be viewed on IPFS Desktop or by accessing IPFS through the Brave browser (ipfs://completed_io_cid)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "BrpWcc3R4F_n"
- },
- "outputs": [],
- "source": [
- "print(f\"View this result on IPFS: https://ipfs.io/ipfs/{completed_io_cid}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "EZV9f8g3o6Es"
- },
- "source": [
- "## Visualize the folded protein"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "hyvNDdU40UbC"
- },
- "outputs": [],
- "source": [
- "from plex import plex_vectorize\n",
- "\n",
- "results = plex_vectorize(completed_io_cid, CoreTools.COLABFOLD_MINI.value)\n",
- "\n",
- "print(results)\n",
- "print(results['best_folded_protein']['filePaths']) # for local file path references\n",
- "print(results['best_folded_protein']['cidPaths']) # IPFS path references"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "pZsoCfCn1F2B"
- },
- "outputs": [],
- "source": [
- "!pip install py3Dmol"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "s9Kakx_IN7Lf"
- },
- "outputs": [],
- "source": [
- "import py3Dmol\n",
- "\n",
- "def show_pdbfile(pdbfile):\n",
- " viewer = py3Dmol.view()\n",
- " with open(pdbfile, 'r') as f:\n",
- " viewer.addModel(f.read(), 'pdb')\n",
- " viewer.setStyle({'cartoon': {'color':'spectrum'}})\n",
- " viewer.show()\n",
- "\n",
- "# Use the function to show a protein from a PDB file\n",
- "show_pdbfile(results['best_folded_protein']['filePaths'][0])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "QV4QoStFpE_h"
- },
- "source": [
- "## Mint a ProofOfScience NFT\n",
- "\n",
- "We can now mint a ProofOfScience token by providing the IPFS CID of the completed IO JSON to the `plex_mint` function."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "bwSJNGwLOTFf"
- },
- "outputs": [],
- "source": [
- "from plex import plex_mint\n",
- "\n",
- "os.environ[\"RECIPIENT_WALLET\"] = \"\" # enter your wallet address\n",
- "\n",
- "# using the autotask webhook enables gasless minting\n",
- "os.environ[\"AUTOTASK_WEBHOOK\"] = \"https://api.defender.openzeppelin.com/autotasks/e15b3f39-28f8-4d30-9bf3-5d569bdf2e78/runs/webhook/8315d17c-c493-4d04-a257-79209f95bb64/2gmqi9SRRAQMoy1SRdktai\"\n",
- "\n",
- "plex_mint(completed_io_cid)\n",
- "\n",
- "print(\"View your ProofOfScience tokens here: https://testnets.opensea.io/account\")"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "provenance": []
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- },
- "language_info": {
- "name": "python"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}