Automated workflow for small RNA sequence data

Snakemake workflow for processing small RNA-seq libaries produced by Illumina small sequencing kits.

Requirments

demultiplex fastq files in located in data directory. They need to be in the form {sample}_R1.fastq.gz
Snakefile shipped with this repository.
config.yaml shipped with this repository. It contains all parameters and settings to customize the processing of the current dataset.
samples.csv listing all samples in the data directory withoug the _R1.fastq.gz suffix. The first line is the header i.e. the work library. An example is shipped with this repository which can be used as a template.
Optionall: environment.yaml to create the software environment if conda is used.
Installation of snakemake and optionally conda
If conda is not used, bowtie, fastqc, samtools and deeptools need to be in the PATH.

The above files can be downloaded as a whole by cloning the repository (which requires git):

git clone https://github.com/seb-mueller/snakemake_sRNAseq.git

Or individually for example the Snakemake file using wget:

wget https://raw.githubusercontent.com/seb-mueller/snakemake_sRNAseq/master/Snakefile

creating conda environment

conda env create --file environment.yaml --name srna_mapping

activate

conda activate srna_mapping

To deactivate the environment, run:

conda deactivate

Update:

git pull
conda env update --file environment.yaml --name srna_mapping

Usage:

Navigate in a Unix shell to the base directory contains the files listed above plus the data directory including the data like int this example:

.
├── data
│   ├── test2_R1.fastq.gz
│   └── test3_R1.fastq.gz
├── config.yaml
├── environment.yaml
├── samples.csv
└── Snakefile

Then just run snakmake in base directory:

# the most basic usage
snakemake
# recommended: automatic conda managment in central location
snakemake --use-conda --conda-prefix ~/.myconda -p

useful parameters:

--cores max number of threads
-n dryrun
-p print commands
--use-conda
--conda-prefix ~/.myconda
--forcerun postmapping forces rerun of a given rule (e.g. postmapping)
--keep-going if for example one sample fails, pipeline will still try to process other samples

Output:

trimmed, log and mapped directory with trimming and mapping results.

Update: added STAR support

# create star index (goes in staridx folder)
snakemake -p --skip-script-cleanup staridx --cores 3
# then map using star
snakemake -p --skip-script-cleanup starmap --cores 3
# TODO: create bw files form STAR mapping

Name	Name	Last commit message	Last commit date
Latest commit seb-mueller flag comment RG Apr 12, 2021 1b381a9 · Apr 12, 2021 History 16 Commits
data	data	bumped wrappers to 49.0 version, added conda directives, minor improv…	Mar 3, 2020
.gitignore	.gitignore	flag comment RG	Apr 12, 2021
LICENSE	LICENSE	Initial commit	Oct 23, 2018
README.md	README.md	added adapter, fixed star mapping (no wrapper)	Feb 7, 2021
Snakefile	Snakefile	flag comment RG	Apr 12, 2021
adapter_list_8bp.txt	adapter_list_8bp.txt	added adapter, fixed star mapping (no wrapper)	Feb 7, 2021
config.yaml	config.yaml	flag comment RG	Apr 12, 2021
environment.yaml	environment.yaml	added adapter, fixed star mapping (no wrapper)	Feb 7, 2021
samples.csv	samples.csv	bumped wrappers to 49.0 version, added conda directives, minor improv…	Mar 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated workflow for small RNA sequence data

Requirments

creating conda environment

activate

Update:

Usage:

useful parameters:

Output:

Update: added STAR support

About

Releases

Packages

Languages

License

seb-mueller/snakemake_sRNAseq

Folders and files

Latest commit

History

Repository files navigation

Automated workflow for small RNA sequence data

Requirments

creating conda environment

activate

Update:

Usage:

useful parameters:

Output:

Update: added STAR support

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages