Skip to content

tadKeys/tabsat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

34ce3a4 · May 8, 2019
Jun 15, 2016
Jun 13, 2016
Jun 27, 2017
Jun 20, 2016
Dec 4, 2018
Mar 28, 2017
Oct 25, 2018
Aug 28, 2017
May 8, 2019
May 8, 2019
Mar 29, 2017
Dec 4, 2018

Repository files navigation

NEW VERSION OF TABSAT

Please check out our new version of TABSAT.
-> https://tabsat.ait.ac.at



TABSAT

TABSAT - Targeted Amplicon Bisulfite Sequencing Analysis Tool - is a tool for analyzing targeted bisulfite sequencing data generated on an Ion Torrent PGM / Illumina MiSeq. It performs

  • Quality Assessment
  • Alignment using Bismark
  • Result aggregation into a table
  • Visualization as lollipop plots

Available as

  • Fully configured Docker image Dockerfile - see usage information below.
  • Source code

Collaboration

Please contact us if you need help running your analyses. Also we have developed an extended version for our collaborators with the following additional features:

  • Interactive web-based visualization
  • Download FASTA of target regions
  • Strand specific CpGs
  • Automatic mapping of primers
  • Restriction enzyme positions
  • Start using web frontend
  • Pattern visualization and analysis

Publication

TABSAT is published:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0160227

Example usage

${TABSAT} -l NONDIR -g hg19 -q 20 -m 10 -p 0.8 -r 0 -t target.csv -a tmap -o output_dir input.fastq

-t Targetlist in CSV format example [mandatory] - Strand can be "+", "-", "+/-"
-e Sequencing library - SE/PE (PE reads must be called *_1.fastq, *_2.fastq)
-g Genome (hg19, mm10)
-l Library mode of bisulfite experiment
-a [optional] Specify the aligner that should be used
-m [optional] This parameter is used for filtering reads that are shorter than the given threshold.
-q [optional] Bases that are below the given threshold are removed from the 3’ end of the reads (read trimming)
-p [optional] Percent of target covered by a read for pattern creation. This value specifies the percent of the target that needs to be covered by a read to include it for pattern analysis.
-r: [optional] Minimum number of mapped reads that need to be present at each CpG site.
-s: [optional] Sorted list of samples that is used to specify the order in the lollipop plots.
-o Output directory
-d Directory of inputfiles (absolute path); if not specified, the input files are added at the end [optional]

Examples

Test with input file directory

tabsat -l NONDIR -g hg19 -t target.csv -d test_input_dir -a tmap -o test_output_dir

Test with separate input files

tabsat -l NONDIR -g hg19 -t target.csv -o test_output_files xy.fastq abs.fastq

Test data

Test data is available here

Installation

$ tabsat/reference/prepareReference.sh
  • Prepare the CpG file
apt-get install p7zip-full
7za e tabsat/tools/ait/all_cpgs_only_pos_hg19.7z
7za e tabsat/tools/ait/all_cpgs_only_pos_mm10.7z
  • Install Perl modules
    • Cairo.pm
    • Switch.pm
  • Run 'install' script in tabsat folder (installs SAMtools, Bedtools) ./install

Run example

Command line

  • After installation go to tabsat/tools/zz_test
  • Execute
./test_tabsat_tmap.sh
  • Inspect output at tabsat/tabsat_test_output

Docker

  • Build the docker file
    docker build -t tabsat:v1 .

  • Run it
    docker run -t --name tabsat -d tabsat:v1

  • Connect to docker
    docker exec -ti tabsat /bin/bash

  • Stop container
    docker stop tabsat

  • Remove container
    docker rm tabsat

  • Remove image
    docker rmi tabsat:v1