Seasonal human coronavirus (HCoV) genome annotation

How to annotate HCoV sequences with VADR

HCoV VADR model libraries

Additional VADR documentation

References

How to annotate HCoV genomes with VADR

Installation instructions:

1. Install VADR

Option A: Use a pre-built Docker image

You can use the StaPH-B Docker image for VADR 1.6.3-hav-flu2 created by Curtis Kapsak (docker image names: staphb/vadr:1.6.3-hav-flu2 and staphb/vadr:latest). This is available from: DockerHub Quay You can pull the image using: docker pull --platform linux/amd64 staphb/vadr:1.6.3-hav-flu2

Option B: Install VADR from source

Alternatively, you can download and install the latest version of VADR, following the instructions on the VADR GitHub.

2. Download the HCoV VADR Model

Clone the latest HCoV VADR model (release v1.01)
git clone git@github.com:greninger-lab/vadr-models-hcov.git

3. Run HPIV annotation Note: Nucleotide sequences must be in FASTA format and should not be aligned. The software only recognizes IUPAC nucleotide codes and does not accept symbols such as - (which indicate deletions in alingments). Remove any terminal ambiguous nucleotides (e.g. "N") which typically represent regions with no sequencing coverage. You can use teh script fasta-trim-terminal-ambigs.pl located in $VADRSCRIPTSDIR/miniscripts/ to clean your sequences accoridngly. To remove too short and too long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 33000 <input-fasta-file> > <trimmed-fasta-file>

Run the v-annotate.pl program on an input trimmed fasta file with HCoV sequences using the recommended command below. Note the path to the directory name including the specific HPIV species subdirectory (e.g. </path/to/vadr-models-hpiv>/229E or </path/to/vadr-models-hpiv>/NL63) In addition, <hcov-key> must indicate the HPIV species 229E, HKU1, NL63, or OC43.

Use the following command lines:

229E and NL63:

v-annotate.pl -s --glsearch -r -f --mkey <hpiv-key> --mdir <hcov-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>

HKU1 and OC43:

v-annotate.pl -s --glsearch -r -f --alt_pass discontn --mkey <hpiv-key> --mdir <hcov-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>

After running the v-annotate.pl, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md

HCoV VADR model libraries

The VADR model libraries for HCoV annotation include models for species 229E, HKU1 (genotypes A, B and C), NL63, and OC43.
Some of the model genomes have been modified slightly on either the 3' or 5' ends to facilitate accurate annotation of sequences of greater length. These include:
- KY996417 (229E) 3' +15 As
- AY884001 (HKU1 B) 5' +3 Ts
- MT118678 (OC43) 5' +1 Gs, 3' +16 As

Additional VADR documentation

Reference

The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Seasonal human coronavirus (HCoV) genome annotation

How to annotate HCoV sequences with VADR

HCoV VADR model libraries

Additional VADR documentation

References

How to annotate HCoV genomes with VADR

HCoV VADR model libraries

Additional VADR documentation

Reference

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
229E		229E
HKU1		HKU1
NL63		NL63
OC43		OC43
README.md		README.md

greninger-lab/vadr-models-hcov

Folders and files

Latest commit

History

Repository files navigation

Seasonal human coronavirus (HCoV) genome annotation

How to annotate HCoV genomes with VADR

HCoV VADR model libraries

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages