Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed correction #1073

Closed
aldertzomer opened this issue Sep 5, 2018 · 3 comments
Closed

failed correction #1073

aldertzomer opened this issue Sep 5, 2018 · 3 comments

Comments

@aldertzomer
Copy link

aldertzomer commented Sep 5, 2018

Canu 1.7.1

Correction fails when trying to assemble an E. coli genome with 2 ~10 kb plasmids that have a high copy number (70-100 or so) . I have 1500 megabase of sequence in my Sequel fasta file, which should be 300 fold coverage, but many of the reads are probably plasmid reads. Commands I used:

/usr/local/canu-1.7.1/Linux-amd64/bin/canu -d BVCB000015H -p BVCB000015H genomeSize=4.7m -pacbio-raw BVCB000015H_33514_subreads.fasta.gz

This failed. Manual suggests setting a high number for corOutCoverage in the case of uneven coverage distribution:

/usr/local/canu-1.7.1/Linux-amd64/bin/canu -d BVCB000015H2 -p BVCB000015H2 genomeSize=4.7m -pacbio-raw BVCB000015H_33514_subreads.fasta.gz corOutCoverage=999

also fails (same step). Something with falconsense (see below).

the failure Canu reports is:

-- Finished on Wed Sep  5 13:42:38 2018 (4693 seconds) with 10149.058 GB free disk space
----------------------------------------
--
-- Read correction jobs failed, retry.
--   job 2-correction/results/0002.cns FAILED.
--   job 2-correction/results/0056.cns FAILED.
--   job 2-correction/results/0059.cns FAILED.
--
--
-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'cor' concurrent execution on Wed Sep  5 13:42:39 2018 with 10149.058 GB free disk space (3 processes; 4 concurrently)

    cd correction/2-correction
    ./correctReads.sh 2 > ./correctReads.000002.out 2>&1
    ./correctReads.sh 56 > ./correctReads.000056.out 2>&1
    ./correctReads.sh 59 > ./correctReads.000059.out 2>&1

-- Finished on Wed Sep  5 13:51:11 2018 (512 seconds) with 10148.894 GB free disk space
----------------------------------------
--
-- Read correction jobs failed, tried 2 times, giving up.
--   job 2-correction/results/0002.cns FAILED.
--   job 2-correction/results/0056.cns FAILED.
--   job 2-correction/results/0059.cns FAILED.
--

ABORT:
ABORT: Canu 1.7.1
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:

exploring 0002.cns further:
$ tail -n 35 0002.err

ALIGN to 329-6553 length 6553
read18693 #16 location 0 to template 315-3783 length 3468 diff 0.207901
read18693 #16 location 1 to template 315-3784 length 3469 diff 0.207841
read18693 #16 location 2 to template 315-3785 length 3470 diff 0.207781
mapped 18693     0- 3398 to template    315-  3784 trimmed by      0-     1 TGATGGATAT TTATG-ATAT
ALIGN to 624-4142 length 6553
read366875 #20 location 0 to template 299-3194 length 2895 diff 0.204490
falconsense: overlapInCore/libedlib/edlib.C:347: void edlibAlignmentToStrings(const unsigned char*, int, int, int, int, int, const char*, const char*, char*, char*): Assertion `strlen(qry_aln_str) == alignmentLength && strlen(tgt_aln_str) == alignmentLength' failed.

Failed with 'Aborted'; backtrace (libbacktrace):
read134431 #17 location 0 to template 512-6202 length 5690 diff 0.235677
mapped 134431     0- 5884 to template    512-  6203 trimmed by      0-     0 GTT-TTGGCA TTTGTTGGCA
ALIGN to 936-2815 length 6553
read94346 #19 location 0 to template 537-5828 length 5291 diff 0.194292
mapped 94346     0- 5336 to template    866-  6158 trimmed by      0-     0 GTGGGT-TCA -TGGGTATCA
ALIGN to 621-6553 length 6553
read159249 #18 location 0 to template 598-6479 length 5881 diff 0.210848
read197332 #21 location 0 to template 143-1638 length 1495 diff 0.209365
mapped 197332     0- 1492 to template   1079-  2575 trimmed by      0-     0 AGCACAGAAC AGCACAGAAC
mapped 159249     0- 5995 to template    671-  6553 trimmed by      0-     6 TAACATTAAC T----T---C
ALIGN to 754-6553 length 6553
ALIGN to 1042-4847 length 6553
AS_UTL/AS_UTL_stackTrace.C::97 in _Z17AS_UTL_catchCrashiP7siginfoPv()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
overlapInCore/libedlib/edlib.C::347 in _Z23edlibAlignmentToStringsPKhiiiiiPKcS2_PcS3_()
correction/falconConsensus-alignTag.C::252 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
Aborted (core dumped)
@skoren
Copy link
Member

skoren commented Sep 5, 2018

Do you have non-ACGT bases in your reads? This would be the same as issue #881, #1026. The fix for this is only in the tip so you'd have to either remove non-ACGT bases from your input or compile the code from source.

@aldertzomer
Copy link
Author

aldertzomer commented Sep 5, 2018

cat BVCB000015H_33514_subreads.fasta |grep -v ">" |tr -d [ATCGatcg] |tr -d "\n"
results in empty output, so no nonACGT bases. Thats not it unfortunately. I'm assembling now with a subset of the data.

@aldertzomer
Copy link
Author

I may have found the issue.. For some reason this was in my fasta file from my sequence provider

[M::bam2fq_mainloop] processed 406256 reads

which does not really belong in a fasta file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants