Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spectrum in mgf file not found #2

Open
nukaemon opened this issue Apr 7, 2021 · 4 comments
Open

Spectrum in mgf file not found #2

nukaemon opened this issue Apr 7, 2021 · 4 comments
Assignees

Comments

@nukaemon
Copy link

nukaemon commented Apr 7, 2021

Dear developers

Thank you for providing this useful tool.
I setup DeepRescore in AWS(CentOS7) with GPU backend and it could run on test data successfully.

Now, I tried on my own data but encountered with an error at 'process_pDeep2_results' step.
The nextflow command executed is below which should be ok, and the identification file (output.2021_04_07_02_59_45.t.xml) was generated from the same mgf file(myown.mgf) using X!Tandem.

nextflow run ${DEEPRESCORE} --id_file output.2021_04_07_02_59_45.t.xml --ms_file myown.mgf --se xtandem --ms_instrument Lumos --ms_energy 0.34 --prefix d2 --decoy_prefix XXX_  --cpu 4 --mem 12

The command error message from nextflow tells that something wrong happened in Java execution

Command error:
  Exception in thread "main" java.io.IOException: Spectrum 'File: D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw; SpectrumID: 2220; scans: 2975' in mgf file 'myown.mgf' not found!
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:788)
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:730)
        at PDVGUI.GenerateSpectrumTable.process(GenerateSpectrumTable.java:84)
        at PDVGUI.GenerateSpectrumTable.<init>(GenerateSpectrumTable.java:31)
        at PDVGUI.GenerateSpectrumTable.main(GenerateSpectrumTable.java:21)

I looked at myown.mgf and there are actually entry lines related with SpectrumID: 2220; scans: 2975.

.
.
.
BEGIN IONS
TITLE=File: "D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw"; SpectrumID: "2220"; scans: "2975"
PEPMASS=496.76181 16035.10645
CHARGE=2+
RTINSECONDS=724
SCANS=2975
168.998 9.06042
171.242 47.312
176.236 15.0943
183.230 16.0298
186.481 12.8785
.
.
.

Also, SpectrumID "2220" shows up at the first line in 'd2_format_titles.txt', so it seems getting the error immediately on loading 'd2_format_titles.txt'.

I also manually tested each command executed in process_pDeep2_results step and confirmed that 'd2_spectrum_pairs.txt' was generated but empty after PDV-1.6.1.beta.features-jar-with-dependencies.jar.

Do you have any idea to solve this problem?
I paste below the whole log message from nextflow just in case.

log message from nextflow
[37/ea937f] process > xml2mzid (d2)                       [100%] 1 of 1 ✔
[1a/aa62b3] process > calc_basic_features_xt (d2)         [100%] 1 of 1 ✔
[89/0c6d10] process > pga_fdr_control (d2)                [100%] 1 of 1 ✔
[1d/7ff0ff] process > generate_train_prediction_data (d2) [100%] 1 of 1 ✔
[29/3b4f5d] process > run_pdeep2 (d2)                     [100%] 1 of 1 ✔
[f0/b0788f] process > process_pDeep2_results (d2)         [100%] 1 of 1, failed: 1 ✘
[-        ] process > train_autoRT                        -
[-        ] process > predicte_autoRT                     -
[-        ] process > generate_percolator_input           -
[-        ] process > run_percolator                      -
[-        ] process > generate_pdv_input                  -
Error executing process > 'process_pDeep2_results (d2)'

Caused by:
  Process `process_pDeep2_results (d2)` terminated with an error exit status (2)

Command executed:

  #!/bin/sh
  mv d2_pdeep2_prediction_results.txt d2_pdeep2_prediction_results.txt.mgf
  Rscript /home/centos/DeepRescore/bin/format_pDeep2_titile.R d2_pdeep2_prediction.txt d2-rawPSMs.txt ./d2_format_titles.txt

  java -Xmx12g -cp /home/centos/DeepRescore/bin/PDV-1.6.1.beta.features/PDV-1.6.1.beta.features-jar-with-dependencies.jar PDVGUI.GenerateSpectrumTable         ./d2_format_titles.txt myown.mgf d2_pdeep2_prediction_results.txt.mgf ./d2_spectrum_pairs.txt xtandem
  mkdir sections sections_results
  Rscript /home/centos/DeepRescore/bin/similarity/devide_file.R ./d2_spectrum_pairs.txt 4 ./sections/
  for file in ./sections/*
  do
      name=`basename $file`
      Rscript /home/centos/DeepRescore/bin/similarity/calculate_similarity_SA.R $file ./sections_results/${name}_results.txt &
  done
  wait
  awk 'NR==1 {header=$_} FNR==1 && NR!=1 { $_ ~ $header getline; } {print}' ./sections_results/*_results.txt > ./d2_similarity_SA.txt

Command exit status:
  2

Command output:
  (empty)

Command error:
  Exception in thread "main" java.io.IOException: Spectrum 'File: D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw; SpectrumID: 2220; scans: 2975' in mgf file 'myown.mgf' not found!
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:788)
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:730)
        at PDVGUI.GenerateSpectrumTable.process(GenerateSpectrumTable.java:84)
        at PDVGUI.GenerateSpectrumTable.<init>(GenerateSpectrumTable.java:31)
        at PDVGUI.GenerateSpectrumTable.main(GenerateSpectrumTable.java:21)
  Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
  Bioconductor version '3.10' is out-of-date; the current release version '3.12'
    is available with R version '4.0'; see https://bioconductor.org/install
  ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
  ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
  ✔ tibble  2.1.3     ✔ dplyr   0.8.4
  ✔ tidyr   1.0.0     ✔ stringr 1.4.0
  ✔ readr   1.3.1     ✔ forcats 0.4.0
  ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
  ✖ dplyr::filter() masks stats::filter()
  ✖ dplyr::lag()    masks stats::lag()
  
  Attaching package: ‘data.table’
  
  The following objects are masked from ‘package:dplyr’:
  
      between, first, last
  
  The following object is masked from ‘package:purrr’:
  
      transpose
  
  Warning message:
  In fread(args[1]) :
    File './d2_spectrum_pairs.txt' has size 0. Returning a NULL data.table.
  Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
  Bioconductor version '3.10' is out-of-date; the current release version '3.12'
    is available with R version '4.0'; see https://bioconductor.org/install
  ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
  ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
  ✔ tibble  2.1.3     ✔ dplyr   0.8.4
  ✔ tidyr   1.0.0     ✔ stringr 1.4.0
  ✔ readr   1.3.1     ✔ forcats 0.4.0
  ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
  ✖ dplyr::between()   masks data.table::between()
  ✖ dplyr::filter()    masks stats::filter()
  ✖ dplyr::first()     masks data.table::first()
  ✖ dplyr::lag()       masks stats::lag()
  ✖ dplyr::last()      masks data.table::last()
  ✖ purrr::transpose() masks data.table::transpose()
  Error in fread(args[1]) : 
    File './sections/*' does not exist or is non-readable. getwd()=='/home/centos/Work/test/work/f0/b0788f2a27d40b620301bb2776920b'
  Execution halted
  awk: cannot open ./sections_results/*_results.txt (No such file or directory)

Work dir:
  /home/centos/Work/test/work/f0/b0788f2a27d40b620301bb2776920b

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
@KaiLiCn
Copy link
Collaborator

KaiLiCn commented Apr 8, 2021

Hi,

Sorry for the inconvenience. Could you please share all inputs of process_pDeep2_results with me? I will test and fix it this week.

Kai

@nukaemon
Copy link
Author

nukaemon commented Apr 9, 2021

Hello @KaiLiCn

Thank you for your kind reply and support.
I can share the input files via AWS by providing you S3 Presinged URL.
Can I send it to your gmail adrress in your profile?

@KaiLiCn
Copy link
Collaborator

KaiLiCn commented Apr 9, 2021

Sure. My email address is likaicnsh@gmail.com.

@wenbostar
Copy link
Collaborator

@KaiLiCn any updates on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants